We have a table in our database that has static ip range of cities and their ip address. It looks something like:
IP-TO, IP-FROM, CITY
100, 110, A
111, 168, B
...
965, 1000, Z
I have mentioned sample data. The real data is huge with almost 64k rows in the table.
For each user on our website, we determine their city from their IP address by executing sql query on sql express server.
Since the data is static, for example every user with IP in range 100 to 110 belongs to city A, we are unnecessarily hitting database every time.
We were thinking of caching every unique ip visit. For example: IP-100 mapped to A IP-101 mapped to A ... IP-110 mapped to A
But this would create 64k keys in memcache and i feel there is no point of storing multiple keys that hold same value when we know the range.
Can we somehow do this in a better way i.e. by minimum mem cache key or using a different approach altogether?
A list of IP addresses can be sorted (since they are, essentially, numbers). If you have a large non-overlapping list of IP ranges, you can sort them into one big list. If you have a big, sorted list of values, you can do a binary search on it. With 64k items, you can search the whole list in about 16 comparisons (practically instant).
With the right indexes and queries, your DB might be able to do this for you. If you think it could be faster another way (hint: use profiling to determine if it really is!) or are concerned about the extra trips to the database, you might cache the whole table's data in memory, and search the list. In high-level terms:
The binary search will need to consider both the start and end numbers. A custom comparator or a custom binary search should make this doable. This should be very fast.
You could also try caching the last few minutes' worth of IP addresses in a dictionary, but I think it's unlikely to be any faster.