Working with geospatial data is notoriously difficult, because latitude and longitude are floating point numbers and should be very precise. In addition, it would seem that latitude and longitude can be represented as a grid, but in fact they can't, simply because Earth is not flat and math is hard.
For example, to determine the distance of a great circle between two points on a sphere, based on their latitude and longitude, the haversine formula is used, which looks like this:
Another common task related to latitude and longitude is finding the number of points in a radius on the Earth's surface. That is, given a large ball (Earth) and you are trying to find points in a radius on this ball. But Earth, in fact, is not a perfect sphere, it is still an ellipsoid. As you might guess, the mathematical calculations for such an operation become quite complex.
In this article, we'll look at how Redis can help us minimize calculations when working with geospatial data.
Redis, which stands for Remote Dictionary Server, is a fast, open source key-value data store. Because of its speed, Redis is a popular choice for caching, session management, gaming, analytics, geospatial data, and more.
Geohash is a system representing coordinates as a string. Geohashing uses Base32 encoding to convert latitude and longitude to a string. For example, the geohash of the Palace Square in St. Petersburg will look like this: udtscze2chgq. A variable geohash length represents a variable position accuracy, in other words, the shorter is the geohash, the less precise are the coordinates it represents. That is, a shorter geohash will represent the same geolocation, but with less accuracy. You can try encoding coordinates in geohash at http://geohash.org.
Geospatial data storage is implemented in Redis using sorted lists (
ZSET) as the underlying data structure, but with on-the-fly encoding and decoding of location data and a new API. This means that indexing, searching and sorting by a specific location can be thrown into Redis with very few lines of code and minimal effort using the built-in commands:
GEOADD, GEODIST, GEORADIUS and GEORADIUSBYMEMBER (GEOSEARCH).
Geo Set is the basis for working with geospatial data in Redis - it is a data structure designed to manage geospatial indexes. Each Geo Set consists of one or more elements, each of which consists of a unique identifier, and a pair of coordinates - longitude and latitude.
To add a new list (or a new element to an existing list) in a Redis store, use the
GEOADD command. For clarity, I will give examples of commands in Redis, as well as in the Ruby client for working with Redis:
# Redis example: GEOADD "buses" -74.00020246342898 40.717855101298305 "Bus A" # Ruby example: RedisClient.geoadd("buses", -74.00020246342898, 40.717855101298305, "Bus A")
These commands add to the Geo Set named "buses" the coordinates of the location for bus "Bus A". If a Geo Set with this name is not yet stored in Redis, it will be created. A new entry will only be added to the index if an entry with the same name ("Bus A") is not already in the list. That is, Bus A is a unique identifier.
It is also possible to add multiple records at once with a single GEOADD call, which can help reduce network and database load. Record IDs must be unique:
# Redis example: GEOADD "buses" -74.00020246342898 40.717855101298305 "Bus A" -73.99472237472686 40.725856700515855 "Bus B" # Ruby example: RedisClient.geoadd("buses", -74.00020246342898, 40.717855101298305, "Bus A", -73.99472237472686, 40.725856700515855, "Bus B")
The same command is used to update the index of a record. If
GEOADD is called with entries already in the Geo Set, Redis simply updates the data for those entries, as soon as bus A starts moving, its location can be updated:
# Redis example: GEOADD "buses" -76.99265963484487 38.87275545298483 "Bus A" # Ruby example: RedisClient.geoadd("buses", -76.99265963484487, 38.87275545298483, "Bus A")
In addition to adding and updating, of course entries can be removed from the index. The
ZREM command is provided to remove an entry from a Geo Set in Redis.
ZREM takes the name of the index to delete records from and the IDs of the records to be deleted:
# Redis example: ZREM buses "Bus A" "Bus B" # Ruby example: RedisClient.zrem("buses", "Bis A", "Bus B")
The geo index can be deleted entirely and since it is stored as a Redis key, the
DEL command can be used:
# Redis example: DEL buses # Ruby example: RedisClient.del("buses")
DEL for big lists can be a bad idea, since it can block Redis for a long time. So it might be better to always use
UNLINK instead of
DEL, i.e. 'non-blocking' delete:
# Redis example: UNLINK buses # Ruby example: RedisClient.unlink("buses")
Keep in mind that Redis has a mechanism for indexes expiration, if you do not specify an expiration date for an index, then it will never be expired and will eat memory. To prevent this from happening, you need to use the
EXPIRE command, passing the name of the index and the number of seconds for expiration:
# Redis example: EXPIRE buses 1000 # Ruby example: RedisClient.expire("buses", 1000)
Redis uses a semi-lazy expiration mechanism, which means that the index is not expired until it is not read, if it turns out that the expiration time has passed during the reading operation, then the result is not returned, and the object itself is deleted from the storage. That is, until we request a Geo Set, it will be stored in memory indefinitely.
However, Redis has a second level of expiration - it's active and random. It's a garbage collector that randomly reads different keys, and when the key is read, the standard mechanism for checking expiration occurs.
Unfortunately, Redis does not have the ability to directly expire records in the index. Such a feature will have to be developed independently.
What about reading and searching by geospatial data?
There are several ways to read entries from an index. You can use the
ZSCAN commands to get started. These commands iterate over all entries in the index. For example, to return all entries in an index:
# Redis example: ZRANGE buses 0 -1 # Ruby example: RedisClient.zrange("buses", 0, -1)
With respect to geospatial data, there are two commands to get the location of an entry from an index. The first -
GEOPOS command returns the coordinates of the entry in the index:
# Redis example: GEOPOS buses "Bus A" # Ruby example: RedisClient.geopos("buses", "Bus A")
The second command -
GEOHASH returns the coordinates of the entry encoded in the geohash:
# Redis example: GEOHASH buses "Bus A" # Ruby example: RedisClient.geohash("buses", "Bus A")
To get the distance between two entries in an index, you can use the
# Redis example: GEODIST buses "Bus A" "Bus B" # Ruby example: RedisClient.geodist("buses", "Bus A", "Bus B", "km")
The result of the command will be returned by default in meters. You can specify the required units of measurement by passing the fourth argument to the command, for example: km for kilometers, m for meters, mi - for miles, ft - for feet.
To search the index, the
GEORADIUSBYMEMBER (for Redis versions less than 6.2) or
GEOSEARCH (for versions older than 6.2) commands are also used.
GEORADIUSBYMEMBER accept the parameters
WITHDIST (display results + distance from the specified point/record) and
WITHCOORD (display results + record coordinates), as well as the
DESC sort option (sort by distance from the point):
# Redis example: GEORADIUS buses -73 40 200 km WITHDIST # returns: 1) 1) "Bus A" 2) "190.4424" 2) 1) "Bus B" 2) "56.4413" GEORADIUS buses -73 40 200 km WITHCOORD # returns: 1) 1) "Bus A" 2) 1) "-74.00020246342898" 2) "40.717855101298305" 2) 1) "Bus B" 2) 1) "-73.99472237472686 2) "40.725856700515855" GEORADIUS buses -73 40 200 km WITHDIST WITHCOORD # returns: 1) 1) "Bus A" 2) "190.4424" 3) 1) "-74.00020246342898" 2) "40.717855101298305" 2) 1) "Bus B" 2) "56.4413" 3) 1) "-73.99472237472686 2) "40.725856700515855" # Redis example: GEORADIUSBYMEMBER buses "Bus A" 100 km # returns: 1) “Bus B” # Ruby example: RedisClient.georadiusbymember("buses", "Bus A", 100, "km")
GEOSEARCH command for new versions of Redis has a similar syntax and does the same thing. The command syntax looks like this:
# Redis examples: GEOSEARCH buses FROMMEMBER "Bus A" BYRADIUS 100 km ASC WITHCOORD WITHDIST WITHHASH # returns all entries in 100km radius from Bus A with coordinates, distances and geohashes GEOSEARCH buses FROMLONLAT -74.00020246342898 40.717855101298305" BYRADIUS 200 mi DESC COUNT 2 # returns maximum 2 entries sorted from the farest to the closest within 200 miles from the center # with given coordinates
The simplicity of implementing location apps with geospatial data in Redis not only makes it easy to handle large amounts of geospatial data, but also allows you to implement some complex processing for the data. For example, querying for entries within a radius can help you implement searching for points of interest nearby, by giving the user only the choices closest to them. If your application uses geospatial data in any way, consider moving complex calculations to Redis, it may increase the efficiency of your application.