DEV Community

Cover image for Day 52 of #100DaysOfClickHouse: ClickHouse® Geospatial Functions for Location Data
Kanishga Subramani
Kanishga Subramani

Posted on

Day 52 of #100DaysOfClickHouse: ClickHouse® Geospatial Functions for Location Data

ClickHouse® Geospatial Functions for Location Data

Location data powers many of today's most widely used applications, from ride-sharing and food delivery platforms to logistics, fleet management, mapping services, and location-based analytics. As businesses collect massive amounts of geographical data, efficiently analyzing latitude and longitude coordinates becomes increasingly important.

ClickHouse® includes built-in geospatial functions that allow you to calculate distances, generate geohashes, perform spatial analysis, and work with geographical coordinates directly in SQL. This eliminates the need to export data to external GIS tools while taking advantage of ClickHouse®'s high-performance analytical engine.

In this guide, we'll explore the most commonly used ClickHouse® geospatial functions with practical SQL examples and real-world use cases.


What Are Geospatial Functions?

Geospatial functions perform calculations using geographical coordinates such as latitude and longitude.

These functions help answer questions like:

  • How far apart are two locations?
  • Which customers are within a specific radius?
  • Which delivery driver is nearest?
  • How can nearby locations be grouped?
  • How can location data be indexed efficiently?

These capabilities are widely used across mapping applications, transportation, logistics, business intelligence, IoT platforms, and location-aware services.


Sample Table

We'll use the following table throughout this guide.

CREATE TABLE delivery_locations
(
    DriverID UInt32,
    DriverName String,
    Latitude Float64,
    Longitude Float64
)
ENGINE = MergeTree
ORDER BY DriverID;
Enter fullscreen mode Exit fullscreen mode

Sample Data

DriverID DriverName Latitude Longitude
1 Alice 13.0827 80.2707
2 Bob 12.9716 77.5946
3 Charlie 17.3850 78.4867

Why Use Geospatial Functions?

Instead of exporting location data into specialized GIS software, ClickHouse® enables geographical analysis directly within SQL.

This approach provides several benefits:

  • Faster analytical queries
  • Real-time location processing
  • Reduced data movement
  • Simpler analytical workflows
  • Better scalability for large datasets

Common ClickHouse® Geospatial Functions

1. greatCircleDistance()

The greatCircleDistance() function calculates the shortest distance between two locations on the Earth's surface using a spherical Earth model.

Suppose a customer is located in Bengaluru. The following query calculates the distance from each driver to the customer's location.

SELECT
    DriverName,
    greatCircleDistance(
        Longitude,
        Latitude,
        77.5946,
        12.9716
    ) AS DistanceInMeters
FROM delivery_locations;
Enter fullscreen mode Exit fullscreen mode

Output

Driver Distance (m)
Bob 0
Alice 290172
Charlie 499786

Common Use Cases

  • Finding the nearest driver
  • Delivery distance calculations
  • Fleet management
  • Nearby store searches

2. geoDistance()

The geoDistance() function calculates distance using the Earth's ellipsoid model, providing greater accuracy than greatCircleDistance().

SELECT
    DriverName,
    geoDistance(
        Longitude,
        Latitude,
        77.5946,
        12.9716
    ) AS Distance
FROM delivery_locations;
Enter fullscreen mode Exit fullscreen mode

Output

Driver Distance (m)
Bob 0
Alice 290105
Charlie 499701

Common Use Cases

  • GPS applications
  • Navigation systems
  • Logistics
  • Route optimization

greatCircleDistance() vs geoDistance()

Function Earth Model Accuracy Best For
greatCircleDistance() Sphere High General analytics
geoDistance() Ellipsoid Very High GPS and navigation

If your application requires maximum geographical accuracy, such as navigation or mapping, geoDistance() is generally the preferred option.


3. geohashEncode()

The geohashEncode() function converts latitude and longitude coordinates into a Geohash string.

SELECT
    DriverName,
    geohashEncode(
        Longitude,
        Latitude,
        8
    ) AS Geohash
FROM delivery_locations;
Enter fullscreen mode Exit fullscreen mode

Output

Driver Geohash
Alice tdn4q6nt
Bob tdr1v9qh
Charlie tephh0ws

Nearby locations often generate similar Geohash values, making Geohashes extremely useful for grouping nearby locations and building efficient spatial indexes.

Common Use Cases

  • Spatial indexing
  • Nearby location searches
  • Location clustering
  • Efficient geographic partitioning

4. geohashDecode()

The geohashDecode() function converts a Geohash string back into its corresponding latitude and longitude coordinates.

SELECT
    geohashDecode('tdr1v9qh');
Enter fullscreen mode Exit fullscreen mode

Output

Latitude Longitude
12.9716 77.5946

5. pointInPolygon()

The pointInPolygon() function determines whether a geographical point lies inside a specified polygon.

It returns:

  • 1 if the point is inside the polygon
  • 0 if the point is outside the polygon

Suppose we want to verify whether each driver falls inside a predefined delivery zone.

SELECT
    DriverName,
    pointInPolygon(
        (Longitude, Latitude),
        [
            (80.20, 13.00),
            (80.35, 13.00),
            (80.35, 13.15),
            (80.20, 13.15)
        ]
    ) AS IsInsideZone
FROM delivery_locations;
Enter fullscreen mode Exit fullscreen mode

Output

Driver IsInsideZone
Alice 1
Bob 0
Charlie 0

Explanation

  • (Longitude, Latitude) represents the driver's location.
  • The list of coordinate pairs defines the delivery zone.
  • 1 indicates the driver is inside the polygon.
  • 0 indicates the driver is outside the polygon.

Common Use Cases

  • Geofencing
  • Delivery zone validation
  • Store coverage analysis
  • Service area verification
  • Asset tracking

6. Finding Nearby Drivers

The following example finds drivers located within a 300-kilometer radius of Bengaluru.

SELECT *
FROM
(
    SELECT
        DriverName,
        greatCircleDistance(
            Longitude,
            Latitude,
            77.5946,
            12.9716
        ) AS Distance
    FROM delivery_locations
)
WHERE Distance <= 300000;
Enter fullscreen mode Exit fullscreen mode

Output

Driver Distance (m)
Bob 0
Alice 290172

7. Sorting Locations by Distance

You can easily rank locations from nearest to farthest.

SELECT
    DriverName,
    greatCircleDistance(
        Longitude,
        Latitude,
        77.5946,
        12.9716
    ) AS Distance
FROM delivery_locations
ORDER BY Distance;
Enter fullscreen mode Exit fullscreen mode

Output

Driver Distance (m)
Bob 0
Alice 290172
Charlie 499786

Real-World Applications

ClickHouse® geospatial functions support a wide variety of location-based applications.

Industry Example
Ride Sharing Find the nearest driver
Logistics Delivery route optimization
Food Delivery Locate nearby restaurants
Retail Store locator
Healthcare Nearest hospital search
IoT Device location tracking
Fleet Management Vehicle monitoring

Performance Tips

To maximize performance when working with geographical data in ClickHouse®:

  • Store coordinates using the Float64 data type.
  • Apply filters before calculating distances whenever possible.
  • Choose efficient ORDER BY keys for MergeTree tables.
  • Use Geohashes for grouping nearby locations.
  • Avoid unnecessary distance calculations across very large datasets.
  • Pre-filter records using bounding boxes before running expensive distance calculations.

Conclusion

ClickHouse® provides a powerful set of built-in geospatial functions that make analyzing location data simple, efficient, and highly scalable. Whether you're calculating distances, finding nearby drivers, generating Geohashes, implementing geofencing, or building location-aware analytics, these SQL functions eliminate the need for external GIS processing.

Combined with ClickHouse®'s columnar storage engine and high-speed analytical capabilities, geospatial functions enable developers and data engineers to build scalable solutions for logistics, ride-sharing, fleet management, retail, IoT, healthcare, and many other industries that rely on geographical data.

As your datasets grow from millions to billions of records, ClickHouse® allows you to perform location-based analytics with exceptional speed while keeping your workflows entirely within SQL.

Top comments (0)