DEV Community

Cover image for Geolocate any IP using latency
Dmitriy A. for Globalping

Posted on • Originally published at blog.globalping.io

Geolocate any IP using latency

TLDR: I made a CLI tool that can resolve an IP address to a country, US state and even a city. https://github.com/jimaek/geolocation-tool
It works well and confirms ipinfo's findings.

Recently, I read how ipinfo finally proved what most technical people assumed: VPN providers don't actually maintain a crazy amount of infrastructure in hundreds of countries. They simply fake the IP geolocation by intentionally providing wrong location data to ARIN, RIPE, and Geo DB providers via geofeeds.

They achieved their results using a novel approach compared to other geo IP providers. Based on their blog and HackerNews comments, they built a large probe network and used it to trace and ping every (or most) IP addresses on the internet.

This latency and hop data, most likely along with advanced algorithms and data cross-reference, provides a reliable way of correctly detecting the physical geolocation of an IP address, without relying on faked data available in public sources.

This is a very interesting approach that makes total sense, and I'm sure their clients appreciate it and heavily rely on it.

While I can't ping every single IP address on the internet from hundreds of locations just yet, I can do it to a limited subset using Globalping. So I decided to try it out and see if I can replicate their results and build a small tool to allow anyone to do the same.

Globalping is an open-source, community-powered project that allows users to self-host container-based probes. These probes then become part of our public network, which allows anyone to use them to run network testing tools such as ping and traceroute.

network-map-globalping.png

At the moment, the network has more than 3000 probes, which in theory should be plenty to geolocate almost any IP address down to a country and even a US state level.

To automate and simplify this process, I made a little CLI tool using the globalping-ts library. My original idea was simple:

  1. Accept a single IP as input
  2. Ping it a few times per continent to select the continent
  3. Then ping the IP from many different probes on that continent
  4. Group and sort the results; the country with the lowest latency should be the correct one
  5. And as a bonus, repeat the same process for USA states if the winning country was the US

Essentially, what I had to do was simply create a few measurements and pass the location I needed using Globalping’s magic field, which would automatically figure out what I was looking for and select a few pseudo-random probes that fit the location and limit.

Now initially, I used ping with 2 packets to run all measurements as quickly as possible, but I quickly realized it wasn’t a good idea as most networks block ICMP traffic. Next, I tried switching to TCP-based ping, which required trying a few popular ports to get it to work. I quickly realized this was too complicated and unreliable and switched to traceroute.

It worked perfectly. Even though traceroute uses ICMP by default, it did not matter to me if the target IP’s network allowed ICMP or not, I simply analyzed the latency of the last available hop. Even if you block ICMP, your upstream most likely allows it, and in most cases, it’s located in the same country.

Of course, this means the resulting data is not 100% perfect. A better approach would be to analyze each IP using different methods, including TCP and UDP-based traceroute on different ports, and expand to the last few hops instead of just one. Maybe even try to figure out the location of the registered ASNs and use a weights system in combination with public whois info in order to “vote” for the right location based on different inputs. Probably even mark low certainty IPs to be retested with a double amount of probes. (end of rant)

But that’s something for a commercial provider to figure out, which it seems they did.

For continent detection, I decided to use just 5 probes per continent; the results were extremely accurate. Although for IPs just on the "border" of continents it might be ineffective, a higher amount of probes would generate better results. For this use case, it was good enough.

My home IP in central Europe was too easy to detect:

Phase 1: Detecting continent...
  North America: 137.18 ms
  Europe: 32.39 ms
  Asia: 174.54 ms
  South America: 215.08 ms
  Oceania: 244.15 ms
  Africa: 156.83 ms
Enter fullscreen mode Exit fullscreen mode

In phase 2, all we need to do is run a single measurement with the winning continent as the location and a higher limit. Initially, I started with 250 probes with great accuracy.

Eventually, I decided to drop down to 50 as the default. Based on my tests, the results continued to look really good, and it would allow the tool to be run even without authentication, as the Globalping API allows 250 tests per hour per IP and 50 probes per measurement.

Although I recommend registering for a free account at https://dash.globalping.io/ and authenticating with a token to get up to 500 tests per hour and run more tests.

Note: If you need more tests than that, you can either host a probe to generate passive credits to be used as tests, or donate via GitHub Sponsors. We will automatically detect it and credit your account.

Phase 2: Detecting country...
  Measuring from 50 probes...

  [████████████████████████████████████████] 100.0%   50/50 - Best: PL (7.29 ms)                    

Top 3 Locations:
─────────────────────────────────────────────────
  1.. Poland, EU                               7.29 ms
  2.. Germany, EU                              13.42 ms
  3.. Lithuania, EU                            17.65 ms

═══════════════════════════════════════════════════
                      SUMMARY
═══════════════════════════════════════════════════
  Location: Poland, EU
  Minimum Latency: 7.29 ms
  Confidence: Medium
Enter fullscreen mode Exit fullscreen mode

Great, now we have a basic IP-to-country resolver that only takes a few seconds to provide a response, and I didn’t even have to understand or write any complicated math. Although I’m sure someone smarter could use a formula to geolocate IPs with even fewer probes and higher accuracy.

For phase 3, we want to resolve the US to a specific state or territory, just like ipinfo did, and luckily they even provided a few sample IPs and locations to benchmark against during testing.

Again, this was as simple as creating a new measurement with the USA as the location. I used 50 probes as the default limit and tested the NordVPN IP advertised as Bahamas but resolved to Miami by ipinfo.

Phase 3: Detecting US state...
  Measuring from 50 probes...

  [████████████████████████████████████████] 100.0%   50/50 - Best: FL (0.45 ms)                    

Top 3 Locations:
─────────────────────────────────────────────────
  1. Florida, USA                             0.45 ms
  2. South Carolina, USA                      12.23 ms
  3. Georgia, USA                             15.01 ms

═══════════════════════════════════════════════════
                      SUMMARY
═══════════════════════════════════════════════════
  Location: Florida, United States
  Minimum Latency: 0.45 ms
  Confidence: Very High
═══════════════════════════════════════════════════
Enter fullscreen mode Exit fullscreen mode

The tool agrees, Florida is the correct location. But how accurate can this system be? Can we expand it to show the city too?

Let's make a new phase, which again, will simply set the resulting country or state as the location and extract the city of the probe with the lowest latency. Here, since there are too many possible cities and towns per state and country, I expect the accuracy to be low and only point to the closest major hub. But in theory, this should be more than enough for use cases like routing or performance debugging.

And here we go, the same result ipinfo got

Phase 4: Detecting city...
  Measuring from 36 probes...

  [████████████████████████████████████████] 100.0%   36/36 - Best: Miami (0.00 ms)                 

Top 3 Locations:
─────────────────────────────────────────────────
  1. Miami, Florida, USA                      0.00 ms
  2. West Palm Beach, Florida, USA            4.36 ms
  3. Tampa, Florida, USA                      5.85 ms

═══════════════════════════════════════════════════
                      SUMMARY
═══════════════════════════════════════════════════
  Location: Miami, Florida, United States
  Minimum Latency: 0.00 ms
  Confidence: Very High
═══════════════════════════════════════════════════
Enter fullscreen mode Exit fullscreen mode

The current results are good but could be better. The main problem is with how the magic field works: when setting, for example, 'Europe' as the location, it tries to spread the tests across all European probes but does not guarantee that every single country is going to be included.

This results in inconsistencies where a probe in the same country as the target IP was not selected, and so the tool assumes the IP is located in a different neighbouring country.

To fix this and make the results more consistent, you would need to change the selection logic and manually set every country per continent and US state. By passing the full list of countries/states to the Globalping API, you ensure that at least one probe in that location is going to be selected. Additionally, you fully control the number of probes per location, which is very important to control the accuracy.

For example, North America technically contains 43 countries and territories. This means you can't just set a limit of one probe per country, it is not enough to properly understand the latency to the target IP from the disproportionately larger USA. A better limit would be around 200 probes for the USA, 20 for Canada, and 10 for Mexico.

But the goal of this tool was to use a minimum amount of probes to allow unauthenticated users to test it out. The current approach works great, it is simple to implement and it is very easy to control the accuracy by simply setting a higher limit of probes.

Overall, latency-based geolocation detection seems to be a great way to verify the location of any IP as long as you have enough vantage points. It will most likely fall apart in regions with minimal or no coverage.

The tool itself is open source and you can run it like this:

geolocate $IP

You can also use the –limit parameter to use more probes per phase. But be careful as it applies the set value to all phases and this will very quickly eat through your limit. Check the full docs in GitHub.

Pull requests with improvements are welcome!

Feel free to email me if you need some free credits to play around with d@globalping.io

And of course consider hosting a probe, it’s as simple as running a container https://github.com/jsdelivr/globalping-probe

Top comments (0)