On the 2020/06/06 I finally made my first Python module.
The reason for the creation of this module is that I needed to perform certain calculations regarding distances.
I wanted to write an article that would cover the most well known ways transmitting data.
Namely, via streaming protocols.
I ended up logically talking about RabbitMQ and Kafka.
Although it was clear for me what I wanted to write about, I wasn't sure what should I base my article about.
An idea was to calculate distances in Km in different planets.
Of course, for this, we need pretty good knowledge of how coordinates work.
I ended up hastily implementing python code to give me those calculations.
At some point I did think I was going too far in my article and ended up settling to planet 🌎 Earth 🌍 and make something with trains and trucks.
The old Geo Library can be found there on path bl-demo-server/bl-core-service.
1. Introduction
What I want to demonstrate with this blog post is a few concepts about python, how I implemented the Haversine Forumla, what is this formula and where does it come from and finally how I made the package currently available on PyPI (Python Package Index)
.
1.1. Haversine Formula Origins
In the book, Heavenly Mathematics: The Forgotten Art of Spherical Trigonometry, Oxford: Princeton University Press, 2012, we can find very interesting references about the Haversine Theory.
It also contains the complete description of it.
For us, what's important is to understand its origins and what it is about.
The beginnings of this theory take us back to 1801.
At this time, Josef de Mendonza y Ríos, used it for the first time.
Josef de Mendonza y Rios, also known as José de Mendonza y Ríos , was a spanish astronomer and mathematician of the 18th century.
He made several publications about navigation and his work was focused on providing mathematical tools to calculate the position of a ship at sea using natural resources and observation instruments.
This position was to be measured in degrees and consisted of two dimensions: latitude and longitude.
José gained notoriety by being able to calculate these coordinates by using two altitudes of the Sun for the latitude and the distance of the moon to a celestial body for the longitude.
Using tables, Sailors and other members of crews at sea could easily and quickly find approximately where they were.
His work earned him an honorable place at the Royal Swedish Academy of Sciences.
However, the first specific publication regarding the Haversine formula only occured only 4 years later in 1805.
At that time James Andrew, published the first known Haversine tables in his ASTRONOMICAL and NAUTICAL tables with PRECEPTS for finding the LONGITUDE and LATITUDE of PLACES By Lunar Distances, Double Altitudes, &c, AND FOR SOLVING OTHER THE MOST USEFUL PROBLEMS IN PRACTICAL ASTRONOMY.
Finally, the actual term haversine
was invented by James Inman(1776–1859).
He was an English mathematician an professor at the Royal Naval College in Portsmouth UK.
James Inman was very busy in 1835 in the development of a new tale of haversines
.
This is where the term comes from.
The new table of haversines
, by James Inman, introduced a very easy way to calculate the distance between two points by using simple spherical trigonometry.
This college has been closed, and it was active between 1733 and 1837
As we can see through this very vey quick introduction to its history, the haversine formula exists for a long time.
It is a centuries old technology, but it's important to realize that although old, its is the basis of what we are about to see.
Not only that, but is part of the theory that makes today GPS (Global Positioning System) possible and with that, we can nowadays know how long the train will take to get to us.
We now know when the buses are coming.
We know how long it will take from A to B.
We even now can have an estimate so precise, that we don't even think about it and take it most of the times for granted.
Even when we have to go from A to B going through C to D and all the way to Z.
Almost everything is calculated wth extreme precision.
We are going to see that although we will consider Earth to be a big globe, all the calculations will still match.
Earth can also be described as a big potato that is almost round.
That small difference for all our distance calculations are negligible.
Unless of coure we need even more extremely precise data for scientific studies.
But that is another issue.
2. Theory
Let's first have a look at the meaning of the word Haversine
.
I didn't mention this in the introduction, just because this term involves quite a lot of mathematical explanation.
This is way it doesn't seem to belong to a historical backgroundf section.
However, it is precisely the end of that section where we will start this one.
From the studies of James Inman, we can finally understand what a Haversine
is.
in Heavenly Mathematics: The Forgotten Art of Spherical Trigonometry
From this image we can see that vers
is basically a versed sine
.
We can now make the correct formula:
vers 𝜭 = 1 - cos 𝜭 = 2 sin ² (𝜭/2)
Half of that is
hav 𝜭 = (1 - cos 𝜭)/2 = sin ² (𝜭/2)
And this is what a haversine
formula is about.
haversine
just means Half versed sine
.
That small distance become something very attractive for people at sea.
The main reason was that it would always result in a positive number.
Plus, a haversine
is very easily invertible and works well with small numbers.
Rounding numbers isn't really a problem using haversines
.
For our implementation we will use this base to develop different formulas.
In line with a good understanding of this problem, it is important that we become very much aware of what latitude and longitude actually mean.
2.1. Latitude (ɸ)
One line parallel to the equator has the same latitude on every of its points and its called a parallel
Latitude is a measurement in degrees of the angle formed from the equator line of the earth to the location we are determining.
The equator line seems like an obvious point to set the latitude at 0°.
This is because it cuts Earth in two halves, and it was already responsible for the separation between the Northen and Southern hemisphere.
The symbol for Latitude is ɸ.
2.2. Longitude (λ)
One line drawn from pole to pole has the same longitude and its called a meridian
Longitude is a measurement in degrees from the Prime Meridian to the location we are determining.
The Prime Meridian has 0° longitude and goes through very near the Royal Observatory, in Greenwich, England.
This has been determined by convention.
The symbol for Longitude is λ
3. Implementation
We are now going to get our hands on Python.
Before we continue we need to be aware of a few things about Python and the libraries we are going to use:
- By default, our functions are available in scope
public
- Functions starting with one _ are considered
protected
. We won't be using these for now - Functions starting with two __ are considered
private
. We will use these - sin, cos, tan, atan, and other geometrical functions of
python
'smath
use radians as arguments. This is why we will be using theradians
functions a lot. - We will be needing to calculate square roots (
sqrt
) and we will need thepi
constant a lot
For version 1.0.0 we will be using some functions within a Coord
class.
This makes functions easier to use and makes our code more elegant:
class Coord:
def __init__(self, lat, lon):
self.lat = lat
self.lon = lon
def delta_rads(self, lat, lon):
self.lat += lat
self.lon += lon
def distance_to_in_meters(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
meters = round(meters, 3)
return meters
def distance_to_in_kilometers(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
km = round(meters / 1000, 3)
return km
def __distance_to(self, coord, planet_radius_km):
lon1 = self.lon
lat1 = self.lat
lon2 = coord.lon
lat2 = coord.lat
phi_1 = radians(lat1)
phi_2 = radians(lat2)
delta_phi = radians(lat2 - lat1)
delta_lambda = radians(lon2 - lon1)
a = sin(delta_phi / 2.0) ** 2 + cos(phi_1) * cos(phi_2) * sin(delta_lambda / 2.0) ** 2
c = 2 * atan2(sqrt(a), sqrt(1 - a))
meters = planet_radius_km * c
return meters
def __repr__(self):
return "lat: " + str(self.lat) + " lon: " + str(self.lon)
Let's now break this class down into its subcomponents:
First a look at the constructor:
def __init__(self, lat, lon):
self.lat = lat
self.lon = lon
This is basically the way a coordinate is defined.
We setup the latitude and the longitude.
One of the things that may be important to move along from one coordinate to another place is a Δ (delta).
For this I've created a function for that:
def delta_rads(self, lat, lon):
self.lat += lat
self.lon += lon
So now we have can easily move step by step if, and only if, we know how many degrees we want to move.
Of course, normally we don't think about the Prime Meridian and we normally talk about kilometers or miles.
This is why I've also created this inner function:
def add_delta_in_km_to_coord(self, d_lat_km, d_lon_km, planet_radius_km=6371):
lat = self.lat
d_lat = (180 / pi) * (d_lat_km / float(planet_radius_km))
d_lon = (180 / pi) * (d_lon_km / float(planet_radius_km)) / cos(radians(lat))
self.delta_rads(d_lat, d_lon)
It's important to notice that we allow, in parameter planet_radius_km
, the radius of a planet.
Remember that this module, is supposed to be usable in different planets.
We first convert our deltas to degrees.
The latitude means that if we get a positive delta, we go North.
A negative one and we go South.
With that in mind we also want to respect the curvature of the earth.
This is why qe have the ratio of 180 degrees / pi
times ΔKlm / radius
.
In terms of longitude, we will be going from west to east for a delta positive number.
However, for longitude, the curvature varies.
Our parallel
line, depending on our position in the meridian
line where we are, has a different radius.
We compensate this in our formulat by dividing by cos(radians(lat))
.
This means that we convert our latitude to radians
so that we can factor our resut by its cos
(cosine).
Finally, we just call our previous function which receves Δ
's in degrees.
We now need to look at two parent functions:
def distance_to_in_meters(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
meters = round(meters, 3)
return meters
def distance_to_in_kilometers(self, coord, planet_radius_km=6371000):
meters = self.__distance_to(coord, planet_radius_km)
km = round(meters / 1000, 3)
return km
We will make sure to give only three decimals in this first version for our calculations.
It is implicit in both functions that we are trying to get the distance between two coordinates.
This is our instance and one injected via params.
Also notice that we are also allowing for a planet_radius
.
The only difference is that this function accepts this parameter only in meters
.
We finally reach the subject of our article.
The Haversine formula
.
According to the literature, the Haversine formula is effectively the source of the formula to calculate distances between two points.
These points can be anywhere in a round planet.
Of course planets are more like giant "potatoes" than actual spheres, but again we need to remember that those humps are negligible.
2 r arcsin (√( sin ² ( (ɸ2 - ɸ2) / 2) + cos ɸ1 * cos ɸ2 sin ² ( (λ2 - λ1) / 2 )))
This formula is derivated from the formula we saw in our introduction.
In code terms this translates to:
def __distance_to(self, coord, planet_radius_km):
lon1 = self.lon
lat1 = self.lat
lon2 = coord.lon
lat2 = coord.lat
phi_1 = radians(lat1)
phi_2 = radians(lat2)
delta_phi = radians(lat2 - lat1)
delta_lambda = radians(lon2 - lon1)
h = sin(delta_phi / 2.0) ** 2 + cos(phi_1) * cos(phi_2) * sin(delta_lambda / 2.0) ** 2
arc_sin = atan2(sqrt(h), sqrt(1 - h))
d = 2 * arc_sin * planet_radius_km
return d
In our code we still find these utility methods
def create_west_random_point(center, radius):
degree = random.randint(-90, 90)
d_lat_km = cos(radians(degree)) * radius
d_lon_km = sin(radians(degree)) * radius
origin = Coord(center.lat, center.lon)
origin = add_delta_in_km_to_coord(origin, -d_lat_km, d_lon_km)
return origin
def create_east_random_point(center, radius):
degree = random.randint(-90, 90)
d_lat_km = cos(radians(degree)) * radius
d_lon_km = sin(radians(degree)) * radius
origin = Coord(center.lat, center.lon)
origin = add_delta_in_km_to_coord(origin, +d_lat_km, d_lon_km)
return origin
What these do is to create a random point in a limited square delimiting a circumference of a determined radius
and a certain ceter
Coord
.
4. Publishing a library
In order to publish a library I followed the rules determined by the example given in:
Python Package Index Sample Project
The important file for this is the setup.py
file:
# -*- coding: utf-8 -*-
from distutils.core import setup
with open("README.txt", "r") as fh:
long_description = fh.read()
setup(
long_description=long_description,
long_description_content_type="text/markdown",
name='geo_calculator',
package_dir={'': 'src'},
py_modules=["geo_calculator"],
version='1.0.0-SNAPSHOT',
description='Multi function Geo Location calculator',
author='João Esperancinha',
author_email='jofisaes@gmail.com',
url='http://joaofilipesabinoesperancinha.nl/main',
download_url='https://github.com/user/reponame/archive/v_01.tar.gz',
keywords=['geo', 'location', 'latitude', 'longitude'],
install_requires=[
# 'math',
# 'random',
],
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'Topic :: Software Development :: Build Tools',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
],
)
The first line # -*- coding: utf-8 -*-
is important if we are using uncommon characters.
In my own name, I do have one of these uncommon characters and that is the Ãẫ
of my name João
.
To configure descriptions we need:
long_description=long_description,
long_description_content_type="text/markdown",
Note, that although markdown
descriptions are supported, I couldn't get themm to work in a way that I like to see.
This is why I just ended up reading a simple Readme.txt file instead:
with open("README.txt", "r") as fh:
long_description = fh.read()
We then need to give some self-explanatory paramter:
name='geo_calculator',
package_dir={'': 'src'},
py_modules=["geo_calculator"],
version='1.0.0-SNAPSHOT',
description='Multi function Geo Location calculator',
author='João Esperancinha',
For the rest of the parameters for our module, it is important to consult the publishing documentation.
One parameter that it's important to give better attention is this:
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'Topic :: Software Development :: Build Tools',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
],
Here we are specifying a sort of datasheet of our module.
At the moment my module is still an alpha version.
Although its coming up good, I will still create better versions, faster and with more functionalities.
To get our module published we need to install twine
:
sudo pip3 install twine
Now that we have twine
, we can finally create our distribution:
python setup.py sdist
Finally we push this to our repository:
twine upload dist/*
If you are pushing your own module, be sure to have your test and production accounts created.
Note that we are going to perform tests locally without having to use any of these remote repos.
5. Running unit tests
Before sending a pacakge to any of these PyPI repos, we should make unit test first and make sure that everything works.
To do that we can install our package locally.
To do that we first need to create it.
As seen before we can create it usidn the sdist
switch:
python setup.py sdist
Now let's install it with command pip3
:
sudo pip3 install dist/geo_calculator-0.1.2.tar.gz
Finally we can run our unit tests with python3
:
python3 -m unittest geo_calculator_test.py
Now let's just have a quick look at how the unit tests are constructed.
We first create the unit test without any functions:
import unittest
from geo_calculator import Coord
class CoordTest(unittest.TestCase):
def setUp(self):
pass
We are setting up the unit test and then we continue.
Let's pick one example where we want to move 100Km Northeast:
def test_add_delta_in_km_to_coord_100_km(self):
coord = Coord(53.32055555555556, -1.7297222222222221)
coord.add_delta_in_km_to_coord(100, 100)
self.assertEqual(54.21987716147429, coord.lat)
self.assertEqual(-0.22417190360281203, coord.lon)
Now let's test the calculation of a distance from one coordinate to another:
def test_distance_to_in_meters_one_place(self):
coord1 = Coord(52.0673599, 5.1102121)
coord2 = Coord(52.08608282419939, 5.109284354540611)
self.assertEqual(2082.859, coord1.distance_to_in_meters(coord2))
self.assertEqual(2.083, coord1.distance_to_in_kilometers(coord2))
Have a look at the other tests. See if you can see what the coordinates refer to and what do they mean!
It will be fun and positive 😉!
6. Conclusion
With this post in my blog, I hope to have been able to help you to get a python package on PyPI.
It isn't really a complicated thing to do as you have seen.
We also see that the Haversine
formula is really something to be remembered.
It is everywhere you need to navigate it is so widely used.
Everything we know today in terms of navigation has a connection to these centuries old theories, which are important to remember forever.
The way to move forward to the future is always to remind ourselves of the lessons from the past.
Not only in Earth we see this.
In the future we may be looking at other planes and there establishing our GPS systems.
No matter how small or how big those planets are, the haversines
and the works of these three remarkable mathematicians will always be there to remind us that no matter how far we go into the future, everything is important.
Our past is our connection to our present which makes us look into the future.
One day we will need formulas that help us navigate through space with our mobile phones I guess.
In space, between planets there is no curvature, and we can measure distances directly.
We also measure distances in 4D.
At one point in time Earth is closer than ever from Mars and at one other point in time, the exact oppposite happens.
This article has been written with a lot of positivity just to provide you with some guidelines about how to create a public python module and to give you an introduction to the Haversine
formula.
If you want to install my module please run:
sudo pip3 install geo_calculator
And if you want to remove it:
pip3 uninstall geo_calculator
The module is availale at PyPI:
7. Resources
7.1. Books & Papers
7.2. Websites
- General Hannyngton's table of haversines by SAO/NASA Astrophysics Data System (ADS)
- Movable Type Scripts - Calculate distance, bearing and more between Latitude/Longitude points
- Latitude Wikipedia
- Longitude Wikipedia
- ASTRONOMICAL and NAUTICAL tables with PRECEPTS for finding the LONGITUDE and LATITUDE of PLACES By Lunar Distances, Double Altitudes, &c, AND FOR SOLVING OTHER THE MOST USEFUL PROBLEMS IN PRACTICAL ASTRONOMY - 1805
- Solar Altitude Angle Calculation
If you like algorithms, you'll probably find interesting to know a bit more about tail recursivity, how that led to tail call optimization and the current state of thing. I tell all about that in this video over here:
References
- Long Description
- Sample release project @ GitHub
- Setup.Py example
- Packages and modules > Contributing packages to the registry
- How to publish packages to npm (the way the industry does things)
- How to upload your python package to PyPi
- How to Publish an Open-Source Python Package to PyPI @ Real Python
- Using TestPyPI
Top comments (0)