In the past, I was usually trying to fine-tune the code of my application, either by coding better or using the most performant tools and techniques. But in several recent jobs, I've noticed that the performance issue was not actually in the code but was "around" the application.
For example:
- Timeouts between the printer at the locations and my server - in the APM, the request processing time takes tens of milliseconds, so why a timeout?
- Long query time to database - but the query was optimized and ran a few milliseconds on the query client when tested.
The issue was high latency between the client and the server because of a nonoptimal geolocation deployment of servers.
For example, the database deployed in the west region while the application server was in the east region. In another case I had, the app servers and the database were on two different continents (Europe and the U.S.)
Nowadays, this issue is too common because provisioning servers is so easy, trying to keep cloud costs cheap and the separation between developers and the cloud infrastructure maintainers.
Let's demonstrate the issue by exploring the latency difference between a few countries. We will use https://wondernetwork.com/pings, one of the greatest sites I found a couple of years ago, telling the story of the importance of latency.
The website runs about 30 pings and shows the average; a ping is the time for the round trip between the sender and receiver (Client ↔ Server).
So, to establish the first TCP connection between a client and a server (Browser and Application Server, App Server and DB, etc.) - it takes ping-time X 1.5 to pass the 3-Way handshake.
Then, each transferring of packet of data will take the ping time. A standard TCP packet is up to 1.5KB, so to pass 15KB of data, you wait for at least a ping-time X 10 packets if none of the packets dropped in the middle.
Now imagine how long it takes to pass 100 records of shop listings with their full catalog and item descriptions between an app server in Europe and a database server in the U.S. Or how long it takes the browser to download your single-page application from your servers.
There are many ways to decrease the latency of your app, which I will elaborate on in the following posts in the latency series. but to name a few:
- Manually align the deployment of your servers in the same region on your cloud provider
- Working with private networks inside your cloud provider
- Using a CDN for static content
- Keeping connections alive between sender and receiver
This post was originally published on my newsletter, Percentile 99th. If you wish to learn about the ways to decrease latency and more about application performance, I write about it extensively there.
Top comments (2)
Great article, Nir, as always 👏
Your analysis of latency metrics is exceptionally illuminating, and the Global Ping Statistics resource is a great one, I will probably use it in the future :))
Keep up the excellent work, looking forward to reading the next one!
An excellent article that adds a lot of value to understanding the internals of the systems we use on a daily basis. You raised numerous crucial points that we often overlook in our day-to-day routine - and it's a pity, as they directly impact our projects. What an insightful and thought-provoking piece! It sheds light on aspects we frequently take for granted, reminding us of their significance. This article is a true gem.