Dmytro Huz for AWS Community Builders

Posted on Mar 16 • Originally published at dmytrohuz.com

A Practical Guide to Time for Developers: Part 3 — How Computers Share Time

#devops #programming #learning #webdev

Intro

Every action film has that scene just before the military operation begins.

“Let’s sync our watches,” the captain says.

The idea is simple: if every stage of the plan depends on precise coordination, everyone involved has to act in sync and according to the same timeline.

A while ago, we started our journey with a practical goal: synchronizing the time of many computers. To get there, we first had to understand what time actually is and learn the basic glossary needed to speak the language of this problem and its solutions. In the first part (https://www.dmytrohuz.com/p/a-practical-guide-to-time-for-developers), we explored the foundations of time itself. In the second (https://www.dmytrohuz.com/p/a-practical-guide-to-time-for-developers-2ec), we looked at how time is kept and tracked inside a single computer. Now we are finally ready to move to the next step: how many computers share time with each other.

Keeping precise time across many computers is not unusual or exotic. In fact, the opposite is true. Distributed systems, industrial networks, telecom infrastructure, financial systems, and measurement environments often involve hundreds or thousands of devices that must stay synchronized within a clearly defined precision budget.

Let’s imagine a wind farm. Each turbine is around 120 meters tall and has a warning light at the top. To make the turbines visible to planes at night, the lights should blink every second. And to make the whole field clearly visible as one coordinated structure, those lights should blink simultaneously.

How can we make that happen?

The obvious answer is: the turbines need synchronized clocks.

But how we can keep them in sync for hundreds and thousands devices with amazing accuracy?
Let’s see!

Just sync the clocks once?

Let’s start with the most obvious idea: set the same time on all clocks once, and the problem is solved.

Unfortunately, it does not work that way.

Every clock has physical behavior behind it. Its frequency is affected by things like oscillator quality, temperature, aging, and other environmental factors. As a result, every clock drifts in its own way. Some also exhibit short-term fluctuations, often described as wander. These effects cannot be fully eliminated, and in practice they mean that two clocks will slowly diverge even if they start perfectly aligned.

That turns synchronization from a one-time setup task into a continuous process.

Clocks do not just need to be set. They need to be kept aligned over time. In practice, that means measuring the difference between clocks again and again, then adjusting their time and, more importantly, their rate so that they do not immediately drift apart again.

You can see how quickly clocks with different rates and wander fall out of sync, even when they start at exactly the same time with the interactive simulation I created for this exact scenario: https://dmytrohuzz.github.io/interactive_demo/clock_sync/index.html

From setting time to synchronization

Once we accept that clocks drift, a one-time setup stops looking like a real solution. Time is not something you assign once. It is something you keep aligned.

In practice, synchronization is a feedback loop. A machine compares its local clock to some reference, estimates the difference, adjusts its own clock, and repeats the process again and again.

The difficult part is that machines cannot read each other’s clocks directly. They can only communicate over a network, and the network adds delay and uncertainty. So synchronization protocols work indirectly: they exchange messages with timestamps and use those timestamps to estimate the relationship between clocks.

At the center of that estimate are two questions:

how far apart are the clocks?
how much of the observed difference comes from network delay rather than clock error?

This sounds simple in theory, but the key idea only becomes clear once we walk through it step by step.

A basic synchronization exchange gives us four timestamps:

t1 — the client sends a request
t2 — the server receives that request
t3 — the server sends a response
t4 — the client receives the response

These four timestamps are the heart of the whole mechanism. Once this pattern becomes intuitive, the rest of the synchronization topic becomes much easier to follow.

Now imagine the exchange from the client’s point of view.

The client sends a request at local time t1 = 00:00.

Later, it receives the response at local time t4 = 00:04.

Inside that response, the server includes its own timestamps:

it received the request at t2 = 00:06
it sent the response at t3 = 00:06

At first glance, this looks strange. How can the server receive the request at 00:06 if the client sent it at 00:00, and the whole round trip took only four seconds on the client side?

The answer is simple: t1 and t2 do not belong to the same timeline.

The client clock and the server clock are different local views of time. What synchronization tries to estimate is the relation between those two timelines. In other words, it tries to answer this question:

If the client sees one moment as 00:00, what does the server call that same moment?

That relationship is what we call offset.

This is the most important insight in the whole topic: the difference t2 - t1 does not represent only network delay. It contains two things mixed together:

packet travel time
clock offset between client and server

A useful way to think about it is with time zones.

Imagine you leave one city at local time 00:00, travel to another city, and arrive when the local clock there shows 14:00. Then you immediately turn around and come back, arriving home when your original city’s clock shows 20:00.

Now suppose the travel time is the same in both directions.

The first leg, from your city to the other one, includes:

travel time + time-zone difference

The return leg includes:

travel time - time-zone difference

So if the outward journey appears shorter or longer than the return journey, that difference tells you something about the offset between the two local clocks.

This is exactly what synchronization protocols exploit.

Under the usual symmetric-delay assumption, the offset can be estimated as:

offset = ((t2 - t1) + (t3 - t4)) /2
and the round-trip delay as:

delay = (t4 - t1) - (t3 - t2)
The first formula separates clock offset from the two directions of travel. The second removes the server’s processing time and leaves only the network round-trip time.

So synchronization is not about directly copying time from one machine to another. It is about observing message exchanges, separating delay from clock difference, and then correcting the local clock based on that estimate.

That is the core idea behind the whole topic.

Feel free to play with the simulation here:

https://dmytrohuzz.github.io/interactive_demo/clock_sync/clock_sync_explained

NTP and PTP: two ways to synchronize clocks

Over time, two major protocol families became the standard answers to the synchronization problem: NTP and PTP.

Both solve the same core problem: a machine cannot read another machine’s clock directly, so it has to infer the difference by exchanging timestamped messages over a network. From those timestamps, it estimates clock offset and network delay, then adjusts the local clock toward a reference.

The difference is not the basic idea, but the precision target and the environment they are designed for.

NTP: practical synchronization for general systems

NTP — the Network Time Protocol — is the general-purpose approach. It is designed to keep clocks reasonably aligned across ordinary systems and ordinary networks.

Its main principle is simple: a client exchanges request and response messages with a time server, records timestamps on both sides, estimates round-trip delay and clock offset, and then gradually disciplines its own clock. It repeats this process continuously, using multiple measurements to smooth out noise and avoid reacting too aggressively to one bad sample.

That makes NTP a good fit for:
• logs and observability
• authentication and certificate validation
• scheduled jobs
• general wall-clock correctness across servers and infrastructure

NTP does not assume a perfect network. It is built for real environments, where delays vary, paths are not perfectly symmetric, and hosts are under changing load. Its strength is robustness, not extreme precision.
[Interactive demo]

PTP: tighter synchronization for controlled environments

PTP — the Precision Time Protocol — targets systems where much tighter agreement between clocks is required.

Its principle is similar to NTP: devices exchange timing messages, estimate offset and delay, and adjust local clocks. But PTP is designed for local precision networks, where the entire timing path is treated more carefully. In practice, this often means hardware timestamping, PTP-aware switches, and a dedicated timing hierarchy built around a grandmaster clock distributing time to other devices.

PTP is commonly used in:
• industrial and automation systems
• telecom networks
• audio and video systems
• measurement systems
• finance
• power and substation environments

PTP is not just “a more accurate NTP.” It usually operates in a different class of environment, with tighter timing requirements and more deliberate infrastructure support.
[Interactive Demo]

Different tools for different timing budgets

So NTP and PTP are not really rivals. They are different engineering choices.

If the goal is to keep ordinary systems aligned to real time well enough for general infrastructure behavior, NTP is usually the right tool.

If the goal is to keep clocks tightly aligned in a local timing domain where timing quality directly affects correctness, event ordering, or measurement precision, PTP is often the better fit.

The key point is this: both protocols depend on timestamp exchange, but the quality of synchronization depends heavily on how those timestamps are produced.

And that leads to the next question: where exactly was the timestamp taken?

This is where timestamping location — in software or in hardware — starts to matter.

Why timestamp location changes everything

At this point, NTP and PTP may still look like protocol problems: exchange messages, estimate offset, correct the clock.

But in practice, a large part of synchronization quality depends on something more physical:

where exactly is the timestamp taken?

That matters because a packet does not appear in software at the exact moment it hits the wire. Between the real network event and the moment the operating system records a timestamp, the packet may pass through the NIC, driver, kernel, interrupt handling, scheduling, and software processing. Every one of those layers can add delay and variation.

So two timestamps may look equally precise as numbers while representing very different physical moments.

Software timestamping

With software timestamping, the timestamp is recorded somewhere in the software stack after the packet has already passed through part of the system.

That makes software timestamping widely available and easy to use, but it also means the measurement includes more uncertainty:
• interrupt latency
• kernel and driver delay
• scheduling effects
• queueing and system load

As a result, a software timestamp often reflects when the system handled the packet, not the exact moment the packet crossed the network interface.

Hardware timestamping

With hardware timestamping, the timestamp is recorded much closer to the real transmit or receive event, typically inside the NIC itself.

This removes a large part of the software-induced uncertainty and makes the measurement more stable and repeatable. The closer the timestamp is to the actual wire event, the more useful it becomes for precise synchronization.

That is one of the main reasons PTP can achieve much better accuracy in the right environment: not only because of the protocol itself, but because it is often paired with hardware timestamping and a more carefully controlled timing path.

So the practical precision limit is not defined by the protocol name alone. It depends on the full measurement path.

A good rule of thumb is simple:

the closer the timestamp is to the wire, the better the synchronization can be.

⸻

Summary

A single computer can keep time locally. A distributed system has a harder task: many machines must keep time together.

That is why simply setting clocks once is not enough. Real clocks drift, so synchronization has to be continuous. Protocols such as NTP and PTP address this by exchanging timestamped messages, estimating clock offset and network delay, and repeatedly steering local clocks toward a reference.

But protocol choice is only part of the story. In practice, synchronization quality also depends heavily on where timestamps are taken. A timestamp captured deep in software carries more uncertainty than one captured close to the physical network event.

So if this part was about the general idea of shared time — why it matters, why it is difficult, and how systems approach it — the next part will move from principle to implementation.

We will look at how Linux actually does this in practice: NICs, software and hardware timestamping, PHCs, and the tools that connect them into a real synchronization stack.

DEV Community