Understanding the gap between actual performance and perceived latency
In previous parts, we explored how backend systems behave under load and how design decisions impact performance.
However, not all performance issues come from slow systems.
In many cases, the backend is fast, but the API still feels slow.
This difference comes from how latency is experienced, not just how it is measured.
API performance is not only about execution time. It is also about network behavior, data transfer, and how requests are structured.
Network latency matters
Every API call travels over a network.
Even if the backend processes a request quickly, the total time includes:
- travel time from client to server
- routing through multiple network hops
- return time for the response
This delay exists even when the backend is efficient.
For users located far from the server, or on unstable networks, this latency becomes noticeable.
As a result, a fast backend can still feel slow due to network distance and conditions.
Payload size issues
The size of the response directly affects how long it takes to deliver.
Larger payloads require more time to:
- transfer over the network
- process on the client side
Even small increases in payload size can add noticeable delay, especially on slower connections.
Returning more data than necessary increases latency without improving functionality.
Efficient APIs focus on sending only what is required.
Too many API calls
Frontend applications often depend on multiple API calls.
Instead of one request, the system may perform several smaller requests to gather data.
For example:
- one call for user data
- another for related items
- another for additional details
Even if each call is fast, the total time adds up.
Sequential calls increase delay further, as each request waits for the previous one.
This creates the perception of a slow system, even when individual endpoints are efficient.
Serialization and deserialization cost
Data needs to be converted before it is sent and after it is received.
On the server:
- objects are serialized into formats like JSON
On the client:
- responses are parsed back into usable data
This process takes time.
While the cost is small per request, it becomes noticeable with large payloads or frequent calls.
It adds hidden overhead that is often ignored during performance evaluation.
Frontend rendering delays
API performance is often judged by how quickly users see results.
Even after the response arrives:
- data must be processed
- UI must be updated
- components must render
These steps add delay beyond the API response time.
From the user’s perspective, the system feels slow, even if the backend responded quickly.
Lack of parallelism in requests
When API calls are made sequentially, total latency increases.
Each request waits for the previous one to complete.
If multiple independent requests are needed, this approach wastes time.
Parallel execution can reduce total wait time, but it is not always implemented.
This leads to unnecessary delays in response delivery.
Conclusion
API performance is not only about backend speed.
It is influenced by network latency, payload size, request patterns, and client-side processing.
A system can be technically fast but still feel slow to users.
Understanding this difference helps in designing APIs that are efficient not only in execution, but also in experience.
In the next part, we will explore load testing and why many systems fail to identify performance limits early.
Thanks for reading.

Top comments (0)