APIVerve

Posted on Feb 14 • Edited on Mar 16 • Originally published at blog.apiverve.com

API Latency: Why 200ms Feels Like Forever

#performance #latency #userexperience #optimization

Your API returns in 200 milliseconds. That's fast, right?

A fifth of a second. Barely perceptible.

Except it's not just the API. It's your server processing the request. The API call. Processing the response. Rendering the result. Now you're at 400ms. Add another API call because your page needs data from two sources. Now you're at 600ms.

Your user tapped a button and nothing happened for more than half a second.

They're already wondering if it's broken.

The Human Perception of Speed

Researchers have studied this for decades. The findings are consistent:

Under 100ms: Feels instantaneous. Users perceive cause and effect as immediate.

100-300ms: Noticeable but acceptable. Users feel the system is working.

300-1000ms: Sluggish. Users notice waiting. Focus starts to break.

Over 1 second: Disruptive. Users wonder if something went wrong. They consider clicking again.

Over 10 seconds: Users leave.

These thresholds aren't arbitrary. They're how human attention works. And your API latency is a direct input to where your application lands on this spectrum.

Latency Stacks

Here's what developers often miss: latency adds up.

Consider a typical page load:

Component	Time
User's network to your server	50ms
Your server processing	20ms
API call #1	180ms
API call #2	150ms
Your server processing response	30ms
Your server to user	50ms
Browser rendering	40ms

Total: 520ms.

And that's if everything goes sequentially. If those API calls can happen in parallel, great. If they can't — if call #2 depends on data from call #1 — you're looking at even longer.

Each individual component seems fine. Nothing is slow. But together? Half a second delay on every interaction.

The Percentile Problem

"Our API averages 100ms response time."

Cool. What's your p95? What's your p99?

Averages hide problems. If 90% of requests complete in 50ms and 10% take 600ms, your average is ~105ms. Looks great. But one in ten users is having a miserable experience.

Those slow requests hit your most engaged users hardest. The person clicking through five pages will likely hit that slow request at least once. The person browsing casually might not notice.

You're degrading the experience for exactly the users you want to keep.

Why APIs Get Slow

Most API latency comes from a few common sources:

Cold starts. Serverless functions that haven't been called recently need to spin up. This can add hundreds of milliseconds to the first request. Your user who just happens to be the first visitor in 15 minutes gets a noticeably worse experience.

Database queries. The API might be fast, but if their database query is slow, you wait. Missing indexes, unoptimized queries, or just high database load all become your problem.

Geographic distance. Calling an API server in Virginia when your user is in Tokyo adds unavoidable physics. Light through fiber optic cables has a speed limit.

Processing time. Some operations are genuinely expensive. Image processing, complex calculations, or large data transformations take time no matter how optimized the code is.

Rate limiting. Some providers throttle requests by adding artificial delay instead of rejecting them. Your request isn't slow; it's being intentionally delayed.

Network congestion. The internet isn't always fast. Packets get lost, retransmits happen, and nobody can do anything about it.

What You Can Control

You can't fix the API provider's infrastructure. But you can work around it.

Caching. If the data doesn't change often, why fetch it fresh every time? Exchange rates don't change by the millisecond. Yesterday's IP geolocation data is probably still accurate.

Cache responses locally. Set appropriate TTLs. Serve from cache when you can, call the API when you must.

Parallel requests. If you need data from three APIs, call them simultaneously. Three 150ms calls in parallel is still 150ms total (plus overhead). Sequential, it's 450ms.

This requires architectural thinking. Can you structure your code so calls don't depend on each other?

Background fetching. For data you'll need soon, fetch it before you need it. Pre-load likely next actions. Update caches on a schedule rather than on-demand.

Graceful degradation. If the API is slow or down, can you show something useful anyway? Cached data, placeholder content, or a "loading" state that doesn't block the rest of the page.

Choosing faster providers. Not all APIs are equal. Some invest in infrastructure, CDNs, and optimization. Some run on a single server in one location. The difference is noticeable — compare a well-optimized Email Validator to a self-hosted alternative and the gap is obvious.

The Mobile Reality

Everything above gets worse on mobile.

Mobile networks have higher latency than wired connections. 4G adds 50-100ms minimum. Spotty connections add packet loss and retransmits. Moving between cell towers causes delays.

That 200ms API call? On mobile, by the time the request gets to the server and the response gets back, you might be looking at 400ms just from network overhead.

Mobile users are also more impatient. They're often multitasking, distracted, or in motion. Every millisecond of delay increases the chance they'll switch to another app.

If your product has mobile users — and it probably does — your latency budget is even tighter than you thought.

Measuring What Matters

You should be tracking:

p50 (median): The typical user experience.

p95: What one in twenty users experiences. This catches meaningful slow outliers without noise from the extremes.

p99: What one in a hundred users experiences. For high-traffic sites, this is still thousands of users per day having a bad time.

Geographic breakdown: Are users in certain regions consistently slower? You might need regional infrastructure or different provider choices for different markets.

Time of day: Does performance degrade during peak hours? That points to capacity issues — yours or your provider's.

Error rates alongside latency: Sometimes slow requests become failed requests. Track them together.

The Latency Budget

Here's a useful exercise: work backwards from user expectations.

Target: 300ms total for an interaction to feel responsive.

Budget	Allocation
Network round-trip (user ↔ server)	100ms
Server-side processing	50ms
External API calls	100ms
Rendering	50ms

That leaves 100ms for API calls. If you're calling an API that averages 180ms, you're already over budget — before accounting for p95 or multiple calls.

This forces real prioritization. Can you call fewer APIs? Can you cache more aggressively? Can you switch to a faster provider?

The budget makes the tradeoffs explicit.

When Slow Is Acceptable

Not every millisecond needs optimizing. Some operations are user-initiated and can take longer:

Export/download features: Users expect generation time
Complex searches: Users accept that complexity takes time
One-time setup flows: Patience is higher when configuring something

The key is communication. A progress indicator, a "this might take a moment" message, or a background processing pattern can make waiting acceptable.

But for the core interactions — loading pages, submitting forms, clicking buttons — every millisecond counts.

The Competitive Angle

Fast isn't just about user experience. It's a competitive advantage.

Google has published data showing that slower pages rank lower in search results. Amazon famously found that every 100ms of latency cost them 1% in sales.

Your competitors who prioritize performance are providing a better experience. Users might not articulate why your app feels "clunky" and theirs feels "snappy," but they notice.

Speed is a feature. Treat it like one.

Making the Case

If you need to convince someone that latency matters:

Show comparisons. Throttle your dev environment to simulate slow API calls. Let people feel the difference between 100ms and 500ms.

Tie to metrics. If you can correlate latency with bounce rates, conversion rates, or retention, the argument makes itself.

Reference the research. Google, Amazon, and Walmart have all published studies quantifying the cost of latency. Borrow their credibility.

Start measuring. You can't improve what you don't measure. Set up real user monitoring and let the data tell the story.

The Bottom Line

200ms feels like nothing when you're measuring it. But users don't experience measurements. They experience interactions.

Fast APIs enable fast experiences. Slow APIs force you to work around them or accept degraded UX.

Every external call is a latency tax. Choose your providers carefully, architect for speed, and remember that milliseconds add up faster than you think.

Your users might not know what's making your app feel slow. But they'll know it feels slow.

Don't let them feel that.

Ready for APIs that won't slow you down? Explore the APIVerve catalog — optimized infrastructure and consistent, fast responses.

Originally published at APIVerve Blog

DEV Community