Introduction
When I started building REST APIs, caching was never part of the plan. It was always something I added later, when users started complaining about slow responses or the database started struggling under load. I would drop in Redis, set a TTL that felt right, and move on. It worked for a while.
Then I watched a caching bug serve wrong prices to thousands of users for over an hour. Nobody caught it until customer support started getting flooded with calls. That was the moment I stopped treating caching as a quick fix and started giving more attention to it.
After years of working on APIs across fintech, SaaS, and enterprise products, here are the REST API caching strategies I use and recommend in 2026.
Top 7 REST API Caching Strategies To Follow
The seven strategies for REST API caching that I have covered below are derived from real projects, real incidents, and decisions I wish I had made earlier. They are ordered from the simplest to implement to the most involved, so you can adopt them in stages rather than all at once.
1. Always Set Cache-Control Headers Before Touching Any Caching Tool
This is where I start on every project now. It is also the last place I looked when I had less experience.
Before a request reaches your application server, it passes through browsers, proxies, and CDNs. The Cache-Control header tells all of them exactly what to do with your response. For a product listing or reference data that does not change often, Cache-Control: public, max-age=3600 means an hour of requests never reaching your server at all. For user-specific data, Cache-Control: private, max-age=60 caches it in the browser only, so nothing leaks through a shared proxy.
What makes things messy is forgetting the Vary header. If your API returns different content based on language or encoding, you need to declare that. Without it, a CDN can serve a cached English response to a French-speaking user, and nobody understands why for hours.
2. Use ETags So Clients Only Fetch Data When Something Actually Changed
I skipped ETags on most early projects because they felt like extra work for a small gain. That changed when I was working on a mobile API where the app polled a profile endpoint every 30 seconds, fetching the full response every single time, even when nothing had changed.
ETags fix this. Your server sends an ETag with the response, which is just a hash or version identifier for that data. The client stores it and sends it back with the next request. If nothing changed, your server returns a 304 Not Modified error with no body at all. That 8KB JSON response becomes a 200-byte header response, hundreds of times a day, per user.
Generating ETags is not complicated. Hash your response with SHA-256, take the first 16 characters, done. Or use the last modified timestamp if that is easier to pull from your data layer. Either works fine in practice.
3. Use Redis as Your Application Cache, but Choose Your TTLs Carefully
Once clients feel HTTP-level caching is not enough, they run towards Redis. Why? Because it can handle the scenarios that headers cannot: expensive database queries, complex aggregations, or third-party API responses that you do not want to call on every request.
Where teams consistently go wrong is TTL selection. I have seen a 24-hour TTL on order status data that changes constantly, and a 30-second TTL on configuration data that barely changes at all. Before picking a number, ask yourself what the worst case is if this data is outdated, and what a cache miss will cost you at peak load. Those two questions give you a much better answer than gut feel.
Also, set a memory limit on your Redis instance and configure an eviction policy. LRU works for most cases. Without it, Redis quietly consumes memory until the server runs out, and that is not what you would want to deal with.
4. Put a CDN in Front of Every Public Endpoint That Does Not Change Per User
If your API has endpoints that return the same data regardless of who is asking, a CDN belongs in front of them. Not because it is the popular choice, but because it is the most practical way to reduce latency for users who are far from your servers.
CDNs cache at edge locations around the world. A user in Nairobi gets a response from a nearby server instead of waiting for a round trip to your data center in Virginia. For read-heavy public APIs, this difference in response time is genuinely noticeable.
The directive that matters here is s-maxage. Browsers ignore it, but CDNs read it and cache accordingly. Setting Cache-Control: public, max-age=60, s-maxage=300 gives you independent control over both. Browsers treat the response as fresh for one minute while the CDN holds it for five. You reduce origin traffic without making users rely on old data too long.
5. Protect Your Database from Cache Stampedes Before They Happen
A cache stampede happens when a heavily used cache entry expires, and hundreds of requests hit your origin at the same time. All of them see a miss. All of them try to recompute the same data. Your database was not built for that.
I dealt with this on a SaaS platform where a popular endpoint's cache expiring during peak hours brought the database down for several minutes. After that, I started using probabilistic early expiration, also called the XFetch algorithm.
The idea is simple. Instead of all requests hitting an empty cache at the exact moment of expiry, individual requests start recomputing the value a little early, before it expires. The closer the entry gets to expiry, the higher the chance any given request will trigger a refresh. The work gets distributed across time instead of piling up at one moment.
After adding this to that platform, it became one of the REST API caching strategies I now apply by default on every high-traffic project, as it just helped eliminate the stampede problem entirely.
6. Serve Stale Content Immediately While Refreshing It in the Background
Most developers have not used the stale-while-revalidate directive, and they are missing out on something genuinely useful.
Cache-Control: max-age=60, stale-while-revalidate=3600 tells a cache to serve a stale response immediately after expiry while fetching a fresh one in the background. The user gets a fast response every single time. The refresh happens without them waiting for it. The next request after the background update picks up the fresh data.
This works well for dashboards, activity feeds, recommendation lists, and any data where freshness matters but not down to the second. Cloudflare and Fastly support it natively. For application caches, you build the same behavior with a background refresh job. The concept is the same either way.
7. Invalidate Your Cache Through Events So Writes Always Reflect Fresh Data
TTL-based expiration works, but it is a risky approach. If data changes one second after you cached it, users see the stale version until the TTL runs out. And, if you shorten the TTL to compensate, you lose most of the caching benefit.
Event-driven invalidation solves this. When a write happens, you emit an event, a consumer picks it up and deletes that cache entry, and the next read fetches fresh data from the database.
This means you can use long TTLs for steady-state reads while still making sure writes invalidate the cache immediately. Reads stay fast. Data stays correct after updates.
The only part you have to get right here is reliability. If your event consumer drops a message, you are serving stale data until the TTL eventually expires. So, use a durable queue with at-least-once delivery and keep the TTL as a backup plan, and not your main strategy.
Conclusion
The most common mistake I see is teams reaching for Redis before they have even looked at their HTTP headers. The headers cost nothing, require no new infrastructure, and eliminate a significant amount of unnecessary traffic before it ever reaches your servers.
Start there. Add ETags to anything being polled frequently. Bring Redis in for endpoints that genuinely need application-level caching. Put a CDN in front of your public endpoints. Add stampede protection as your traffic grows. Move to event-driven invalidation when TTL expiration starts causing real consistency problems.
These strategies for REST API caching build on each other. You do not need all seven from day one. You need the right ones for where you are right now, and the rest as your system actually demands them.
That said, getting caching right across a REST API takes more than reading about it. The decisions around TTLs, invalidation logic, and cache topology are highly specific to your data, your traffic patterns, and your architecture. If your team is still figuring this out, it is better to hire REST API developers who have dealt with these problems across different systems, as they can save you a lot of time and help avoid any production incidents that you would not want to deal with.
Top comments (0)