Ezeana Micheal

Posted on May 22

How I Cut API Response Time from 2s to <100ms with Redis Caching

#webdev #programming #backend #redis

Early in my software development years, I had the opportunity to work with a company where I learned backend development. I worked on a system where I was responsible for building the APIs without senior guidance, just documentation, experimentation, and a lot of self-learning.

When learning to design DB models and RESTful APIs. That was it for me, connect everything and let it out via the get request, let it in via the post request, patch things up with the patch request, and delete via the delete request.

Basic crud operations, and I was fine with that, back then. Eventually, I decided to explore the site myself, and damn, it was slow.

Every interaction came with a noticeable delay. So I stopped assuming things were fine and tested the APIs properly. Using Postman, I measured response times. Most endpoints were taking over 5 seconds.

The Problem

Every request was hitting the database directly, even when requesting the same data repeatedly.

The system wasn’t slow because the database was bad.

It was slow because it was doing the same work over and over again.

What Changed?

I did some research, investigations, and found something powerful. Caching, and utilizing redis for it.

What is Caching?

Caching stores frequently accessed data in memory (Redis uses RAM), allowing much faster retrieval compared to querying a database repeatedly. It reduces database load, instead of hitting the database every time it hits the cache instead and returns the result.

Types of Caching (and Why Redis Fits Here)

Not all caching works the same way. In my case, I implemented application-level caching with Redis, but it helps to understand where it sits in the bigger picture.

1. Application-Level Caching

This is what I used. The application stores frequently accessed data in a fast in-memory store like Redis.

Instead of always asking the database:

API → Database → Response

We first check:

API → Redis → Database (only if needed)

This is the most common approach in backend systems because it gives full control over what gets cached and when it gets updated.

2. Database Caching

Some databases internally cache query results or use external layers to store repeated query outputs.

This reduces repeated expensive queries, but it is less flexible compared to Redis-based caching where you control the logic directly inside your API.

3. Distributed Caching

This is when caching is shared across multiple servers using systems like Redis Cluster or Memcached.

It becomes important when your application is no longer running on a single server, but across multiple services or microservices.

Therefore,

Before:

Client → API → DB (every request)

After:

Client → API → Redis → DB (only on cache miss)

Instead of querying the database every time, the API now checks Redis first. If the data exists, it returns immediately. If not, it fetches from the database, stores the result in Redis, and returns it.

Here was my initial API response time without caching,

Here was my API response time afterwards.

I implemented 2 main strategies.

Key-value Caching.
Cache invalidation via Write.

In my API I implemented key value caching for every get request i.e (get /books/ and get /books/:id)

Eliminating First-Request Slowness (Cache Warm-Up)

One downside of caching is the initial delay when the cache is empty.

How did I solve that? a script.

I added a cache warm-up script that runs on deployment and preloads frequently accessed data into Redis. That way, the system starts “warm,” and users don’t experience the initial latency.

At that point, the only time the system feels slow is immediately after cache invalidation.

But having a cache isn't all good if not invalidated well, because write data comes in too, through post , patch, put. So a cache invalidation strategy was needed.

I chose cache invalidation via Write for 2 reasons:

Data will be refreshed once a put, patch, post or delete request comes in, and
Data is not often changed regularly, the data stays for a while because of the type of application we’re dealing with so giving it a time to live and time to die feels too much.

Trade-offs

Every system has tradeoffs as there is no perfect system. The following were the tradeoffs of my decision.

First request after invalidation is slower
Cache keys must be managed carefully
Slight increase in system complexity

But the system is now significantly faster and more efficient.

PS: There are several optimization techniques in backend development, this article is just about caching.

Thanks for reading, what do you think about the approach? Read, like and comment.

DEV Community