In the previous post, I wrote about RESP — the protocol layer that lets a Redis server understand commands coming over TCP.
But parsing a command is only the first step.
Once the server receives something like:
SET name Alice
and the RESP parser converts it into:
["SET", "name", "Alice"]
the next question is:
Where does this data actually live?
That is what this post is about.
In this part of my Redis clone, I built the in-memory storage layer that supports:
- Strings
- Lists
- Hashes
- Key expiry
- Type validation
- Lazy expiry on access
This is the layer where the server starts feeling like an actual database.
Why the Storage Layer Matters
At first, a Redis clone sounds like it could just be:
const store = {};
or:
const store = new Map();
And for the most basic version, that is true.
If all you want is:
SET name Alice
GET name
then a simple key-value map works.
But Redis is not just a plain key-value object.
Redis has data types.
A key can store a string.
Another key can store a list.
Another key can store a hash.
Some keys may expire after 10 seconds.
Some commands should only work on specific data types.
So the storage layer needs to track more than just:
key -> value
It needs to track:
key -> value
key -> type
key -> expiry time
That is what turns a simple JavaScript object into a Redis-like in-memory database.
The Basic Data Model
The simplest mental model for the store is:
database:
name -> Alice
queue -> [task1, task2, task3]
user:1 -> { name: Bob, age: 30 }
expiry:
session -> expires at timestamp
So internally, the system has two major responsibilities:
- Store the actual data.
- Track when keys should expire.
I kept these concerns separate.
The database module focuses on storing and retrieving values.
The expiry module focuses on TTL metadata.
This separation makes the design easier to reason about.
Strings
Strings are the simplest Redis data type.
Example:
SET name Alice
GET name
The server stores:
name -> Alice
When the client runs:
GET name
the server checks:
- Does the key exist?
- Has the key expired?
- Is the key storing a string?
- If yes, return the value.
The response is encoded back in RESP format.
For example, if the value is Alice, the server returns:
$5\r\nAlice\r\n
If the key does not exist, it returns a null bulk string:
$-1\r\n
The string data type is simple, but it becomes important because many other commands depend on the same storage rules:
- type checking
- expiry checking
- persistence
- replication
- response encoding
A simple GET command still passes through multiple layers of the system.
SET with Expiry
Redis allows setting keys with expiry:
SET session abc EX 60
This means:
Store session = abc
Expire it after 60 seconds
In my Redis clone, the value is stored in the main database, while the expiry timestamp is stored separately.
So conceptually:
database:
session -> abc
expiry:
session -> currentTime + 60 seconds
This separation makes expiry easier to manage.
The value does not need to know about its own expiry.
The expiry system handles that concern.
Lazy Expiry
There are two common ways to expire keys:
Active expiry
Lazy expiry
Active expiry means a background process keeps scanning for expired keys and deletes them.
Lazy expiry means the server checks whether a key has expired only when someone tries to access it.
For example:
SET token abc EX 10
After 10 seconds, the key is expired.
But instead of deleting it immediately in the background, the server can wait until a client runs:
GET token
At that moment, the server checks:
Does token have an expiry timestamp?
Is the current time greater than the expiry timestamp?
If yes, delete token and return null.
So the flow becomes:
GET token
↓
check expiry
↓
expired?
↓
delete key + expiry metadata
↓
return null
This approach keeps the system simpler.
It also taught me an important database design idea:
Sometimes cleanup does not have to happen immediately. It just has to happen before the data is observed.
Why Expiry Affects Everything
Expiry sounds like a small feature, but it touches many parts of the database.
For example:
GET
Before returning a key, the server must check whether it is expired.
DEL
When a key is deleted, its expiry metadata should also be removed.
FLUSHALL
When the database is cleared, the expiry store should also be cleared.
Persistence
If a key has expiry metadata, that information needs to be preserved or handled correctly during save/load.
Replication
If the master writes a key with expiry, the replica needs to receive the same write behavior.
Sandbox
If a key has TTL, the UI should show the countdown clearly.
So expiry is not just an extra field.
It becomes a cross-cutting concern.
Lists
The next data type I implemented was Lists.
Redis lists are useful for queues, stacks, timelines, and task buffers.
Supported commands include:
LPUSH queue a b c
RPUSH queue d e
LPOP queue
RPOP queue
LLEN queue
LRANGE queue 0 -1
A list is stored internally as an ordered array-like structure.
Example:
LPUSH queue task1
LPUSH queue task2
The list becomes:
queue -> [task2, task1]
Because LPUSH inserts on the left.
If we run:
RPUSH queue task3
the list becomes:
queue -> [task2, task1, task3]
This was useful because it forced me to think about command semantics.
LPUSH and RPUSH sound similar, but they mutate different ends of the list.
LPOP and RPOP also remove from different ends.
LRANGE and Index Handling
One of the more interesting list commands is:
LRANGE queue 0 -1
This means:
Return all elements from index 0 to the last element.
Redis supports negative indexes.
So:
LRANGE queue -2 -1
means:
Return the last two elements.
That means the clone needs to normalize indexes.
The server has to convert:
start = -2
stop = -1
into real array indexes based on the list length.
This small feature makes the implementation more realistic.
It is not just pushing and popping.
It is matching Redis-like behavior.
Hashes
Hashes are another important Redis data type.
They let you store field-value pairs under a single key.
Example:
HSET user:1 name Bob age 30
HGET user:1 name
HGETALL user:1
Conceptually:
user:1 -> {
name: Bob,
age: 30
}
This is useful for storing object-like data.
In the clone, hashes support commands like:
HSET user:1 name Bob age 30
HGET user:1 name
HDEL user:1 age
HGETALL user:1
HLEN user:1
HEXISTS user:1 name
The interesting part here was handling multiple fields in one command.
For example:
HSET user:1 name Bob age 30 city Delhi
This command contains multiple field-value pairs.
The command handler needs to validate that fields and values are paired correctly.
If there is a missing value, the command should return an error.
That kind of validation is what makes the command engine feel closer to a real Redis server.
Type Checking
One of the most important parts of the storage layer is type checking.
Imagine this:
SET name Alice
LPUSH name Bob
This should not work.
The key name already stores a string.
A list command should not be allowed on it.
So the server must return a wrong type error instead of silently converting the value.
This means each key needs an associated type.
Conceptually:
name:
type: string
value: Alice
queue:
type: list
value: [task1, task2]
user:1:
type: hash
value: { name: Bob }
Before executing a command, the handler checks whether the key has the expected type.
For example:
GET expects string
LPUSH expects list
HGET expects hash
If the key does not exist, some commands create it.
If the key exists with the wrong type, the command returns an error.
This was one of the key differences between a simple map and a Redis-like store.
Command Flow Example: SET
Let’s walk through what happens when a client sends:
SET name Alice EX 60
The flow looks like this:
RESP parser receives raw bytes
↓
Parser emits ["SET", "name", "Alice", "EX", "60"]
↓
Command router identifies SET
↓
SET handler validates arguments
↓
Database stores name = Alice
↓
Type is marked as string
↓
Expiry store records TTL
↓
AOF persistence records the write
↓
RDB snapshot is updated
↓
Replication layer can propagate the write
↓
Server returns +OK
So one command touches:
- protocol parsing
- command validation
- storage
- expiry
- persistence
- replication
- response encoding
That is why even “simple” Redis commands are great for learning systems design.
Command Flow Example: LPUSH
Now take:
LPUSH queue task1 task2
The server does:
Parse command
↓
Check whether queue exists
↓
If it does not exist, create a new list
↓
If it exists, verify it is a list
↓
Push values to the left
↓
Persist the write
↓
Return the new list length
If queue does not exist, it is created.
If queue already stores a list, the command mutates it.
If queue stores a string or hash, the command returns a type error.
This type behavior keeps the database predictable.
Command Flow Example: HSET
For:
HSET user:1 name Bob age 30
the server does:
Parse command
↓
Check whether user:1 exists
↓
If not, create a hash
↓
Validate field-value pairs
↓
Set name = Bob
↓
Set age = 30
↓
Return number of new fields added
Hashes were interesting because they introduced nested storage.
A string stores one value.
A list stores an ordered sequence.
A hash stores a map inside the main map.
That means the storage layer becomes:
database map
↓
key
↓
hash map
↓
field -> value
This helped me understand why Redis data types are powerful.
They let you model different kinds of data without leaving the key-value model.
Storage and Persistence
The storage layer does not live alone.
Every successful write needs to be persisted.
For example:
SET name Alice
LPUSH queue task1
HSET user:1 name Bob
These commands should update memory, but they should also be recorded by the persistence system.
That way, if the server restarts, the database can be rebuilt.
This is where AOF and RDB connect to storage.
The storage layer owns the current state.
The persistence layer records or snapshots that state.
The relationship looks like:
Command handler
↓
Storage update
↓
AOF append
↓
RDB snapshot
This made the architecture cleaner because storage does not need to know how commands arrived.
It only needs to expose operations that command handlers can use.
Storage and Replication
Replication also depends on successful writes.
When a write happens on the master, it needs to be propagated to replicas.
So after a command updates storage, the server can send the same write command to replica connections.
The flow becomes:
Client writes to master
↓
Master updates in-memory store
↓
Master persists write
↓
Master sends command to replicas
↓
Replicas apply the same write
This is where command design matters.
If the command is represented clearly, it can be reused for:
- local execution
- persistence replay
- replication propagation
That was an important architectural learning.
Storage and the React Sandbox
The React sandbox made the storage layer much easier to understand visually.
Instead of only seeing:
+OK
after running a command, the sandbox shows the internal changes.
For example:
SET name Alice EX 60
updates:
Database tab
Expiry tab
AOF log
RDB snapshot
For a list command:
LPUSH queue task1 task2
the database view shows the list changing.
For a hash command:
HSET user:1 name Bob age 30
the database view shows the hash fields.
This made the project more explainable.
It is one thing to say “the database changed.”
It is much better to see how it changed.
What I Learned
1. A key-value store is not always simple
At the beginning, I thought the storage layer would be the easiest part.
But once data types, expiry, type checking, persistence, and replication are added, it becomes much more interesting.
2. Types matter
Redis commands are simple because Redis is strict about what each key contains.
A list command should not work on a string.
A hash command should not work on a list.
That strictness makes the system predictable.
3. Expiry is a system-wide concern
TTL is not just attached to SET.
It affects reads, deletes, persistence, replication, and UI visualization.
4. Command semantics are important
Commands like LPUSH, RPUSH, LRANGE, HSET, and HGETALL all have small behavior details that need to be handled carefully.
5. Separating storage and expiry keeps the design clean
The database stores values.
The expiry module stores TTL metadata.
This makes both parts easier to test and reason about.
6. A database is a collection of layers
The in-memory store is only one layer.
It sits between protocol parsing, command execution, persistence, replication, and client responses.
Final Thought
Before building this, I thought of Redis mainly as:
key -> value
After building the storage layer, I started seeing it more like:
key -> typed value
key -> expiry metadata
commands -> validated mutations
writes -> persistence + replication
That shift matters.
A Redis-like database is not just a map.
It is a carefully designed system where every command has rules, every key has type behavior, and every write can affect persistence, replication, and expiry.
That is what made this part of the project so valuable.
In the next post, I’ll go deeper into persistence: how AOF and RDB work, why both exist, and what I learned while implementing them.
Repo:
https://github.com/Abhinov007/redis_clone
Live sandbox:
https:https://redis-clone.vercel.app/
Top comments (0)