Pravesh Sudha

for AWS Community Builders

Posted on Jun 1 • Edited on Jul 8

🤔 If Data Survives in Deployments, Why Do We Need StatefulSets?

#automation #beginners #devops #kubernetes

Hola Amigos 👋

Welcome to the second episode of K8s with Pravesh. If you're new to the series, check out the first episode where we learned how Deployments and Services work under the hood.

Most people believe that StatefulSets are needed because Deployments lose data. But is that actually true?

In today's blog, we'll answer this question by exploring StatefulSets vs Deployments through a practical hands-on demonstration.

🚀 Getting Started

Head over to my GitHub repository, fork it under your own GitHub account, and clone the code:

git clone https://github.com/<your-username>/K8s_with_Pravesh.git

Navigate to the project directory:

cd K8s_with_Pravesh/part-02-statefulsets/configs

Here you'll find:

Secrets and ConfigMaps
Deployment manifest
StatefulSet manifest

First, apply the Secrets and ConfigMaps. These contain the MySQL password, database name, and MySQL initialization script.

kubectl apply -f secrets-and-config.yml

🧪 Experiment 1: Deployment + PVC

Now let's apply our Deployment manifest.

This file contains:

Persistent Volume Claim (PVC)
Deployment
Service

Apply it using:

kubectl apply -f deployment.yml

Once the pod is running, we'll create a sample record inside MySQL, delete the pod, and verify whether the data survives.

Get the pod name:

kubectl get pods

Connect to MySQL:

kubectl exec -it mysql-XXXXXX-XXXX -- mysql -uroot -p

Enter the password:

Pravesh

(You can change this value in the secrets-and-config.yml file.)

Inside MySQL, run:

USE crud_app;

INSERT INTO users(name,email,password)
VALUES('Pravesh','pravesh@example.com','secret');

Exit MySQL:

exit

Delete the pod:

kubectl delete pod mysql-XXXXXX-XXXX

Wait for Kubernetes to create a replacement pod:

kubectl get pods

Connect to the new pod:

kubectl exec -it mysql-XXXXXX-XXXX -- mysql -uroot -p

Run:

USE crud_app;

SELECT * FROM users;

You'll notice that the data is still there.

Wait... The Data Survived?

At this point, many people expect the data to disappear because we're using a Deployment.

However, the data survived.

So the obvious question becomes:

If data survives in a Deployment, why do we even need StatefulSets?

The answer is simple.

The data survived because we attached a Persistent Volume Claim (PVC) to the MySQL container. The PVC preserved the data, not the Deployment.

In other words:

Pod
 ↓
PVC
 ↓
Persistent Storage

The pod can disappear and be recreated, but the storage remains intact.

🧪 Experiment 2: StatefulSet

Now let's perform the same experiment using a StatefulSet.

Apply the StatefulSet manifest:

kubectl apply -f statefulset.yml

Wait for the pod to start:

kubectl get pods

You'll notice that the pod has a predictable name:

mysql-0

Connect to MySQL:

kubectl exec -it mysql-0 -- mysql -uroot -p

Insert another record:

USE crud_app;

INSERT INTO users(name,email,password)
VALUES('Pravesh','pravesh@example.com','secret');

Exit MySQL:

exit

Delete the pod:

kubectl delete pod mysql-0

Wait for Kubernetes to recreate it:

kubectl get pods

Notice something interesting:

mysql-0

The pod name remains exactly the same.

Connect again:

kubectl exec -it mysql-0 -- mysql -uroot -p

Verify the data:

USE crud_app;

SELECT * FROM users;

The data survived once again.

🤨 So What Is the Real Difference?

Let's inspect the PVCs:

kubectl get pvc

With StatefulSets, each replica gets its own dedicated PVC.

For example:

mysql-data-mysql-0
mysql-data-mysql-1
mysql-data-mysql-2

This creates a stable relationship between a pod and its storage.

StatefulSets manage a group of pods while maintaining a sticky identity for each pod. Unlike Deployments, StatefulSet pods are not interchangeable.

Each pod receives:

A stable hostname
A stable network identity
Persistent storage
Ordered deployment and termination

Even if a pod is rescheduled, it retains its identity.

When Should You Use StatefulSets?

StatefulSets are valuable when your application requires:

Stable, unique network identifiers
Stable, persistent storage
Ordered, graceful deployment and scaling
Ordered rolling updates

Common examples include:

MySQL
PostgreSQL
Kafka
ZooKeeper
Redis Clusters
Elasticsearch

Deployment vs StatefulSet

Feature	Deployment	StatefulSet
Data Persistence	✅ With PVC	✅ With PVC
Stable Pod Name	❌	✅
Stable Network Identity	❌	✅
Dedicated Storage Per Replica	❌	✅
Ordered Startup/Shutdown	❌	✅

Conclusion

A common misconception is that StatefulSets exist because Deployments cannot persist data. As we saw in this blog, that isn't entirely true.

A Deployment can preserve data just fine when paired with a Persistent Volume Claim. The PVC is responsible for data persistence, not the Deployment itself.

The real strength of StatefulSets lies in providing stable identities, predictable networking, dedicated storage per replica, and ordered deployment behavior. These features make StatefulSets the ideal choice for databases and other stateful distributed systems.

If you're running stateless applications such as frontend applications, REST APIs, or microservices, Deployments are usually the right choice. But when your workloads require stable identities and persistent state, StatefulSets become essential.

I hope this hands-on comparison helped clarify the difference between Deployments and StatefulSets.

See you in the next episode of K8s with Pravesh! 🚀