MaxHuo

Posted on Jun 16

Most People Misunderstand Object Storage (Here’s the Mental Model That Actually Helps)

#cloud #rust #distributedsystems #s3

If you’ve used S3, MinIO, or any cloud storage API, it’s easy to assume object storage is just a “cloud folder system.”

That assumption is wrong — and it leads to confusion when you start working with distributed systems.

Object storage is not a file system.

It’s closer to a distributed key-value system with strong durability guarantees and a very specific access model.

Once you understand that shift, a lot of cloud infrastructure starts to make more sense.

The mental model most people start with

When people first see object storage, they imagine something like this:

/photos/cats.png
/photos/dogs.png

A hierarchical file system:

folders
subfolders
files inside directories

This is how traditional systems like ext4 or NTFS work.

But object storage doesn’t actually work this way.

The actual model: key → object

Object storage is much simpler at its core:

key → value

Example:

key: photos/cats.png
value: <binary data>

There are no real folders.

“folders” are just string prefixes used for organization.

That’s it.

Why this design exists

This model isn’t accidental. It solves real distributed system problems.

Traditional file systems struggle when you try to:

scale across many machines
replicate data reliably
handle partial failures
coordinate metadata changes at scale

Object storage avoids many of these problems by simplifying the model.

Instead of supporting complex file operations, it focuses on:

store object
retrieve object
delete object
list objects by prefix

Nothing more.

The most important design choice: immutability

In most object storage systems:

Objects are not modified in place.

If you “update” a file, what actually happens is:

upload a new object
replace the key pointer
old object becomes orphaned (eventually cleaned up)

This is a huge shift from file systems.

Why this matters

Immutability makes distributed systems easier because:

no concurrent write conflicts on the same object
replication becomes simpler
caching becomes safer
failure recovery is easier to reason about

What object storage optimizes for

Object storage is not trying to be fast at small operations.

It is optimized for:

high durability (data should not be lost)
horizontal scalability
large objects (MBs → TBs)
simple access patterns

This is why it works well for:

backups
media storage
data lakes
AI datasets
logs and archives

Why listing objects feels “expensive”

One confusing thing for newcomers:

Why is listing objects slower than expected?

Because there is no real directory structure.

To list:

photos/

The system actually has to:

scan keys
match prefixes
aggregate results across storage nodes

This is a distributed query problem, not a simple filesystem lookup.

Where systems like RustFS fit in

While studying object storage systems, I’ve been looking at designs like RustFS, an open-source distributed object storage system built in Rust.

It becomes easier to understand such systems once you realize:

metadata is the hardest part, not storage
consistency decisions matter more than raw throughput
failure handling defines system behavior more than normal execution paths

We’ll go deeper into these ideas in the next parts.

Key takeaway

If you remember just one thing:

Object storage is not a folder system. It is a distributed key-value system optimized for durability and scale, not navigation and mutation.

Next in this series

In Part 2, I’ll break down:
“Why distributed storage systems need metadata engines (and why they are the hardest part)”

DEV Community