From Buckets to File Systems: Making Amazon S3 Feel Like Home (Without Breaking It)

#aws #s3 #filesystem #newfeature

The Reality Behind “Just Put It in S3”
Let’s be honest... S3 is amazing… until it isn’t.

If you’ve spent any time working in cloud environments, you’ve likely heard the phrase: “Let’s just move it to S3.” And to be fair, Amazon S3 is one of the most powerful and widely adopted services in the cloud. It offers virtually unlimited scale, extremely high durability, and a cost model that makes storing massive amounts of data both practical and predictable.

But this is also where things tend to go wrong.

As a Technical Account Manager, I’ve seen this pattern repeat itself across different customers and industries. Teams migrate workloads to S3 expecting it to behave like the file systems they’ve used for years. At first, everything seems fine, but soon enough issues start creeping in. Applications behave unpredictably, scripts fail in subtle ways, performance doesn’t match expectations, and costs begin to climb in ways that are difficult to explain.

None of this happens because S3 is flawed. It happens because it is being used with the wrong assumptions.

Teams want cheap, durable storage

They move to Amazon S3

Then reality hits: “Why can’t I just mount it?” “Why doesn’t ls work properly?” “Why is my app breaking?”

That’s because S3 is not a file system — it’s object storage. And that difference matters more than most teams expect.

The Mental Model Shift (That Everyone Skips)
S3 works fundamentally differently. At its core, S3 is not a file system. It is object storage, and that distinction is not just technical nuance; it fundamentally changes how data is stored, accessed, and managed.

Unlike traditional file systems, S3 does not have real directories, even though it may appear that way in consoles and tooling. What looks like a folder structure is simply a naming convention built on prefixes. There is no native concept of file locking, and it does not follow POSIX standards that many applications rely on implicitly. Even operations that seem simple, such as renaming files or listing directories, behave very differently under the hood.

The challenge is that most applications were not designed with this model in mind. They expect a mounted path, predictable file operations, and consistent low-latency access. When those expectations collide with the reality of object storage, friction is inevitable.

Bridging the Gap Without Changing the Foundation
Enter: S3 as a File System (Finally… Kind Of)
AWS introduced ways to bridge the gap — making S3 behave more like a file system.

Mountpoint for S3 (Cloud-Native Approach)
High-performance access directly to S3

Designed for throughput-heavy workloads

Works well for: ML training Data processing pipelines Batch jobs

S3 File Gateway (Hybrid / Enterprise Friendly)
Uses standard protocols (NFS / SMB)

Adds: Local caching Low-latency access Seamless integration with existing apps

Think of it as:

“Make S3 look like a NAS… but backed by cloud-scale storage”

Both approaches are valuable, but they are not magic solutions. They do not turn S3 into a traditional file system; they simply provide a more familiar interface for interacting with it.

Where Things Still Break
Even with these abstractions in place, certain limitations remain, and this is where many teams get caught off guard.

S3 does not suddenly gain full file system semantics. Operations that depend on strong consistency guarantees, file locking, or rapid metadata changes can still behave differently than expected. Performance is also highly dependent on how the data is accessed. Workloads that involve large, sequential reads tend to perform exceptionally well, while those that rely on frequent, small, random writes often struggle.

Cost is another area where misunderstandings surface. With S3, you are not only paying for storage but also for how you interact with that data. Every request, whether it is a read, write, or list operation, contributes to the overall cost. Inefficient access patterns can quickly turn what seemed like a cost-effective solution into an unexpectedly expensive one.

What Works in Practice
A Strong Fit for Data-Driven Workloads
In practice, S3 shines in scenarios where its strengths align with the workload. Data lakes, analytics platforms, and machine learning pipelines are all excellent examples. These use cases benefit from S3’s scalability, durability, and ability to handle large volumes of data efficiently.

Use Carefully with Traditional Patterns
Challenges start to appear when S3 is used as a direct replacement for traditional shared storage. Applications that expect low-latency file access or rely heavily on small file operations may require additional design considerations or alternative services.

A Better Way to Approach It
The most valuable shift is not technical; it is conceptual.

Instead of asking how to make S3 behave like a file system, it is more effective to ask whether the workload actually needs a file system in the first place. In many cases, what appears to be a requirement is simply a legacy assumption carried over from on-premises environments.

When workloads are redesigned to align with object storage principles, the benefits of S3 become far more apparent. Complexity is reduced, performance improves, and costs become easier to manage. When that redesign is not possible, using the right abstraction layer thoughtfully can still provide a viable path forward.

TL;DR
S3 is object storage — not a file system

New features make it feel like one

Great for throughput-heavy workloads

Dangerous if you treat it like traditional storage

The real skill: choosing the right abstraction for the workload