DEV Community

Mathew Pregasen
Mathew Pregasen

Posted on

Why S3 Performance Limits Matter — and How Archil Solves Them

Many enterprises rely on AWS S3 as the backbone of their data storage strategy because of its immense scalability, global reach, and extreme durability measured in eleven nines. Everything from audit logs and backups to machine learning datasets often ends up living on S3.

But S3 is not a file a system, it’s an object store—an important difference.

This means that S3 wasn’t designed to handle low-latency, high-frequency access or POSIX-style workloads. It’s missing crucial file system features like atomic renames, file locking, shared caching, and sub-millisecond response times. Even though it’s a common practice, treating S3 like a traditional file system often leads to performance bottlenecks, unpredictable behavior, and the need for engineering workarounds.

As data volumes increase and concurrency requirements becoming more demanding, developers need the durability of S3 paired with the speed and ease of a local file system without the complexity of managing it all.

Today, we’ll dive into the main performance limitations of S3 and examine how a service like Archil addresses these challenges, enabling performant, cloud-native workloads.


S3 Explained: Capabilities and Misconceptions

So why doesn’t S3 perform well as a file system? Let’s first take a look at its initial purpose and the use cases it was designed to support.

What S3 Is Designed For

Amazon S3 is a globally distributed object storage service built for extreme scale and exceptional durability. Its primary features are:

  • Immutable Object Storage — Data is stored as immutable “objects” (up to 5 TiB each) in a flat structure, each with a unique identifier. To update data, a new object is created rather than modifying the existing one.
  • Strong Durability and Availability — Automatic replication across multiple Availability Zones (AZs) and utilizes erasure coding to achieve 99.99999999999% durability and 99.99% availability.
  • Unlimited Scalability — Horizontal partitioning across distributed nodes based on key prefixes to handle trillions of objects and exabytes of data without manual intervention.
  • API-Driven Access — Exposes a RESTful HTTP interface and SDKs, instead of POSIX calls, for easy integration across languages and platforms.
  • Strong Consistency — Guarantees immediate updates and read-after-write consistency for all PUTS and DELETES.

These features make S3 perfect for write-once, read-many use cases such as data-lake partitions, archival backups, or immutable machine learning training datasets.

In such scenarios, the emphasis is on high durability, availability, and scale. Not fast random access or full POSIX file-system features.

Common Misconceptions About S3

With that being said, S3 is often used incorrectly due to common misunderstandings:

  1. “S3 is a POSIX File System” — S3 does not support POSIX semantics. For starters, it lacks 1) atomic renames, 2) file locking, 3) symbolic links, and 4) directory inodes. Applications that depend on these features are prone to failure or unexpected behavior. To compensate, developers have to build complex coordination layers, custom lock services, and copy-delete hacks, which inevitably undermine performance.
  2. “FUSE Adapters Provide Native Semantics” — While tools like s3fs and Mountpoint for S3 let you mount a bucket, they don’t guarantee genuine filesystem behavior. They locally buffer and asynchronously replay operations, which can cause problems like timeouts, stale reads, out of order writes, and caching errors with concurrent access.
  3. “Metadata Operations Are Inexpensive” — Although each individual LIST, GET Bucket , and object metadata calls may seem inexpensive, these operations add up, involve API call overhead, and potential rate throttling. These S3 calls have to traverse distributed indexes and are not meant for high-frequency use.
  4. “Throughput and IOPS Scale Linearly Without Effort” — S3 imposes rate limits per prefix and throughput restrictions per connection. Without implementing prefix sharding and parallel streams, exceeding these thresholds can lead to throttling, higher latencies, and request failures.
  5. “Latency is Negligible” — In reality, object access latencies can vary significantly. If you need fine-grained, random access, then latency can be vastly greater than that of local or block storage.

Such misunderstandings demonstrate why using S3 as a file system is fundamentally an anti-pattern—it’s exactly why solutions like Archil exist. Next, let’s see the architectural limitations of S3 that lead to these issues.


Core S3 Performance Limitations

a. Prefix Partition Limits

S3 relies on prefix-based partitioning for scaling of objects and requests. Each distinct prefix in a bucket acts as a separate data shard, with S3 allocating both storage and I/O resources.

Because of this, AWS has strict per-prefix request limits at 3,500 PUT/POST/DELETE and 5,500 GET/HEAD operations per second or prefix. If an application funnels all its traffic through a single prefix, it will rapidly hit these limits and face throttling regardless of the actual bucket capacity of concurrent clients.

To prevent this bottleneck, developers need to implement key-naming strategies such as hashing or time-based prefixes to distribute requests across partitions.

This does, however, introduce additional complexity as developers must build custom logic for prefix distribution. On top of that, read and list operations often require scanning multiple pseudo-directories to rebuild the complete dataset.

b. Per-Connection Throughput Caps

Each TCP connection to S3 is capped at about 80 MiB/s, regardless of the EC2 instance’s network capability or EBS throughput. S3 enforces these limits by regulating connection handoffs and buffer sizes, so that resources are distributed fairly and that system stability is maintained for all tenants. This approach causes:

  1. Single Stream Bottleneck: Even on a 100Gbps instance, the transfer speed of a single GET or PUT request is capped at around ~80MiB/s. When dealing with objects larger than 5 MiB, S3 splits them into smaller parts for multipart transfers where each part fits in the per-connection cap.
  2. Client-Side Parallelism Required: To overcome this limitation, applications must establish several simultaneous connections and coordinate them concurrently. For tasks needing 1 GiB/s, this usually involves managing a minimum of 13 parallel streams (~77 MiB/s per stream), as well as incorporating thread pools, retry logic, and back-pressure handling.
  3. Operational Complexity: Setting up efficient concurrent connections adds considerable engineering overhead:
    1. Synchronization of part writes and reads.
    2. Error Handling for failed streams.
    3. Load Balancing to prevent overloading any prefixes.
    4. Monitoring Performance to identify and recover from partial-throttle events.

💡 Monitoring and observability are essential, especially with systems like S3 that may hit hidden limits. Platforms like Mezmo can help by tracking latency patterns, highlighting throttling events, and sending alerts for unusual activity. With proactive monitoring, you can catch bottlenecks before they affect performance.

These per-connection ceilings force developers to create custom multiplexing layers, adding complexity and making their system more prone to failure.

c. Latency and IOPS

S3 operations introduce 10-100ms of round-trip delay per request, which is much slower than local NVMe or even the sub-millisecond latencies of networked block storage. This added delay is due to the HTTP API processing, authentication, and multi-AZ replication. Performing a high frequency of small-object reads or metadata queries causes delays to accumulate and noticeably slow down random-access workflows.

S3’s performance is also limited by API rate caps and network capacity. Unlike block storage, you cannot just adjust IOPS in the settings. Instead, you need to distribute requests across multiple prefixes or set up parallel connections. High_IO tasks can quickly hit these limits, leading to throttling or higher error rates.

d. Lack of POSIX Semantics

S3 is not a POSIX-compliant file system. It uses a flat object storage model accessible via HTTPS APIs, lacking the hierarchical structure and system-level primitives expected by applications. It thus omits essential POSIX features, including:

  • File Locking: Without flock() or fcntl(), concurrent systems can’t coordinate writes or avoid race conditions.
  • Atomic Renames: The rename() operation isn’t available. Renaming requires copying it and then deleting the original.
  • Symbolic Links: S3 does not support inodes or links; each object is standalone, identified by its unique key.
  • Random Writes: Because objects are immutable, you can’t modify a specific byte range in place. To update, the entire object must be re-uploaded (or use multipart uploads for larger objects).

Applications designed for POSIX semantics, especially data-processing tools, may exhibit unpredictable behavior on S3.

Without point-in-time consistency, locks, or atomic directory operations, workflows encounter data corruption, dropped files, and subtle errors. This fundamental mismatch makes S3 unsuitable for workloads that rely on true filesystem behavior.

Real-World Impact on Workloads

These limitations of S3 can, and do, lead to performance bottlenecks.

For example, ML training jobs that handle thousands of small files face high per-request latency and prefix throttling, often resulting in wasted compute resources. ETL pipelines must use custom staging and lock services to compensate for S3’s lack of atomic operations. POSIX-dependent tools and research workflows often face race conditions and missed errors. Teams using spot or ephemeral instances have to create local caches or synchronization layers, which can cause startup delays and increases the risk of outdated data.


Why Archil Exists: Closing the Gap Between S3 and POSIX

S3 is a go-to-choice for its scalability, durability, and effortless integrations within the cloud ecosystem. It is pay-per-use, has enormous capacity, and is natively supported in data pipelines.

As usage increases, so do the challenges: throttled prefixes, slow metadata retrieval, the absence of POSIX functionality, and limited connection throughput. These aren’t exceptions—they’re everyday hurdles for teams working on advanced ML pipelines, real-time applications, and complex ETL workflows.

To support these teams, Archil was created: to connect S3’s object storage model with the POSIX-compliant file systems that developers are accustomed to.

What Archil Does: File System Performance, Backed by S3

With Archil, your S3 buckets become high-performance, POSIX-compliant local file systems. As a fully-managed, durable, high-speed caching layer, it sits between your compute environment and object storage to deliver fast, consistent access to large datasets without extra infrastructure overhead or capacity planning.

Built for Performance: Low Latency, High Throughput, Zero Tuning

Applications can set up Archil without changes to the codebase via an encrypted NFSv3 connection. Archil maps each file operation to the correct S3 API call, as a centralized cache manages both data and metadata. This creates a smooth, high-performance file system experience supported by S3 without the typical drawbacks.

  • Latency: Reads and writes returned from the cache are near-instant. In the event of a cache miss, Archil retrieves the object from S3 in 10-30 ms, faster than fetching from S3 directly.
  • Throughput & IOPS: By default, each file system provides up to 10 Gbps and 10,000 IOPS (higher tiers are available upon request).
  • POSIX Compliance: Archil offers complete support for file locking, renaming, symbolic links, and random writes—your applications work just like they would on a local filesystem, while still leveraging the scale, durability, and cost benefits of S3.

S3 Alone vs. S3 via Archil

When applications need low latency, concurrent access, or full POSIX compliance, the constraints of S3 becomes increasingly evident. The table below compares the direct use of S3 alone with the addition of Archil, illustrating where each approach excels:

Feature Raw S3 S3 via Archil
IOPS Scaling Limited by prefix structure & client-side logic 10,000 IOPS out of the box (scalable)
Infrastructure Overhead Requires custom retries, parallelism, staging logic Fully managed, no provisioning
Directory Operations Flat namespace, costly list calls Fast metadata cache, true directory behavior
Concurrent Access No atomic coordination Safe concurrent reads/writes with built-in locking
Object Format Compatibility Native Native (no custom block format required)
Cross-Instance Cache No Yes, with a shared cache accessible by all clients
Write Syncing Immediate, but expensive Asynchronous, batched, cost-optimized
Mount Support No native file system interface NFSv3 with TLS encryption
Data Availability Pre-Sync Depends on S3 sync delay 99.999% durability pre-sync

Raw S3 vs. Archil: Choosing the Best Storage Layer for POSIX, ML, and Real-Time Workloads

S3 has cemented itself as a pillar of cloud storage and modern system architecture. It shines as a scalable, cost-efficient object store, making it ideal for static archives, logs, and cloud-native analytics that work within the object storage model.

When your workflow requires file-system semantics and fast performance, the very strengths of S3 can become a burden. Limits of prefixes, connection throughput, and the lack of POSIX support can complicate development and force unnecessary solutions.

Archil addresses this need by adding high-performance caching, full POSIX support, and easy integration, without the need for infrastructure, code refactoring, or specialized tooling.

Stick with S3 when object storage is sufficient. But turn to Archil when your cloud workloads need low-latency access, traditional file semantics, and the scalability of S3.

Top comments (0)