DEV Community

Cover image for Architecture Layers That S3 Files Eliminates — and Creates

Architecture Layers That S3 Files Eliminates — and Creates

On April 7, 2026, AWS made Amazon S3 Files generally available. It lets you mount S3 buckets as NFS v4.1/v4.2 file systems from EC2, EKS, ECS, and Lambda.

There are already plenty of setup guides and first-look posts. This article focuses on something different: what becomes unnecessary and what becomes possible in your architecture.

If you use S3 regularly and are wondering "this sounds big, but how does it actually affect my architecture?" — this is for you.

The Problem S3 Files Is Solving

Let's start with a shared understanding.

Say an ML team needs to preprocess training data. The raw data is in S3. They want to use pandas. While pd.read_csv("s3://my-bucket/data.csv") works, under the hood boto3 issues GET requests and loads data into memory. Writing results back requires PUT. This is fundamentally different from open("./data.csv").

At scale, this becomes an architectural problem. Many organizations operate pipelines like this:

Copy from S3 to EFS/EBS, process, write results back to S3. This "middle copy layer" exists solely to bridge the I/O model gap between object storage and file systems. Maintaining sync scripts, managing consistency during copies, and provisioning EFS — all of this overhead comes from that gap.

S3 Files aims to eliminate this gap entirely.

From the application's perspective, S3 data appears as a local directory. pd.read_csv("/mnt/s3files/data.csv") reads from S3 behind the scenes, and df.to_csv("/mnt/s3files/result.csv") automatically commits changes back.

The full technical overview is in the official documentation.

Why This Isn't Just Another Mount Feature

If "mount S3" sounds familiar, you might be thinking of Mountpoint for Amazon S3 or Google Cloud's Cloud Storage FUSE (gcsfuse). S3 Files has a fundamentally different architecture.

The Difference from FUSE-Based Tools

FUSE-based tools emulate file system behavior on top of S3's API. In Mountpoint for Amazon S3, for example, overwriting a file means deleting the old object and PUTting a new one. Partial file writes — a basic file system operation — aren't supported. Directories don't actually exist, leading to inconsistencies with empty directories.

S3 Files doesn't emulate. It connects EFS (Elastic File System), a real NFS file system, to S3. The file system side provides real NFS semantics, and the S3 side remains real S3 objects. Two distinct systems coexist with an explicit synchronization layer between them.

This matters in practice: appending to a WAL (Write-Ahead Log) or editing part of a config file works with byte-level writes on the file system side, periodically synced to S3 as whole objects. With FUSE, these operations require re-PUTting the entire object.

What "Stage and Commit" Actually Does

Andy Warfield, VP and Distinguished Engineer at AWS, describes the sync model as "stage and commit" in his post on All Things Distributed, explicitly noting it's "a term borrowed from version control systems like git" (official documentation uses "synchronization" instead).

File system changes are like working directory changes in Git. They aren't immediately reflected in S3 — instead, they're batched and committed as S3 PUTs approximately every 60 seconds. In the other direction, when objects are updated in S3 (e.g., via PutObject from another application), the official documentation states changes are reflected in the file system "typically within seconds." DevelopersIO's hands-on testing measured approximately 30 seconds.

Amazon S3 Files が GA — S3 バケットをファイルシステムとしてマウント、EFS と比較してみた | DevelopersIO

2026年4月提供開始のAmazon S3 Filesは、S3バケットをNFS v4.2でマウント可能にする新サービス。EC2/Lambda/EKS/ECSから利用でき、既存レガシーアプリケーションのコード変更なしでS3を活用できます。

favicon dev.classmethod.jp

If both sides modify the same file simultaneously, S3 is the source of truth. The file system version is moved to a lost+found directory, with a CloudWatch metric indicating the conflict.

This is a deliberate tradeoff: not a real-time shared file system, but one that tolerates tens of seconds of delay in exchange for preserving both file and object semantics without compromise.

According to Warfield's post, the team initially tried to make the boundary between files and objects invisible, but every approach forced unacceptable compromises on one side or the other. They ultimately decided to make the boundary itself an explicit, well-designed feature. His post is essential reading for understanding the "why" behind S3 Files.

S3 Files and the changing face of S3 | All Things Distributed

Andy Warfield writes about the hard-won lessons dealing with data friction that lead to S3 Files

favicon allthingsdistributed.com

Architecture Layers That Disappear

Here's the core of this article: what specific architectural patterns does S3 Files make unnecessary?

1. S3 → EFS/EBS Staging Pipelines

Consider a daily retraining pipeline for a recommendation model. Purchase logs accumulate in S3, and preprocessing involves data cleansing → feature generation → format conversion.

Previously, every time an EC2 or SageMaker Processing Job starts, it first downloads data from S3 to EBS. For 100GB of training data, depending on instance network bandwidth, the download alone takes several minutes. After processing, results are uploaded back to S3, and the EBS volume is cleaned up. Of the four steps — download → process → upload → cleanup — only "process" is the actual work.

With S3 Files, you mount the S3 prefix (e.g., s3://ml-data/purchase-logs/) and your processing script reads and writes /mnt/s3files/purchase-logs/ directly. Download, upload, and cleanup steps disappear.

Note: if a downstream job needs to read results via the S3 API immediately, the ~60-second commit delay matters. If both jobs use the same mount point, this isn't an issue. For S3 API consumers, design around S3 event notifications or explicit waits.

2. Lambda's "/tmp Download" Pattern

Consider a Lambda function that generates thumbnails when images are uploaded to S3. The traditional implementation:

# Traditional: Download → Process → Upload
import boto3
from PIL import Image

s3 = boto3.client('s3')

def handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    download_path = f'/tmp/{key.split("/")[-1]}'
    s3.download_file(bucket, key, download_path)

    img = Image.open(download_path)
    img.thumbnail((128, 128))
    thumb_path = f'/tmp/thumb_{key.split("/")[-1]}'
    img.save(thumb_path)

    s3.upload_file(thumb_path, bucket, f'thumbnails/{key}')
Enter fullscreen mode Exit fullscreen mode

With S3 Files mounted:

# S3 Files: Operate directly on mounted paths
from PIL import Image

def handler(event, context):
    key = event['Records'][0]['s3']['object']['key']

    img = Image.open(f'/mnt/s3files/{key}')
    img.thumbnail((128, 128))
    img.save(f'/mnt/s3files/thumbnails/{key}')
Enter fullscreen mode Exit fullscreen mode

You don't even need to import boto3. The same code you'd write for local development works as-is.

Beyond code simplicity, Lambda functions are freed from /tmp capacity constraints (default 512MB, max 10GB). For functions referencing multi-GB ML models, cold start download time directly impacted latency. S3 Files pre-fetches files below a configurable threshold (default 128KB) alongside metadata, and fetches larger files on demand. Warfield calls this "lazy hydration" in his post — you can start working immediately even with millions of objects in the bucket.

3. Self-Managed EFS + S3 Sync

If your organization uses S3 as a data lake but needs EFS for real-time processing or interactive analysis, you likely have DataSync, Step Functions, or cron scripts bridging the two. Maintaining this sync logic — detecting new objects, identifying diffs, retry on failure, consistency during sync, cleanup of stale EFS files — is a significant operational burden.

S3 Files replaces this with managed synchronization. Per the official documentation, import from S3 runs at up to 2,400 objects/second, and export to S3 uses ~60-second batch windows. Unused file data is automatically evicted from the file system cache (configurable from 1 to 365 days, default 30) but never deleted from S3. File system storage costs scale with your active working set, not your total dataset.

4. Adapter Layers for Legacy Applications

Log aggregation tools watching /var/log/, build systems reading from /src/, config management tools writing to /etc/ — these applications assume open() / read() / write() and rewriting them for the S3 SDK is often impractical.

Previously, "put files on EFS, back up to S3 as needed" was the pragmatic solution. S3 Files lets you keep S3 as primary storage while applications access it via NFS mount. POSIX permissions and file locking (flock) are supported, making migration possible with a mount point change and zero code changes.

New Architecture Patterns

What becomes practically feasible for the first time?

1. Two-Tier Read Optimization

S3 Files uses a two-tier architecture internally. The first tier, "high-performance storage," caches small, frequently accessed files with sub-millisecond to single-digit millisecond latency per the official documentation. The second tier is S3 itself — reads of 1MB or larger are streamed directly from S3 even if data is cached locally, because S3 is optimized for throughput. Notably, these large reads incur only S3 GET request costs with no file system access charge.

Official performance specifications:

Metric Value
Max read throughput per client 3 GiB/s
Aggregate read throughput per file system Terabytes/s
Max read IOPS per file system 250,000
Aggregate write throughput per file system 1–5 GiB/s (varies by region)
Max write IOPS per file system 50,000

For context: EBS gp3 provides 125 MiB/s baseline throughput, scalable to 2,000 MiB/s (~2 GB/s) with additional provisioning. io2 Block Express maxes out at 4 GB/s. S3 Files delivers comparable read throughput without any volume provisioning.

From spec values alone: reading a 100GB dataset sequentially takes ~13 minutes at gp3 default (125 MiB/s) versus ~33 seconds at S3 Files maximum (3 GiB/s). Actual throughput depends on workload and instance type, but the order-of-magnitude difference matters. And since 1MB+ reads are billed at S3 GET rates only, heavy sequential reads essentially incur no file system charges.

2. Large Reference Data in Lambda

Previously, Lambda functions using large reference data had three options: container images with embedded models (max 10GB, rebuild on every model update), EFS mounts (requires VPC, tends to increase cold starts), or S3 downloads to /tmp (max 10GB, download time added to cold starts). S3 Files is a fourth option: mount the S3 prefix, read model files via the file system. Model updates require only an S3 upload — no Lambda redeployment needed.

Unlike EFS mounts, the backend is your standard S3 bucket, so S3-native features like versioning, lifecycle policies, and cross-region replication work as-is.

3. AI Agent Access to S3 Data

Coding agents like Claude Code, Codex, Kiro, and Cursor use file system operations as their primary data access method: ls to list files, cat to read, editor to modify and save. It's the Unix toolchain.

Of course, agents can access S3 through other means — running aws cli commands, calling S3 APIs via MCP servers or Skills/Powers, generating boto3 code. But all of these are indirect compared to file operations and add reasoning steps. To search S3 logs, a file system lets you write grep -r "ERROR" /mnt/s3files/logs/ in one line, while the S3 API requires listing objects, downloading individually, and searching locally.

With S3 Files mounting the bucket, this indirection disappears. To the agent, S3 data is just another directory under /mnt/s3files/.

Warfield's post describes AWS engineering teams using Kiro and Claude Code hitting the problem of agent context windows compacting and losing session state. With S3 Files, agents write investigation notes and task summaries to shared directories, and other agents read them. When sessions end, state persists on the file system for the next session.

File locking (flock) supports mutual exclusion across agents and processes. However, S3 API access bypasses file locks — if you write from both the file system and S3 API simultaneously, locking won't protect you.

Constraints and Decision Criteria

S3 Files isn't universal. Key constraints to evaluate:

Commit Interval: ~60 Seconds (By Design)

Writes take ~60 seconds to appear as S3 objects. If job B reads via S3 API immediately after job A writes via the file system, job B may see stale data.

This isn't just a limitation — it's a cost optimization. Per the official documentation, consecutive writes to the same file are aggregated within the 60-second window and committed as a single S3 PUT, reducing S3 request costs and versioning storage overhead.

Sync throughput per the official performance specification: S3 → file system at up to 2,400 objects/s and 700 MB/s; file system → S3 at up to 800 files/s and 2,700 MB/s.

No "commit now" API exists at GA. Warfield mentions this as an area for future improvement. Workarounds: pass data between jobs via the file system (same mount point), or trigger downstream jobs via S3 event notifications.

Rename Costs

S3 has no native rename. File system renames are implemented as copy + delete internally. Per the official performance specification, renaming a directory of 100,000 files completes instantly on the file system, but takes several minutes to reflect in the S3 bucket. During that window, the file system shows the new path while S3 still has the old keys. S3-side request costs (100K CopyObject + 100K DeleteObject) are also non-trivial.

Buckets Exceeding 50 Million Objects

Warfield's post warns about mounting buckets with more than 50 million objects (this figure doesn't currently appear on the official quotas page). Consider mounting a specific prefix to narrow the scope.

VPC Requirement

Mount targets live inside a VPC. Lambda functions and EC2 instances must connect from subnets in the same AZ as the mount target. Per the official documentation, supported compute services are EC2, Lambda, EKS, and ECS. On-premises or cross-cloud resources are not in the supported list.

Namespace Incompatibilities

Some S3 object keys can't be represented as POSIX filenames: keys ending with /, keys containing POSIX-invalid characters, or path components exceeding 255 bytes. See the official quotas page for the full list.

This is intentional. Per Warfield's post, the team chose to pass through the vast majority of keys that work in both worlds and emit events for incompatible ones rather than silently converting them.

Versioning Required

S3 Files requires S3 bucket versioning. For existing buckets, evaluate the storage cost impact (old versions are retained) and compatibility with existing lifecycle rules.

Decision Flowchart

What to Review in Your Existing Architecture

First, inventory your pipelines for "copy from S3, process, write back to S3" patterns. Batch processing and ML preprocessing pipelines with EBS/EFS staging layers are prime candidates for replacement.

Second, consider how storage choices change for new projects. "Put it in S3 now, access it as a file system later" is now a viable strategy, reducing the urgency of early "object vs. file system" decisions.

Third, audit Lambda functions that explicitly download to / upload from /tmp. Functions handling large reference data or sharing data across invocations are worth evaluating.

S3 started 20 years ago as an object store. With Tables, Vectors, and now Files, it has expanded how data can be accessed. S3 Files removes one more architectural constraint imposed by storage choices. It won't apply to every workload, but for organizations where "the data is in S3 but the tools need a file system" — and that's a lot of organizations — the impact is significant.

Top comments (0)