You've written this code before. An S3 event fires, your Lambda function wakes up, and the first thing it does is download a file to /tmp. Process ...
For further actions, you may consider blocking this person and/or reporting abuse
Mounting S3 as a local file system on Lambda with S3 Files is a great solution for avoiding the
/tmpjuggling act. I'm curious about the VPC setup, though. It seems to add complexity to something meant to simplify things. How do you manage the increased cold start times with the VPC requirement? If you're getting ready for system design interviews, PracHub has some good question banks that really reflect what interviewers ask, unlike trying to piece things together from blog posts.This pattern is going to land hard for code-analysis and document-processing agents — the "/tmp tax" is a real and unromantic cost that nobody talks about until they've written their fifth boto3 download-process-upload-cleanup wrapper. Two operational caveats worth raising for anyone considering this for production agents though:
First, S3 Files semantics are not POSIX in the way agents tend to assume. If your orchestrator and your two reviewer Lambdas all "write" to the same path concurrently, you don't get atomic last-writer-wins; you get S3's eventually-consistent object semantics underneath a filesystem-shaped API. For your security/style reviewer split where outputs are distinct paths this is fine — but the moment you have two agents updating a shared state file (e.g., a tasks.json), you need an explicit lease/lock or you'll get torn writes.
Second, the cold-start angle deserves measurement, not a one-line "no longer has the penalty" claim. VPC + S3 Files + Bedrock SDK + Strands runtime is a meaningful cold-start surface; for orchestrators that fan out 10+ agent invocations on demand, p99 is still where the user pain lives.
Genuinely excited about this primitive though — it makes the "agents as sandboxed processes with a shared working directory" mental model finally cheap enough to use.
Appreciate the thoughtful follow-up — both points are worth addressing.
On consistency: S3 Files provides full NFS v4.2 file system semantics — including read-after-write consistency, file locking, and POSIX permissions. It's not raw S3 eventual consistency exposed through a filesystem-shaped API. When a Lambda closes a file handle, the data commits to the high-performance storage layer and is visible to other clients on their next open (close-to-open semantics). Advisory locks via flock/fcntl are supported at the NFS layer.
That said, your instinct is right in spirit: concurrent writes to the same path from multiple Lambda invocations still need coordination. NFS close-to-open means "visible after close" — if two agents have the same file open simultaneously, last-close wins. For the architecture in the post, the agents write to distinct paths by design (each agent owns its output file), so this isn't an issue. But if you were building a shared tasks.json pattern, you'd want either distinct files per agent (append-only log) or advisory locks. I'd argue the better design is to keep the orchestrator as the single writer for shared state — which is what durable functions give you naturally.
On cold starts: Fair challenge — I shouldn't hand-wave it. VPC-attached Lambda cold starts lost the ~10s ENI-attach penalty back in 2019 with Hyperplane networking, but the total runtime surface you're describing is real: VPC network setup + NFS mount + Bedrock SDK init + Strands runtime. I haven't published p99 numbers for this specific stack yet — that's a good follow-up post. What I can say: the NFS mount itself is fast once mount targets exist (sub-100ms in my testing), and Bedrock SDK initialization is dominated by the first inference call, not the client instantiation. SnapStart or provisioned concurrency would eliminate most of the tail, but that's a cost/latency tradeoff worth quantifying.
The "agents as sandboxed processes with a shared working directory" framing is exactly the mental model. Glad it resonates.
Shameless plug, but since cold starts came up: I recently benchmarked modern Lambda cold starts across runtimes with multi-concurrency and SnapStart in the mix. The results might surprise you if your mental model is still calibrated to 2021-era numbers: Cold Starts Are Dead
S3 Files eliminating the /tmp tax is a big deal for multi-agent workloads. The download-process-upload ceremony has always been the awkward part of running AI pipelines on Lambda — agents need shared state and intermediate artifacts, not isolated blob operations. With a mounted filesystem, agent-to-agent handoff becomes just file writes, which maps much better to how local agent frameworks already work. The interesting follow-up is whether this changes the cold start calculus for heavier ML workloads.
The /tmp tax and IAM gotchas are real. One thing I'd add: when agents run in environments like Lambda, keeping track of what happened across partial failures gets harder fast.
For production agents, I'd want a run record that captures: task input, tool calls made, args/results, retries, and final artifact — regardless of whether the Lambda run succeeded or timed out. Without that, debugging a failed run in a stateless environment turns into guessing.
This is the angle we're building Armorer around: a local control plane for operating agents, not another agent framework.
I love it!
In my case, this is exactly why the project uses durable functions as the orchestration layer — they are the run record.
Every step in a durable function is checkpointed: task input, each agent invocation, tool calls, results, retries, and final output. If a Lambda times out or fails mid-run, the execution history is already persisted. You pick up exactly where you left off — no guessing, no reconstruction. And worst case, the full execution history is there in the console for manual debugging — you can see every step, what it received, what it returned, and where it broke.
So the observability problem you're describing is real for raw Lambda agents (fire-and-forget invocations with no coordination layer). But once you put a durable orchestrator in front, that audit trail comes built in. The orchestrator is the control plane.
That said, I get the angle — there's value in runtime-agnostic tooling that works across frameworks, not just AWS-native. Different problem surface.
ran into this exact problem - agents writing partial state to /tmp that the next invocation couldn't see. ended up routing through S3 manually, which was a mess. how does the mounted bucket handle concurrent writes from parallel runs?
The
/tmptax is real — I've spent way too many hours writing boilerplate download/process/upload patterns in Lambda functions. The S3 Files mount approach is a huge quality-of-life improvement.The multi-agent workspace use case you described is particularly interesting. One thing I'd be curious about is how you handle write conflicts when multiple agents are working on the same files simultaneously. With the
/tmpapproach, each invocation had its own isolated copy, so conflicts weren't a concern. But with a shared mount, two agents could potentially write to the same file at the same time. Do you use any file-level locking, or is the orchestration layer (Step Functions or similar) responsible for serializing writes?Also, the VPC requirement is worth calling out more. For teams already running in VPC, it's a non-issue. But for those with simpler architectures, adding a VPC + NAT gateway just for S3 Files could be a significant operational overhead. I've seen some teams use EFS directly with Lambda for similar shared-state patterns, and the VPC requirement was always the main friction point. It's good that AWS is making this easier, but it's still something to factor into the architecture decision.
Interesting article. The file API is the easy sell, but the real design question is the consistency model.
If writes from Lambda reach S3 within minutes and S3 changes appear on the mount within seconds, then a shared workspace across concurrent functions is not the same thing as a normal local file system. It is closer to a cached coordination layer with lag. That matters a lot for the agent pattern you describe.
For this kind of workflow I would want very explicit ownership rules:
one writer per path
immutable inputs where possible
append-only or versioned outputs
a done marker or manifest file instead of assuming directory state is current
Otherwise it is easy to get stale reads, clobbered writes, or an agent consuming partial output from another step.
The other thing worth calling out is where this is a simplification versus where it is just moving complexity. It clearly removes a lot of
/tmpboilerplate. But in exchange you take on VPC setup, mount targets, access points, IAM wrinkles, and a file system abstraction over an object store. That trade can still be good, but it is not free.I also think readers should be careful about assuming POSIX-like semantics mean POSIX-like behavior under concurrency. For single-function or low-contention pipelines this looks great. For parallel workers sharing a repo-sized workspace, I would want some benchmarks and failure mode testing before treating the mount as the source of truth.
Still, this is a useful post because it focuses on the part people will actually care about in practice: does this make Lambda code less annoying to write. In many cases it probably does.
That /tmp tax really hits when dealing with large, transient files. For us, that's often voice data. When a user speaks 'kaaichal' (Tamil for 'fever') to GoDavaii, we're not just moving text-it's audio streams, processed by agents.\n\nEliminating the explicit
s3.download_filefor intermediate steps, where agents might pass chunks, streamlines...The persistent-filesystem-on-Lambda story finally closes a gap I've been working around with S3-as-tmp for years. The honest cost of "stateless functions + remote storage for everything" was never the storage bill — it was the boilerplate to checkpoint partial work and the inevitable bug where step 4 silently re-ran step 2's side effects.
One thing I'd love more clarity on: cold-start behavior when the FS is non-trivial. If an agent has been writing intermediate artifacts to /mnt for a long-running task and the function gets reaped, does the next invocation get a warm view of that filesystem, or are you reattaching from a backing store on the first hit? The latency profile on that matters a lot for whether you can build a "resume where you left off" pattern, vs. needing a separate orchestrator to track which steps completed. Either way, this is the right direction — agents need somewhere to put their scratch work that isn't a database row.
The shared S3 workspace approach is clever because it makes the coordination layer boring, which is exactly what you want. The gap I keep hitting with multi-agent Lambda setups is the networking piece. Once agents need to talk to each other across accounts or back to a local instance, you're wiring up auth and NAT from scratch per agent. Pilot Protocol (pilotprotocol.network) handles this. Agents get a persistent virtual address and encrypted tunnel regardless of which network they're on. Makes the communication layer feel as boring as the file access layer you described.
Putting a real filesystem under an agent changes the failure modes more than the capabilities. The interesting part is not what the agent can now write to disk; it is what happens when two invocations race on the same file and you discover the agent never had a concurrency model in its head. Lambda's previous statelessness was actually a feature for this exact reason. Worth thinking through which agent workloads benefit from persistence vs which were quietly relying on amnesia.