Big news for serverless developers — AWS Lambda’s response streaming limit has jumped from 20 MB → 200 MB by default.
That’s a 10× improvement, and it means fewer workarounds, less infrastructure, and simpler code.
What Changed?
Previously, if your Lambda needed to return more than 20 MB (e.g., PDFs, images, analytics dumps, AI output), you had to:
- Split or chunk the payload,
- Compress it, or
- Upload it to S3 and return a presigned URL.
That added more code, higher latency, and extra moving parts.
Now:
You can stream up to 200 MB directly from Lambda to your client.
No S3 handoff. No chunking. Just stream data as it’s ready.
Quick Facts
Limit Type | New (Streaming) | Old | Applies To |
---|---|---|---|
Response Size (Streaming) | 200 MB | 20 MB | Node.js managed & custom runtimes |
Response Size (Classic) | 6 MB | 6 MB | Buffered responses only |
Input Payload Size | 6 MB | 6 MB | Still capped |
Note: The 6 MB input limit still exists for requests into Lambda.
For uploads, use S3 presigned URLs or direct integrations.
Why This Matters
This change simplifies architectures for:
- Media Delivery — Serve full podcast episodes, videos, and big PDFs right from Lambda.
- Generative AI — Stream long text, images, or audio without waiting for completion.
- Data-heavy APIs — Deliver large CSVs, reports, or datasets without extra handling.
Because it’s streaming, users start receiving data immediately — improving Time-to-First-Byte (TTFB) and responsiveness.
Developer Tips
-
Use Streaming When:
- Responses are large and sequential.
- Processing produces output progressively (AI or analytics).
-
Mind the Input Limit:
- For uploads > 6 MB, use S3 presigned URLs, API Gateway → S3.
-
Runtime Support:
- Works out-of-the-box in Node.js managed runtimes.
- Other languages require a custom runtime.
In Summary
AWS Lambda’s new 200 MB response streaming means:
- Less code
- Lower latency
- Fewer AWS services to manage
For most big-response use cases, Lambda can now be a direct delivery engine — no buckets, no detours.
Tip: Combine with CloudFront or edge-optimized APIs for blazing global performance.
Top comments (1)
This is a huge leap forward for serverless apps! The jump from 20 MB to 200 MB streaming directly from Lambda removes so many previous hurdles no more juggling chunking or S3 uploads just to serve larger files. It simplifies architecture, cuts latency, and streamlines workflows, especially for media delivery and generative AI use cases. Streaming large payloads progressively also means better user experience with faster initial data. Excited to see how this will reshape serverless design patterns and cut down infrastructure complexity!