You have files sitting in Cloudflare R2 and a user just clicked "Download All." Now what?
R2 doesn't have a built-in "zip these objects" operation. You need to figure it out yourself. After building a file processing API that has archived 550K+ files and 10TB+ on R2, here are the three approaches I've found — each with very different trade-offs.
Approach 1: Download Locally, ZIP, Re-upload
The most obvious approach. Pull files down, zip them on your server, upload the archive back to R2 (or serve it directly).
// Pseudocode
const files = await Promise.all(
keys.map(key => r2.get(key))
);
const zip = new JSZip();
files.forEach(f => zip.file(f.name, f.body));
const archive = await zip.generateAsync({ type: 'nodebuffer' });
// Serve or upload archive
Pros:
- Simple to implement
- Works with any ZIP library (JSZip, archiver, etc.)
- Full control over the ZIP structure
Cons:
- Memory killer. You're buffering everything. 100 files × 50MB = 5GB in RAM
- Slow. Download all → process → upload/serve is sequential
- Egress costs if your server isn't on Cloudflare. Pulling files from R2 is free (R2 has zero egress fees), but once the ZIP lives on your AWS/GCP server, serving it to the user or uploading it back to R2 means paying your cloud provider's egress fees
- Scaling requires extra infrastructure. You could run this on Fargate or similar, but you'd need to build the queue, orchestration, and autoscaling yourself — it's no longer "just zip these files"
Best for: Small archives (< 100MB total), infrequent use, prototyping.
Approach 2: Stream a ZIP in a Cloudflare Worker
Instead of buffering, you can construct a ZIP archive as a stream directly inside a Cloudflare Worker. The Worker fetches each file from R2 and pipes it into a ZIP stream that goes straight to the client.
// Simplified concept
export default {
async fetch(request, env) {
const keys = ['file1.pdf', 'file2.jpg', 'file3.csv'];
const { readable, writable } = new TransformStream();
const writer = writable.getWriter();
// Write ZIP local file headers + data for each file
for (const key of keys) {
const obj = await env.BUCKET.get(key);
await writeLocalFileHeader(writer, key); // sizes unknown at this point
// Stream file data through, computing CRC32 on the fly
const reader = obj.body.getReader();
let crc = 0;
while (true) {
const { done, value } = await reader.read();
if (done) break;
crc = crc32(crc, value);
await writer.write(value);
}
await writeDataDescriptor(writer, crc, obj.size);
}
// Write central directory at the end
await writeCentralDirectory(writer, entries);
await writer.close();
return new Response(readable, {
headers: { 'Content-Type': 'application/zip' }
});
}
};
Pros:
- Constant memory usage. Only one file chunk in memory at a time
- Zero egress fees. R2 → Worker → client, all within Cloudflare's network
- Streaming. Client starts downloading immediately, no waiting for the full archive
- Horizontal scaling. Workers handle many concurrent requests naturally — high throughput isn't a problem
Cons:
- You have to implement ZIP yourself. Local file headers, data descriptors, CRC32, central directory, ZIP64 extensions — it's a lot of spec to get right
- 128MB Worker memory limit. You can't cheat with buffering even if you wanted to
- Per-archive size is limited. Workers have a 15-minute wall clock limit and a subrequest cap per invocation, so large archives (tens of GB+) won't complete in a single run. Working around this requires a significantly complex checkpoint/resume system — serializing CRC32 mid-computation, byte offsets, multipart upload state, and resuming exactly where you left off
- Error handling is brutal. If file #500 of 1000 fails mid-stream, you've already sent 499 files to the client. The HTTP response is in-flight — you can't restart or send an error code. The client just gets a truncated ZIP
- CRC32 must be computed on-the-fly since you can't seek back to update headers (Data Descriptors solve this, but not all ZIP readers support them well)
Best for: Production systems that need to handle large archives at scale, if you're willing to invest in the implementation.
Approach 3: Use a ZIP API Service
Instead of building and maintaining streaming ZIP infrastructure yourself, use an API that handles it for you. You send a list of R2 URLs (or presigned URLs), and get back a ZIP.
curl -X POST https://api.eazip.io/jobs \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"files": [
{ "url": "https://your-bucket.r2.dev/file1.pdf" },
{ "url": "https://your-bucket.r2.dev/file2.jpg" },
{ "url": "https://your-bucket.r2.dev/file3.csv" }
]
}'
The service handles streaming, CRC32, ZIP64, error recovery — everything from Approach 2 — so you don't have to.
Pros:
- One API call. No ZIP implementation to build or maintain
- Handles edge cases you don't want to think about (ZIP64 for large files, Data Descriptors, checkpoint/resume for failures)
- Zero egress if the service also runs on Cloudflare's network
- Scales to 5,000+ files per archive, up to 50GB
Cons:
- Third-party dependency. You're relying on an external service
- Cost. Free tier exists but large-scale usage has costs
- Less control over ZIP structure details
Best for: Teams that want ZIP functionality without building ZIP infrastructure. Ship in an afternoon instead of a sprint.
Full disclosure: I built Eazip because I went through Approach 2 myself and realized most teams shouldn't have to.
Comparison
| Download + ZIP | Stream in Worker | ZIP API | |
|---|---|---|---|
| Memory | O(total size) | O(chunk size) | N/A |
| Egress cost | Depends on server location | $0 | $0 |
| Max archive size | Limited by server RAM/disk | Limited by wall clock (15 min) | 50GB |
| Implementation time | Hours | Weeks | Minutes |
| Maintenance | Low | High (ZIP spec edge cases) | None |
| Error recovery | Easy (retry all) | Hard (mid-stream failures) | Built-in |
Which Should You Pick?
Prototyping or small files? → Approach 1. Just use JSZip and move on.
Production with scale requirements? → Approach 2 if you want full control and have the engineering bandwidth, or Approach 3 if you'd rather ship the feature and focus on your core product.
The reality is that ZIP is a deceptively complex format. What starts as "just zip these files" turns into weeks of handling CRC32 streaming, ZIP64 thresholds, Data Descriptor compatibility, and mid-stream error recovery. I learned this the hard way after archiving 10TB+ of files.
What's your approach? Have you tried something different? Let me know in the comments.
Top comments (0)