DEV Community: Ryan

How to Add 'Download All as ZIP' to Your SaaS in 30 Minutes

Ryan — Tue, 17 Mar 2026 05:36:59 +0000

Your users uploaded 200 photos. Now they want to download them all. What do you do?

The naive approach — loop through files, zip them on your server, serve the result — falls apart fast. Memory spikes with large files. Egress fees add up. You need temp storage, cleanup jobs, and error handling for partial failures.

I hit this exact wall building a file-sharing service that's now processed 550K+ files and 10TB+ of archives. After weeks of wrestling with ZIP64, streaming, and Cloudflare Workers' 128MB memory limit, I turned my solution into an API. Here's how you can skip that pain entirely.

The 30-minute version

Step 1: Get an API key

Step 2: Collect your file URLs

You already have these — they're in your database. S3 presigned URLs, R2 public URLs, any HTTPS endpoint that returns a file.

const fileUrls = await db.query(
  'SELECT file_url, original_filename FROM uploads WHERE project_id = ?',
  [projectId]
);

Step 3: One API call

const response = await fetch('https://api.eazip.io/jobs', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.EAZIP_API_KEY,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    files: fileUrls.map(f => ({
      url: f.file_url,
      filename: f.original_filename,
    })),
  }),
});

const { job_id } = await response.json();

Step 4: Poll for completion and redirect

async function waitForZip(jobId) {
  const MAX_ATTEMPTS = 60;
  for (let i = 0; i < MAX_ATTEMPTS; i++) {
    const res = await fetch(\`https://api.eazip.io/jobs/\${jobId}\`, {
      headers: { 'X-API-Key': process.env.EAZIP_API_KEY },
    });
    const { job } = await res.json();

    if (job.status === 'completed') return job.download_url;
    if (job.status === 'failed') throw new Error('ZIP job failed');

    await new Promise(r => setTimeout(r, 2000)); // wait 2s
  }
  throw new Error('ZIP job timed out');
}

// In your route handler:
const downloadUrl = await waitForZip(job_id);
res.redirect(downloadUrl);

That's it. Your users click "Download All", get a ZIP a few seconds later.

What you didn't have to build

ZIP64 support — files over 4GB just work
Streaming — constant memory regardless of archive size
Error recovery — if file #500 fails, the job retries from a checkpoint
Temp storage cleanup — signed URLs expire automatically
Egress optimization — zero egress fees on Cloudflare's network

When this makes sense

SaaS with user-uploaded files — "Download all attachments" in a support ticket, bulk photo export from a gallery
E-commerce — product image packs, digital goods delivery
Internal tools — compliance teams exporting 6 months of audit logs, database backups

When it doesn't

If you only serve a handful of small files — just zip them in memory
If you need real-time streaming to the browser — this is async (job → download URL)
If you need custom compression settings — eazip uses STORE (no compression) for speed

Try it

eazip.io — free tier: 60 GB-days/month, no credit card.

Building an export feature? I'd love to hear about your use case — drop a comment or reach out.

3 Ways to ZIP Files Stored on Cloudflare R2

Ryan — Sun, 15 Mar 2026 05:05:18 +0000

You have files sitting in Cloudflare R2 and a user just clicked "Download All." Now what?

R2 doesn't have a built-in "zip these objects" operation. You need to figure it out yourself. After building a file processing API that has archived 550K+ files and 10TB+ on R2, here are the three approaches I've found — each with very different trade-offs.

Approach 1: Pull to a Server and Use the `zip` Command

The most straightforward approach. Spin up a container (Fargate, Cloud Run, EC2, etc.), pull the files from R2, and run the good old zip command.

# Pull files from R2 and zip them
aws s3 sync s3://your-r2-bucket/files/ /tmp/files/ \
  --endpoint-url https://<account-id>.r2.cloudflarestorage.com
zip -r /tmp/archive.zip /tmp/files/
# Upload the archive back to R2 or serve it directly

Pros:

Dead simple. zip is battle-tested and handles everything — compression, large files, edge cases
No ZIP implementation needed. You're not writing any ZIP logic yourself
Full control over the server environment, compression level, file structure

Cons:

Disk and memory bound. You need enough disk space to hold all the files + the archive. For large archives (10GB+), this means provisioning beefy instances
Egress costs if your server isn't on Cloudflare. Pulling files from R2 is free (R2 has zero egress fees), but once the ZIP lives on your AWS/GCP server, serving it to the user or uploading it back to R2 means paying your cloud provider's egress fees
Infrastructure overhead. You need to manage containers, queues, autoscaling, and cleanup. It's no longer "just zip these files" — it's a whole pipeline
Not real-time. The user has to wait for the entire download + zip + upload cycle before they can start downloading

Best for: Batch processing, internal tooling, or when you already have server infrastructure and egress costs aren't a concern.

Approach 2: Stream a ZIP in a Cloudflare Worker

Instead of pulling files to a server, you can stream a ZIP archive directly from a Cloudflare Worker. Libraries like JSZip and fflate support streaming, so you can pipe R2 objects through them without buffering entire files.

// Using a streaming ZIP library in a Worker
import { ZipWriter } from 'some-streaming-zip-lib';

export default {
  async fetch(request, env) {
    const keys = ['file1.pdf', 'file2.jpg', 'file3.csv'];

    const { readable, writable } = new TransformStream();
    const zipWriter = new ZipWriter(writable);

    (async () => {
      for (const key of keys) {
        const obj = await env.BUCKET.get(key);
        await zipWriter.addStream(key, obj.body);
      }
      await zipWriter.close();
    })();

    return new Response(readable, {
      headers: { 'Content-Type': 'application/zip' }
    });
  }
};

This works well for simple cases. But things get complicated fast when you need production-level reliability.

Pros:

Constant memory usage. Only one file chunk in memory at a time
Zero egress fees. R2 → Worker → client, all within Cloudflare's network
Streaming. Client starts downloading immediately, no waiting for the full archive
Horizontal scaling. Workers handle many concurrent requests naturally — high throughput isn't a problem

Cons:

Per-archive size is limited. Workers have a 15-minute wall clock limit and a subrequest cap per invocation, so large archives (tens of GB+) won't complete in a single run
Error handling is brutal. If file #500 of 1000 fails mid-stream, you've already sent 499 files to the client. The HTTP response is in-flight — you can't restart or send an error code. The client just gets a truncated ZIP
Checkpoint/resume requires a custom ZIP implementation. To work around the wall clock limit, you'd need to serialize mid-stream state — CRC32 computations, byte offsets, multipart upload progress — and resume exactly where you left off. At that point, off-the-shelf libraries won't cut it, and you're deep in the ZIP spec implementing local file headers, data descriptors, central directory, and ZIP64 extensions yourself

Best for: Small-to-medium archives where the 15-minute wall clock limit isn't a concern. For anything larger, you'll need either serious engineering investment or a different approach.

Approach 3: Use a ZIP API Service

Instead of building and maintaining streaming ZIP infrastructure yourself, use an API that handles it for you. You send a list of R2 URLs (or presigned URLs), and get back a ZIP.

curl -X POST https://api.eazip.io/jobs \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      { "url": "https://your-bucket.r2.dev/file1.pdf" },
      { "url": "https://your-bucket.r2.dev/file2.jpg" },
      { "url": "https://your-bucket.r2.dev/file3.csv" }
    ]
  }'

The service handles streaming, CRC32, ZIP64, error recovery, and checkpoint/resume — all the hard parts — so you don't have to.

Pros:

One API call. No ZIP implementation to build or maintain
Handles edge cases you don't want to think about (ZIP64 for large files, Data Descriptors, checkpoint/resume for failures)
Zero egress if the service also runs on Cloudflare's network
Scales to 5,000+ files per archive, up to 50GB

Cons:

Third-party dependency. You're relying on an external service
Cost. Free tier exists but large-scale usage has costs
Less control over ZIP structure details

Best for: Teams that want ZIP functionality without building ZIP infrastructure. Ship in an afternoon instead of a sprint.

Full disclosure: I built Eazip because I went through Approach 2 myself and realized most teams shouldn't have to.

Comparison

	Server + zip	Stream in Worker	ZIP API
Memory	O(total size)	O(chunk size)	N/A
Egress cost	Depends on server location	$0	$0
Max archive size	Limited by disk	Limited by wall clock (15 min)	50GB
Implementation time	Hours	Hours–Weeks	Minutes
Maintenance	Medium (infra)	High (ZIP spec edge cases)	None
Error recovery	Easy (retry all)	Hard (mid-stream failures)	Built-in

Which Should You Pick?

Have server infrastructure and don't mind egress costs? → Approach 1. Pull the files, run zip, and move on. Just keep in mind that egress fees add up fast at scale — especially if you're on AWS or GCP.

Want to stay on Cloudflare's network? → Approach 2 works great for small-to-medium archives. But once you hit the wall clock limit or need error recovery, complexity escalates quickly.

Want to ship the feature and move on? → Approach 3. One API call, zero infrastructure, zero egress. You can be done in an afternoon.

The reality is that ZIP archiving looks simple until it isn't. What starts as "just zip these files" turns into managing disk space, egress bills, wall clock limits, or mid-stream error recovery — depending on which approach you choose. I learned this the hard way after archiving 10TB+ of files.

What's your approach? Have you tried something different? Let me know in the comments.

Zero Egress: Why I Chose Cloudflare Workers + R2 Over AWS for a File Processing API

Ryan — Thu, 12 Mar 2026 08:24:10 +0000

Most cloud cost calculators focus on compute and storage. They rarely mention the line item that dominated my bill: egress fees.

I run eazip.io, an API that takes an array of URLs and returns a ZIP archive. The service fetches remote files, streams them into a ZIP64 archive, and stores the result for download. In production, it has processed 550K+ files totaling over 10 TB of archived data.

When I was choosing where to build this, I modeled the costs across AWS, GCP, Azure, and Cloudflare. The results changed my entire architecture.

The Workload

Here's what happens on every API call:

Fetch remote files (1 to 5,000 URLs per job)
Stream them into a ZIP64 archive (up to 50 GB per job)
Store the archive for the customer to download
Serve the download (this is where egress hits)

The critical insight: every byte of the final archive gets transferred twice — once during creation (fetch → process → store) and once during download (store → customer). For a file processing API, egress isn't a rounding error. It's the dominant cost.

Egress Pricing Comparison

Here's what the major providers charge for data transfer out to the internet (as of early 2026, standard pricing tiers):

Provider	Egress Cost (per GB)	Free Tier
AWS S3	$0.09/GB (first 10 TB)	100 GB/month
GCP Cloud Storage	$0.12/GB (first 1 TB)	Free tier egress limited
Azure Blob Storage	$0.087/GB (first 5 GB free)	5 GB/month
Cloudflare R2	$0.00/GB	Unlimited

That's not a typo. R2 charges zero egress. You pay only for storage ($0.015/GB/month) and operations (Class A: $4.50/million, Class B: $0.36/million).

Scenario 1: The Single Large Job

Let's model a realistic scenario: a customer creates a 10 GB ZIP archive and downloads it.

AWS S3:

Storage: negligible (temporary)
Egress: 10 GB × $0.09 = $0.90
If they download it 10 times (sharing with team): $9.00

Cloudflare R2:

Storage: negligible
Egress: 10 GB × $0.00 = $0.00
10 downloads: still $0.00

One job, one customer, $9 difference. Now multiply that across hundreds of customers.

Scenario 2: Monthly Sustained Usage

A more realistic model: 1,000 jobs per month, average 1 GB per archive, each downloaded twice.

Cost Component	AWS S3	Cloudflare R2
Storage (temp, ~100 GB avg)	$2.30	$1.50
Egress (2 TB out)	$180.00	$0.00
Operations	~$0.50	~$2.00
Monthly Total	$182.80	$3.50

Egress is 98.5% of the AWS bill. The actual compute and storage costs are almost identical between providers. The entire cost difference comes from one line item.

Scenario 3: What Happens at Scale

This is where it gets scary. Let's say the service grows to 10,000 jobs/month (20 TB egress):

Scale	AWS Egress	R2 Egress
2 TB/month	$180	$0
20 TB/month	$1,740	$0
200 TB/month	$16,200	$0

On AWS, egress grows linearly with success. Every new customer, every additional download directly increases your largest cost. On R2, success doesn't punish you.

For an API business charging $9-29/month, an egress bill of $1,740 makes the unit economics impossible.

"Just Put a CDN in Front of S3"

This is the first objection everyone raises. And for many workloads, it's the right answer. But for a ZIP archive API, it doesn't work.

Every archive is unique. Customer A's ZIP contains different files than Customer B's. The cache hit rate for unique, per-customer archives is effectively zero. CloudFront, Fastly, or any CDN will just pass through to S3, and you still pay S3 egress to the CDN.

CDN caching works when many users request the same content. When every response is unique — which is the case for any file processing, transformation, or generation API — the CDN is just an expensive proxy.

The Hidden Costs You Don't See Coming

Egress isn't the only surprise on an AWS bill for this kind of workload:

NAT Gateway fees: If your processing runs in a VPC (ECS, Lambda in VPC), data passing through a NAT Gateway costs $0.045/GB on top of regular egress. For a service processing terabytes, this doubles your data transfer costs.

Cross-region transfer: If your compute is in us-east-1 but your customer's files are elsewhere, you pay $0.01-0.02/GB for inter-region transfer — before any processing even begins.

Temporary storage I/O: EBS volumes have I/O costs. If you're writing temporary files during ZIP creation, those IOPS add up.

Lambda execution time: A 10 GB ZIP takes minutes to create. At Lambda pricing, long-running file processing jobs are expensive. And Lambda has a 15-minute timeout, so you need step functions or ECS for large jobs.

On Cloudflare:

Workers → R2: free (same network)
R2 → customer: free (zero egress)
No NAT Gateway (Workers run at edge)
No cross-region transfer (R2 is globally distributed)

The Architecture That Emerged

The zero-egress model didn't just save money — it shaped the architecture:

Customer Request
      ↓
  [Cloudflare Worker]
      ↓ (fetch remote files)
  [Stream → ZIP64]
      ↓ (multipart upload, free)
  [Cloudflare R2]
      ↓ (download, free)
  Customer

Every data transfer in this pipeline is free. The only costs are:

Workers compute: $0.30 per million requests + CPU time
R2 storage: $0.015/GB/month
R2 operations: pennies per thousand

Total monthly bill for 550K+ files processed and 10 TB+ archived: single-digit dollars.

The same workload on AWS would be estimated at $200-400/month minimum, scaling to $1,000+ as usage grows. And that's before the NAT Gateway surprise.

The Tradeoffs

Cloudflare Workers aren't free of constraints. I want to be honest about what you give up:

128 MB memory limit: You can't buffer large files in memory. I had to build a fully streaming ZIP64 implementation. (I wrote about the technical details in my previous article.)

CPU time limits: Workers get seconds of CPU time, not minutes. For large jobs, I built a checkpoint/resume system that serializes state to R2 and picks up where it left off.

No filesystem: V8 isolates don't have fs. Everything must be streamed or held in memory.

Vendor lock-in: Your code is tied to Cloudflare's runtime. Workers are not standard Node.js — they're a subset of Web APIs. Migrating to AWS Lambda would require significant rewriting.

Smaller ecosystem: Fewer libraries, fewer examples, smaller community compared to AWS Lambda.

For my use case — a data-heavy API where egress dominates costs — these tradeoffs were worth it. For a CRUD app with minimal data transfer, the cost difference wouldn't matter, and AWS/GCP's richer ecosystem might be more valuable.

The Decision Framework

Here's how I'd think about it for your workload:

Choose Cloudflare Workers + R2 when:

Data transfer is a significant portion of your workload
You serve large files or many downloads
Each response is unique (low CDN cache hit rate)
You need global low-latency without multi-region setup
Cost predictability matters (no surprise egress bills)

Stay with AWS/GCP when:

You need specific managed services (RDS, SQS, etc.)
Your workload is compute-heavy, not data-heavy
Egress is a small fraction of total cost
You need the mature ecosystem and tooling
Your team already has deep AWS/GCP expertise

Quick Cost Calculator

For a rough estimate of your potential savings, here's a simple formula:

Monthly egress savings = Monthly data out (GB) × $0.09

If you also use NAT Gateway:
Add: Monthly data through NAT (GB) × $0.045

If that number is more than your compute costs, egress is your dominant expense and it's worth evaluating R2.

Results

After 6+ months in production:

550K+ files processed
10 TB+ total data archived
Monthly infrastructure cost: single-digit dollars
Estimated AWS equivalent: $200-400/month
Zero egress bills, ever

The service handles jobs ranging from 1 file to 5,000 files, with archives up to 50 GB. The checkpoint/resume system means large jobs complete reliably despite Worker timeout limits. And the cost stays flat regardless of how many times customers download their archives.

Try It

If you're building something that needs to create ZIP archives from remote URLs, eazip.io wraps this entire architecture into a single API call. Free tier available, no credit card required.

But even if you don't need a ZIP API — if you're building any data-heavy service, model your egress costs before choosing a provider. It might be the most important line item on your bill.

Questions about the cost analysis or the Cloudflare Workers architecture? Drop a comment — happy to share more specific numbers.

How I Built Streaming ZIP64 on Cloudflare Workers (128MB RAM, No Filesystem)

Ryan — Wed, 11 Mar 2026 14:08:08 +0000

I needed to ZIP 1,000+ files totaling 10 GB stored in Cloudflare R2. The catch: I had to do it on Cloudflare Workers -- 128 MB memory, no filesystem, no long-running processes. Every existing solution I tried failed. So I built my own streaming ZIP64 archiver from scratch.

This is the story of how that archiver became eazip.io, and the technical decisions behind it.

The Problem

I run a file-sharing service. Users upload files to Cloudflare R2, and sometimes they want to download hundreds or thousands of files at once. The obvious answer is a ZIP file. The not-so-obvious part is where to create it.

The constraints were brutal:

128 MB memory limit per Worker invocation
No filesystem -- Workers run in V8 isolates, not containers
Subrequest limits -- every fetch(), R2 read, and database call counts toward a per-invocation cap
CPU time limits -- you get seconds, not minutes
Files up to 4 GB+ -- meaning ZIP64 is mandatory, not optional

And one more thing: I wanted to avoid egress fees entirely. If I sent R2 data through AWS ECS to build the ZIP, I'd pay AWS egress on every byte. Cloudflare Workers talking to Cloudflare R2 costs nothing in bandwidth. That economic constraint shaped the entire architecture.

Why Existing Solutions Failed

I evaluated every ZIP library I could find for the Workers environment.

zip.js -- Almost Perfect

zip.js supports ZIP64 and streaming. It looked ideal. But internally it chains TransformStream instances using pipeTo(), and Cloudflare's workerd runtime doesn't implement inter-TransformStream pipeTo(). I tried shimming globalThis.TransformStream with IdentityTransformStream and manually implementing pipeTo() in a wrapper. Memory usage exploded. The fundamental issue is that Workers' TransformStream lacks backpressure propagation across chained transforms.

JSZip -- Wrong Model

JSZip buffers the entire archive in memory before producing output. With a 128 MB limit and 10 GB of input files, that's a non-starter. It also doesn't support ZIP64 writes.

fflate -- No ZIP64

fflate is fast and Workers-compatible, but it doesn't support ZIP64. Any archive over 4 GB (or with entries over 4 GB) is out.

The Decision

None of them worked. I needed to write a ZIP64 archiver from scratch, designed specifically for the Workers constraint model.

Architecture: Streaming ZIP64 Without a Filesystem

The key insight is that the ZIP format is sequential enough to stream -- if you make the right tradeoffs.

How ZIP Files Work (The Short Version)

A ZIP file is structured like this:

[Local File Header + File Data] x N
[Central Directory]
[End of Central Directory]

The Central Directory at the end is an index referencing all files. This is why ZIP files can be read without scanning the entire archive -- readers jump to the end first.

For our purposes, this structure is actually a gift: we can stream Local File Headers and file data sequentially, accumulate metadata in memory, then write the Central Directory at the end. The metadata per file is small (filename, offset, CRC32, sizes), so even 5,000 files fit comfortably in memory.

Data Descriptors: Write Now, Fill In Later

Normally, a Local File Header must contain the CRC32 and compressed size before the file data. That means you'd need to read the entire file first to compute these values, then write the header. With a 4 GB file, that's obviously impossible in 128 MB of RAM.

The solution: Data Descriptors (GPBF bit 3). This ZIP feature lets you write placeholder values in the header and append the real CRC32 and sizes after the file data:

[Local File Header (CRC=0, size=0)]  -- write immediately
[File Data stream]                    -- stream through
[Data Descriptor (real CRC, size)]   -- write after streaming

Here's the conceptual flow:

async function* streamZipEntry(
  filename: string,
  sourceStream: ReadableStream<Uint8Array>
): AsyncGenerator<Uint8Array> {
  // Write local file header with zeros for CRC/size
  // GPBF bit 3 signals "data descriptor follows"
  yield buildLocalFileHeader(filename, {
    gpbf: 0x0008,
    crc32: 0,
    compressedSize: 0,
    uncompressedSize: 0,
  });

  // Stream file data, computing CRC32 on the fly
  let crc = 0;
  let size = 0n;
  const reader = sourceStream.getReader();

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    crc = updateCrc32(crc, value);
    size += BigInt(value.byteLength);
    yield value; // Pass through without buffering
  }

  // Write ZIP64 data descriptor with real values
  yield buildZip64DataDescriptor(crc, size, size);
}

We use STORE mode (no compression). The data passes through the Worker byte-by-byte with zero buffering beyond the current chunk. Peak memory stays well under 128 MB regardless of file size.

CRC32 Mid-Computation Serialization

CRC32 is computed incrementally as data flows through. But what happens when a Worker hits its CPU time limit mid-file?

CRC32 state is just a 32-bit integer. We serialize it along with the byte offset and resume from exactly where we left off:

interface ZipCheckpoint {
  // ZIP state
  currentFileIndex: number;
  currentFileOffset: bigint;
  crc32State: number; // Just a uint32
  entryMetadata: EntryMeta[];

  // R2 multipart state
  uploadId: string;
  uploadedParts: { partNumber: number; etag: string }[];
  tailBuffer: Uint8Array; // < 5 MB leftover
}

This checkpoint gets serialized to R2 as JSON (with the tail buffer stored separately). When a new Worker invocation picks up the job, it deserializes the checkpoint and resumes streaming from the exact byte where the previous invocation stopped.

5 MB Boundary Buffering for R2 Multipart

R2 (and S3) multipart uploads require each part to be at least 5 MB (except the last one). But streaming ZIP data doesn't naturally align to 5 MB boundaries. A Local File Header might be 120 bytes. A tiny file might be 2 KB.

The solution: a tail buffer. We accumulate bytes until we hit 5 MB, then flush a part:

async function flushToR2(
  chunk: Uint8Array,
  state: MultipartState
): Promise<void> {
  // Append to tail buffer
  state.tail = concat(state.tail, chunk);

  // Flush when we have enough for an R2 part
  while (state.tail.byteLength >= PART_SIZE_MIN) {
    const part = state.tail.slice(0, PART_SIZE_MIN);
    state.tail = state.tail.slice(PART_SIZE_MIN);

    const uploaded = await r2.uploadPart(
      state.uploadId,
      state.nextPartNumber,
      part
    );

    state.uploadedParts.push({
      partNumber: state.nextPartNumber,
      etag: uploaded.etag,
    });
    state.nextPartNumber++;
  }
}

The tail buffer is always under 5 MB, so memory stays bounded.

Checkpoint/Resume: State Serialization to R2

Workers can die at any point -- CPU limit, subrequest limit, or infrastructure issues. The system is designed around this assumption.

The "Checkpoint Authoritative" model:

Stream data, flush R2 parts, accumulate metadata
At a safe stopping point (e.g., every 128 MB of data processed), serialize the full state to R2
Update the database with the checkpoint reference
If the Worker dies before a checkpoint, we replay from the last checkpoint

The checkpoint is the single source of truth. Any R2 parts uploaded after the last checkpoint are treated as non-existent on resume. This makes the system crash-safe without distributed transactions.

Worker Instance 1:
  [stream 128MB] → checkpoint A → [stream 128MB] → checkpoint B → [dies]

Worker Instance 2 (resume):
  [load checkpoint B] → [stream from where B left off] → ...

A Cron-based monitor watches for stalled jobs (Worker died without updating status) and spawns new Worker instances to resume them.

Central Directory: The Finale

After all files are streamed, we write the Central Directory and ZIP64 End of Central Directory records using the metadata we accumulated:

async function* finalize(
  entries: EntryMeta[]
): AsyncGenerator<Uint8Array> {
  const cdOffset = currentOffset;

  for (const entry of entries) {
    yield buildCentralDirectoryHeader(entry);
  }

  const cdSize = currentOffset - cdOffset;

  // ZIP64 End of Central Directory
  yield buildZip64EndOfCentralDirectory(entries.length, cdSize, cdOffset);
  yield buildZip64Locator(cdOffset + cdSize);
  yield buildEndOfCentralDirectory(entries.length, cdSize, cdOffset);
}

The whole ZIP never exists in memory at once. It flows from source URLs, through the Worker, into R2 parts, and gets assembled into a complete file -- all without ever exceeding a few megabytes of RAM.

Results

This architecture has been running in production as the backend for eazip.io:

550K+ files processed
10 TB+ total data archived
Up to 5,000 files and 50 GB per job
Zero egress fees (Workers + R2 = no bandwidth charges)
Peak memory well under 128 MB regardless of job size

The checkpoint/resume mechanism means jobs survive Worker restarts gracefully. A 50 GB ZIP that takes many Worker invocations to complete will checkpoint and resume automatically, with no manual intervention.

What I Learned

The ZIP format is more streaming-friendly than it looks. Data Descriptors were designed in 1993 for tape drives, but they solve exactly the same problem we have in serverless: you can't seek backward.

Serializable state is the key to serverless resilience. If you can serialize your entire computation state to a few kilobytes of JSON plus a small buffer, you can resume anywhere, on any instance, after any failure.

Constraints breed creativity. I never would have built this architecture if I had a 16 GB VM with a filesystem. The Workers limitations forced a design that's actually more resilient and cost-efficient than the traditional approach.

Egress fees are a hidden tax on cloud architectures. Keeping compute and storage on the same provider's network isn't just a performance optimization -- it's an economic one. A 10 GB ZIP downloaded 100 times would cost roughly $90 in AWS egress. On Workers + R2, it costs $0.

Try It

If you need to create ZIP files from remote URLs without managing servers, temp files, or cleanup jobs, eazip.io wraps all of this into a single API call:

curl -X POST https://api.eazip.io/jobs \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {"url": "https://cdn.example.com/report-q1.pdf"},
      {"url": "https://cdn.example.com/report-q2.pdf", "filename": "Q2.pdf"}
    ]
  }'

There's a free tier (no credit card required) and the documentation covers the full API.

Have questions about the ZIP format internals or the Workers architecture? Drop a comment -- happy to go deeper on any part of this.

DEV Community: Ryan

How to Add 'Download All as ZIP' to Your SaaS in 30 Minutes

The 30-minute version

Step 1: Get an API key

Step 2: Collect your file URLs

Step 3: One API call

Step 4: Poll for completion and redirect

What you didn't have to build

When this makes sense

When it doesn't

Try it

3 Ways to ZIP Files Stored on Cloudflare R2

Approach 1: Pull to a Server and Use the zip Command

Approach 2: Stream a ZIP in a Cloudflare Worker

Approach 3: Use a ZIP API Service

Comparison

Which Should You Pick?

Zero Egress: Why I Chose Cloudflare Workers + R2 Over AWS for a File Processing API

The Workload

Egress Pricing Comparison

Scenario 1: The Single Large Job

Scenario 2: Monthly Sustained Usage

Scenario 3: What Happens at Scale

"Just Put a CDN in Front of S3"

The Hidden Costs You Don't See Coming

The Architecture That Emerged

The Tradeoffs

The Decision Framework

Quick Cost Calculator

Results

Try It

How I Built Streaming ZIP64 on Cloudflare Workers (128MB RAM, No Filesystem)

The Problem

Why Existing Solutions Failed

zip.js -- Almost Perfect

JSZip -- Wrong Model

fflate -- No ZIP64

The Decision

Architecture: Streaming ZIP64 Without a Filesystem

How ZIP Files Work (The Short Version)

Data Descriptors: Write Now, Fill In Later

CRC32 Mid-Computation Serialization

5 MB Boundary Buffering for R2 Multipart

Checkpoint/Resume: State Serialization to R2

Central Directory: The Finale

Results

What I Learned

Try It

Approach 1: Pull to a Server and Use the `zip` Command