Polliog

Posted on Apr 4

Your API Responses Are 40x Larger Than They Need to Be

#webdev #performance #api #node

I was profiling a production API last year when I noticed something that should have been obvious from the start: the response body for a simple list endpoint was 2.4MB. The actual useful data? About 60KB.

The rest was a mix of unused fields, redundant nesting, and no compression. It had been that way since day one, and nobody had noticed because on localhost it's fast enough that it doesn't register.

This is not a rare situation. It's the default.

The Three Ways APIs Bloat Their Responses

1. No Compression

This is the easiest win and the most commonly skipped.

HTTP has supported gzip compression since 1999. Brotli has been in all major browsers since 2017. Most APIs don't enable either by default.

# Check if your API compresses responses
curl -s -o /dev/null -w "%{size_download}" https://api.example.com/users
curl -s -o /dev/null -w "%{size_download}" -H "Accept-Encoding: gzip" https://api.example.com/users

If both numbers are the same, you're not compressing.

The fix in Node.js with Fastify:

import fastifyCompress from "@fastify/compress";

await app.register(fastifyCompress, {
  global: true,
  encodings: ["br", "gzip", "deflate"],
  threshold: 1024, // don't compress responses under 1KB
});

And with Express:

import compression from "compression";

app.use(compression({
  level: 6,
  threshold: 1024,
  filter: (req, res) => {
    if (req.headers["x-no-compression"]) return false;
    return compression.filter(req, res);
  },
}));

Real numbers from a list endpoint returning 500 records:

Format	Size
No compression	2.4MB
gzip	280KB
brotli	210KB

That's an 8-11x reduction with zero changes to your data model, zero changes to clients, and about 10 lines of code.

Note: Handling compression in Node.js works well for most setups. At large scale, offloading it to your reverse proxy (NGINX) or CDN (Cloudflare) saves CPU cycles since Node.js is single-threaded and compression is CPU-intensive. If you're already behind a proxy, check whether it's compressing for you before adding it in Node.js too.

2. Over-fetching

Every ORM makes it trivial to return entire database rows. Most codebases do exactly that.

// This returns every column in the users table
const users = await db.query("SELECT * FROM users");

// Including: password_hash, internal_flags, created_at, updated_at,
// deleted_at, last_ip, raw_oauth_payload, internal_notes...

The fix is not complicated - it just requires being deliberate:

// Return only what the client actually needs
const users = await db.query(`
  SELECT id, name, email, avatar_url, role
  FROM users
  WHERE organization_id = $1
  ORDER BY created_at DESC
`, [orgId]);

If you're using an ORM like Prisma, use select explicitly:

const users = await prisma.user.findMany({
  where: { organizationId },
  select: {
    id: true,
    name: true,
    email: true,
    avatarUrl: true,
    role: true,
  },
});

The temptation to use SELECT * or skip the select clause is real because it saves two minutes of typing. The cost is paid on every request, by every client, forever.

3. Redundant Nesting and Metadata

This one is subtler. It's the pattern where every response wraps data in a consistent envelope, which is fine, but the envelope carries metadata that nobody uses.

{
  "success": true,
  "status": 200,
  "message": "OK",
  "timestamp": "2026-04-02T10:00:00Z",
  "requestId": "abc-123",
  "version": "v1",
  "data": {
    "items": [...],
    "meta": {
      "total": 1250,
      "page": 1,
      "perPage": 20,
      "lastPage": 63,
      "from": 1,
      "to": 20,
      "currentPage": 1,
      "hasNextPage": true,
      "hasPrevPage": false
    }
  }
}

success, status, message, and version are duplicating what HTTP already tells the client. currentPage is page renamed. from and to are derivable from page and perPage. hasPrevPage is page > 1.

A cleaner version:

{
  "data": [...],
  "pagination": {
    "total": 1250,
    "page": 1,
    "perPage": 20,
    "hasNext": true
  }
}

Less noise, same information, smaller payload.

Keyset Pagination vs. Offset Pagination

While we're talking about list endpoints, there's a related issue worth covering.

Offset pagination (LIMIT 20 OFFSET 200) requires the database to count and skip rows. On large tables this gets slow fast, and it also makes total count queries expensive.

-- This scans and counts the entire table
SELECT COUNT(*) FROM logs WHERE organization_id = $1;

-- On a table with 50M rows this can take 2-5 seconds

Keyset pagination avoids both problems:

// Instead of page/offset, use the last seen ID.
// Request one extra row to check if there's a next page.
const logs = await db.query(`
  SELECT id, timestamp, level, message
  FROM logs
  WHERE organization_id = $1
    AND id < $2
  ORDER BY id DESC
  LIMIT $3
`, [orgId, lastSeenId, pageSize + 1]);

// If we got more than pageSize rows, there's a next page.
// Trim the extra row before sending.
const hasNext = logs.length > pageSize;
const items = logs.slice(0, pageSize);

You lose the ability to jump to arbitrary pages, which matters for some UIs but not for most. You gain: fast queries at any depth, no COUNT(*) needed, and stable pagination even when rows are being inserted concurrently.

Conditional Responses with ETags

If your data doesn't change between requests, the client shouldn't have to download it again.

import { createHash } from "crypto";

// In your route handler
const data = await fetchData(orgId);

// If your data has an updated_at field, use that instead of hashing
// the full payload - much cheaper on large responses.
// const etag = createHash("md5").update(`${data.id}:${data.updatedAt}`).digest("hex");

const etag = createHash("md5")
  .update(JSON.stringify(data))
  .digest("hex");

const clientEtag = req.headers["if-none-match"];

if (clientEtag === etag) {
  return res.status(304).send(); // Not Modified - zero payload
}

res
  .header("ETag", etag)
  .header("Cache-Control", "private, max-age=0, must-revalidate")
  .send(data);

One caveat: hashing the full JSON.stringify(data) on every request is expensive if your payload is large. If the data has an updated_at timestamp in the database, derive the ETag from that instead - hash(id + updated_at) is a constant-time operation regardless of payload size and avoids blocking the event loop.

For list endpoints where data changes frequently, this won't help much. For configuration endpoints, user profile data, or static reference data, a 304 response is massively cheaper than resending the same payload on every poll.

Putting It Together

The combined impact on that 2.4MB endpoint:

Change	Size	Reduction
Baseline	2.4MB	-
+ brotli compression	210KB	91%
+ select only needed fields	58KB	97.5%
+ clean response envelope	55KB	97.7%
+ ETag (no change)	0KB	100%

The 40x number in the title is real. Most of it comes from compression alone - the rest is just discipline.

None of these changes require a rewrite. They don't break existing clients. They're additive. The only cost is a bit of attention to defaults that most frameworks don't set correctly out of the box.

Start with compression. It takes 10 minutes and the numbers will surprise you.

Found something I got wrong, or a pattern that works better for your stack? Drop it in the comments.

Top comments (2)

Marina Eremina • Apr 5

Really liked this article, lots of useful information. All recommendations on improving API response size are spot on.

In my experience though, the biggest issue often comes from a different place: a single endpoint serves multiple views, each view requires slightly different data. Over time, it turns into a catch-all response, the payload just keeps growing.

Besides switching to GraphQL, what approaches have you found effective to handle this? Curious what has worked best in your experience.

Polliog • Apr 5 • Edited

Usually splitting the requests works in all cases.
Also a better managing, for example the "getAll" usually doesn't need to give a lot of info etc