My Next.js 15 aggregator runs on a CSV file instead of a database

#webdev

I spent six months building Sweepbase, a comparison site for 139 crypto debit and credit cards. The usual advice was to pick a managed Postgres or a document store. I ended up shipping it on a single CSV file checked into the repo. The site has been live for six months and the architecture has not caused a single incident. Here is how the pipeline actually works.

The dataset shape

Each row has 35 columns: issuer name, region, network, fees, cashback rules, custody model, and so on. The file is about 280 KB on disk. All 139 rows parse in under 10 ms on cold start. A linear scan is faster than any index I could build.

Writes happen maybe twice a week when an issuer changes a fee structure. There are no user-generated records. No admin dashboard in production. Every update is a git commit.

The pipeline

The runtime contract is Zod 4. PapaParse reads the raw CSV into loose record objects. Zod validates each row, coerces strings into typed fields, and drops invalid rows with a build-time warning.

// lib/card-schema.ts
import { z } from "zod";

export const CardSchema = z.object({
  issuer: z.string(),
  region: z.enum(["USA", "EU", "UK", "LATAM", "APAC", "AFRICA", "MENA", "AU"]),
  network: z.enum(["Visa", "Mastercard", "Other"]),
  cashbackPct: z.coerce.number().min(0).max(15).default(0),
  fxFeePct: z.coerce.number().min(0).max(5).default(0),
  selfCustody: z.coerce.boolean().default(false),
  // 25 more fields
});

export type Card = z.infer<typeof CardSchema>;

Every downstream file imports Card and gets full autocomplete plus compile-time checks. When I add a column, the type, the filter, the UI, and the JSON-LD generator update in one commit.

Caching at two layers

I wrap the parse function in React.cache() so the same logical request never re-parses the file.

// lib/data.ts
import { cache } from "react";
import fs from "node:fs";
import Papa from "papaparse";
import { CardSchema, type Card } from "./card-schema";

export const getCards = cache((): Card[] => {
  const raw = fs.readFileSync("data.csv", "utf-8");
  const parsed = Papa.parse(raw, { header: true, skipEmptyLines: true });
  return parsed.data.flatMap((row) => {
    const result = CardSchema.safeParse(row);
    return result.success ? [result.data] : [];
  });
});

Above that sits ISR with a one-hour revalidation window. The first request after an hour pays the parse cost. Every subsequent request inside that hour reads pre-rendered HTML from the Vercel edge. I never had to reach for Redis or an extra CDN layer because the CDN is already where Next.js puts the output.

What the setup does not solve

Real-time features and multi-user write paths are out. There is no favorites list per user. Email signups go to Resend. Report Error submissions go through a rate-limited API route that writes to my email via Resend. None of this needs a database because the data lives somewhere else.

If I ever need to accept user-generated records, I will add Postgres for that feature alone and keep the card catalog in the CSV. The two concerns do not have to share infrastructure.

The cost

Vercel Free tier covers the traffic. Sentry is on the free developer plan. The only paid line item is the domain renewal. A managed Postgres would have been about fifteen dollars a month and half a day of ops work every time the provider rotates connection strings, with zero features the users would notice.