Three months of running a Next.js aggregator on a CSV: what broke and what did not,

#architecture #database #nextjs #sideprojects

I shipped a 141-row crypto card comparison site on a public CSV instead of a database back in February, and I want to write down what I have learned three months in. The earlier posts covered why I picked CSV (why a CSV beats a database for this) and what I would do differently on the architecture side (six lessons-learned from shipping a Next.js 15 + CSV side project). This is the operational version.

What broke

ISR cache went stale faster than I expected. Setting revalidate = 86400 on card detail pages felt safe in dev. In production, when I edited the CSV and pushed, the new content took up to 24 hours to surface on cold pages because Vercel only revalidates on traffic. I added a /api/revalidate webhook that I hit from a small script after every CSV change. That fixed the lag, but it adds a step I forget half the time.

PapaParse parsing in a Server Component blew up once when a column contained a comma inside quoted text and the quoting was wrong. Zod validation caught the malformed row, but I had 20 minutes of "is the entire site broken" panic before I read my own logs. Lesson: always log the failing row before throwing.

Image proxy started rate-limiting. I serve card images via /api/image-proxy with a 7-day cache. About six weeks in, I noticed Google Drive started throttling requests from Vercel egress IPs. Cache hit rate dropped, latency went up. I now host all new card images locally as .webp and only fall back to Drive for legacy entries.

What did not break

The catalog itself. 141 rows in a CSV is below any threshold where you actually need a database. Greps are instant in CI, the file diffs cleanly in PRs, and contributors can read it without a SQL client. I have not regretted this once.

Filter functions as predicates. Every category on the site is a single function (card: Card) => boolean in one file. When I needed to add a new category (Brazil, USDC, self-custody), it was a one-line export. Reading a meta post on the editorial layer of a comparison site made me realize this was the architectural choice that made the most editorial work feel cheap.

Zod schemas as the source of truth. Card type, validation, defaults all in one place. I have refactored the card model three times now and the migration was always trivial because the schema was the contract.