Frankline Yombih Yombih

Posted on Jun 21

I Optimized a 1.7M Row Postgres Database to Run at 66ms on a 6GB RAM Instance. Here's How

#postgres #node #nextjs #architecture

Most database optimization advice online is useless because it assumes you have a massive enterprise budget and an endless pool of server resources.

I don't.

I built LemonKnows—an independent consumer safety and food intelligence platform tracking over 1.7 million products, global government recalls, and consumer complaints. Because I value staying completely self-funded and lean, I threw the entire platform onto a single Oracle Cloud Free Tier 6GB ARM instance.

When you shove 1.7 million dense records into Postgres on a 6GB box, native query patterns don't just slow down—they completely collapse. At one point, heavy category aggregations and string searches were triggering brutal sequential table scans that took up to 49 seconds.

Today, those exact pages assemble in 66ms.

Here is the exact technical war story of how I re-engineered the stack, eliminated database bottlenecks, and built a zero-cost AI translation architecture that scales.

1. The Death of COUNT(*) and the stats_cache Fix

If you are running SELECT COUNT(*) on tables with millions of rows to power your pagination or dashboard counters, you are actively killing your database. PostgreSQL handles MVCC (Multi-Version Concurrency Control) by physically scanning rows to see if they are visible to the current transaction. On 1.7M rows, the Postgres planner will choke.

The Fix:

I stopped asking Postgres to count things in real-time. Instead, I built a lightweight stats_cache lookup table.

Every time a background data ingest runs or a status changes, it updates the pre-aggregated counters in stats_cache. When a user hits a main listing or category view, the application executes a rapid 2ms primary key lookup on the cache table instead of executing a multi-second table scan.

[sql]
-- Stop doing this:
SELECT COUNT(*) FROM products WHERE category = 'automotive' AND status = 'active';

-- Do this:
SELECT total_count FROM stats_cache WHERE key = 'products_automotive_active' LIMIT 1;
[/sql]

2. Composite Indexing vs. Full Table Scans

When a single query filters by category, checks a safety status, and sorts by a reliability score, separate b-tree indexes on individual columns are practically worthless. Postgres has to pick one index, scan it, and then filter the remaining millions of records in memory.

I dropped the single-column clutter and deployed a highly specific composite index tailored directly to our heavy query filters:

[sql]
CREATE INDEX idx_products_cat_status_score
ON products (category, status, score DESC);
[/sql]

3. The Zero-Cost LLM Multi-Language Proxy

LemonKnows serves global consumer data, meaning the interface and heavy text data must exist across 7 distinct languages (English, French, Spanish, German, Italian, Portuguese, and Dutch).

If you hook an LLM API directly into your user-facing request stream to handle translations on the fly, you will end up with two things: terrible latency and an absolute nightmare of an API bill.

The Architecture:

I built a server-side translation proxy that leverages Gemini 2.5 Flash Lite as the translation engine, backed by a local lookup table called ugc_translations.

Here is the data loop:

A request comes in for a specific text string or complaint breakdown in German.

The backend generates a SHA-256 hash of the English source text.

It performs a rapid lookup on ugc_translations where hash = SHA256 and lang = 'de'.

Cache Hit: The localized string is returned instantly from our local storage. Zero API cost, sub-5ms lookup.

Cache Miss: The string is piped to Gemini via a background worker, translated, immediately written to the database cache, and returned.

This means we pay for a translation exactly once. The moment a product page is translated, it is effectively free to serve to international traffic forever.

4. Turning Empty States into Secure UGC Channels

When managing millions of product records, dealing with missing imagery is a massive UX bottleneck. I don't believe in leaving empty, ugly placeholder gaps, nor do I believe in manually uploading a million photos.

I built a client-side validated, user-generated content (UGC) loop that safely crowd-sources the platform’s visual gap:

Step 1: User encounters a product page missing an image.

Step 2: User clicks the auth-gated "Add Image" action button.

Step 3: User inputs an image URL which undergoes strict client-side verification.

Step 4: Validated submissions are written straight to the product_image_suggestions workflow table.

The Security Pipeline:

The DB Constraint: To prevent users from spamming a single product, the product_image_suggestions table has a strict composite unique index: UNIQUE(user_id, product_id). The database enforces a strict limit of one pending suggestion per user, per product.

The Validation Gate: The modal doesn't just accept any raw string. The client-side script validates that the URL resolves to a genuine image format with a live preview before allowing submission.

The Admin Pipeline: Suggestions land in a pending state. When I'm ready to review, a lean, automated admin dashboard fires a PATCH /products/image-suggestions/:id with { "status": "approved" }. The system instantly migrates the record and maps it to the product live, keeping the core products table perfectly untainted by unverified user inputs.

The Takeaway

You don't need a massive distributed cluster or a complex network of microservices to handle millions of records. You just need to respect your database, stop making it count things it already knows, cache your expensive AI operations aggressively, and build clean, structured loops to let your users help you scale.

Top comments (1)

Frankline Yombih Yombih • Jun 21

There is still a lot of work to be done, so please let me know what you think.