DEV Community

mamoru kubokawa
mamoru kubokawa

Posted on

How I auto-enrich a brand database with AI on cache miss (Lovable + Claude API)

Most database designs have two ugly options:

  1. Manually seed thousands of rows (impossible for niche data like Japanese wholesale suppliers)
  2. Force users to enter everything (terrible UX, dead-on-arrival)

Last week I shipped a third option in 30 minutes with Lovable: let the database grow itself.

Every search that misses the cache triggers Claude API to generate a real, structured entry — and saves it. The next user gets an instant hit.

Here's the exact pattern.

The pattern in 4 lines

async function search(query) {
  if (await db.has(query)) return db.get(query);
  const entry = await aiGenerate(query);
  await db.save(entry);
  return entry;
}
Enter fullscreen mode Exit fullscreen mode

That's the whole thing. The magic is in what happens to the database over time.

Why this beats alternatives

Seed-only DBs require domain expertise upfront. For my Japan Brand Finder, that meant cold-calling Tsubame-Sanjo metalworkers — months of effort before launching.

User-fed DBs have chicken-and-egg. Empty DB → no value → no users → no entries.

Cache-miss enrichment sidesteps both:

  • Launch with 20 seed entries (1 hour)
  • AI fills the long tail as users search
  • Every miss makes the DB better for the next user
  • Cost grows linearly with usage (predictable)

The prompt that actually worked

The hard part isn't the pattern. It's getting AI to produce structured, useful entries instead of generic Wikipedia summaries.

What worked for me (Japan Brand Finder context):

You are filling a database row for a Japanese manufacturer.
The user searched: "[QUERY]"

Generate a JSON object:
- name_en: English brand name
- name_jp: Japanese name (kanji or kana)
- category: from this list [...]
- hq_location: city, prefecture
- english_support: "good" | "limited" | "none"
- business_culture_notes: 1-2 sentences

If the brand doesn't exist, return null. Don't invent.
Enter fullscreen mode Exit fullscreen mode

Two key tricks:

  1. JSON schema forces structure (no rambling output)
  2. "Return null if doesn't exist" gives AI permission to refuse

The second one cut hallucination by ~80% in my testing.

Economics

  • Per search: ~$0.005 with Claude Sonnet
  • Per 1,000 searches: ~$5
  • DB grows: ~700 unique entries (cache hit ratio improves over time)

After Month 2, ~70% of searches hit cache → AI cost drops while DB value compounds.

What I'd improve

  1. Verification batch job — weekly re-check generated entries against external sources
  2. User flagging — one-click report for wrong entries
  3. Quality tiers — mark "AI-generated" vs "human-verified"

Try it yourself

If you have any niche directory idea (suppliers, restaurants, courses), this pattern unlocks it.

Demo: https://japanbrandfinder.lovable.app/

Twitter: @tokidigitaljp

What would you use the cache-miss enrichment pattern for?

Top comments (1)

Collapse
 
foxck016077 profile image
foxck016077

This is the article that anchored our cache-miss conversation today, so let me drop a concrete adjacency you might find useful: I'm running the same enrichment-on-miss pattern but in an inbox-triage Actor, and the failure mode is different from yours.

Your enrichment runs once-per-novel-key and caches forever — domain knowledge entries don't decay. Mine runs once-per-Friday-per-mailbox and the "cache" is actually a stale snapshot the moment the mailbox receives a reply. So the unit economics flip: where you can amortize the Claude call cost across all future hits of the same brand query, I have to re-pay it weekly because last week's HOT thread is this week's WARM. Same pattern, different decay profile.

What that means for the "free first slow hit, paid fast cache" model you outlined in your reply: it works in your shape because patience pays for itself. In a triage-style product (stale snapshots), the "fast cache" doesn't exist as a separate billable thing — every run is effectively a fresh enrichment.

The 4-line pattern code you wrote nails it. The interesting hidden constraint is what counts as a cache hit. For your DB, exact-key match works. For inbox triage, two threads with identical sender + similar subject + 3 days apart should still be classified as separate enrichments. I ended up using (thread_id, last_message_at_bucket) as the cache key, which means most "hits" are actually misses by another name.

If you ever extend the pattern to time-decaying data (price comps, availability), the same trap waits. Worth flagging up front in the post so a reader doesn't try to reuse the pattern for the wrong shape.

(Day 16 receipts also moved up a tick after our exchange — Day 16 dev.to article got the +51 reader spike today; full breakdown in my latest checkpoint if useful.)