Basmin Shaik

Posted on Apr 12

The Fallback That Saved Our Demo and Why I Almost Didn't Build It

#webdev #agents #hindsight #automation

Twenty minutes before our demo, Hindsight returned zero memories.
Not an error. Not a crash. Just an empty array, silently, every time. The agent had nothing to work with and was generating responses based on no context at all. Generic, useless, exactly what we'd spent two weeks trying to avoid.

We were fine. Because I'd built a fallback.

What Retrospect Does

Retrospect is a personal decision memory agent. Users log real decisions and outcomes. Hindsight retains every one as a semantic memory and recalls the most relevant ones when the agent is asked a question. Gemini 2.5 Flash reasons over the recalled memories to give personalised advice.

My contribution was infrastructure — auth, database, API routes, deployment. The unglamorous work that has to be right for everything else to function. And the fallback systems that nobody thinks about until they need them.

Why Fallbacks Get Skipped

When you're building fast, fallbacks feel like debt. The happy path works, the demo is tomorrow, you'll add error handling later. This is almost always wrong.

External dependencies fail in ways you don't predict. APIs return unexpected shapes. Network timeouts happen at the worst possible moments. The question isn't whether your Hindsight integration will fail at some point — it's whether you've built something that keeps working when it does.

I built two fallbacks. One for development, one for production.

The Development Fallback: Mock Memory Store

The Hindsight API key lived in an environment variable. During early development, half the team didn't have it set up. Every time someone ran the app locally without the key, the Hindsight client would fail to initialise and the entire agent flow would break.

I added a null-return path and an in-memory fallback:

// src/lib/hindsight.ts

async function getHindsight() {
  const apiKey = process.env.HINDSIGHT_API_KEY;
  if (!apiKey) {
    console.warn("HINDSIGHT_API_KEY not set — using mock memory store");
    return null; // triggers fallback
  }

  try {
    const { Hindsight } = await import("@vectorize-io/hindsight");
    hindsightClient = new Hindsight({ apiKey });
    return hindsightClient;
  } catch (e) {
    console.warn("Hindsight SDK not available", e);
    return null;
  }
}

// In-memory fallback store
const memoryStore: MemoryResult[] = [];

export async function recallMemories(params: RecallParams): Promise<MemoryResult[]> {
  const client = await getHindsight();
  if (client) {
    // real Hindsight recall
    return await client.recall({ query: params.query, topK: params.topK });
  }

  // Fallback: keyword matching against in-memory store
  const query = params.query.toLowerCase();
  return memoryStore
    .filter(m => params.filter?.userId ? m.metadata.userId === params.filter.userId : true)
    .map(m => {
      const words = query.split(/\s+/);
      const matchCount = words.filter(w => m.content.toLowerCase().includes(w)).length;
      return { ...m, score: matchCount / words.length };
    })
    .sort((a, b) => (b.score || 0) - (a.score || 0))
    .slice(0, params.topK);
}

Keyword matching is not semantic recall. It's a pale imitation. But it was accurate enough to develop and test the full UI flow, debug the chat interface, and validate the response format — without needing a live API key. It also caught two edge cases in the filtering logic that I would have missed if I'd only ever tested against the real Hindsight client.

The Production Fallback: SQLite Rescue

The second fallback is in the ask route. If Hindsight recall returns nothing — empty array, timeout, any failure — the route falls back to the most recent SQLite decisions:

// api/decisions/ask/route.ts

let memories: any[] = [];
try {
  memories = await recallMemories({
    query: question,
    topK: 5,
    filter: { userId: payload.userId },
  });
} catch (e) {
  console.warn("Hindsight recall failed, using DB fallback:", e);
}

// If Hindsight returned nothing, use recent DB decisions
if (memories.length === 0) {
  const db = getDb();
  const decisions = db
    .prepare("SELECT * FROM decisions WHERE user_id = ? ORDER BY date DESC LIMIT 5")
    .all(payload.userId) as any[];

  memories = decisions.map((d) => ({
    content: `Decision: ${d.decision}. Category: ${d.category}. Outcome: ${d.outcome}. Result: ${d.result}. Date: ${d.date}`,
    metadata: { category: d.category, outcome: d.outcome, date: d.date, decisionId: d.id },
  }));
}

This is what saved the demo. Hindsight returned an empty array — we still don't know exactly why, possibly a timing issue with the API — and the route silently switched to SQLite. The context wasn't semantically retrieved, so the advice was slightly less targeted, but it was real context from the user's actual history. The agent kept working. Nobody watching the demo noticed anything wrong.

What the App Looked Like Running

This is the dashboard after 17 decisions logged. The database layer is what makes all of this possible — decisions stored reliably, recall events logged, user sessions managed. The "Agent Intelligence" card is powered by a Hindsight recall that worked perfectly here. But if it hadn't, the SQLite fallback would have caught it.

Every submission hits the log route: SQLite write first, then Hindsight retain. Both in the same request. I made the Hindsight retain a non-blocking try/catch — if it fails, the decision is still saved to SQLite. The user doesn't lose data. The agent just has slightly less to work with until the next sync.

The Recall Events Table

One thing I added that wasn't in the original spec was a recall_events table:

CREATE TABLE recall_events (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  user_id INTEGER NOT NULL,
  query TEXT NOT NULL,
  recalled_count INTEGER NOT NULL,
  created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

Every time the agent recalls memories, we log the query and how many memories came back. This powers the "Patterns Found" stat on the dashboard and gives visibility into how often the memory layer is actually being used — versus how often it's falling back to SQLite.

It also gave us the data to spot the demo issue in the first place. Recalled count of zero, repeatedly, for the same user — that's a signal something is wrong, not just a one-off.

What I'd Do Differently

Write the fallback the same day you write the feature. I built the mock memory store on day two, after the first time someone couldn't run the app locally. It should have been day one. The cost is an hour of work. The benefit is never losing a day to a blocked teammate.

Make external dependency failures non-fatal. Every call to an external API — Hindsight, Gemini, anything — should be wrapped in a try/catch with a defined fallback behaviour. Not a crash, not an empty response, but a defined degraded mode. Users should never see an error because a third-party API had a bad moment.

Log everything that touches your memory layer. The recall events table is the most useful debugging tool I built. Without it, "Hindsight returned zero memories" is invisible. With it, it's a query away. Instrument your agent memory layer from the start — you will need that data.

Test with the fallback explicitly. Unset the API key. Force the mock store. Make sure the app still works end-to-end. Then restore the key and test again. Both paths need to be tested, not just the happy path.

The demo worked. Not because nothing went wrong — something did go wrong. Because we'd built something that kept working anyway.

That's the job.

DEV Community