SANSKRITI

Posted on Apr 12

I built an AI that remembers every production incident. Here's what changed.

#agents #ai #showdev #sre

It's 2 am. Production is down.

You paste the error into ChatGPT. It tells you to check your connection string and make sure the database is running.

Not helpful. However, it's also not ChatGPT's fault — it has no idea that this same error caused your payment service to fail three times last month, that the fix took 12 minutes each time, or that your infrastructure team keeps accidentally reverting it.

AI doesn't fail at incidents because it's unintelligent. It fails because it's amnesiac.

So I built something with memory.

WHAT I BUILT

SentinelAI is an incident response agent that learns from every incident your team encounters. Powered by Hindsight (Vectorize's persistent memory layer), the core loop is simple:

Paste an error or alert into the UI
The agent retrieves semantically similar past incidents from memory
It scores recurrence risk — pattern or one-off?
It generates a structured report: root cause, fix steps, and a time-to-resolve estimate
You log what worked — and the agent stores that for next time

No generic advice. No rediscovering things you already know. Just institutional memory, on demand.

THE DEMO MOMENT THAT MADE IT REAL

I seeded the agent with 8 synthetic incidents — connection pool errors, OOMKills, cert expirations. Then I submitted the same ConnectionPoolTimeoutError that had appeared three times already.

HIGH RECURRENCE RISK — seen 3x in 21 days on payment API
Similarity score: 0.94
Root cause: RDS max_connections reverted after each infra update
Fix: re-deploy pgbouncer and set max_connections=500
TTR estimate: ~12 min

That's not a chatbot answer. That's your senior SRE, who remembered everything— without being paged.

WHAT MADE IT TECHNICALLY INTERESTING

The real insight is what you inject into the prompt. A bare prompt gets you generic answers. A prompt enriched with semantically recalled past incidents provides you specific, high-confidence recommendations.

Same model. Same base prompt. Dramatically different output quality. Memory is the unlock — not the model.

The stack: Hindsight handles persistent storage and semantic search. Cosine similarity scores recurrence frequency. Groq (qwen3-32b) generates the structured report. FastAPI and a single-page UI keep the whole thing deployable in minutes. A /resolve endpoint writes successful fixes back to memory, so the agent improves every time it's used.

THREE THINGS I LEARNED BUILDING THIS
Stateless AI is a toy. Agents with memory are tools.

The test is simple: is your tenth interaction meaningfully better than your first? If not, you haven't built an agent — you've built a fancy autocomplete.

Scope discipline wins hackathons.

I could have added Slack bots, trend dashboards, and multi-agent routing. Instead, I built one thing that works perfectly: paste error and get intelligence. Judges reward polish over complexity every time.

The best AI products solve for the moment of highest stress.

2 am, production down, everyone watching — that's when your tool either earns trust or loses it forever. Design for that moment first.

THE BIGGER PICTURE

SRE teams at mid-size companies lose thousands of engineering hours each year reinvestigating the same incidents. Post-mortems get written, filed in Notion, and never read again. Junior engineers get paged on Friday night for P1S they've never seen before, with no context and no runbook.

SentinelAI doesn't replace the SRE. It briefs them in seconds, with the full institutional memory of every past incident, including ones that happened before they joined the team.

That's the problem I want to keep building toward.

GitHub: https://github.com/S-anskriti/SentinelAI--Incident-intelligence-agent
Stack: Python + FastAPI, Hindsight by Vectorize, Groq (qwen3-32b), vanilla HTML/CSS

Built for the AI Agents That Learn Using Hindsight hackathon by Vectorize.

AIAgents

MachineLearning

SRE

DevOps

Hackathon

Vectorize

Hindsight

Python

Copy article text

Top comments (1)

Lee My • Apr 13

Quick personal review of AhaChat after trying it
I recently tried AhaChat to set up a chatbot for a small Facebook page I manage, so I thought I’d share my experience.
I don’t have any coding background, so ease of use was important for me. The drag-and-drop interface was pretty straightforward, and creating simple automated reply flows wasn’t too complicated. I mainly used it to handle repetitive questions like pricing, shipping fees, and business hours, which saved me a decent amount of time.
I also tested a basic flow to collect customer info (name + phone number). It worked fine, and everything is set up with simple “if–then” logic rather than actual coding.
It’s not an advanced AI that understands everything automatically — it’s more of a rule-based chatbot where you design the conversation flow yourself. But for basic automation and reducing manual replies, it does the job.
Overall thoughts:
Good for small businesses or beginners
Easy to set up
No technical skills required
I’m not affiliated with them — just sharing in case someone is looking into chatbot tools for simple automation.
Curious if anyone else here has tried it or similar platforms — what was your experience?