Most people assume that to give an AI agent memory, you need a backend, a database, maybe some vector embeddings, and a whole lot of DevOps.
But if you're building quickly, you don’t always need a full memory stack. In this post, I’ll show you how I built AI agents that can "remember" previous interactions without using a real database.
🧠 The Problem: Stateless Agents = Dumb Agents
When using OpenAI or Anthropic APIs, every prompt is stateless by default.
Your agent forgets everything the moment the function ends.
Which means:
- Your AI assistant won’t recall previous user preferences
- Sales agents forget client context
- Support bots loop on questions they already answered
🧩 The Hack: Context Injection + Persistent Memory via API
Instead of plugging in a database or Pinecone-style vector store, I used this flow:
- Store lightweight memory in a Make.com webhook or Airtable
- Inject the memory back into the prompt before each request
- Let the AI think it "remembers" - even though it's just replaying context
It’s not true memory. It’s simulated memory. But it works.
⚙️ How It Works (Step-by-Step)
Set Up a Webhook or Minimal Storage
Use Make.com, Google Sheets API, or Airtable to log key-value pairs:
json
{
"user_id": "92823",
"name": "Dylan",
"last_topic": "Smart contract audit",
"notes": "Prefers non-custodial tools"
}
On Each New Prompt, Fetch & Format Context
Before calling OpenAI, inject this:
pgsql
You are talking to Dylan.
Last time, you discussed smart contract audits.
He prefers non-custodial tools.
Continue the conversation as if you remember this.
Now add the new user input below that and send it all in the same API call.
Update Memory After Each Turn
After each message, update your lightweight store.
Don’t store the whole convo - just key facts or intent:
json
Copy
Edit
"last_topic": "interested in Bedrock vs ChatGPT Agents"
You’re creating progressive memory, not full transcripts.
Bonus: Structured Prompts + Slot-Filling
Use structured prompting like:
diff
Extract and save the following from this conversation:
- Name
- Area of interest
- Any tools or brands mentioned
Then use that output to update your “memory” store.
✅ When This Works Well
- Internal tools (sales agents, support bots, onboarding flows)
- Low-cost AI automations
- MVPs and proof of concept demos
- Scenarios with short-term memory needs
❌ When It Doesn’t
- Long conversations with complex back-and-forth
- When retrieval needs context across multiple use cases
- If multiple users share the same agent instance
- Security-sensitive applications where hallucination = risk In those cases, use a proper vector DB (like Weaviate, Pinecone, or even Postgres + pgvector).
Final Thought
Sometimes you don’t need a full stack. Sometimes, a good prompt and a clever context injection flow is all it takes. You’ll be surprised how far this “fake memory” gets you.
Top comments (0)