Most local AI assistants forget everything once the conversation ends.\
While experimenting with locally hosted LLMs, I wanted to solve that
problem by giving my assistant persistent memory.
On 16 March 2026, I worked on improving the architecture and
reliability of my local AI assistant project. The main focus was:
- Adding persistent memory
- Integrating PostgreSQL
- Improving project structure
- Running multiple models locally
This article walks through what I built and what I learned.
The Problem: Local AI Assistants Have No Memory
When you run models locally using tools like Ollama, they respond
based only on the current prompt.
They don't remember:
- Your preferences\
- Previous conversations\
- Important user information
To solve this, I implemented a memory system backed by PostgreSQL.
Designing a Memory Storage System
The idea was simple:
If the user explicitly asks the assistant to remember something, the
system should store that information.
Instead of storing entire conversations, I designed a trigger-based
memory detection system.
Trigger Words
The assistant watches for these keywords:
-
remember -
store -
save
If a user message contains one of these triggers, the system extracts
and stores the important information.
Memory Extraction Process
The system follows this pipeline:
- Detect trigger word in user input
- Remove the trigger word
- Clean the remaining text
- Ask the model to convert it into a concise fact
- Store it in the database
Example
User Input
Remember that I prefer Python for backend development.
Stored Memory
User prefers Python for backend development.
This ensures the database contains clean, structured facts instead of
raw conversation logs.
PostgreSQL Integration
To store memories persistently, I integrated PostgreSQL with the
assistant.
Three core database functions were implemented:
store_memory()
get_memories()
clear_whole_database()
Using PostgreSQL ensures that memories remain available even after
restarting the assistant.
Improving Reliability with Error Handling
AI systems interacting with databases can fail for many reasons.
To make the assistant more stable, I wrapped the memory storage logic
inside a try/except block.
Benefits:
- Prevents application crashes
- Logs errors properly
- Allows the conversation to continue
Implementing a Centralized Logging System
Originally, the project printed logs directly to the terminal.
As the project grew, this became messy and hard to debug.
I implemented a centralized logging configuration.
Logging Structure
logs/
Logging configuration lives in:
config/logging_config.py
Advantages:
- Cleaner terminal output
- Persistent logs for debugging
- Easier monitoring of system behavior
Running Local LLMs with Ollama
The assistant runs multiple models locally using Ollama.
Model Stack
Model Purpose
Llama3 General conversation and reasoning
DeepSeek-Coder Programming and technical questions
Phi3 Lightweight fallback model
This setup allows the assistant to choose the most suitable model
depending on the task.
Refactoring the Project Structure
As the project expanded, the codebase needed better organization.
Updated Project Structure
offline_chat_project/
app/
main.py
ai/
database/
config/
logging_config.py
logs/
project_logs/
scripts/
.env
requirements.txt
README.md
Key Improvements
- Separated AI logic into the
ai/module - Isolated database operations inside
database/ - Added centralized logging configuration
- Organized logs and project documentation
Version Control Strategy
All database and memory-related work was developed in a dedicated
feature branch.
feature/database_store
Using feature branches helps keep the main branch stable while
developing new functionality.
Lessons Learned
Small Models Can Be Unreliable
Smaller models sometimes generate inconsistent structured outputs.
When building memory systems, it's important to validate the extracted
data before storing it.
Memory Systems Need Filtering
Without proper filtering, the assistant might store irrelevant or
incorrect information.
The system should only store long-term meaningful facts.
Good Project Structure Matters
As projects grow, maintaining clean architecture becomes critical.
Separating modules early prevents major refactoring later.
What's Next
Planned improvements include:
- Injecting stored memories into prompts
- Adding commands like
show memories - Implementing intelligent model routing
- Improving memory filtering
The goal is to make the assistant behave more like a personalized AI
system rather than a stateless chatbot.
Final Thoughts
Building a local AI assistant with persistent memory is an
interesting engineering challenge.
Combining:
- PostgreSQL
- Local LLMs
- Modular architecture
- Structured memory storage
brings us closer to creating personal AI systems that truly remember
users.
I am taking of all the updates inside my github repo inside ProjectLogs.
For Code and Updates Checkout my github repository
Top comments (0)