Rohit Rajvaidya

Posted on Mar 17

Building a Local AI Assistant with Memory, PostgreSQL, and Multi-Model Support Update

#python #ai #postgres #programming

Most local AI assistants forget everything once the conversation ends.\
While experimenting with locally hosted LLMs, I wanted to solve that
problem by giving my assistant persistent memory.

On 16 March 2026, I worked on improving the architecture and
reliability of my local AI assistant project. The main focus was:

Adding persistent memory
Integrating PostgreSQL
Improving project structure
Running multiple models locally

This article walks through what I built and what I learned.

The Problem: Local AI Assistants Have No Memory

When you run models locally using tools like Ollama, they respond
based only on the current prompt.

They don't remember:

Your preferences\
Previous conversations\
Important user information

To solve this, I implemented a memory system backed by PostgreSQL.

Designing a Memory Storage System

The idea was simple:

If the user explicitly asks the assistant to remember something, the
system should store that information.

Instead of storing entire conversations, I designed a trigger-based
memory detection system.

Trigger Words

The assistant watches for these keywords:

remember
store
save

If a user message contains one of these triggers, the system extracts
and stores the important information.

Memory Extraction Process

The system follows this pipeline:

Detect trigger word in user input
Remove the trigger word
Clean the remaining text
Ask the model to convert it into a concise fact
Store it in the database

Example

User Input

Remember that I prefer Python for backend development.

Stored Memory

User prefers Python for backend development.

This ensures the database contains clean, structured facts instead of
raw conversation logs.

PostgreSQL Integration

To store memories persistently, I integrated PostgreSQL with the
assistant.

Three core database functions were implemented:

store_memory()
get_memories()
clear_whole_database()

Using PostgreSQL ensures that memories remain available even after
restarting the assistant.

Improving Reliability with Error Handling

AI systems interacting with databases can fail for many reasons.

To make the assistant more stable, I wrapped the memory storage logic
inside a try/except block.

Benefits:

Prevents application crashes
Logs errors properly
Allows the conversation to continue

Implementing a Centralized Logging System

Originally, the project printed logs directly to the terminal.

As the project grew, this became messy and hard to debug.

I implemented a centralized logging configuration.

Logging Structure

logs/

Logging configuration lives in:

config/logging_config.py

Advantages:

Cleaner terminal output
Persistent logs for debugging
Easier monitoring of system behavior

Running Local LLMs with Ollama

The assistant runs multiple models locally using Ollama.

Model Stack

Model Purpose

Llama3 General conversation and reasoning
DeepSeek-Coder Programming and technical questions
Phi3 Lightweight fallback model

This setup allows the assistant to choose the most suitable model
depending on the task.

Refactoring the Project Structure

As the project expanded, the codebase needed better organization.

Updated Project Structure

offline_chat_project/

app/
   main.py
   ai/
   database/

config/
   logging_config.py

logs/
project_logs/
scripts/

.env
requirements.txt
README.md

Key Improvements

Separated AI logic into the ai/ module
Isolated database operations inside database/
Added centralized logging configuration
Organized logs and project documentation

Version Control Strategy

All database and memory-related work was developed in a dedicated
feature branch.

feature/database_store

Using feature branches helps keep the main branch stable while
developing new functionality.

Lessons Learned

Small Models Can Be Unreliable

Smaller models sometimes generate inconsistent structured outputs.

When building memory systems, it's important to validate the extracted
data before storing it.

Memory Systems Need Filtering

Without proper filtering, the assistant might store irrelevant or
incorrect information.

The system should only store long-term meaningful facts.

Good Project Structure Matters

As projects grow, maintaining clean architecture becomes critical.

Separating modules early prevents major refactoring later.

What's Next

Planned improvements include:

Injecting stored memories into prompts
Adding commands like show memories
Implementing intelligent model routing
Improving memory filtering

The goal is to make the assistant behave more like a personalized AI
system rather than a stateless chatbot.

Final Thoughts

Building a local AI assistant with persistent memory is an
interesting engineering challenge.

Combining:

PostgreSQL
Local LLMs
Modular architecture
Structured memory storage

brings us closer to creating personal AI systems that truly remember
users.

I am taking of all the updates inside my github repo inside ProjectLogs.
For Code and Updates Checkout my github repository

https://github.com/RohitRajvaidya5/AI-Assistant-Project.git

DEV Community