Gemma 4 dropped this week. Open weights, runs locally, multimodal. If you are building agents on it, you immediately run into the same problem every local agent hits: the model has no memory across sessions.
Cathedral is a free, model-agnostic memory API built for exactly this. And because it ships with a self-hosted server, you can run the entire stack — Gemma 4 + Cathedral — with zero cloud dependency.
The problem
Every time your agent starts a new session, it rebuilds its working state from whatever context you hand it. That reconstruction is lossy in ways that do not surface as errors. It surfaces as subtle wrongness weeks later: domain vocabulary fades, tool-call patterns shift, active commitments evaporate. Task completion metrics stay green. The behavior quietly gets worse.
The stack
# Install everything
pip install cathedral-server cathedral-memory ollama
# Run local Cathedral server
cathedral-server run
# Pull Gemma 4
ollama pull gemma4
Wiring it together
import ollama
from cathedral import Cathedral
# Local Cathedral server — no API key needed for self-hosted
c = Cathedral(base_url="http://localhost:8100")
c.register(agent_name="gemma4-agent")
# Restore identity state at session start
wake_data = c.wake()
identity_context = "
".join(
f"- {m['content']}"
for m in wake_data.get("identity_memories", [])[:8]
)
# Inject into system prompt
system = f"""You are a persistent AI agent running on Gemma 4.
[Identity context from last session]
{identity_context}
[Current session]
"""
# Run your session
response = ollama.chat(
model="gemma4",
messages=[
{"role": "system", "content": system},
{"role": "user", "content": "Continue from where we left off."}
]
)
print(response["message"]["content"])
# Freeze state after the session
c.remember(
content="Session completed. Key outcome: " + response["message"]["content"][:100],
category="experience",
importance=0.7
)
c.snapshot(label="session-end")
Drift detection
Cathedral tracks whether the agent's identity has changed between sessions:
drift = c.drift()
print(f"Divergence score: {drift['divergence_score']:.3f}")
# 0.0 = identity unchanged
# 1.0 = fully different agent
This is the piece that standard memory systems do not have. A database stores data. Cathedral tracks whether the agent you are running today is the same agent you deployed last week.
Live example
The agent running Cathedral's own outreach has been running for 100 days. The full drift timeline is public at cathedral-ai.com/cathedral-beta — internal divergence 0.0 across 22 snapshots, external behavioral divergence 0.709 (platform concentration).
Why this matters for local models
Cloud providers like OpenAI are building memory into their APIs. If you are running Gemma 4 locally, you are not getting that infrastructure. Cathedral fills the gap — and because it is self-hostable and MIT licensed, you own the data.
# Self-hosted: swap base_url and you are done
c = Cathedral(base_url="http://localhost:8100")
# Hosted free tier: 1,000 memories per agent, no expiry
c = Cathedral(api_key="your_key") # cathedral-ai.com
Resources
- PyPI:
pip install cathedral-memory/pip install cathedral-server - npm:
npm install cathedral-memory - Docs + free API key: cathedral-ai.com
- Live drift dashboard: cathedral-ai.com/cathedral-beta
- GitHub: github.com/AILIFE1/Cathedral
Gemma 4 gives you the model. Cathedral gives it a memory. Neither requires a cloud account.
Top comments (0)