How MemoraEU Cannot Read Your Memories — Even If We Wanted To
Zero-knowledge architecture of a sovereign AI memory layer
The question nobody asks enough
When Claude, ChatGPT, or Gemini "remembers" something, where does it go? To Anthropic's, OpenAI's, or Google's servers. In plaintext. Potentially used to fine-tune future models. Subject to the Cloud Act if the company is American.
That's the trade-off we implicitly accept in exchange for convenience.
MemoraEU makes a different bet: the server must never be able to read your data. Not as a policy. As an irreversible technical constraint. This post explains how we get there — and why it's harder than it sounds when you still want semantic search to work.
The architecture in one sentence
Content is encrypted on your machine before leaving your machine. The key never leaves your machine. The server stores opaque blobs and floating-point vectors.
That's it. Everything else is implementation.
Key derivation: PBKDF2-HMAC-SHA256
You configure two environment variables in your MCP server:
MEMORAEU_SECRET=your-long-unique-passphrase
MEMORAEU_SALT=one-salt-per-installation
At startup, a single derivation operation:
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from cryptography.hazmat.primitives import hashes
kdf = PBKDF2HMAC(
algorithm=hashes.SHA256(),
length=32, # AES-256 → 32 bytes
salt=salt_bytes,
iterations=100_000, # NIST SP 800-132 recommends ≥ 10,000
)
key = kdf.derive(password.encode())
100,000 iterations of SHA-256: enough to make brute-forcing a long passphrase prohibitively expensive, without noticeably slowing down startup (< 200 ms on a modern laptop).
The derived key is kept in RAM for the duration of the session. It is never written to disk, never transmitted, never logged.
Encryption: AES-256-GCM
import os, base64
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
def encrypt(plaintext: str, key: bytes) -> str:
nonce = os.urandom(12) # 96-bit, random per message
ciphertext = AESGCM(key).encrypt(nonce, plaintext.encode("utf-8"), None)
return base64.b64encode(nonce + ciphertext).decode("ascii")
The format of the blob stored server-side:
base64( nonce[12 bytes] | ciphertext | auth_tag[16 bytes] )
Three key properties of GCM:
- Confidentiality — without the key, the ciphertext is indistinguishable from random noise
- Integrity — the 16-byte authentication tag detects any modification of the ciphertext (authenticated encryption)
- Unique nonce per message — identical content produces different blobs on every encryption
What the server sees: a base64 string. What it can infer: the approximate size of the original content (± a few bytes). Nothing else.
The real challenge: searching encrypted data
Encrypting and storing is easy. But an AI memory without search is useless. And semantic search requires understanding the meaning of content — something a server cannot do on ciphertext.
The naive solution would be to decrypt server-side to compute the embedding. Obviously we don't do that.
Our approach: embeddings are computed before encryption, on your machine.
Plaintext
│
├─► Mistral Embed (local) ──► float[1024] vector ──► Qdrant (server)
│
└─► AES-256-GCM ──► opaque blob ──► PostgreSQL (server)
When you store "I use ESP32-S3 with UART on GPIO21":
- The plaintext goes to the Mistral Embed API (from your machine, via your Mistral API key)
- Mistral returns a 1024-dimensional vector representing the semantics
- The text is encrypted locally
- Only the vector and the encrypted blob travel to our servers
When you search "UART wiring on my board":
- The query is turned into a vector (same process, local)
- Qdrant performs a cosine similarity search in the vector space
- The matching blobs are returned
- They are decrypted locally before being shown to Claude
What the server can do: find the N nearest vectors to a query. It knows that two memories are "semantically close" without knowing what they say.
What the server cannot do: read the content, understand the topic, infer anything beyond the vector structure.
Zero-knowledge deduplication
Before storing a new memory, we check whether it already exists — without ever comparing plaintext:
DEDUP_SKIP_THRESHOLD = 0.94 # exact duplicate → reject storage
DEDUP_WARN_THRESHOLD = 0.85 # very similar → warn but store
response = await api_post("/memories/search-by-vector", {
"vector": embedding, # vector computed locally
"limit": 1,
"threshold": DEDUP_WARN_THRESHOLD,
})
The comparison happens entirely in vector space. If the cosine similarity score exceeds 0.94, it's an exact duplicate: we reject the storage and return the existing ID. Between 0.85 and 0.94: we inform the user but store anyway.
Result: zero plaintext transmitted for deduplication.
Smart compression (optional)
If MISTRAL_API_KEY is configured, the MCP server compresses long memories before encrypting them:
Raw text (> 300 chars)
│
└─► Mistral (local): "summarize in 1-3 sentences"
│
└─► Compressed text ──► Embed ──► Encrypt ──► Store
Compression happens before encryption, on plaintext, on your machine. What goes to the Mistral API is your raw text — but it's your Mistral key, on your infrastructure, and Mistral does not store prompts by default. What goes to our servers is always encrypted.
What the server actually sees
In the database, a memory looks like this:
{
"id": "mem_01HVKX9...",
"content": "dGhpcyBpcyBub3QgcmVhZGFibGUgYXQgYWxs...",
"category": "hardware",
"embedding": [0.0234, -0.1823, 0.0091, ...],
"created_at": "2026-04-25T09:14:00Z"
}
content is a base64 blob that cannot be decrypted without the key. embedding is a vector that captures semantics but not literal content. category is assigned locally by the LLM before encryption — it's the only readable metadata, and it's intentionally generic ("hardware", "personal", "project"…).
The threat model
This scenario is covered: our servers are compromised. An attacker retrieves the entire database and the Qdrant vectors. They see base64 blobs and coordinates in a 1024-dimensional space. Without your passphrase, there's nothing they can do. Even we can't.
This scenario is not covered: your machine is compromised. If an attacker has access to your local environment, they can read MEMORAEU_SECRET from your .env or intercept content before encryption. No zero-knowledge architecture can protect against client-side compromise — this is a fundamental limitation, not specific to MemoraEU.
This scenario is partially covered: passphrase reuse. If you use the same passphrase across multiple installations, compromising one machine affects all others. A different MEMORAEU_SALT per installation mitigates this risk.
Cryptographic roadmap
The current v1 uses a fixed salt per installation (stored in .env). This is pragmatic but imperfect:
- Phase 2: unique salt per user, stored server-side (the server provides the salt, not the key — this doesn't break zero-knowledge)
- Phase 3: per-memory-pair encryption to prevent temporal correlation
- Phase 4: HSM support for enterprise deployments (key in hardware, never in RAM)
Why this matters now
LLMs are becoming permanent assistants. They will know more and more about you — your projects, your decisions, your family, your health, your finances. Where that memory is stored and who can access it is a question of personal sovereignty, not just product preference.
Zero-knowledge is not a marketing argument. It's an architectural constraint we impose on ourselves so we are never in the position of having to choose between our commercial interests and your privacy.
MemoraEU is open source. The encryption code is available on GitHub.
Technical questions: contact@memoraeu.com
Top comments (0)