DEV Community

zahraarmantech
zahraarmantech

Posted on

Stop choosing between smart search and private data

A few months ago I built a way to search documents by meaning while keeping the embeddings hidden — even from the server doing the search. I called it ZATRON.

The obvious question everyone (including me) kept asking was: does it actually hide anything, or does it just look scrambled?

Scrambled-looking isn't the same as secure. So instead of trusting a correlation number, I did the thing that actually scares me: I trained a neural network to break it.

This post is the honest write-up — including the part where I tried hard to make the attack win.

The setup

Standard semantic search stores embeddings as plain vectors. Anyone with database access can cluster them by topic and infer content without reading a word. ZATRON transforms each embedding into a modular barcode: project onto PCA channels, quantize, add a per-document keyed mask, and keep only residues modulo a set of primes. You compare barcodes in modular space; the original embedding is never reconstructed.

Retrieval still works — 98% of cosine quality on 626K MSMARCO passages. The question is whether the barcodes leak.

Why a correlation number wasn't enough

My first security check was a Spearman correlation between barcode distance and true similarity. It came out near zero (ρ ≈ 0.05). Good — but a low linear correlation only rules out a simple attacker. A neural network doesn't need linearity. It can learn whatever structure is there.

So the real test: give a neural network every advantage and see if it can recover similarity from the barcodes.

The threat model (making the attacker strong on purpose)

I used a known-plaintext attacker — the strongest realistic setting:

  • It sees all the stored barcodes.
  • It also gets 80,000 document pairs with their true cosine similarities (as if a chunk of plaintext leaked).
  • It trains a model — a linear probe and a 3-layer MLP — to predict the similarity of unseen pairs from per-prime circular-difference features.
  • Train and test pairs share no anchor documents, so it can't just memorize.

And the part that makes the result trustworthy: I ran the identical attack on the unprotected quantized signals as a control. If the attack can't break those, the attack is too weak and the test means nothing.

The result

On 50,000 MSMARCO passages, 100,000 labeled pairs:

Input the attacker sees Linear probe MLP (3-layer)
Unprotected signals (control) ρ = 0.79, AUC = 0.985 ρ = 0.90, AUC = 0.999
ZATRON barcodes ρ = 0.00, AUC = 0.498 ρ = 0.00, AUC = 0.505

The same network that recovers similarity from unprotected signals almost perfectly (AUC 0.999) gets exactly chance level on the barcodes — with 80,000 labeled pairs to learn from. AUC 0.50 is a coin flip.

It learned nothing.

I also put it head-to-head with the classic baseline

"8x faster than FHE" is a weak flex — everyone knows FHE is slow. The fairer comparison is ASPE (Wong et al., SIGMOD 2009), the classic encrypted-kNN scheme. ASPE preserves scalar products exactly, so retrieval is perfect — but that same property means any observer can read similarities straight off the ciphertexts.

ASPE (SIGMOD '09) ZATRON
Retrieval recall@10 (strict) 100% 81%
Observer reads similarity directly ρ = +0.87 ρ = −0.06
Learned attack (MLP) ρ = +0.91, AUC = 0.99 ρ = +0.01, AUC = 0.52

ASPE buys perfect recall with total leakage. ZATRON gives up a margin on the strictest retrieval metric and leaks nothing — to a direct observer or a trained network.

What I'm NOT claiming

Honesty is the whole point, so the limits:

  • This is the observer threat model. A key holder computing many pairwise distances can still partially recover geometry via MDS (ρ ≈ 0.35) — that's inherent to any distance-preserving scheme, FHE included.
  • It is a randomized privacy-preserving encoding, not a reversible cipher, and not yet independently audited by a cryptographer. That's the right bar before anyone calls it production-grade.
  • The strict recall metric here (full top-10 set overlap) is harder than the top-1-in-top-10 number I quote elsewhere. Same system, stricter ruler.

Try it / break it

Everything is reproducible:

pip install zatron
Enter fullscreen mode Exit fullscreen mode

The attack and the ASPE comparison are in the repo as runnable scripts (benchmarks/). If you can make the neural attack win — train it longer, give it more pairs, better features — I genuinely want to see it. Finding the weakness is the point.

I'd rather have someone break this now than after I've claimed too much.

Top comments (0)