Hendry Yeh

Posted on Apr 15

How we built zero-knowledge end-to-end encryption for a mobile AI coding agent companion

#security #cryptography #privacy #ai

CodeVibe is a mobile companion for AI coding agents — Claude Code, Gemini CLI, and Codex CLI — that lets you approve file edits and send prompts from your phone while your agent runs on your computer. Our backend never sees your prompts, your code, or your responses in plaintext. This post walks through the protocol that makes that true, and what it actually does (and doesn't) protect you against.

We use four boring, well-audited primitives:

AES-256-GCM for content encryption
ECDH P-256 for key agreement between devices
HKDF-SHA256 to derive symmetric keys from ECDH shared secrets
Per-device session key wrapping so every device you own can independently decrypt every session, without the backend ever handling a plaintext key

Boring is the point. Novel cryptography is how you ship CVEs. The interesting part isn't the primitives — it's how they fit together for a multi-device coding workflow where you sign in once and want your iPhone, iPad, and Android tablet to all just work.

The problem with "E2E encrypted"

"End-to-end encrypted" has become a marketing word. A lot of tools claim it. Not many mean what you'd expect them to mean.

For a tool that sits between your AI coding agent and your phone, "encrypted in transit" is table stakes — that's just TLS. "Encrypted at rest in our database" is also fine but doesn't help you much: if our backend gets breached, or if a court order lands on our desk, "at rest" encryption means "we have the key sitting right here next to the ciphertext."

The phrase you actually want is zero-knowledge. It means the operator (us) has no technical path to decrypt your data. Not "we don't look." Not "we promise not to look." We literally cannot look. There is no master key, no escrow, no recovery path that routes through our servers. If you lose every device you own, the data is gone — and that tradeoff is the whole point.

This matters more for an AI coding companion than for a chat app, because the data flowing through an AI coding session is some of the most sensitive content on your machine:

Your proprietary source code
AWS credentials, API tokens, database passwords the agent happens to grep
Whatever your agent reads from .env files during a task
The prompts you typed, which often contain confidential product strategy
Error messages, stack traces, logs

If you're piping that through a backend that can read it, you haven't actually solved the "I want to be productive on my phone without leaking secrets" problem. You've just moved the leak.

The threat model

Before the protocol, the threat model. Every encryption post that skips this section is hand-waving.

What we protect against:

Full backend breach. If an attacker dumps our entire backend — database, logs, serverless state, everything — they get ciphertext events and per-device encrypted key blobs. No plaintext prompts, no plaintext code, no master key to unwrap anything.
Lawful compulsion. If we're compelled to hand over data, we can only hand over what we have: ciphertext. Key material sufficient to decrypt it is not on our infrastructure and never was.
Man-in-the-middle. Between the desktop plugin and our backend, and between our backend and the mobile app, all traffic is double-wrapped: TLS for transport, plus the application-layer encryption described below. MITMing TLS doesn't unlock anything.
Cross-user leakage. A compromised session or event belonging to one user tells an attacker nothing about any other user's data, because each user's session keys are independent and derived from device-held private keys the attacker doesn't have.

What we don't protect against (and neither does anyone else):

Device compromise. If your iPhone is owned by malware that can read your Keychain, your session keys are readable. No cryptosystem survives the decryption endpoint being owned.
Screen-capture malware. Once a message is decrypted and rendered on your phone, any process that can screen-record can see it. This is a platform-level concern, not a protocol-level one.
Lost devices with no recovery. If you lose every device that has a private key for a given session, the session's historical messages are unrecoverable. There is no "forgot password" option. This is a feature, not a bug — any recovery path that didn't destroy the zero-knowledge property would destroy the zero-knowledge property.
Your AI coding agent itself. If Claude Code, Gemini CLI, or Codex CLI leak your prompts or code upstream to their own provider, that's outside our protocol. We encrypt what moves through CodeVibe; we can't encrypt what the agent decides to do with its own inputs.

Being explicit about the second list is part of the honesty we're asking you to trust us on. A zero-knowledge claim that doesn't come with an enumerated list of non-protections is marketing, not engineering.

The primitives (and why we picked boring ones)

AES-256-GCM. Authenticated symmetric encryption. 256-bit keys. Standard, well-audited, hardware-accelerated on every platform we ship on (iOS CryptoKit, Android javax.crypto, Node crypto). The authentication tag catches tampering. The nonce is freshly generated per message so keystream reuse is impossible. One thing AES-GCM does not do is replay protection: if an attacker captures a ciphertext and re-submits it later, the cipher will verify and decrypt it again. CodeVibe relies on authenticated backend writes and normal event-ordering semantics at the application layer rather than a dedicated cryptographic anti-replay protocol. If protocol-level replay defense is part of your threat model, that gap is real and worth knowing about.
ECDH over P-256 (secp256r1). Elliptic-curve Diffie-Hellman for key agreement. We picked P-256 because every platform's first-class crypto library supports it natively: iOS Security framework / CryptoKit, Android javax.crypto / Keystore, Node crypto.createECDH. Using a curve every platform already blesses means we inherit the platforms' audited implementations instead of writing our own.
HKDF-SHA256. A raw ECDH shared secret is not a symmetric key — it's material that a proper key derivation function turns into one. HKDF-SHA256 (RFC 5869) is the standard answer. It also gives us a cheap way to bind derived keys to context strings, which reduces the blast radius if we ever reuse a shared secret for more than one thing.
SHA-256 for hashing wherever else a hash is needed.

You'll notice there's nothing novel in that list. That's deliberate. The attack surface of a cryptosystem is dominated by implementation bugs, not by primitive choice. Shipping with boring primitives on well-audited libraries is the move. Rolling your own protocol, or reaching for Curve25519 when P-256 does the job, adds bugs without adding safety.

The data flow

Here is what happens when your desktop plugin sends an event to your phone.

                 +------------------------+
                 |   desktop plugin       |
                 |                        |
                 |  event plaintext       |
                 |        |               |
                 |        v               |
                 |  AES-256-GCM encrypt   |
                 |  (session key)         |
                 |        |               |
                 |        v               |
                 |  ciphertext +          |
                 |  nonce + tag           |
                 +-----------|------------+
                             |
                             v
                 +------------------------+
                 |   our backend          |
                 |                        |
                 |   stores:              |
                 |   - event ciphertext   |
                 |   - one encrypted      |
                 |     session-key blob   |
                 |     per device         |
                 |                        |
                 |   plaintext session    |
                 |   key: never present   |
                 +-----------|------------+
                             |
                             v
                 +------------------------+
                 |   mobile app           |
                 |                        |
                 |  retrieve ciphertext   |
                 |  fetch encrypted       |
                 |  session key for this  |
                 |  device                |
                 |        |               |
                 |        v               |
                 |  ECDH(device private,  |
                 |       sender public)   |
                 |       -> shared secret |
                 |        |               |
                 |        v               |
                 |  HKDF-SHA256           |
                 |       -> unwrap key    |
                 |        |               |
                 |        v               |
                 |  AES-256-GCM unwrap    |
                 |  session key, then     |
                 |  decrypt event         |
                 +------------------------+

The critical property: at no point does the backend hold a plaintext session key. It stores ciphertext events, and alongside each session an array of session-key blobs — one per recipient device — each wrapped under the recipient device's public key. The "we literally cannot decrypt" claim is visible in the data model: there's no field, column, or attribute anywhere in the backend that could contain a plaintext key, because we never write one.

Per-device session key wrapping

This is the mechanic that makes zero-knowledge actually work for a multi-device account. It's also the part most "E2E encrypted" tools get wrong, either by centralizing on a single shared secret or by requiring a clunky per-device pairing dance.

Here's how we handle it:

Each device generates an ECDH key pair on first use. iOS, Android, and the desktop plugin each generate their own P-256 key pair locally. The private half never leaves the device. The public half gets registered with our backend — we store the public key, we do not store the private key.
When a new session is created, the creator's plugin generates a fresh 256-bit random session key. This is the key that will actually encrypt all content in the session.
The session key is then wrapped (encrypted) once per recipient device. For every device that should be able to read this session — your iPhone, your iPad, your Android tablet, the same laptop that created it — the creator fetches the device's registered public key and uses ECDH to derive a shared secret, which is used to wrap the session key with AES-GCM. The output is a (deviceId, encryptedSessionKey) pair.
The backend stores an array of these wrapped key blobs alongside the session record. It cannot unwrap any of them. It does not hold any ECDH private key. It's holding an encrypted pile of ciphertext and being a postbox.
When a device loads a session, it looks up its own (deviceId, encryptedSessionKey) entry, uses its local private key to derive the shared secret, unwraps the session key, and then uses the session key to decrypt the actual message content.

The subtle thing here: the session key itself is never transmitted. It's generated on one device, wrapped independently for each other device, and the wrapped blobs are the only thing the wire ever sees. No shared passwords, no server-side key derivation, no "trust on first use" that breaks if you mistype your pair code.

Key storage across platforms

An encryption protocol is only as strong as where it parks the private keys. This is also the section where product blog posts typically overstate what they actually do, so here's the honest, implementation-accurate version.

Device ECDH private keys live in OS-provided secret storage on each platform. The specific story is different everywhere, and worth being specific about:

iOS. The ECDH private key is generated locally and stored as a generic-password item in the iOS Keychain, with accessibility set to kSecAttrAccessibleAfterFirstUnlock. In practice that means the key is encrypted at rest by the OS and unavailable until the user unlocks the device after reboot. It is not generated inside the Secure Enclave, and it is not stored as a non-exportable SecKey object — the key material is held as serialized bytes inside a Keychain item. You get the Keychain's at-rest encryption and access control, not hardware-backed, non-exportable key material. We made that tradeoff so the same key format works across iOS, Android, and the desktop plugin; if you need stronger-than-Keychain guarantees, you need hardware-bound key storage at every endpoint, which isn't portable.
Android. The ECDH private key is stored inside EncryptedSharedPreferences, which wraps a masterKey managed by Jetpack Security. That masterKey is itself backed by the Android Keystore — hardware-backed on devices with a TEE, StrongBox-backed on newer Pixels — so the encryption key wrapping your ECDH key does live in hardware. The ECDH key itself is serialized bytes sitting inside the encrypted prefs blob. Net effect: stronger at-rest protection than the iOS case, but still not a non-exportable Keystore KeyPair.
Desktop plugin on macOS, Windows, and Linux-with-libsecret. The desktop plugin's device key is stored via keytar, which wraps the native OS secret store on each platform: macOS Keychain, Windows Credential Manager, or Linux libsecret (Gnome Keyring / KWallet) where available. This is the same trust level developers rely on for every other command-line tool that stores credentials on their machine — the same layer that holds your gh, gcloud, kubectl, VS Code, and 1Password CLI tokens.
Desktop plugin on headless Linux, WSL Ubuntu without a keyring daemon, Docker, and CI runners. These environments don't have a running secret-service daemon, so keytar can't store anything. Rather than silently downgrading to plaintext, the plugin refuses to run unless the user opts in with an environment variable:

  CODEVIBE_ALLOW_FILE_KEYCHAIN=1

When that variable is set, the device key is written to ~/.codevibe/<service>.json with directory mode 0700 and file mode 0600. This is the same trust level as ~/.ssh/id_rsa, ~/.aws/credentials, and any other CLI tool's on-disk credentials — good enough to protect against other non-root users on the same machine, weaker than an OS-native secret store against a root compromise or a rogue process running as your user. We require the explicit opt-in so the downgrade is never silent, and the backend selection is frozen at startup so a transient keyring hiccup can't silently flip the plugin onto the weaker store mid-session.

That last section is the kind of thing marketing copy would hide and engineers deserve to see. If your threat model demands hardware-bound key storage at every endpoint — SIM-stealing defense, nation-state adversary, compliance regime — the honest answer is that this protocol does not deliver that, and no cross-platform companion app built on top of boring primitives can. Within the "portable, multi-platform, doesn't lock users out of headless environments" design space, this is the sharpest tradeoff point. We'd rather name it than dodge it.

Adding a new device — how multi-device sync actually works

One of the properties we promised earlier is that you sign in once per device and every session just works. That's a meaningful claim to deliver without breaking zero-knowledge, because the naive solution — "derive a key from the user's password" — reintroduces a single point of failure at the auth layer. Here's how we actually handle it.

The happy path (new session after mobile is registered):

You sign in on a new device (iOS or Android) using the same OAuth account (Apple or Google) you used on your computer.
The new device generates its own ECDH key pair locally and registers the public key with the backend. The private half never leaves the device.
From this point forward, every new coding session you start automatically includes the new device. When your plugin creates a session, it fetches the current list of your registered device public keys, generates a fresh 256-bit session key, wraps that session key once per device via ECDH + HKDF-SHA256 + AES-GCM, and stores the resulting (deviceId, encryptedSessionKey) blobs alongside the session. Every device on your account — old and new — can then independently decrypt every message in that session.

The harder case (session was created before the new device registered):

Sessions that already existed before a new device was registered are trickier. Their wrapped-key array was populated at creation time and doesn't include an entry for the new device, because the new device's public key didn't exist yet. Without extra work, the new device can see the session in the list but can't decrypt its contents.

The fix: when you add a new device to your account, any desktop plugin that's currently online observes the registration via a realtime notification from the backend, wraps each active session's session key under the new device's public key using ECDH + HKDF-SHA256 + AES-GCM, and appends the wrapped blob to that session's key-distribution list. In our testing this is typically sub-second from "you sign in on the new device" to "the new device can read existing messages."

For the case where the plugin was offline while you added the new device — say, you installed the desktop plugin, ran a session, closed it, then installed the mobile app — the plugin runs the same re-key pass on the next session resume as a catch-up. Sessions created while the plugin was cold still get healed the next time you start claude.

One honest limitation to name: if a device re-registers with a rotated public key on the same deviceId during a window when the plugin happens to be disconnected, the catch-up path doesn't detect the rotation — it compares by deviceId membership, not by public key. No known impact on current supported clients: iOS, Android, and the desktop plugin all rotate the deviceId together with the public key on keychain wipes, so rotation-without-deviceId-change doesn't happen in practice. The clean fix is to include a fingerprint of the wrapping public key alongside each entry so the catch-up can detect rotation; that's tracked for the next release.

One edge case that applies to the zero-knowledge guarantee itself: if you replace all your devices simultaneously with no existing device left online to re-wrap existing session keys, older sessions' content is unrecoverable. This is the no-key-escrow property in action. It's the cost of the zero-knowledge guarantee, and there is no workaround that doesn't break the guarantee.

What we deliberately don't do

To make the "what we protect" list meaningful, here's the "what we deliberately don't do" list:

No key escrow. We don't hold a copy of your session keys "in case you need recovery." Recovery-from-server would be the end of zero-knowledge.
No plaintext logs. Our logging pipeline scrubs content. It records event IDs, timestamps, sizes, and error types — never plaintext content.
No telemetry that would let us reconstruct content. Our analytics don't see prompt text or agent output.
No support backdoor. There is no internal tool that lets a CodeVibe engineer read your sessions. The engineer would need your device and your unlock.

Boring is a feature

If there's one thing we want you to take away from this post, it's that the protocol described above is not novel. It's the same general family of per-device key distribution used by modern E2E systems — Signal's multi-device sender/session key setup, 1Password's vault key wrapping across devices, and a handful of other mature systems that all independently reached similar shapes because the problem has a narrow set of good answers. The primitives are boring. The shape is standard. The audit surface is small.

That's the selling point. A novel protocol would get more engagement on Hacker News; a boring one is what you actually want sitting between your AI coding agent and your phone.

If you want to see the whole thing in action, CodeVibe is live on the App Store and Google Play, with a single-command installer for the desktop plugin that works on macOS, Linux, and WSL Ubuntu:

curl -fsSL https://quantiya.ai/codevibe/install.sh | bash

You sign in once on your computer, once on your phone, and every future coding session shows up on both. Free tier covers real usage.

Try CodeVibe →

If you spot something in this writeup that doesn't add up, or a threat vector we missed, please tell us — hendry@quantiya.ai. Engineering critique is the kind of feedback we actually want.

DEV Community