DEV Community

Cover image for How We Added E2E Encryption on Top of a Local-First Architecture
Doszhan Mengaliyev
Doszhan Mengaliyev

Posted on • Originally published at zentline.com

How We Added E2E Encryption on Top of a Local-First Architecture

I used to track my finances in an app. Salary, loans, small transfers, all of it. At some point I got curious whether the team behind it could actually see those numbers in their database. So I wrote them and asked. They never replied.

That stuck with me. When we started building Finsight, I did not want our users in that same spot, wondering what happens to their data on someone else's server. So privacy went into the architecture from day one, not added later, not bolted on before launch.

The goal was simple: data should be inaccessible not because the team promises to behave, but because technically the team cannot read it even if they wanted to.

HTTPS Is Not End-to-End

HTTPS protects the connection between the phone and the server. That is necessary, but it only covers the wire. Once the request lands on the server, the data sits there in plain text. If a transaction amount of 530 arrives, the server sees 530, period.

Encrypting the database on the server does not help either. It protects against stolen disks and leaked backups, but while the server is running it decrypts data on the fly and reads it like anything else. Inside the application layer it is still plaintext.

So at some point the server sees balances, notes, and amounts anyway. We wanted to eliminate that moment entirely.

Where We Drew the Line

In a previous article we covered the move to local-first. The short version: every user has their own SQLite database on their device. The UI reads and writes directly to it, so everything opens instantly and works offline. PowerSync runs in the background and keeps the local database in sync with the server.

That solved the performance problem. But it raised another one: data travels constantly between the device and the server, and anyone along that path could potentially read it.

Our answer was to encrypt sensitive fields directly on the device before PowerSync sends anything. What reaches the server is already scrambled. Only a client holding the key can decrypt it. The server never sees the key.

input  --> plaintext into local SQLite
       --> encryption on the client
       --> ciphertext into the sync table
       --> PowerSync carries ciphertext to the server
       --> another device receives ciphertext
       --> client decrypts locally
       --> UI reads normal plaintext again
Enter fullscreen mode Exit fullscreen mode

For PowerSync these are just rows. It does not need to understand what is inside them.

Two Tables Instead of One

If we encrypt data before sending it, the UI has a problem: it needs the values in readable form. You cannot render a transaction list from scrambled strings, and a monthly expense filter would not work on them either.

So for every sensitive entity we maintain two tables.

One is local, with plain values, and the UI reads from it. In PowerSync these tables are marked as localOnly and never leave the device. The second is the sync table, with the same records but sensitive fields already encrypted. PowerSync moves this one between the device and the server.

For an account the schema looks like this:

// Local table with plain values, never leaves the device
export const accounts = new Table(
  {
    organization_id: column.text,
    name: column.text,    // plaintext
    balance: column.real, // regular number
    currency_id: column.text,
    // ...
  },
  { localOnly: true }
);

// Sync table with the same fields, but name and balance hold ciphertext
export const accountsEncrypt = new Table({
  organization_id: column.text,
  name: column.text,    // encrypted
  balance: column.text, // text instead of real because it holds ciphertext
  currency_id: column.text,
  // ...
});
Enter fullscreen mode Exit fullscreen mode

The same pair exists for every sensitive entity:

accounts        --> accounts_encrypt
transactions    --> transactions_encrypt
debts           --> debts_encrypt
loans           --> loans_encrypt
loan_payments   --> loan_payments_encrypt
Enter fullscreen mode Exit fullscreen mode

We do not encrypt entire rows, only specific fields: amounts, balances, notes, lender names, interest rates. Things like dates, user and organization IDs, and relations between records stay visible. Without them the server cannot route rows to the right devices.

That is a tradeoff, and there is more on it below.

How It Works in Code

When a user creates an account, the UI writes a row to the plain accounts table. No encryption yet, no sync. Just a normal insert into the local database.

The database itself is a PowerSyncDatabase instance. On initialization it gets a schema with all the tables and a filename where SQLite will store the data on the device:

import { PowerSyncDatabase } from '@powersync/web';

const db = new PowerSyncDatabase({
  database: { dbFilename: 'finsight.db' },
  schema: AppSchema, // all tables, both local-only and sync
});
Enter fullscreen mode Exit fullscreen mode

From there it gets interesting. PowerSync can subscribe to changes and notify which tables just changed. We use that as a trigger to run the crypto layer:

// Fires every time one of the tables changes
db.onChange({
  onChange: async ({ changedTables }) => {
    for (const table of changedTables) {
      await handleTableChange(table);
    }
  }
});
Enter fullscreen mode Exit fullscreen mode

The handleTableChange function is a simple router. It looks at the table name and decides which direction to go: encrypt freshly written data before it gets sent, or decrypt incoming data from the server so the UI can display it.

async function handleTableChange(table) {
  // User changed something locally, encrypt and write to the sync copy
  if (table === 'accounts')     return encryptAccounts();
  if (table === 'transactions') return encryptTransactions();

  // Encrypted row arrived from the server, decrypt for the UI
  if (table === 'accounts_encrypt')     return decryptAccounts();
  if (table === 'transactions_encrypt') return decryptTransactions();

  // ...debts, loans, payments follow the same pattern
}
Enter fullscreen mode Exit fullscreen mode

Both directions are needed. The first makes sure data gets encrypted before PowerSync picks it up. The second makes sure anything that arrived from another device gets turned back into readable numbers and strings so the UI can show it right away.

After that, PowerSync looks only at the encrypted tables and sends accumulated changes to the server in batches. On the backend they map to regular Django models. The Transaction.amount field is declared as TextField rather than a number because it holds ciphertext. The server stores the string, returns it on request, and cannot read what is inside.

Keys

Everything depends on a key the server does not know. If the server had it, e2e would be pointless.

We use a two-layer scheme. Each organization has a main key called the DEK (data encryption key). It is what actually encrypts the financial fields. The DEK is a random sequence of bytes that no one types or memorizes.

The DEK itself is also encrypted, wrapped by another key called the KEK (key encryption key). The KEK is derived from the user's secret key, a passphrase they set specifically for data encryption and separate from their account password. The derivation uses Argon2id from libsodium, an algorithm designed to make passphrase brute-forcing computationally expensive.

secret key + salt  -->  KEK
random DEK         -->  encrypted DEK (wrapped by KEK)
DEK                -->  encrypted financial fields
Enter fullscreen mode Exit fullscreen mode

The server stores only the encrypted DEK and its metadata. It never sees the raw DEK, the KEK, or the secret key.

The upside is that changing the secret key is cheap. Instead of re-encrypting every transaction and account, we just unwrap the DEK with the old KEK and re-wrap it with a new one derived from the new key. The DEK stays the same, so all the encrypted data stays untouched.

The downside is that without the secret key and without a backup, the data is gone. The server has no plaintext copy to recover from. No support ticket fixes this, because the server genuinely does not have the key.

What Gets Harder

The encrypt() call itself takes three lines. The complexity lives around it.

Validation. Before e2e, most business rules ran on the server. An amount came in, the server checked it against the balance and the constraints. Now the server sees a string like eyJhbGciOiJBMjU2R0NN... and cannot say anything meaningful about it. Some checks moved to the client. The server still handles access control, relational integrity, quotas, and structure, but it can no longer verify what is actually inside an encrypted field.

Debugging. When a user reports a discrepancy, it used to take two minutes to look up the row in PostgreSQL. Now those fields are base64 strings unreadable without the user's key. Reproducing the issue means working on the client: inspecting local SQLite, checking the upload queue, looking at sync state. The server database stopped being the place where you look for answers.

Logging. The server should never see plaintext. But the client sees it constantly, before encryption and after decryption. Client logging has to be designed deliberately so a routine log line does not accidentally capture an amount or a note.

Device load on first sync. When a user signs into a fresh device, the full history has to come down and every single row needs to be decrypted locally. The server cannot help with that part. Early on we hit real bugs from this: rows arriving in batches, decryption running in parallel, race conditions and UI stalls while everything was still catching up.

What the Server Still Sees

With e2e the server does not see nothing. It sees a lot of metadata, and that is worth saying plainly.

The server knows who the user is and which organization they belong to, the IDs of every record, the fact that transactions and accounts and debts exist, their dates and types, relations between records, creation and update timestamps, and the size of each encrypted value.

Metadata alone can say quite a bit. The server may not know the transaction amount, but it can see that health-related transactions tripled this month. Or that a new currency appeared in the account list.

So I do not call e2e full invisibility. The heaviest financial values — amounts, balances, rates, and notes — do not reach the server in plain form. But the shape of the data, who has what and when, is visible to the server. That is the actual line we drew, nothing more.


E2e in production is not a weekend project. It shifts how you think about the backend, the client, validation, and debugging. The backend knows less. The client takes on more responsibility.

If your product does not handle sensitive data, this complexity is probably not worth it. But when a finance app holds a real piece of someone's life, it becomes clear why you would build it this way. That is what we did in Finsight.

Top comments (16)

Collapse
 
superfunicular profile image
Super Funicular

Strong agreement on the threat model — "data should be inaccessible because the team technically cannot read it, not because they promise to behave" is exactly the right framing. The two-tables-with-localOnly pattern is clean.

One angle that pushes it further: if the sync target is just another device on the same LAN (phone ↔ laptop, two phones on the same wifi), you can drop the sync engine entirely and run an embedded HTTP server inside the app. The server stays local, never opens a port to the internet, and the only thing that leaves the device is a response to a request from a device that already knows the LAN IP.

We took that route building Background Camera RemoteStream — a recording app that streams the live camera feed to a browser on the same network. No PowerSync, no encrypted sync table, no cloud backend at all. The ciphertext doesn't need encryption at rest because it never gets handed to anyone we don't control: play.google.com/store/apps/details...

Honest tradeoff: this only works when "another device on the LAN" is an acceptable sync target. The moment users need device-to-device sync across cellular networks, PowerSync + your two-table pattern is the right move and the LAN-only model breaks.

Quick question on the encryption side: how are you handling key rotation when a user changes their master password? Re-encrypting every row in accounts_encrypt/transactions_encrypt seems expensive — do you wrap a data key with the password-derived key and just re-wrap the data key, or actually re-encrypt the ciphertext?

Collapse
 
doszhan profile image
Doszhan Mengaliyev

Thanks, that’s a fair distinction.

LAN-only sync is a valid model when “same Wi-Fi only” is acceptable.
On key rotation: yes, this is the KEK/DEK approach described in the “Keys” section. The domain data is encrypted with a DEK, and the DEK is wrapped with a KEK derived from the master password.

So when the master password changes, we only re-wrap the DEK with the new KEK. We do not re-encrypt every row.

Collapse
 
privacyfish profile image
Privacy.Fish

The part I like most here is that you call out what stays visible. A lot of E2EE writeups stop at “server can’t read the fields”, but the remaining metadata model is where the real product tradeoffs live: dates, org IDs, row relationships, sync timing, deleted/created patterns, etc.

For finance data, even unencrypted relations can say quite a bit. A burst of rows every payday, recurring vendor-shaped records, or org membership changes may not reveal amounts, but they still leak behavior. Not saying that makes the design wrong — it’s usually the practical split — just that it’s worth documenting as explicitly as the key-recovery downside.

We’re wrestling with a similar shape at privacy.fish for mail: keep more state local, reduce what the provider can retain/read, but be honest that routing metadata and operational logs do not disappear by magic. The hard part is making the privacy boundary understandable without turning every feature into a threat-model lecture.

Collapse
 
superfunicular profile image
Super Funicular

Same observation hits the camera-app side. Even if you encrypt the video files, the upload-burst pattern leaks "someone's home / someone just left" — bursts at 6pm, dead at 9am, that's a routine. Cloud-relay security cameras pretend the privacy story stops at "AES-256 in transit/at rest," but the metadata model (file count per hour, average size, geolocated POP) is a behavior log.

The way Background Camera RemoteStream sidesteps this is by structurally lacking the relay: footage is stored on the phone, viewed over LAN through the device's own embedded web server. There is no upstream traffic shape to analyze because there is no upstream. The cost of that design is exactly the one you're naming — it has to be understandable without a threat-model lecture. Our short form is "if your Wi-Fi is off, the data can't leave the building." Users get that one.

What we still leak: the local recording schedule itself (file-system timestamps on the device, if someone has physical access). And the LAN viewer is trustable-by-WiFi, which is a much weaker assumption on a coffee-shop network than at home — so we say "use it on your own Wi-Fi" out loud in the README, which is the camera-app equivalent of "this is the privacy boundary, drawn explicitly."

Curious what shape you land on for routing metadata at privacy.fish — selective batching to flatten timing signatures, or accept-and-document the leak? We treated it as accept-and-document (file-system timestamps are user-visible anyway), but mail is asymmetric (recipients matter), so the same answer probably doesn't fit.

For anyone landing here from a camera-app angle: play.google.com/store/apps/details...

Collapse
 
privacyfish profile image
Privacy.Fish

Good question. For privacy.fish, I think the honest answer is mostly accept-and-document.

Email has metadata that cannot be wished away: when mail arrives, when a client connects, what recipient domain the server has to deliver to, and the source IP/port records we are legally required to retain. We can reduce retained server-side mail, avoid webmail/tracking surfaces, support Tor/VPN/onion access, and keep storage local-first — but we should not imply that encrypted mail makes routing metadata disappear.

Batching could help a narrow timing-analysis threat model, but it also makes email less useful and still does not hide the social graph from the mail system itself. So I’d rather draw the privacy boundary clearly than sell a cleaner story than email can honestly support.
Enter fullscreen mode Exit fullscreen mode
Thread Thread
 
superfunicular profile image
Comment deleted
Thread Thread
 
privacyfish profile image
Privacy.Fish

Mostly it de-risks expectations with technical users, and only partly with everyone else.

The useful effect is that it gives people a sentence to anchor on: “we reduce the provider-side data, but email still has routing metadata.” That stops some overclaiming early, especially with people who already know SMTP. For less technical users, “privacy-first mail” still tends to get heard as “private in every way,” so the boundary has to be repeated in plain-language places: onboarding, docs, support answers, and marketing.

Thread Thread
 
superfunicular profile image
Super Funicular

That four-surface repetition — onboarding, docs, support, marketing — matches what I keep relearning on the camera-app side. "Privacy-first camera" gets heard as "no one can ever see anything," but the residual surface is real: the LAN-broadcast hostname when the local web server is on, app-list fingerprinting if anyone has device access, the SD card itself if the phone is physically taken.

The hardest place to repeat the boundary is the in-app settings screen. That's where users actually make the threat-model decision (turn LAN streaming on or off), and it's also where most apps default to silent toggles with no plain-language consequence text. We're trying a "what this exposes" line under each toggle now (in Background Camera RemoteStream — play.google.com/store/apps/details?id=com.superfunicular.digicam), which feels closer to the docs/onboarding repetition you described but lives at the point of decision instead of at the point of acquisition. Probably not enough on its own, but it stops the toggle-then-forget failure mode.

Thread Thread
 
privacyfish profile image
Privacy.Fish

That “what this exposes” line is exactly the kind of UX I wish more privacy tools had. Not a modal, not a legal wall, just the consequence sitting next to the switch while the user still has context.

The LAN streaming example is a good one because it is not scary in every threat model. It is just not invisible.

Thread Thread
 
superfunicular profile image
Super Funicular

"Just not invisible" captures the whole design rule in four words. We do the same in the camera app — the LAN stream URL renders right under the toggle (http://192.168.1.x:8080), so "who can see this" is one glance from the switch instead of three taps deep in a settings tree.

The harder UX call was the phone-to-phone case across networks where you can't show a URL at all. We landed on a 60-second pairing code with a visible TTL countdown — the countdown does the "not invisible" work that the URL did before. Users treat a ticking 60-second code very differently than they treat a permanent room link, and we got fewer "wait who else has this" support questions after we added the timer than after we added the code.

Curious about your side: on the mail flow, do you ever surface recipient TLS state (or the lack of it) in the compose UI itself, or is that always relegated to settings? Feels like the same pattern — the moment the user is making the decision is when the consequence needs to be in view.

(Context for anyone wandering in from search: Background Camera RemoteStream is the camera app — play.google.com/store/apps/details?id=com.superfunicular.digicam — the URL-under-toggle screenshot is on the Play listing.)

Thread Thread
 
privacyfish profile image
Privacy.Fish

That 60-second pairing code is a great example of the same rule: the countdown is doing the UX work without turning into a warning wall.

On the mail side, I’d want recipient TLS state surfaced at the point of sending, not buried in settings. For privacy.fish, the rule we describe is: if the recipient’s mail server can’t prove itself with a valid TLS certificate, the client should stop and ask whether to cancel or send anyway. That’s the moment the user is actually choosing between deliverability and transport assurance.

I’d frame it less as “danger” and more as “what changed before this leaves the device”: the message can still be delivered, but this hop no longer has the same server-identity guarantee. That matches the article’s broader point too: the server/client boundary has to be explicit, because otherwise people hear “encrypted/private” as a much cleaner story than the system can honestly support.

Thread Thread
 
superfunicular profile image
Super Funicular

"What changed before this leaves the device" — that's the line. Worth borrowing.

The camera app has a structurally identical hop: when LAN streaming is toggled on, the device starts responding to GET requests from the local network. We don't render that as a warning either. We render it as the URL itself, sitting under the toggle in monospace, with a one-line "this is the address that will respond to requests now." Same rule: the user is choosing between reach (the daily-driver phone on the same Wi-Fi can view it) and exposure (anything else on the same Wi-Fi can also try to). The "what changed" is the GET-receivable surface, not the secrecy of the video.

The interesting parallel is what each surface refuses to do for the user. Mail clients with no TLS UI silently degrade to plaintext SMTP; cloud camera apps with no transport UI silently treat the upload-to-vendor-cloud as the only option. Both elide a server-identity decision the user actually has standing in. The countdown / URL-under-toggle / TLS-prompt pattern is the same fix in three skins: don't elide; render the consequence at the point of decision.

For anyone wandering in: I'm building the camera app at play.google.com/store/apps/details... — the local-only, no-account architecture that the original article and this thread keep gesturing at as a category.

Collapse
 
kobie profile image
Kobie Botha

Ah nice, the two tables approach 🫡

You might be interested in our High-Performance Diffs, have you checked those out? docs.powersync.com/client-sdks/hig...

Collapse
 
doszhan profile image
Doszhan Mengaliyev

Thanks for the pointer, @kobie ! I really appreciate it.
Looks like this feature shipped right after we built our onChange-based pipeline, so we completely missed it. It seems very relevant to the two-table approach.
Will definitely try it out.

Collapse
 
superfunicular profile image
Super Funicular

Our back-and-forth on Doszhan's E2E thread stuck with me — your 'accept-and-document' take, that structural SMTP leaks (recipient domain, source IP, social-graph at the mail-server layer) can't be wished away, so you'd rather draw the privacy boundary honestly than sell false batching mitigations. I just drafted a piece on the same tension one layer over: when AI agents dispatch real-world tasks to humans, 'proof the work happened' defaults to surveillance, and I think the honest move is minimum legible proof, owned by the worker — basically your framing applied to physical verification. Would love your read, and if it resonates I'd happily fold your angle in or co-sign: The Carbon Layer. Two privacy-first builders comparing notes.

play.google.com/store/apps/details?id=com.superfunicular.digicam

Some comments may only be visible to logged-in visitors. Sign in to view all comments.