When Your Bandaid Needs a Bandaid: Migrating Away From Automerge in Production

#webdev #javascript #typescript #powersync

One of our client's schedules got too large for Automerge. I was running a cron job daily just to keep the schedule alive. That's when I knew we had a real problem.

** The Background **
We build scheduling software for film and TV production. Offline availability and real-time collaboration are not nice-to-haves — they are the product. We chose Automerge early because it was purpose built for exactly that. It worked beautifully. Until it didn't.

** The bandaid **
The fix was a cron job. Simple enough in theory — spin up an Automerge instance for each schedule, check the ops count, and if it crossed a threshold, strip the history. Pull the document JSON, create a fresh Automerge document from that snapshot, update the access records, update the report records. Done.
I even let users set their own maintenance window so they wouldn't get kicked mid-session. It was simple. It worked. And I already knew it wasn't a real solution.

** When The Bandaid Broke **
A team working a six month production schedule needed the cron job by week two. The data was expanding faster than we could compress it. Then the cron job itself failed — the history was so large I had to write a Rust tool just to extract the document. Four gigabytes of CRDT history. For one schedule.
That was the moment I stopped looking for a better bandaid, and started looking for a real long-term solution.

** The Realization **
Automerge V2 compiles to WebAssembly, and WebAssembly has a hard 4GB memory limit. There's nothing you can do about that at the library level. So how do you solve the unsolvable? You stop trying to solve it and ask why you're using the library in the first place.
Do I actually need CRDTs?
In film and TV scheduling, all users are working from the same script. The scenes are defined. The cast is defined. All the details come from the script. The entire job of the scheduling software is to take something that already exists and make it scheduleable. We're not collaborating on text. We're not expecting two people to type different characters into the same field simultaneously.
CRDTs solve character level conflicts. We didn't have character level conflicts.
Once I saw that clearly, the question changed from "how do I fix Automerge" to "what do I actually need, and what becomes possible if I'm not constrained by CRDTs?"
That question led me to PowerSync.

** The Decision **
The CTO had chosen Automerge. I wasn't going to challenge that based on a question my brother had asked in a technical interview two years earlier. I was just an engineer.
Then the CTO left. The other engineers left or were let go. It was just me.
There was no one left to defer to. Which meant I had no excuse not to question everything.

** Why PowerSync **
I evaluated five options: PowerSync, Yjs, staying on Automerge (waiting for V3), electricSQL and OrbitDB with IPFS.
Staying on Automerge wasn't really an option - we had no idea when v3 would be released and ready for production, we needed a solution "yesterday".
OrbitDB with IPFS shared the peer-to-peer eventual consistency which worked with the current mental model of Automerge, but I also realized that this was the wrong model for an authoritative scheduling system.
Yjs was still CRDT-based meaning we could keep relying on the inherent conflict resolution of the system, but migrating from Automerge's JSON structure to Yjs's modular types like Y.Text and Y.Map would have meant significant UI rewrites on top of the sync layer changes, this may solve the problem but it would take too much time to implement test and ship reliably - the core principle of what I needed was a migration that I could handle alone while still dealing with a crumbling system.
ElectricSQL was also on my radar. The architecture was compelling but it hadn't reached v1 yet and Postgres sync wasn't available. But it reminded me of PowerSync, the project my brother was involved in, the basic architecture was the same, save to local database and "sync" changes to the backend using database replication.
PowerSync had the highest migration complexity on paper. But it had one critical advantage the others didn't — the data could be loaded into the exact same shape as the Automerge objects we'd been using. That meant we could migrate incrementally, running both systems in parallel behind a feature flag, rather than doing a big bang rewrite.
And then there was the unexpected upside. PowerSync isn't just a sync engine. It's a full SQLite database on the client. Once we were migrated, we weren't just solving the memory problem — we were syncing user settings, feature flags, access records. Things we'd been handling separately. The scope of what became possible was larger than the problem we started with.

** What I'd Tell Another Developer **
It depends on your actual sync granularity. If you're syncing text — collaborative documents, rich editing — Automerge is genuinely impressive and the V3 improvements are real. Use it.
But if you're syncing discrete fields, properties, structured data — ask yourself the question my brother asked before you commit: are you syncing per character or per field? The answer probably determines your architecture.

I wish someone had asked me that question before I started.

DEV Community

When Your Bandaid Needs a Bandaid: Migrating Away From Automerge in Production

Top comments (0)