DEV Community

Cover image for From Spaghetti Code to the Lazarus Protocol
Alex Benny
Alex Benny

Posted on

From Spaghetti Code to the Lazarus Protocol

How We Built an Offline-First Engineering App with Flutter & SQLite

I didn’t start by wanting to build software. I started by needing reliable systems where the internet fails.

This is a technical post-mortem on building a production offline-first engineering app under real-world constraints: unreliable networks, low-end devices, strict budgets, and legal responsibility.

Why Offline-First Was Non-Negotiable

K-First started with a simple observation:

Construction sites don’t have reliable internet, but engineers still need reliable data.

Most apps assume:

network first

cache later

sync as an optimization

That model breaks down when:

connectivity drops mid-form

background processes are killed

devices reboot unexpectedly

data loss has legal or financial consequences

So the core constraint from day one was clear:

The app must function correctly even if the network never comes back.

Offline-first wasn’t a feature.
It was the architecture.

V1: The “Zero-Burn” Monolith (December 2025)

The first version of K-First had one brutal constraint:

Zero burn.

No servers

No paid infrastructure

No background sync unless a user explicitly paid

The Initial Architecture (and the Trap)

To move fast, we built a single large controller—effectively a God Class—that handled:

navigation

state

database access

UI orchestration

All project data was eagerly loaded into memory at app startup.

On paper, it worked.
In reality, it created three serious problems.

  1. Memory Pressure

Large projects meant large in-memory state.
Low-end Android devices didn’t appreciate that.

  1. “Passive Builder” Navigation

Forms returned data using:

Navigator.pop(result)

This failed when:

Android killed background activities

users deep-linked back into the app

the process restarted mid-flow

  1. The “Ghost Data” Bug

Users would save logs…
…and later discover they never actually persisted.

That’s unacceptable for an engineering logbook.

By late December, the monolith was already collapsing under real usage.

Phase 2: The Lazarus Refactor (January 2026)

We stopped feature work and initiated what we internally called Operation Clean House.

The goal wasn’t elegance.
The goal was survivability.

Breaking the God Class

We migrated to a strict MVVM structure:

Repository layer for persistence

ViewModels for state and lifecycle safety

UI reduced to pure rendering

State stopped flowing through navigation.
Navigation stopped being a data transport.

The Lazarus Protocol: Self-Healing Local Storage

The biggest risk in an offline-first app is silent database corruption.

crashes happen

battery pulls happen

OEMs do weird things

So we introduced what we call the Lazarus Protocol:

Every critical database write is checkpointed

On startup, the app validates the SQLite file

If corruption is detected:

the database is quarantined

the last known-good backup is restored

The user is informed, but never left with an empty app

The principle was simple:

A partially correct logbook is better than a wiped one.

Android 15 and the 16KB Page Size Wall

In January 2026, Google Play rejected our builds.

The reason had nothing to do with Flutter.

Android 15 introduced a mandatory 16KB page size requirement for native libraries.
Our encryption stack was incompatible.

The Fix

Forced upgrade of sqflite_sqlcipher

Migration safety nets to prevent existing users from being locked out

Careful handling of encrypted database headers

This reinforced a painful truth:

Mobile platforms are not stable targets. They are moving ground.

If you build offline-first, you inherit that responsibility.

The Samsung “Zombie Key” Incident (S23 / S24)

This was the most dangerous bug we’ve encountered so far.

The Symptom

Samsung users updated the app and saw:

“0 Projects”

No crash.
No error.
Just empty state.

The Root Cause

Samsung’s hardware-backed keystore (Knox) is sometimes not ready during cold start.

Our app:

requested the encryption key

received a new key

attempted to open the existing database

failed silently

We called this a Zombie Key:

valid

real

completely wrong

The Fix: The Samsung Patience Protocol

Instead of assuming storage is instant, we implemented:

retry loops

exponential backoff

up to ~7.5 seconds of patience

Only after exhausting retries do we treat the app as a fresh install.

Lesson learned:
Never assume hardware security modules wake up on time.

The Split-Brain Problem and the Unified Core Decision

Originally, we planned:

sqflite for free users (offline only)

PowerSync for paid users (sync enabled)

It looked clever.
It was a maintenance nightmare.

Two engines meant:

double migrations

double testing

double failure modes

The Pivot

We chose a Unified Core:

PowerSync everywhere

SQLite as the single source of truth

Sync toggled by capability, not architecture

Free users run PowerSync in offline-only mode.
Paid users simply enable connect().

This eliminated an entire class of future migrations.

Engineering Ethics: Trust Over Features

During compliance review, we identified calculators whose results depended on:

subjective land values

inconsistent local rules

We removed them.

Not because we couldn’t implement them —
but because shipping legally risky math is irresponsible.

We also implemented:

global disclaimers

consent-gated analytics (DPDP Act)

hard kill switches where confidence was insufficient

Engineering responsibility doesn’t end at correctness.
It includes consequences.

The Final Stack (Early 2026)
Mobile

Flutter (Dart)

MVVM architecture

SQLite + SQLCipher

Argon2id key derivation

Firebase (Crashlytics, Auth)

Web

Astro (SSG, zero-JS default)

Tailwind CSS

React islands (calculators only)

Motion One (mechanical animations)

Vercel (CI/CD)

Consent-first analytics loading

What This Journey Taught Me

Offline-first changes everything

Storage is not an optimization — it is the product

Hardware is unpredictable

Especially when security modules are involved

Architecture debt compounds faster than feature debt

Removing features can be a sign of maturity

Trust is the most expensive thing to lose — and the hardest to earn back

Closing

This architecture now powers K-First, an offline-first engineering logbook built for real site conditions where reliability matters more than polish.

If you’re building tools for the physical world:

Assume failure first — and design so your users never pay for it.

Top comments (0)