How We Built an Offline-First Engineering App with Flutter & SQLite
I didn’t start by wanting to build software. I started by needing reliable systems where the internet fails.
This is a technical post-mortem on building a production offline-first engineering app under real-world constraints: unreliable networks, low-end devices, strict budgets, and legal responsibility.
Why Offline-First Was Non-Negotiable
K-First started with a simple observation:
Construction sites don’t have reliable internet, but engineers still need reliable data.
Most apps assume:
network first
cache later
sync as an optimization
That model breaks down when:
connectivity drops mid-form
background processes are killed
devices reboot unexpectedly
data loss has legal or financial consequences
So the core constraint from day one was clear:
The app must function correctly even if the network never comes back.
Offline-first wasn’t a feature.
It was the architecture.
V1: The “Zero-Burn” Monolith (December 2025)
The first version of K-First had one brutal constraint:
Zero burn.
No servers
No paid infrastructure
No background sync unless a user explicitly paid
The Initial Architecture (and the Trap)
To move fast, we built a single large controller—effectively a God Class—that handled:
navigation
state
database access
UI orchestration
All project data was eagerly loaded into memory at app startup.
On paper, it worked.
In reality, it created three serious problems.
- Memory Pressure
Large projects meant large in-memory state.
Low-end Android devices didn’t appreciate that.
- “Passive Builder” Navigation
Forms returned data using:
Navigator.pop(result)
This failed when:
Android killed background activities
users deep-linked back into the app
the process restarted mid-flow
- The “Ghost Data” Bug
Users would save logs…
…and later discover they never actually persisted.
That’s unacceptable for an engineering logbook.
By late December, the monolith was already collapsing under real usage.
Phase 2: The Lazarus Refactor (January 2026)
We stopped feature work and initiated what we internally called Operation Clean House.
The goal wasn’t elegance.
The goal was survivability.
Breaking the God Class
We migrated to a strict MVVM structure:
Repository layer for persistence
ViewModels for state and lifecycle safety
UI reduced to pure rendering
State stopped flowing through navigation.
Navigation stopped being a data transport.
The Lazarus Protocol: Self-Healing Local Storage
The biggest risk in an offline-first app is silent database corruption.
crashes happen
battery pulls happen
OEMs do weird things
So we introduced what we call the Lazarus Protocol:
Every critical database write is checkpointed
On startup, the app validates the SQLite file
If corruption is detected:
the database is quarantined
the last known-good backup is restored
The user is informed, but never left with an empty app
The principle was simple:
A partially correct logbook is better than a wiped one.
Android 15 and the 16KB Page Size Wall
In January 2026, Google Play rejected our builds.
The reason had nothing to do with Flutter.
Android 15 introduced a mandatory 16KB page size requirement for native libraries.
Our encryption stack was incompatible.
The Fix
Forced upgrade of sqflite_sqlcipher
Migration safety nets to prevent existing users from being locked out
Careful handling of encrypted database headers
This reinforced a painful truth:
Mobile platforms are not stable targets. They are moving ground.
If you build offline-first, you inherit that responsibility.
The Samsung “Zombie Key” Incident (S23 / S24)
This was the most dangerous bug we’ve encountered so far.
The Symptom
Samsung users updated the app and saw:
“0 Projects”
No crash.
No error.
Just empty state.
The Root Cause
Samsung’s hardware-backed keystore (Knox) is sometimes not ready during cold start.
Our app:
requested the encryption key
received a new key
attempted to open the existing database
failed silently
We called this a Zombie Key:
valid
real
completely wrong
The Fix: The Samsung Patience Protocol
Instead of assuming storage is instant, we implemented:
retry loops
exponential backoff
up to ~7.5 seconds of patience
Only after exhausting retries do we treat the app as a fresh install.
Lesson learned:
Never assume hardware security modules wake up on time.
The Split-Brain Problem and the Unified Core Decision
Originally, we planned:
sqflite for free users (offline only)
PowerSync for paid users (sync enabled)
It looked clever.
It was a maintenance nightmare.
Two engines meant:
double migrations
double testing
double failure modes
The Pivot
We chose a Unified Core:
PowerSync everywhere
SQLite as the single source of truth
Sync toggled by capability, not architecture
Free users run PowerSync in offline-only mode.
Paid users simply enable connect().
This eliminated an entire class of future migrations.
Engineering Ethics: Trust Over Features
During compliance review, we identified calculators whose results depended on:
subjective land values
inconsistent local rules
We removed them.
Not because we couldn’t implement them —
but because shipping legally risky math is irresponsible.
We also implemented:
global disclaimers
consent-gated analytics (DPDP Act)
hard kill switches where confidence was insufficient
Engineering responsibility doesn’t end at correctness.
It includes consequences.
The Final Stack (Early 2026)
Mobile
Flutter (Dart)
MVVM architecture
SQLite + SQLCipher
Argon2id key derivation
Firebase (Crashlytics, Auth)
Web
Astro (SSG, zero-JS default)
Tailwind CSS
React islands (calculators only)
Motion One (mechanical animations)
Vercel (CI/CD)
Consent-first analytics loading
What This Journey Taught Me
Offline-first changes everything
Storage is not an optimization — it is the product
Hardware is unpredictable
Especially when security modules are involved
Architecture debt compounds faster than feature debt
Removing features can be a sign of maturity
Trust is the most expensive thing to lose — and the hardest to earn back
Closing
This architecture now powers K-First, an offline-first engineering logbook built for real site conditions where reliability matters more than polish.
If you’re building tools for the physical world:
Assume failure first — and design so your users never pay for it.
Top comments (0)