Originally published at dileeparanawake.com
When I began building LittleSteps AI - a full-stack, authenticated LLM chat for new parents - I assumed the database schema should come first.
I was wrong.
The behaviour of the system’s surfaces - the UI, auth layer, and OpenAI provider - revealed the real rules the schema needed to uphold long before any abstract modelling exercise could.
Surfaces are the concrete touchpoints you can actually see and test. They expose real behaviour and constraints, which makes them a far more reliable foundation for modelling than guessing at tables.
This post is a case study in how a surface-first approach helped me design a smaller, clearer, more reliable schema that matched the product instead of my assumptions.
1. The Problem I Needed to Solve
LittleSteps AI had to support:
- authenticated Google users
- persistent chat threads
- ordered messages
- thread/URL navigation
- OpenAI-powered prompts and responses
Core problem: The real difficulty wasn’t building a chat UI - it was deciding what a minimal schema looked like, and whether the backend or frontend should drive the design.
As I was thinking about schemas (and feeling a little lost), two things became clearer when I mapped user flows and built a quick minimal frontend:
(1) The user experience had a fixed shape:
A predictable flow: sidebar → thread list → thread rename → message list → new message appended.(2) The behaviour of the UI wasn’t optional:
It imposed concrete rules the backend had to respect: Those behavioural constraints turned out to be far more informative than starting with tables and guessing relationships.-
(3) The model provider exposed strict sequencing and message structure:
The OpenAI request–response cycle surfaced rules around ordering, role classification, and token metadata - constraints that the schema would eventually need to include.
-
(4) The auth layer brought its own structural boundaries:
BetterAuth came with a working user/session/account schema and strict ownership hierarchies. Instead of reinventing identity, these boundaries became another “surface” that shaped the final data model.
By mapping early UI flows and observing the behaviour of the auth layer and model provider, the actual ‘surfaces’ of the system became clear.
2. Constraints: What Made This Non-Trivial
The behaviour of the UI, the auth layer, and the OpenAI provider surfaced constraints that the schema had to include:
Deterministic ordering
Messages must appear exactly in the order the user expects - no races, no timestamp drift.Non-guessable identifiers
Thread IDs appear in the browser URL and must be safe to share.Strict ownership
Only the creator of a thread should ever access it.No orphaned entities
A thread delete should cascade messages; a message must never “float”.GDPR-aligned deletion
Removing a user should remove data across auth sessions, accounts, threads, and messages.Room for growth without premature complexity
Token usage, analytics, or rate limits might come later-but shouldn’t distort the minimal model now.Atomic prompt–response cycles
Each user prompt produces a tightly coupled pair of messages (user → assistant), which must be persisted together in the correct order.
Together, these constraints narrowed the viable designs. Once the rules were visible, only a small, clean schema actually fit them.
3. Why Surface-First (Not Schema-First) Led to a Better Model
My core insight wasn’t “UI-first”.
It was surface-first: using the concrete surfaces of the system (UI, auth layer, and model provider etc) to reveal the concrete behaviours and rules the schema must design for.
To do that, my mental model is:
Surface → Rules → Schema
(Surface (behaviours) → Rules (invariants) → Schema)
This model works because the behaviour of a system’s surfaces always constrains what the schema must guarantee.
The behavioural inputs came from three places:
a) The UI
- What does the sidebar need to load?
- How must threads appear in order?
- How do we navigate after creating a new message?
- Where does rename write to?
b) The OpenAI model provider
Its request-response cycle surfaces constraints around:
- atomic prompt+response writes
- message role classification
- token metadata
- error handling paths
c) The auth layer (BetterAuth)
It defines ownership, session guarantees, and the edges of who can access what.
What this revealed
Answering those questions produced a set of invariants:
- Threads belong to exactly one user
- Messages belong to exactly one thread
- Messages require stable ordering
- Thread titles must be constrained
- URLs must use non-enumerable IDs
- Delete-user must delete everything cleanly
Once the invariants were clear, the schema became straightforward to design.
Why I didn’t start with schema-first
Schema-first would have forced:
- guessing relationships before understanding flows
- modelling features that didn’t exist
- encoding constraints that didn’t reflect behaviour
- hiding edge cases the UI and provider would later surface
Schema-first is low resolution-it shows what entities might exist, but it doesn’t show how they behave.
Surface-first revealed the dynamics that the schema needed to preserve.
Each surface exposed a different part of the real behaviour: the UI revealed navigation and ordering, the model provider revealed atomic cycles, and the auth layer revealed ownership boundaries.
4. The Data Model
Combining UI flow, auth guarantees, and OpenAI’s request–response lifecycle, the schema settled into four tables:
Each part of the schema maps directly back to the constraints surfaced earlier. The UI revealed ordering and navigation behaviour, which drove the need for explicit sequences and non‑enumerable IDs. The model provider revealed atomic prompt–response cycles and role structures, which shaped the message table. The auth layer exposed strict ownership boundaries, which defined the thread → user relationship and cascade rules. By tying each schema choice to a rule revealed by a surface, the final model became smaller, more explicit, and directly aligned with real product behaviour.
User, Session/Account, Thread, Message.
User (id TEXT PK, email UNIQUE, name, createdAt, updatedAt)
├─ Session (id TEXT PK, token UNIQUE, expiresAt, userId FK → User.id ON DELETE CASCADE)
├─ Account (id TEXT PK, accountId, providerId, userId FK → User.id ON DELETE CASCADE)
└─ Thread (id UUID PK, userId FK → User.id ON DELETE CASCADE, title VARCHAR(60), createdAt, updatedAt)
└─ Message (id UUID PK, threadId FK → Thread.id ON DELETE CASCADE,
sequence INT, role ENUM[system|user|assistant], content TEXT,
createdAt, promptTokens?, completionTokens?, totalTokens?,
UNIQUE(threadId, sequence))
What This Structure Guarantees
1. Clear ownership
A single userId foreign key on Thread encodes all access rules.
The API only needs one join to enforce permissions. This directly satisfies the ownership boundaries revealed by the auth surface.
2. No orphaned data
ON DELETE CASCADE removes dependent rows automatically.
This makes GDPR “delete my data” flows trivial. This addresses the data‑lifecycle rules exposed by both the UI surface and GDPR deletion requirements.
3. Deterministic message ordering
Timestamps are unreliable under async workloads and parallel writes.
Instead, the schema uses:
- a single source of truth:
sequence - DB-level protection:
UNIQUE(threadId, sequence)
This guarantees consistent ordering for:
- UI rendering
- API consumers
- DB-backed tests (Vitest)
- debugging race conditions
Deterministic ordering emerged as a key requirement from several surfaces. I hadn’t anticipated it early on, but it became clear very quickly that relying on timestamps would produce inconsistencies. This solves the ordering and atomicity constraints exposed by both the UI and the model‑provider surfaces.
4. Safe URLs
UUIDs make thread/message identifiers:
- non-enumerable
- shareable
- URL-safe
- predictable in routing
No need for obfuscation or extra indirection. This satisfies the routing and non‑enumerability requirements imposed by the UI surface.
5. Practical constraints exposed by the UI surface
The UI forced the schema to adopt:
- title length limits
- stable sequence ordering
- UUID routing
- role validation (
system | user | assistant)
Every constraint is tied to real behaviour-not speculation. These schema constraints originate directly from the behavioural limits surfaced in the UI.
5. Trade-offs I Made to Ship Faster
1. Some rules (invariants) live in app logic, not the DB
Ownership checks and sequence assignment happen in the application layer.
This keeps the schema simple but shifts responsibility to well-tested code.2. Single-user threads only
Supporting shared threads would require membership tables and more complex ACLs.
Out of scope for MVP - so the model stayed intentionally simple.3. No configurable system prompts yet
I excluded per-thread system messages; future UX changes will require schema migrations.
This avoided premature abstraction.-
4. Shared Postgres instance during tests
Vitest runs tests concurrently, so I used one Postgres container for all tests.
DB-backed tests validated:- thread creation
- message append
- ordering guarantees
- deletion behaviour
Per-test isolated DBs would reduce async testing bugs and improve test isolation - but at the cost of slower iteration. I chose a shared database first to keep feedback loops fast.
5. MVP focus over performance/security
I explicitly avoided deep optimisation or hardening (RLS, rate-limits, encryption, extra indexes) until the product stabilises.
6. Closing Thoughts: Surface-First Schema Design
The biggest lesson from LittleSteps was clear:
Schema design becomes clearer when you start from real behaviour, not tables.
By grounding everything in UI flows, auth rules, and provider interactions, the backend became:
- smaller and more explicit
- easier to reason about
- easier to test
- aligned with real product constraints
- capable of evolving without rewrites
The process wasn’t “UI-first” so much as surface-first - starting from the external surfaces of the system to understand the rules it must uphold.
I appreciate this is more difficult in a production system with large volumes of existing data, but for early product stages it enables fast iteration and clearer modelling.
If you’re building a full-stack app - especially one for quick prototypes / fast itteration / with chat-like interactions / LLM workflows - try this pattern:
Surface → Rules → Schema
Model the real flows first.
Let the schema fall out of them.
You may find that the architecture becomes simpler, more reliable, and easier to extend.

Top comments (0)