Dileepa Ranawake

Posted on Nov 20 • Originally published at dileeparanawake.com

Surface-First Schema Design

#schema #webdev #database #softwaredevelopment

Originally published at dileeparanawake.com

When I began building LittleSteps AI - a full-stack, authenticated LLM chat for new parents - I assumed the database schema should come first.

I was wrong.

The behaviour of the system’s surfaces - the UI, auth layer, and OpenAI provider - revealed the real rules the schema needed to uphold long before any abstract modelling exercise could.

Surfaces are the concrete touchpoints you can actually see and test. They expose real behaviour and constraints, which makes them a far more reliable foundation for modelling than guessing at tables.

This post is a case study in how a surface-first approach helped me design a smaller, clearer, more reliable schema that matched the product instead of my assumptions.

1. The Problem I Needed to Solve

LittleSteps AI had to support:

authenticated Google users
persistent chat threads
ordered messages
thread/URL navigation
OpenAI-powered prompts and responses

Core problem: The real difficulty wasn’t building a chat UI - it was deciding what a minimal schema looked like, and whether the backend or frontend should drive the design.

As I was thinking about schemas (and feeling a little lost), two things became clearer when I mapped user flows and built a quick minimal frontend:

(1) The user experience had a fixed shape:
A predictable flow: sidebar → thread list → thread rename → message list → new message appended.
(2) The behaviour of the UI wasn’t optional:
It imposed concrete rules the backend had to respect: Those behavioural constraints turned out to be far more informative than starting with tables and guessing relationships.
(3) The model provider exposed strict sequencing and message structure:

The OpenAI request–response cycle surfaced rules around ordering, role classification, and token metadata - constraints that the schema would eventually need to include.
(4) The auth layer brought its own structural boundaries:

BetterAuth came with a working user/session/account schema and strict ownership hierarchies. Instead of reinventing identity, these boundaries became another “surface” that shaped the final data model.

By mapping early UI flows and observing the behaviour of the auth layer and model provider, the actual ‘surfaces’ of the system became clear.

2. Constraints: What Made This Non-Trivial

The behaviour of the UI, the auth layer, and the OpenAI provider surfaced constraints that the schema had to include:

Deterministic ordering

Messages must appear exactly in the order the user expects - no races, no timestamp drift.
Non-guessable identifiers

Thread IDs appear in the browser URL and must be safe to share.
Strict ownership

Only the creator of a thread should ever access it.
No orphaned entities

A thread delete should cascade messages; a message must never “float”.
GDPR-aligned deletion

Removing a user should remove data across auth sessions, accounts, threads, and messages.
Room for growth without premature complexity

Token usage, analytics, or rate limits might come later-but shouldn’t distort the minimal model now.
Atomic prompt–response cycles

Each user prompt produces a tightly coupled pair of messages (user → assistant), which must be persisted together in the correct order.

Together, these constraints narrowed the viable designs. Once the rules were visible, only a small, clean schema actually fit them.

3. Why Surface-First (Not Schema-First) Led to a Better Model

My core insight wasn’t “UI-first”.

It was surface-first: using the concrete surfaces of the system (UI, auth layer, and model provider etc) to reveal the concrete behaviours and rules the schema must design for.

To do that, my mental model is:

Surface → Rules → Schema

(Surface (behaviours) → Rules (invariants) → Schema)

This model works because the behaviour of a system’s surfaces always constrains what the schema must guarantee.

The behavioural inputs came from three places:

a) The UI

What does the sidebar need to load?
How must threads appear in order?
How do we navigate after creating a new message?
Where does rename write to?

b) The OpenAI model provider

Its request-response cycle surfaces constraints around:

atomic prompt+response writes
message role classification
token metadata
error handling paths

c) The auth layer (BetterAuth)

It defines ownership, session guarantees, and the edges of who can access what.

What this revealed

Answering those questions produced a set of invariants:

Threads belong to exactly one user
Messages belong to exactly one thread
Messages require stable ordering
Thread titles must be constrained
URLs must use non-enumerable IDs
Delete-user must delete everything cleanly

Once the invariants were clear, the schema became straightforward to design.

Why I didn’t start with schema-first

Schema-first would have forced:

guessing relationships before understanding flows
modelling features that didn’t exist
encoding constraints that didn’t reflect behaviour
hiding edge cases the UI and provider would later surface

Schema-first is low resolution-it shows what entities might exist, but it doesn’t show how they behave.

Surface-first revealed the dynamics that the schema needed to preserve.

Each surface exposed a different part of the real behaviour: the UI revealed navigation and ordering, the model provider revealed atomic cycles, and the auth layer revealed ownership boundaries.

4. The Data Model

Combining UI flow, auth guarantees, and OpenAI’s request–response lifecycle, the schema settled into four tables:

Each part of the schema maps directly back to the constraints surfaced earlier. The UI revealed ordering and navigation behaviour, which drove the need for explicit sequences and non‑enumerable IDs. The model provider revealed atomic prompt–response cycles and role structures, which shaped the message table. The auth layer exposed strict ownership boundaries, which defined the thread → user relationship and cascade rules. By tying each schema choice to a rule revealed by a surface, the final model became smaller, more explicit, and directly aligned with real product behaviour.

User, Session/Account, Thread, Message.

User (id TEXT PK, email UNIQUE, name, createdAt, updatedAt)
 ├─ Session (id TEXT PK, token UNIQUE, expiresAt, userId FK → User.id ON DELETE CASCADE)
 ├─ Account (id TEXT PK, accountId, providerId, userId FK → User.id ON DELETE CASCADE)
 └─ Thread (id UUID PK, userId FK → User.id ON DELETE CASCADE, title VARCHAR(60), createdAt, updatedAt)
     └─ Message (id UUID PK, threadId FK → Thread.id ON DELETE CASCADE,
                 sequence INT, role ENUM[system|user|assistant], content TEXT,
                 createdAt, promptTokens?, completionTokens?, totalTokens?,
                 UNIQUE(threadId, sequence))

What This Structure Guarantees

1. Clear ownership

A single userId foreign key on Thread encodes all access rules.

The API only needs one join to enforce permissions. This directly satisfies the ownership boundaries revealed by the auth surface.

2. No orphaned data

ON DELETE CASCADE removes dependent rows automatically.

This makes GDPR “delete my data” flows trivial. This addresses the data‑lifecycle rules exposed by both the UI surface and GDPR deletion requirements.

3. Deterministic message ordering

Timestamps are unreliable under async workloads and parallel writes.

Instead, the schema uses:

a single source of truth: sequence
DB-level protection: UNIQUE(threadId, sequence)

This guarantees consistent ordering for:

UI rendering
API consumers
DB-backed tests (Vitest)
debugging race conditions

Deterministic ordering emerged as a key requirement from several surfaces. I hadn’t anticipated it early on, but it became clear very quickly that relying on timestamps would produce inconsistencies. This solves the ordering and atomicity constraints exposed by both the UI and the model‑provider surfaces.

4. Safe URLs

UUIDs make thread/message identifiers:

non-enumerable
shareable
URL-safe
predictable in routing

No need for obfuscation or extra indirection. This satisfies the routing and non‑enumerability requirements imposed by the UI surface.

5. Practical constraints exposed by the UI surface

The UI forced the schema to adopt:

title length limits
stable sequence ordering
UUID routing
role validation (system | user | assistant)

Every constraint is tied to real behaviour-not speculation. These schema constraints originate directly from the behavioural limits surfaced in the UI.

5. Trade-offs I Made to Ship Faster

1. Some rules (invariants) live in app logic, not the DB

Ownership checks and sequence assignment happen in the application layer.

This keeps the schema simple but shifts responsibility to well-tested code.
2. Single-user threads only

Supporting shared threads would require membership tables and more complex ACLs.

Out of scope for MVP - so the model stayed intentionally simple.
3. No configurable system prompts yet

I excluded per-thread system messages; future UX changes will require schema migrations.

This avoided premature abstraction.
4. Shared Postgres instance during tests

Vitest runs tests concurrently, so I used one Postgres container for all tests.

DB-backed tests validated:
- thread creation
- message append
- ordering guarantees
- deletion behaviour
Per-test isolated DBs would reduce async testing bugs and improve test isolation - but at the cost of slower iteration. I chose a shared database first to keep feedback loops fast.
5. MVP focus over performance/security

I explicitly avoided deep optimisation or hardening (RLS, rate-limits, encryption, extra indexes) until the product stabilises.

6. Closing Thoughts: Surface-First Schema Design

The biggest lesson from LittleSteps was clear:

Schema design becomes clearer when you start from real behaviour, not tables.

By grounding everything in UI flows, auth rules, and provider interactions, the backend became:

smaller and more explicit
easier to reason about
easier to test
aligned with real product constraints
capable of evolving without rewrites

The process wasn’t “UI-first” so much as surface-first - starting from the external surfaces of the system to understand the rules it must uphold.

I appreciate this is more difficult in a production system with large volumes of existing data, but for early product stages it enables fast iteration and clearer modelling.

If you’re building a full-stack app - especially one for quick prototypes / fast itteration / with chat-like interactions / LLM workflows - try this pattern:

Surface → Rules → Schema

Model the real flows first.

Let the schema fall out of them.

You may find that the architecture becomes simpler, more reliable, and easier to extend.

DEV Community

Surface-First Schema Design

1. The Problem I Needed to Solve

2. Constraints: What Made This Non-Trivial

3. Why Surface-First (Not Schema-First) Led to a Better Model

Surface → Rules → Schema

The behavioural inputs came from three places:

What this revealed

Why I didn’t start with schema-first

4. The Data Model

What This Structure Guarantees

1. Clear ownership

2. No orphaned data

3. Deterministic message ordering

4. Safe URLs

5. Practical constraints exposed by the UI surface

5. Trade-offs I Made to Ship Faster

6. Closing Thoughts: Surface-First Schema Design

Surface → Rules → Schema

Top comments (0)