DEV Community

Cover image for I keep writing the same Pydantic-asyncpg wrapper. So I published it.
Hannu Varjoranta
Hannu Varjoranta

Posted on

I keep writing the same Pydantic-asyncpg wrapper. So I published it.

I've built this thin Pydantic-over-asyncpg layer too many times now.

Every project ends up with the same shape: a Row class for typed tables, raw SQL for the joins that don't fit into typed CRUD, transactions, no ORM magic. The fourth time I caught myself starting over, I published it. It's called etchdb. Alpha on PyPI, v0.5 as of this post.

This isn't a "we built our own ORM" story. It's the opposite -- it's about the smallest layer that closes the gap between raw asyncpg and the Pydantic models you already have, with raw SQL as a first-class escape valve when CRUD isn't enough.

The problem with the existing options

I've shipped production code with most of them.

SQLAlchemy is the default answer for a reason. It's also heavy, opinionated, and leaks at the seams the moment you reach for pgvector or PostGIS or any Postgres extension that ORMs don't know about. Async support is now native in 2.0, but the session lifecycle is still a thing you have to think about, and the typed model layer doesn't quite line up with how Pydantic feels in the rest of the codebase.

Tortoise, Piccolo, SQLModel -- better stories, all of them. SQLModel especially is close to what I want. But they all bring their own model class layered on top of (or instead of) Pydantic, and they all want to own more of the query surface than I want to give up.

Raw asyncpg is what I keep coming back to. It's fast, the API mirrors how I think about SQL, and it doesn't try to be clever. The only problem: every project ends up writing the same 80-line wrapper to bridge query results into Pydantic models. And that wrapper grows: insert with RETURNING *, partial updates that don't clobber unset columns, transactional helpers, eventually some kind of typed get(Model, id=...).

After watching myself write that same code four times, I gave it a name and pushed it to PyPI.

What etchdb looks like

Two user-facing types: DB and Row. That's it.

from etchdb import DB, Row

class User(Row):
    __table__ = "users"
    id: int | None = None    # leave unset and the DB allocates it
    name: str
    email: str | None = None

db = await DB.from_url("postgresql+asyncpg://user@host/db")

alice = await db.insert(User(name="Alice"))    # alice.id is now populated by the DB
user = await db.get(User, id=alice.id)         # one row or None
users = await db.query(User, name="Alice")     # list of rows
await db.update(User(id=alice.id, name="Alice B"))   # partial: email is preserved
await db.delete(alice)
Enter fullscreen mode Exit fullscreen mode

insert only emits the columns you actually set, so an unset id lets the database allocate one (SERIAL or INTEGER PRIMARY KEY); the returned Row reflects the DB's view via RETURNING *, so server-defaults like id and created_at are populated in place. update does the same partial-emit thing: a column you didn't touch keeps its current value rather than being clobbered.

That's most of what most of my CRUD looks like, day to day. When it's not enough, raw SQL is right there:

# Typed-result raw SQL (covers most joins)
users = await db.fetch_models(User, """
    SELECT u.* FROM users u JOIN orders o ON o.user_id = u.id
    WHERE o.created_at > $1
""", since)

# Untyped raw SQL (mirrors asyncpg's vocabulary)
rows = await db.fetch("SELECT count(*) FROM events WHERE site_id = $1", site_id)
val = await db.fetchval("SELECT count(*) FROM users")
n = await db.execute("UPDATE users SET active = false WHERE id = $1", uid)
Enter fullscreen mode Exit fullscreen mode

db.execute returns the affected-row count as int, normalised across asyncpg, psycopg, and aiosqlite. That used to be the one thing that bit me every time I swapped drivers -- asyncpg returns "UPDATE 5" strings, psycopg returns a cursor, aiosqlite discards it entirely.

The features that grew out of adoption

I won't list everything. A few that came from actually using it against real code:

Atomic multi-tenant scoping via where= on update / delete:

await db.update(
    User(id=alice.id, name="Alice B"),
    where={"user_id": current_user_id},   # AND'd onto the PK in one statement
)
Enter fullscreen mode Exit fullscreen mode

The check and the update run in the same SQL statement, so there's no TOCTOU window between "is this user allowed to edit?" and the write.

IS NULL filters that actually find rows. This is a correctness fix that hit me on the first real project -- db.get(User, deleted_at=None) used to silently match zero rows because field = NULL is never true in SQL. Now it emits WHERE deleted_at IS NULL.

IN (...) via list values:

recent = await db.query(User, id=[1, 5, 7])    # WHERE id IN ($1, $2, $3)
Enter fullscreen mode Exit fullscreen mode

Atomic increments and DB-side timestamps via sentinels:

from etchdb import Inc, Now

await db.update(Counter.patch(id=1, n=Inc()))            # n = n + 1, atomic
await db.update(Article.patch(id=1, updated_at=Now()))   # CURRENT_TIMESTAMP
Enter fullscreen mode Exit fullscreen mode

on_conflict="upsert" for create-or-update:

await db.insert(
    User(id=1, name="alice", email="a@x"),
    on_conflict="upsert",
)
Enter fullscreen mode Exit fullscreen mode

Typed exceptions across drivers, so the same except IntegrityError works against asyncpg, psycopg, and aiosqlite. The original driver exception is preserved as __cause__.

Inspectable SQL -- every typed op exposes its (sql, params) without executing:

q = db.compose("get", User, id=1)
print(q.sql)     # SELECT id, name, email FROM users WHERE id = $1
print(q.params)  # [1]
Enter fullscreen mode Exit fullscreen mode

I lean on this constantly when debugging or when an agent is generating code against the library -- being able to print what etchdb is about to send to the database closes a lot of "why isn't this working" loops.

Migrations: forward-only, file-based, paranoid

v0.5's headline feature. I needed a migration story for the projects I was building with etchdb, and Alembic is the wrong size for a library that does forward-only schema changes.

Drop .sql files into a directory; filenames sort to apply order:

migrations/
  0001_create_users.sql
  0002_add_email_index.sql
  0003_add_articles.sql
Enter fullscreen mode Exit fullscreen mode
-- migrations/0001_create_users.sql
CREATE TABLE users (
    id          SERIAL PRIMARY KEY,
    name        TEXT NOT NULL,
    email       TEXT UNIQUE,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX users_email_idx ON users (email);
Enter fullscreen mode Exit fullscreen mode
applied = await db.migrate("migrations/")
status = await db.migration_status("migrations/")
Enter fullscreen mode Exit fullscreen mode

Each migration runs in its own implicit transaction on Postgres. Don't write BEGIN / COMMIT / ROLLBACK in the file -- the runner owns transaction control and rejects files that try to take it back. For DDL that won't run inside a transaction (CREATE INDEX CONCURRENTLY), put -- etchdb:no-transaction on the first non-blank line.

The thing I cared most about getting right: strict consistency. Tracking lives in a _etchdb_migrations table with (filename, checksum, applied_at). The runner refuses to operate when state is inconsistent: an applied file's content has changed (drift), or an applied filename is no longer in the directory (disappearance). Both fail loudly with the recovery command in the error message. Silent continuation with unknown state is exactly what a forward-only tool exists to prevent.

If you need autogenerate, branching, or rollback, Alembic / dbmate / sqitch still slot in fine -- etchdb's helper covers the simple forward-only case without dragging those in.

Designed for AI-assisted development

This is the part I think matters most for where Python is right now.

Predictable verbs (db.get, db.query, db.insert, db.update, db.delete, db.migrate). No metaclass magic. No implicit context vars. No lazy loading. Every typed operation produces inspectable SQL. The vocabulary mirrors asyncpg's, which is already in every model's training data.

The result is that an LLM writing application code against etchdb gets it right on the first attempt much more often than against the big ORMs, because there's nothing surprising to learn. The bug class I hit constantly with SQLAlchemy (the model thinks one thing, the session thinks another, the cursor returned something else) just doesn't exist when the layer is this thin.

I've been building etchdb the same way -- Claude Code as the primary development assistant, Codex for review between releases, occasional Gemini for a third opinion. Five versions on PyPI in five days, mostly driven by what fell out when I adopted v0.1 against an existing project and watched where the friction was. Each gap fed the next release.

It's still alpha. Surface keeps tightening between versions. But the core (DB + Row) has stayed identical since v0.1, and I don't expect that to change.

Where it lives

GitHub logo varjoranta / etchdb

Minimal async DB layer for Python. Typed CRUD over Pydantic. Raw SQL when you need it.

etchdb

Minimal async DB layer for Python. Typed CRUD over Pydantic. Raw SQL when you need it.

Status

Alpha. v0.5.0 on PyPI. Built in public from day one; expect tightening between alpha releases.

Example

from etchdb import DB, Row
class User(Row):
    __table__ = "users"
    id: int | None = None             # leave unset and the DB allocates it (SERIAL / INTEGER PK)
    name: str
    email: str | None = None

# Connect (driver inferred from URL scheme)
#   postgresql+asyncpg://...   asyncpg + Postgres
#   postgresql+psycopg://...   psycopg3 + Postgres
#   sqlite+aiosqlite:///...    aiosqlite + SQLite
db = await DB.from_url("postgresql+asyncpg://user@host/db")

# Typed CRUD
alice = await db.insert(User(name="Alice"))           # alice.id is now populated by the DB
user = await db.get(User, id=alice.id)                # one row
Enter fullscreen mode Exit fullscreen mode
pip install etchdb[asyncpg]    # asyncpg + Postgres
pip install etchdb[psycopg]    # psycopg3 + Postgres
pip install etchdb[sqlite]     # aiosqlite + SQLite
pip install etchdb[all]        # everything
Enter fullscreen mode Exit fullscreen mode

Postgres is the primary target, with SQLite as a secondary backend (mostly for tests and embedded use). Drivers are optional extras so the top-level namespace depends only on Pydantic.

If you've also been writing the same thin asyncpg+Pydantic glue layer in every project, give it a shot. Issues and feedback welcome on GitHub.

Top comments (0)