DEV Community

D
D

Posted on • Originally published at qiita.com

[Side A] SQLite's Shared In-Memory DB Disappears — and `deserialize()` Won't Save You

From the Author:
D-MemFS was featured in Python Weekly Issue #737 (March 19, 2026) under Interesting Projects, Tools and Libraries. Being picked up by one of the most widely-read Python newsletters confirmed that in-memory I/O bottlenecks and memory management are truly universal challenges for developers everywhere. This series is my response to that interest.

🧭 About this Series: The Two Sides of Development

In Japan, I publish this series across two distinct platforms to serve different developer needs. To provide the complete picture here on Dev.to, I've brought them together as two "Sides":

  • Side A (Practical / originally on Qiita): Focuses on the "How". Implementation details, benchmarks, and concrete solutions for practical use cases.
  • Side B (Philosophy / originally on Zenn): Focuses on the "Why". The development war stories, design decisions, and how I collaborated with AI through Specification-Driven Development (SDD).

Introduction

I recently released D-MemFS — a pure Python in-memory virtual file system with zero external dependencies. While integrating it into an application backend, an idea struck me: could D-MemFS help overcome a well-known weakness of SQLite's in-memory database (:memory:)?

It turned out there was an unexpected trap waiting for me.

First: The Limits of :memory: and cache=shared

SQLite's :memory: is a convenient way to use a database entirely in-process without touching the filesystem — perfect for testing and prototyping. However, it has one critical restriction: it cannot be shared across sessions (connections).

The cache=shared option exists to solve this. Using a named shared URI like file:my_db?mode=memory&cache=shared, you can run multiple named shared in-memory databases simultaneously — a genuinely useful pattern.

But even with cache=shared, an unavoidable volatility rule applies — just like with plain :memory::

In-memory DB volatility rule:
Even with shared caching enabled, the data vanishes without a trace the moment the last connection closes.

To prevent this data loss, some systems keep a dummy connection open permanently — a functional but inelegant hack.

My thought: what if we snapshot the database state just before the connection closes, and restore it cleanly when needed? This is where Python 3.11's serialize() / deserialize() and D-MemFS enter the picture.

But that's exactly where the trap was hiding.

"Easy with deserialize(), Right?" — Wrong.

Python 3.11+ sqlite3.Connection objects provide serialize() and deserialize(), which allow saving and restoring database state as raw bytes — no disk I/O required. I wrote what seemed like an obvious solution:

  1. Snapshot: Call conn.serialize() to capture state as bytes, then save to D-MemFS.
  2. Restore: Read from D-MemFS, reconnect using the shared cache URI, call deserialize().
import sqlite3

# Reconnect using the shared cache URI and load the data
new_conn = sqlite3.connect("file:my_db?mode=memory&cache=shared", uri=True)
new_conn.deserialize(snapshot_bytes)

# Looks good! Data restored!
print(new_conn.execute("SELECT * FROM users").fetchall())
# → [(1, 'Alice'), (2, 'Bob')]
Enter fullscreen mode Exit fullscreen mode

So far, so good. "That was easy," I thought.

But the moment a worker thread (or another request) tries to connect to the same shared DB, it blows up.

# Connect to the same shared DB from elsewhere
worker_conn = sqlite3.connect("file:my_db?mode=memory&cache=shared", uri=True)
worker_conn.execute("SELECT * FROM users").fetchall()

# 💥 sqlite3.OperationalError: no such table: users
Enter fullscreen mode Exit fullscreen mode

"Wait — why? Where's the table?"

new_conn could read the data just fine. But worker_conn, connecting to the exact same URI, sees nothing. I did a double-take.

The Tricky Behavior of deserialize()

Digging into the cause, I found a subtle but significant behavior of deserialize():

  • When deserialize() is called, the pager (SQLite's internal memory management) behind that connection gets swapped out for a completely private, independent in-memory DB containing the loaded data.
  • In other words, the moment deserialize() is called, that connection silently leaves the shared cache ring.

The connection that performed the restore holds the data. But any new connection joins the original shared cache — which is now empty — and gets no such table. That was the trap.

The Workaround: Using the backup() API

deserialize() cannot write directly into a shared cache connection. But we still need to restore from bytes. The solution: use a "courier".

Using SQLite's long-standing backup() API, the steps are:

  1. Read: Retrieve the snapshot bytes from D-MemFS.
  2. Load into courier: Create a temporary :memory: connection and deserialize() the data into it.
  3. Copy to target: Use backup() to transfer everything from the courier into the real shared cache connection.
  4. Discard: Close the courier.
# The real shared cache DB
shared_conn = sqlite3.connect("file:my_db?mode=memory&cache=shared", uri=True)

if snapshot_bytes:
    # Prepare the courier (temporary DB) and deserialize into it
    temp_conn = sqlite3.connect(":memory:")
    temp_conn.deserialize(snapshot_bytes)

    # Copy everything from the courier to the target
    temp_conn.backup(shared_conn)
    temp_conn.close()  # Courier has served its purpose
Enter fullscreen mode Exit fullscreen mode

This slightly unglamorous workaround successfully restores state into a shared cache without corrupting it, across Python 3.11 through 3.14.

Honestly: Isn't a Plain Dict Good Enough?

Having explained all of this, the sharp-eyed reader is probably thinking:

"If you're just stashing serialized bytes, why bother with a virtual filesystem? A global Dict[str, bytes] would be simpler and faster."

You're absolutely right.

A plain dict requires no external library and keeps the code simple. A management class wrapping it would be even cleaner. The advantage of a memory filesystem is thin here.

So why did I use D-MemFS? Honestly — personal preference.

Comparison Global Dict D-MemFS
Speed / memory efficiency ⭕ Faster and lighter 🔺 FS overhead exists
State management model 🔺 Key-string-based self-management ⭕ Intuitive paths like /snapshots/v1.db
Snapshot size limit 🔺 Must implement yourself ⭕ Hard Quota prevents unbounded growth / OOM
Versioning / export 🔺 Mapping and serialization is tedious ⭕ Export entire directory tree in one call

In real application development, "using SQLite" usually implies eventual persistence — writing to a physical file at a checkpoint. When that moment arrives, managing raw bytes under dict keys feels mismatched with SQLite's mental model. Having a virtual path like /snapshots/v1.db — even if it only exists in memory — fits far more naturally. And D-MemFS's hard quota guarantees that unbounded snapshot growth can never cause an OOM kill.

Added as an official use case
The deserialize() pitfall and backup() workaround were interesting enough to formalize. I wrapped this pattern into an MFSSQLiteStore class and added it to D-MemFS's official README and examples.

Even if you manage snapshots with a plain dict, the SQLite deserialize() / backup() trick itself is worth keeping in the back of your mind.

And if managing a virtual filesystem on top of a database sounds oddly appealing — give D-MemFS a try.


🔗 Links & Resources

If you find this project interesting, a ⭐ on GitHub would be the best way to support my work!

Top comments (0)