To learn more about database replication and distributed systems, I designed VeloSync: an offline-first vector database synchronization engine written in pure Python. I used AI to assist in writing and organizing the codebase.
The system runs a local SQLite store on edge devices, performs vector search locally, and bidirectionally synchronizes updates with a FastAPI master server over HTTP.
System Architecture & Features
-
Local Storage & Search: Saves AI vector embeddings locally in SQLite as packed float64 BLOBs (via the Python
structmodule) and performs exact Cosine Similarity search offline. - Replication Log: Tracks every local edit using a Write-Ahead Log (WAL) and Log Sequence Numbers (LSN).
- FastAPI Server: Ingests push payloads over HTTP, verifies integrity checksums, and uses version vector clocks to causally resolve concurrent edit conflicts.
- Log Compaction: Compacts offline changes locally before replication to minimize network bandwidth.
- Echo Suppression: Ensures devices do not receive their own pushes back during pull cycles.
Building this provided hands-on experience with database transactions, version vector clocks, and preventing replication loops in distributed databases.
The source code and system architecture details are available on my GitHub:
Abhishektegur
/
velosync-engine
A zero-dependency Python database sync engine that replicates vector embeddings between SQLite on edge devices and a central FastAPI server.
VeloSync ⚡
VeloSync is a zero-dependency, offline-first vector database synchronization engine written in Python. It is designed for edge environments (mobile phones, IoT nodes, embedded devices) that need to perform fast local vector searches on SQLite and synchronize mutations bidirectionally with a central cloud database over HTTP.
System Architecture
+----------------------------------+ +----------------------------------+
| Device A (edge-phone-01) | | Device B (edge-tablet-07) |
+----------------------------------+ +----------------------------------+
| SQLite DB | | Logical WAL log | | SQLite DB | | Logical WAL log |
+-----------+ +-----------------+ +-----------+ +-----------------+
\ / \ /
(HTTPSyncEngine) (HTTPSyncEngine)
\ / \ /
v v v v
POST /sync (Replicate) POST /sync (Replicate)
POST /ack (Commit) POST /ack (Commit)
\ / \ /
\ / / /
v v
+-------------------------------------------------------------+
| FastAPI HTTP Server |
+-------------------------------------------------------------+
| Master Store (server_store.db) |
+-------------------------------------------------------------+
Key Components
-
Local Edge Storage (
client.py)- A SQLite file database on the device…
I would appreciate any feedback on the design decisions, SQLite integration, or suggestions on what database internals or distributed algorithms I should study next.
Top comments (0)