Swap Vector Databases by Changing a URL: Meet vectorwrap
tl;dr vectorwrap lets you migrate from an in-memory SQLite prototype to a production pgvector cluster by changing only the connection URL – all your insert & search code stays the same.
The Problem
Every vector database ships its own client, table schema, and API quirks.
A side-project that starts on DuckDB or SQLite turns into a week-long rewrite when you graduate to Postgres or MySQL HeatWave.
Wouldn't it be nicer if the only thing you had to edit was the connection string?
Enter vectorwrap.
Why bother? (The migration tax)
I built a toy semantic-search feature for a Flask app on in-memory SQLite.
When the project graduated to staging we needed:
- Durable storage ➜ PostgreSQL + pgvector
- Production-grade indexing ➜ HNSW
- Simple CI tests ➜ DuckDB (fast + no server)
Three back-ends meant three client libraries, three table DDLs, and dozens of if db == "postgres": … branches. That's when vectorwrap was born.
30-second Quick Start
pip install "vectorwrap[all]" # pgvector, HeatWave, SQLite-VSS, DuckDB-VSS
from vectorwrap import VectorDB
def embed(txt): return [0.1] * 768 # plug your own embeddings here
db = VectorDB("sqlite:///:memory:") # 1️⃣ prototype
db.create_collection("docs", 768)
db.upsert("docs", 1, embed("hello world"), {"lang": "en"})
print(db.query("docs", embed("hello"), top_k=1))
db = VectorDB("postgresql://u:p@localhost/vectors") # 2️⃣ prod swap
print(db.query("docs", embed("hello"), top_k=1))
No new imports, no table DDL, just the URL.
VectorDB is a thin façade that parses the URL scheme and dispatches to a backend driver:
VectorDB("postgresql://…") ─┬─> PgVectorDriver
├─> MySQLVectorDriver
├─> SQLiteVSSDriver
└─> DuckDBVSSDriver
Features
Unified API
Each driver implements four primitives:
create_collection(name, dim)
upsert(name, id, vector, metadata)
query(name, query_vector, top_k=5, filter=None)
close()
Metadata filter translator
Filters are expressed as pure Python dicts:
{"category": "phone", "price": {"$lt": 1000}}
The translator turns that into:
| Backend | Generated SQL / predicate |
|---|---|
| Postgres | WHERE metadata->>'category'='phone' AND (metadata->>'price')::float < 1000 |
| SQLite-VSS | fetch N × top_k then Python-filter (no JSON query) |
| DuckDB | WHERE metadata->>'category'='phone' AND CAST(metadata->>'price' AS DOUBLE)<1000 |
| MySQL (HeatWave) | WHERE JSON_EXTRACT(metadata,'$.category')='phone' AND JSON_EXTRACT(metadata,'$.price') < 1000 |
Smart defaults
First insert → auto-create HNSW index (where supported).
top_k > 100 defaults to brute-force because HNSW recall degrades.
DuckDB/SQLite run synchronous by default; Postgres/MySQL drivers also expose awaitable methods via VectorDBAsync.
Zero-config tests
VectorDB("sqlite:///:memory:") + pytest gives you nanosecond-startup unit tests that run in CI without Docker.
Extensibility hook
Every driver inherits BaseVectorDriver; adding Qdrant is a 200-line PR (ping me!).
Benchmark (single CPU, 5k × 768, HNSW M=16 ef=100)
| Backend | QPS ↑ | p50 latency | Notes |
|---|---|---|---|
| PostgreSQL 16 + pgvector | 210 | 4.7 ms | Unix-socket on local SSD |
| DuckDB 0.10.2 | 197 | 5.0 ms | In-proc; no network hop |
| MySQL 8.2 HeatWave | 153 | 6.3 ms | TCP localhost |
| SQLite-VSS | 138 | 6.9 ms | HNSW in shared library |
Full Jupyter notebook + raw CSV in /bench.
Advanced usage
Async in FastAPI
from vectorwrap.asyncio import VectorDBAsync
db = VectorDBAsync("postgresql://u:p@host/db") # `asyncpg` under the hood
@app.post("/search")
async def search(q: str):
results = await db.query("docs", embed(q), top_k=10)
return {"hits": results}
Running LangChain on top
from langchain.vectorstores.vectorwrap import VectorWrapStore
store = VectorWrapStore(VectorDB("duckdb:///demo.duckdb"), embeddings=OpenAIEmbeddings())
retriever = store.as_retriever(search_kwargs={"filter": {"lang": "en"}})
Migrating data between DBs
src = VectorDB("sqlite:///legacy.db")
dest = VectorDB("postgresql://u:p@host/newdb")
for id_, vec, meta in src.scan("docs"):
dest.upsert("docs", id_, vec, meta)
<10 lines, no external ETL.
Future roadmap
- Redis & Elasticsearch adapters
- Batch upserts (
upsert_many)
Contributing
git clone https://github.com/mihirahuja1/vectorwrap
poetry install
pytest -q
Good-first-issue labels → metadata filter edge cases, docstrings
Code style → Black, ruff, mypy CI-enforced
Discussion → GitHub Discussions or #vectorwrap on pgvector Slack
Try it
- GitHub: https://github.com/mihirahuja1/vectorwrap
-
PyPI:
pip install vectorwrap - Colab playground: https://colab.research.google.com/github/mihirahuja1/vectorwrap/blob/HEAD/examples/demo_notebook.ipynb
If vectorwrap saves you a migration headache, feel free to leave a comment!
Top comments (0)