KOLOG B Josias Yannick

Posted on Jan 11 • Originally published at kologojosias.com

9,000+ Downloads in 2 Weeks: I Just Built and Published

#vectordatabases #ai #python #node

Two weeks ago, I published Embex to PyPI and npm.

Today: 9,000+ downloads (7K Python, 2K Node.js).

I made one LinkedIn post after publishing. That's it. And it didn't even got likes or comments.

Here's what I built.

What is Embex?

A universal ORM for vector databases. One API that works across 7 different databases.

The problem:

Every vector database has a completely different API.

Pinecone:

index.upsert(vectors=[(id, values, metadata)])
results = index.query(vector=query, top_k=5)

Qdrant:

client.upsert(collection_name=name, points=points)
results = client.search(collection_name=name, query_vector=query, limit=5)

Weaviate:

client.data_object.create(data_object, class_name)
results = client.query.get(class_name).with_near_vector(query).do()

Switching from Pinecone to Qdrant means rewriting your entire data layer.

With Embex:

# Works with ANY provider
client = await EmbexClient.new_async(provider="lancedb", url="./data")
await client.insert("products", vectors)
results = await client.search("products", vector=query, top_k=5)

# Switch to Qdrant? Change ONE line:
client = await EmbexClient.new_async(provider="qdrant", url="http://localhost:6333")

Same code. Zero vendor lock-in.

Why I Built It

I needed to test different vector databases for a project. Writing separate implementations for each one seemed wasteful.

So I built one API that works with all of them.

The Tech Stack

Core:

Rust (performance-critical operations)
PyO3 (Python bindings)
Napi-rs (Node.js bindings)
SIMD instructions (vector math acceleration)

Why Rust?

Vector operations are CPU-intensive. Rust + SIMD is ~4x faster than pure Python/JavaScript for normalization, similarity calculations, and filtering.

Supported Databases:

LanceDB (embedded, file-based)
Qdrant (managed/self-hosted)
Pinecone (managed)
Chroma (embedded/server)
PgVector (PostgreSQL extension)
Milvus (self-hosted/cloud)
Weaviate (managed/self-hosted)

The Launch

Published to PyPI and npm. Made one LinkedIn post. Went back to building.

Downloads started coming in. 9,000+ in two weeks.

I don't know where the traffic came from. PyPI/npm search, probably. Maybe GitHub. I haven't looked at analytics closely.

What I Noticed

23% of downloads are Node.js. I expected mostly Python. Apparently people are building with JavaScript too.

Multi-language support matters. Supporting both Python and Node.js from day one expanded reach.

Example Usage

Python:

from embex import EmbexClient, Vector
from sentence_transformers import SentenceTransformer

client = await EmbexClient.new_async("lancedb", "./data")
model = SentenceTransformer('all-MiniLM-L6-v2')

await client.create_collection("docs", dimension=384)

vectors = [Vector(
    id="1",
    vector=model.encode("your text").tolist(),
    metadata={"text": "your text"}
)]
await client.insert("docs", vectors)

results = await client.search(
    "docs",
    vector=model.encode("query").tolist(),
    top_k=5
)

Node.js:

const { EmbexClient } = require('@bridgerust/embex');
const { pipeline } = require('@xenova/transformers');

const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const client = await EmbexClient.new({ provider: 'qdrant', url: 'http://localhost:6333' });

await client.insert('docs', documents.map(doc => ({
  id: doc.id,
  vector: embedder(doc.content),
  metadata: { title: doc.title }
})));

const results = await client.search(
  'docs',
  embedder('search query'),
  { top_k: 10 }
);

What's Next

Short term:

Hybrid search (vector + keyword)
Performance optimizations
More examples

Medium term:

Elasticsearch/OpenSearch support
Redis vector support
Migration utilities

Try It

Python:

pip install embex lancedb sentence-transformers

Node.js:

npm install @bridgerust/embex lancedb @xenova/transformers

Quick test:

import asyncio
from embex import EmbexClient

async def main():
    client = await EmbexClient.new_async('lancedb', './data')
    await client.create_collection('test', dimension=384)
    print('✅ Works')

asyncio.run(main())