Pinecone costs $70/month for 1M vectors. LanceDB stores them on disk for free. Embedded, serverless, and built for AI.
What Is LanceDB?
LanceDB is an open-source vector database built on the Lance columnar format. It's embedded (like SQLite) — no server, no Docker, no infrastructure.
import lancedb
import numpy as np
# Open database (creates directory if needed)
db = lancedb.connect("./my-lancedb")
# Create a table with vectors
data = [
{"text": "The cat sat on the mat", "vector": np.random.rand(384).tolist()},
{"text": "Dogs are great pets", "vector": np.random.rand(384).tolist()},
{"text": "Machine learning is fun", "vector": np.random.rand(384).tolist()},
]
table = db.create_table("documents", data)
# Search
query_vector = np.random.rand(384).tolist()
results = table.search(query_vector).limit(5).to_pandas()
print(results[["text", "_distance"]])
With Embeddings (Auto-Generate Vectors)
from lancedb.embeddings import get_registry
from lancedb.pydantic import LanceModel, Vector
# Use sentence-transformers for embeddings
embedder = get_registry().get("sentence-transformers").create()
class Document(LanceModel):
text: str = embedder.SourceField()
vector: Vector(embedder.ndims()) = embedder.VectorField()
# Vectors are generated automatically!
table = db.create_table("docs", schema=Document)
table.add([
{"text": "The cat sat on the mat"},
{"text": "Machine learning is transforming industries"},
{"text": "Python is a great programming language"},
])
# Search with text (auto-embedded)
results = table.search("artificial intelligence").limit(3).to_pandas()
Full-Text Search + Vector Search
# Hybrid search
results = (table
.search("machine learning", query_type="hybrid")
.limit(10)
.to_pandas()
)
Why LanceDB
| Feature | Pinecone | LanceDB |
|---|---|---|
| Hosting | Managed ($70+/mo) | Embedded (free) |
| Setup | Account, API key | pip install lancedb |
| Storage | Cloud only | Local disk, S3, GCS |
| Full-text search | No | Yes |
| SQL queries | No | Yes |
| Versioning | No | Built-in (Lance format) |
- Zero infrastructure — no server to run, like SQLite for vectors
- Disk-based — handle billions of vectors without $1000s in RAM
- Multi-modal — store images, text, video alongside vectors
- Versioning — Lance format supports time-travel queries
pip install lancedb
Building AI search? Check out my AI tools or email spinov001@gmail.com.
Top comments (0)