Milvus is an open-source vector database built for AI applications. It stores, indexes, and searches billions of embedding vectors with millisecond latency. Used by 1,000+ organizations including Salesforce, PayPal, and Shopee.
Why Milvus?
- Purpose-built — designed for vector search from the ground up
- Billion-scale — handles 1B+ vectors with consistent performance
- Multi-index — IVF, HNSW, DiskANN, GPU indexes
- Hybrid search — combine vector similarity with scalar filtering
- Cloud-native — Kubernetes-native, scales horizontally
Quick Start
# Docker (standalone)
docker run -d --name milvus \
-p 19530:19530 -p 9091:9091 \
milvusdb/milvus:latest standalone
# Or Milvus Lite (embedded, for dev)
pip install pymilvus
# Uses SQLite-based local storage
Python SDK
from pymilvus import MilvusClient
import numpy as np
# Connect (or use Milvus Lite for local dev)
client = MilvusClient(uri="http://localhost:19530")
# Create collection
client.create_collection(
collection_name="articles",
dimension=768, # Match your embedding model
)
# Insert vectors
data = [
{"id": 1, "vector": np.random.rand(768).tolist(), "title": "Introduction to RAG", "category": "AI"},
{"id": 2, "vector": np.random.rand(768).tolist(), "title": "Vector Databases Explained", "category": "Database"},
{"id": 3, "vector": np.random.rand(768).tolist(), "title": "Building Search with Milvus", "category": "AI"},
]
client.insert(collection_name="articles", data=data)
# Similarity search
query_vector = np.random.rand(768).tolist()
results = client.search(
collection_name="articles",
data=[query_vector],
limit=5,
output_fields=["title", "category"],
)
for hits in results:
for hit in hits:
print(f"{hit['entity']['title']} — score: {hit['distance']:.4f}")
# Filtered search (hybrid)
results = client.search(
collection_name="articles",
data=[query_vector],
filter='category == "AI"',
limit=5,
output_fields=["title"],
)
REST API
BASE="http://localhost:9091/api/v1"
# List collections
curl $BASE/collections
# Collection info
curl -X POST $BASE/collection \
-d '{"collectionName": "articles"}'
# Insert
curl -X POST $BASE/entities \
-H 'Content-Type: application/json' \
-d '{
"collectionName": "articles",
"data": [{"id": 4, "vector": [...], "title": "New Article"}]
}'
# Search
curl -X POST $BASE/search \
-H 'Content-Type: application/json' \
-d '{
"collectionName": "articles",
"vector": [...],
"limit": 5,
"outputFields": ["title", "category"]
}'
RAG Pipeline Example
from openai import OpenAI
from pymilvus import MilvusClient
openai = OpenAI()
milvus = MilvusClient(uri="http://localhost:19530")
def embed(text: str) -> list[float]:
response = openai.embeddings.create(input=text, model="text-embedding-3-small")
return response.data[0].embedding
def search(query: str, top_k: int = 5) -> list[dict]:
query_vec = embed(query)
results = milvus.search(
collection_name="knowledge_base",
data=[query_vec],
limit=top_k,
output_fields=["text", "source"],
)
return [hit['entity'] for hit in results[0]]
def answer(question: str) -> str:
context = search(question)
context_text = "\n".join([doc['text'] for doc in context])
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"Answer using this context:\n{context_text}"},
{"role": "user", "content": question},
],
)
return response.choices[0].message.content
Key Features
| Feature | Details |
|---|---|
| Scale | Billions of vectors |
| Indexes | IVF, HNSW, DiskANN, GPU |
| Search | ANN, range, hybrid |
| Filtering | Scalar + vector combined |
| Storage | Memory, disk, tiered |
| Deployment | Standalone, cluster, cloud |
Resources
Building AI applications? Check my Apify actors or email spinov001@gmail.com.
Top comments (0)