DEV Community

Alex Spinov
Alex Spinov

Posted on

Chroma Has a Free API — Here's How to Build AI Apps with the Simplest Vector Database

Why Chroma?

Chroma is the simplest vector database for AI. It runs embedded in your Python or JavaScript app — no server setup, no Docker, no infrastructure. Just pip install and go.

Free and open source. Chroma Cloud coming soon with managed hosting.

Getting Started

Python (Embedded — Zero Setup)

pip install chromadb
Enter fullscreen mode Exit fullscreen mode
import chromadb

client = chromadb.Client()  # In-memory
# Or persistent:
# client = chromadb.PersistentClient(path="./chroma_data")

# Create collection (auto-embeds with default model!)
collection = client.create_collection(name="articles")

# Add documents — Chroma embeds them automatically!
collection.add(
    documents=[
        "Machine learning is transforming how we build software",
        "React Server Components change the way we think about rendering",
        "Docker containers simplify deployment and scaling",
        "GraphQL provides a flexible alternative to REST APIs",
        "Rust's ownership model prevents memory bugs at compile time"
    ],
    ids=["ml-1", "react-1", "docker-1", "graphql-1", "rust-1"],
    metadatas=[
        {"category": "AI", "author": "Alice"},
        {"category": "Frontend", "author": "Bob"},
        {"category": "DevOps", "author": "Charlie"},
        {"category": "API", "author": "Alice"},
        {"category": "Systems", "author": "Diana"}
    ]
)

# Semantic query — finds by meaning!
results = collection.query(
    query_texts=["artificial intelligence and neural networks"],
    n_results=3
)

for doc, meta, dist in zip(results['documents'][0], results['metadatas'][0], results['distances'][0]):
    print(f"[{meta['category']}] {doc[:60]}... (distance: {dist:.3f})")

# Filtered query
results = collection.query(
    query_texts=["modern web development"],
    where={"category": "Frontend"},
    n_results=3
)

# Update
collection.update(
    ids=["ml-1"],
    documents=["Deep learning and neural networks are revolutionizing AI"],
    metadatas=[{"category": "AI", "author": "Alice", "updated": True}]
)
Enter fullscreen mode Exit fullscreen mode

JavaScript (Also Embedded!)

import { ChromaClient } from "chromadb";

const client = new ChromaClient();
const collection = await client.createCollection({ name: "docs" });

// Add documents
await collection.add({
  ids: ["doc1", "doc2", "doc3"],
  documents: [
    "How to deploy a Next.js app to Vercel",
    "Building REST APIs with Express.js",
    "Introduction to TypeScript generics"
  ],
  metadatas: [
    { topic: "deployment" },
    { topic: "backend" },
    { topic: "typescript" }
  ]
});

// Query
const results = await collection.query({
  queryTexts: ["hosting web applications"],
  nResults: 2
});

console.log(results.documents);
Enter fullscreen mode Exit fullscreen mode

RAG (Retrieval Augmented Generation)

import chromadb
import openai

# 1. Store your knowledge base in Chroma
client = chromadb.PersistentClient(path="./knowledge")
kb = client.get_or_create_collection("knowledge_base")

# Add your docs
kb.add(
    documents=["Your company docs...", "Product specs...", "FAQ..."],
    ids=["doc1", "doc2", "doc3"]
)

# 2. Query Chroma for relevant context
def ask(question):
    results = kb.query(query_texts=[question], n_results=3)
    context = "\n".join(results['documents'][0])

    # 3. Feed context to LLM
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": f"Answer based on this context:\n{context}"},
            {"role": "user", "content": question}
        ]
    )
    return response.choices[0].message.content

print(ask("What are the product specifications?"))
Enter fullscreen mode Exit fullscreen mode

Chroma vs Alternatives

Feature Chroma Pinecone Qdrant
Embedded mode Yes No No
Setup required None Account Docker
Auto-embedding Yes No No
License Apache 2.0 Proprietary Apache 2.0
Best for Prototyping + RAG Production Production

Need to scrape data for your AI app? I build production-ready scrapers. Check out my Apify actors or email spinov001@gmail.com for custom data pipelines.

Building RAG apps? What's your vector DB choice? Share below!

Top comments (0)