DEV Community

Atlas Whoff
Atlas Whoff

Posted on

Vector Databases Explained: Embeddings, Similarity Search, and When to Use Them

What Is a Vector Database?

A regular database stores text and numbers. You search by exact match or range.

A vector database stores meaning. You search by semantic similarity.

Regular DB: SELECT * FROM docs WHERE title LIKE '%authentication%'
Vector DB:  Find documents semantically similar to "how do users log in"
Enter fullscreen mode Exit fullscreen mode

The second query returns results about login, OAuth, sessions, passwords, JWTs—even if none of them contain the word "authentication."

How It Works

Text → Embedding model → High-dimensional vector → Store in vector DB

import OpenAI from 'openai';

const openai = new OpenAI();

async function embed(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding; // 1536-dimensional vector
}

const docVector = await embed('JWT tokens expire after 24 hours by default');
// [0.023, -0.041, 0.087, ... 1536 numbers]

const queryVector = await embed('how long does authentication last');
// [0.019, -0.038, 0.091, ...] — similar direction in space

// Cosine similarity between these vectors ≈ 0.92 (very similar)
Enter fullscreen mode Exit fullscreen mode

pgvector: Vectors in PostgreSQL

You don't need a separate vector DB if you're already on PostgreSQL:

-- Enable extension
CREATE EXTENSION vector;

-- Table with vector column
CREATE TABLE documents (
  id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  content   TEXT NOT NULL,
  embedding vector(1536),  -- OpenAI text-embedding-3-small dimension
  metadata  JSONB
);

-- Index for fast similarity search
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);
Enter fullscreen mode Exit fullscreen mode
// Store document with embedding
async function storeDocument(content: string, metadata: object) {
  const embedding = await embed(content);

  await db.$executeRaw`
    INSERT INTO documents (content, embedding, metadata)
    VALUES (${content}, ${JSON.stringify(embedding)}::vector, ${JSON.stringify(metadata)}::jsonb)
  `;
}

// Semantic search
async function search(query: string, limit = 5) {
  const queryEmbedding = await embed(query);

  const results = await db.$queryRaw<Array<{content: string, similarity: number}>>`
    SELECT content, metadata,
      1 - (embedding <=> ${JSON.stringify(queryEmbedding)}::vector) AS similarity
    FROM documents
    ORDER BY embedding <=> ${JSON.stringify(queryEmbedding)}::vector
    LIMIT ${limit}
  `;

  return results;
}
Enter fullscreen mode Exit fullscreen mode

Dedicated Vector Databases

Pinecone (managed, serverless):

import { Pinecone } from '@pinecone-database/pinecone';

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });
const index = pc.index('my-docs');

// Upsert
await index.upsert([{
  id: 'doc-1',
  values: embedding,
  metadata: { title: 'Authentication Guide', url: '/docs/auth' },
}]);

// Query
const results = await index.query({
  vector: queryEmbedding,
  topK: 5,
  includeMetadata: true,
});
Enter fullscreen mode Exit fullscreen mode

Qdrant (open source, self-hosted):

import { QdrantClient } from '@qdrant/js-client-rest';

const client = new QdrantClient({ url: 'http://localhost:6333' });

// Search
const results = await client.search('documents', {
  vector: queryEmbedding,
  limit: 5,
  with_payload: true,
});
Enter fullscreen mode Exit fullscreen mode

RAG: Retrieval-Augmented Generation

The main use case—give LLMs access to your documents:

async function answerQuestion(question: string): Promise<string> {
  // 1. Find relevant documents
  const relevantDocs = await search(question, 3);

  // 2. Build context
  const context = relevantDocs
    .map(doc => doc.content)
    .join('\n\n');

  // 3. Ask LLM with context
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: `Answer questions using only the provided context. 
                  If the answer isn't in the context, say so.\n\nContext:\n${context}`,
      },
      { role: 'user', content: question },
    ],
  });

  return response.choices[0].message.content!;
}
Enter fullscreen mode Exit fullscreen mode

When to Use Vector Search

Good fit:

  • Semantic document search
  • Recommendation systems
  • RAG (AI with your docs)
  • Duplicate detection
  • Code search by functionality

Not needed:

  • Exact match lookups (use SQL)
  • Structured filtering (use SQL)
  • Simple keyword search (use Postgres full-text)

Start with pgvector. It handles millions of vectors with good performance. Switch to Pinecone/Qdrant when you need billions of vectors or specialized features.


AI features with vector search and RAG built in: Whoff Agents MCP Security Scanner uses embeddings for intelligent threat detection.

Top comments (0)