This post was originally published on Rivestack.
Most pgvector tutorials are written for Python. That makes sense: Python dominates the AI tooling ecosystem and most RAG frameworks are Python-first. But a large share of production web applications are built with Node.js, and those applications need vector search too. If you are building an API with Express, a full-stack app with Next.js, or a background worker with plain Node.js, this guide has everything you need to get pgvector working.
We will cover connecting with node-postgres, inserting and querying embeddings, building an HNSW index, and wiring it all into a functional RAG pipeline. At the end, there is also a section on using Drizzle ORM for TypeScript projects that prefer an ORM layer.
What You Need Before You Start
You need:
- A PostgreSQL database (14 or newer) with the
vectorextension enabled - Node.js 18 or newer
- An OpenAI API key for generating embeddings (or any other embedding provider)
If you are running PostgreSQL locally, install pgvector from the pgvector GitHub repo. On Rivestack, the extension is pre-installed and enabled by default on every database. Connect to your database and run:
CREATE EXTENSION IF NOT EXISTS vector;
Installing the Node.js Packages
npm install pg pgvector openai
npm install --save-dev @types/pg typescript
Connecting to PostgreSQL
import pg from 'pg';
import { fromSql, toSql } from 'pgvector';
const { Pool } = pg;
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
});
Using a Pool rather than a single Client is important for any application that handles concurrent requests.
Creating the Schema
async function createSchema(): Promise<void> {
await pool.query(`
CREATE TABLE IF NOT EXISTS documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
source TEXT,
embedding VECTOR(1536)
)
`);
}
The dimension (1536) matches OpenAI's text-embedding-3-small model. Adjust based on the model you use.
Generating and Inserting Embeddings
import OpenAI from 'openai';
import { toSql } from 'pgvector';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function embed(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return response.data[0].embedding;
}
async function insertDocument(content: string, source?: string): Promise<void> {
const embedding = await embed(content);
await pool.query(
'INSERT INTO documents (content, source, embedding) VALUES ($1, $2, $3)',
[content, source ?? null, toSql(embedding)]
);
}
The toSql call converts the JavaScript number[] into the string format PostgreSQL expects. Without it, the query will fail with a type error.
Querying by Vector Similarity
interface SearchResult {
id: number;
content: string;
source: string | null;
similarity: number;
}
async function search(query: string, limit = 5): Promise<SearchResult[]> {
const queryEmbedding = await embed(query);
const result = await pool.query<SearchResult>(
`SELECT id, content, source, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT $2`,
[toSql(queryEmbedding), limit]
);
return result.rows;
}
pgvector provides three distance operators: <=> (cosine), <-> (L2), <#> (inner product). For text embeddings, cosine distance is almost always the right choice.
Adding an HNSW Index
Without an index, PostgreSQL does a full sequential scan for each similarity query. This is fine for small datasets but becomes a bottleneck at scale.
CREATE INDEX CONCURRENTLY documents_embedding_idx
ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
At query time, tune the recall versus speed tradeoff with hnsw.ef_search:
async function searchWithHighRecall(query: string, limit = 5) {
const queryEmbedding = await embed(query);
const client = await pool.connect();
try {
await client.query('SET LOCAL hnsw.ef_search = 100');
const result = await client.query(
'SELECT id, content, 1 - (embedding <=> $1) AS similarity FROM documents ORDER BY embedding <=> $1 LIMIT $2',
[toSql(queryEmbedding), limit]
);
return result.rows;
} finally {
client.release();
}
}
Use SET LOCAL (not SET) to scope the setting to the current transaction.
Building a Complete RAG Pipeline
async function retrieve(query: string, k = 5): Promise<string[]> {
const qEmbed = await embed(query);
const result = await pool.query(
'SELECT content FROM documents ORDER BY embedding <=> $1 LIMIT $2',
[toSql(qEmbed), k]
);
return result.rows.map((r) => r.content as string);
}
async function answer(question: string): Promise<string> {
const docs = await retrieve(question);
const context = docs.join('\n\n');
const completion = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: 'Answer questions using only the provided context.' },
{ role: 'user', content: `Context:\n${context}\n\nQuestion: ${question}` },
],
});
return completion.choices[0].message.content ?? '';
}
Metadata Filtering
One advantage of pgvector over standalone vector databases is standard SQL WHERE clauses:
async function searchBySource(query: string, source: string, limit = 5) {
const queryEmbedding = await embed(query);
const result = await pool.query(
`SELECT id, content, source, 1 - (embedding <=> $1) AS similarity
FROM documents
WHERE source = $2
ORDER BY embedding <=> $1
LIMIT $3`,
[toSql(queryEmbedding), source, limit]
);
return result.rows;
}
Next.js Connection Pool Pattern
For Next.js, avoid creating a new pool on every request:
// lib/db.ts
import pg from 'pg';
declare global {
var pgPool: pg.Pool | undefined;
}
export const pool =
globalThis.pgPool ??
new pg.Pool({ connectionString: process.env.DATABASE_URL });
if (process.env.NODE_ENV !== 'production') {
globalThis.pgPool = pool;
}
The globalThis pattern prevents Next.js hot reload from creating a new pool on every file change.
Common Mistakes
Forgetting toSql on insert. You will get invalid input syntax for type vector. Always call toSql(embedding) before passing it to pool.query.
Using SET hnsw.ef_search without LOCAL. Connection pools reuse sessions, so earlier settings can bleed into later queries. Always use SET LOCAL inside a transaction.
Creating the HNSW index before loading data. Load your data first, then build the index. Building on an empty table reduces index quality.
Where to Go From Here
The next steps: add hybrid search combining vector and full-text, add row-level security for multi-tenant apps, or check the HNSW tuning guide for production performance.
The operational side — keeping pgvector updated, managing NVMe storage, connection pooling — is where self-hosted PostgreSQL gets complicated. Rivestack provides managed PostgreSQL with pgvector pre-installed and connection pooling built in.
Top comments (0)