DEV Community

Yasser B.
Yasser B.

Posted on • Originally published at rivestack.io

pgvector with Node.js: Build Semantic Search on PostgreSQL

This post was originally published on Rivestack.

Most pgvector tutorials are written for Python. That makes sense: Python dominates the AI tooling ecosystem and most RAG frameworks are Python-first. But a large share of production web applications are built with Node.js, and those applications need vector search too. If you are building an API with Express, a full-stack app with Next.js, or a background worker with plain Node.js, this guide has everything you need to get pgvector working.

We will cover connecting with node-postgres, inserting and querying embeddings, building an HNSW index, and wiring it all into a functional RAG pipeline. At the end, there is also a section on using Drizzle ORM for TypeScript projects that prefer an ORM layer.

What You Need Before You Start

You need:

  • A PostgreSQL database (14 or newer) with the vector extension enabled
  • Node.js 18 or newer
  • An OpenAI API key for generating embeddings (or any other embedding provider)

If you are running PostgreSQL locally, install pgvector from the pgvector GitHub repo. On Rivestack, the extension is pre-installed and enabled by default on every database. Connect to your database and run:

CREATE EXTENSION IF NOT EXISTS vector;
Enter fullscreen mode Exit fullscreen mode

Installing the Node.js Packages

npm install pg pgvector openai
npm install --save-dev @types/pg typescript
Enter fullscreen mode Exit fullscreen mode

Connecting to PostgreSQL

import pg from 'pg';
import { fromSql, toSql } from 'pgvector';

const { Pool } = pg;

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
});
Enter fullscreen mode Exit fullscreen mode

Using a Pool rather than a single Client is important for any application that handles concurrent requests.

Creating the Schema

async function createSchema(): Promise<void> {
  await pool.query(`
    CREATE TABLE IF NOT EXISTS documents (
      id        BIGSERIAL PRIMARY KEY,
      content   TEXT NOT NULL,
      source    TEXT,
      embedding VECTOR(1536)
    )
  `);
}
Enter fullscreen mode Exit fullscreen mode

The dimension (1536) matches OpenAI's text-embedding-3-small model. Adjust based on the model you use.

Generating and Inserting Embeddings

import OpenAI from 'openai';
import { toSql } from 'pgvector';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function embed(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding;
}

async function insertDocument(content: string, source?: string): Promise<void> {
  const embedding = await embed(content);
  await pool.query(
    'INSERT INTO documents (content, source, embedding) VALUES ($1, $2, $3)',
    [content, source ?? null, toSql(embedding)]
  );
}
Enter fullscreen mode Exit fullscreen mode

The toSql call converts the JavaScript number[] into the string format PostgreSQL expects. Without it, the query will fail with a type error.

Querying by Vector Similarity

interface SearchResult {
  id: number;
  content: string;
  source: string | null;
  similarity: number;
}

async function search(query: string, limit = 5): Promise<SearchResult[]> {
  const queryEmbedding = await embed(query);

  const result = await pool.query<SearchResult>(
    `SELECT id, content, source, 1 - (embedding <=> $1) AS similarity
     FROM documents
     ORDER BY embedding <=> $1
     LIMIT $2`,
    [toSql(queryEmbedding), limit]
  );

  return result.rows;
}
Enter fullscreen mode Exit fullscreen mode

pgvector provides three distance operators: <=> (cosine), <-> (L2), <#> (inner product). For text embeddings, cosine distance is almost always the right choice.

Adding an HNSW Index

Without an index, PostgreSQL does a full sequential scan for each similarity query. This is fine for small datasets but becomes a bottleneck at scale.

CREATE INDEX CONCURRENTLY documents_embedding_idx
ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Enter fullscreen mode Exit fullscreen mode

At query time, tune the recall versus speed tradeoff with hnsw.ef_search:

async function searchWithHighRecall(query: string, limit = 5) {
  const queryEmbedding = await embed(query);
  const client = await pool.connect();
  try {
    await client.query('SET LOCAL hnsw.ef_search = 100');
    const result = await client.query(
      'SELECT id, content, 1 - (embedding <=> $1) AS similarity FROM documents ORDER BY embedding <=> $1 LIMIT $2',
      [toSql(queryEmbedding), limit]
    );
    return result.rows;
  } finally {
    client.release();
  }
}
Enter fullscreen mode Exit fullscreen mode

Use SET LOCAL (not SET) to scope the setting to the current transaction.

Building a Complete RAG Pipeline

async function retrieve(query: string, k = 5): Promise<string[]> {
  const qEmbed = await embed(query);
  const result = await pool.query(
    'SELECT content FROM documents ORDER BY embedding <=> $1 LIMIT $2',
    [toSql(qEmbed), k]
  );
  return result.rows.map((r) => r.content as string);
}

async function answer(question: string): Promise<string> {
  const docs = await retrieve(question);
  const context = docs.join('\n\n');

  const completion = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: 'Answer questions using only the provided context.' },
      { role: 'user', content: `Context:\n${context}\n\nQuestion: ${question}` },
    ],
  });

  return completion.choices[0].message.content ?? '';
}
Enter fullscreen mode Exit fullscreen mode

Metadata Filtering

One advantage of pgvector over standalone vector databases is standard SQL WHERE clauses:

async function searchBySource(query: string, source: string, limit = 5) {
  const queryEmbedding = await embed(query);
  const result = await pool.query(
    `SELECT id, content, source, 1 - (embedding <=> $1) AS similarity
     FROM documents
     WHERE source = $2
     ORDER BY embedding <=> $1
     LIMIT $3`,
    [toSql(queryEmbedding), source, limit]
  );
  return result.rows;
}
Enter fullscreen mode Exit fullscreen mode

Next.js Connection Pool Pattern

For Next.js, avoid creating a new pool on every request:

// lib/db.ts
import pg from 'pg';

declare global {
  var pgPool: pg.Pool | undefined;
}

export const pool =
  globalThis.pgPool ??
  new pg.Pool({ connectionString: process.env.DATABASE_URL });

if (process.env.NODE_ENV !== 'production') {
  globalThis.pgPool = pool;
}
Enter fullscreen mode Exit fullscreen mode

The globalThis pattern prevents Next.js hot reload from creating a new pool on every file change.

Common Mistakes

Forgetting toSql on insert. You will get invalid input syntax for type vector. Always call toSql(embedding) before passing it to pool.query.

Using SET hnsw.ef_search without LOCAL. Connection pools reuse sessions, so earlier settings can bleed into later queries. Always use SET LOCAL inside a transaction.

Creating the HNSW index before loading data. Load your data first, then build the index. Building on an empty table reduces index quality.

Where to Go From Here

The next steps: add hybrid search combining vector and full-text, add row-level security for multi-tenant apps, or check the HNSW tuning guide for production performance.

The operational side — keeping pgvector updated, managing NVMe storage, connection pooling — is where self-hosted PostgreSQL gets complicated. Rivestack provides managed PostgreSQL with pgvector pre-installed and connection pooling built in.

Top comments (0)