DEV Community

Cover image for pgvector with LangChain: Build a RAG Pipeline on PostgreSQL
Yasser B.
Yasser B.

Posted on • Originally published at rivestack.io

pgvector with LangChain: Build a RAG Pipeline on PostgreSQL

LangChain has a vectorstore abstraction that lets you swap out the underlying vector database without rewriting your application logic. Swap Chroma for Pinecone, Pinecone for Weaviate, whatever. In theory, it's clean. In practice, most teams end up staying with whatever they picked first, because migration is never as simple as swapping a class name.

So the decision matters. And if you're building on PostgreSQL, the answer is almost always: use pgvector. Your embeddings live in the same database as your users, documents, and application state. No sync pipeline. No eventual consistency. Full SQL.

This guide walks through the LangChain PGVector integration from scratch, including document loading, embedding, similarity search, metadata filtering, and wiring it into a working retrieval chain.

Why pgvector Over a Dedicated Vectorstore

Before the code, let's be direct about the tradeoff.

Dedicated vectorstores like Pinecone are fast and scale to billions of vectors without you thinking about infrastructure. If you're building something where vectors are the entire product, they're a reasonable choice.

But most applications aren't like that. You have users. You have documents. You have metadata. You have access control logic. When your vectors live in Pinecone and everything else lives in PostgreSQL, you've just created a sync problem. Document gets deleted from Postgres, the embedding stays in Pinecone. You want to filter by user_id, now you need to implement that in a separate system. You want a transaction that inserts a document and its embedding atomically, you can't have one.

pgvector collapses this. Everything lives in one place. JOINs work. Transactions work. Your existing backup strategy, monitoring setup, and connection pooler all work with zero changes.

Setting Up

You'll need:

  • PostgreSQL with the vector extension enabled
  • Python 3.9+
  • An OpenAI API key (or any LangChain-compatible embeddings model)

Install dependencies:

pip install langchain langchain-openai langchain-postgres psycopg
Enter fullscreen mode Exit fullscreen mode

The langchain-postgres package is the current home for the PGVector integration. On a managed PostgreSQL service with pgvector pre-installed, just run this once:

CREATE EXTENSION IF NOT EXISTS vector;
Enter fullscreen mode Exit fullscreen mode

Connecting LangChain to pgvector

The PGVector class takes a connection string and an embeddings object:

from langchain_postgres import PGVector
from langchain_openai import OpenAIEmbeddings

connection_string = "postgresql+psycopg://user:password@localhost:5432/mydb"
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name="documents",
    connection=connection_string,
    use_jsonb=True,
)
Enter fullscreen mode Exit fullscreen mode

Always set use_jsonb=True — it stores metadata as JSONB, unlocking proper metadata filtering. LangChain creates the required tables automatically on first use.

Loading and Embedding Documents

from langchain_core.documents import Document

docs = [
    Document(
        page_content="PostgreSQL supports ACID transactions and is MVCC-based.",
        metadata={"source": "postgres-intro", "category": "database"}
    ),
    Document(
        page_content="pgvector adds vector similarity search using HNSW and IVFFlat indexes.",
        metadata={"source": "pgvector-intro", "category": "extensions"}
    ),
]

vectorstore.add_documents(docs)
Enter fullscreen mode Exit fullscreen mode

For real use cases, split long documents first:

from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = splitter.split_documents(raw_docs)
vectorstore.add_documents(split_docs)
Enter fullscreen mode Exit fullscreen mode

Similarity Search

results = vectorstore.similarity_search("How does pgvector index work?", k=4)

results_with_scores = vectorstore.similarity_search_with_score("pgvector index", k=4)
for doc, score in results_with_scores:
    print(f"Score: {score:.4f}{doc.page_content[:80]}")
Enter fullscreen mode Exit fullscreen mode

Lower scores mean higher similarity when using cosine distance.

Metadata Filtering

# Filter by a single field
results = vectorstore.similarity_search(
    "database indexing",
    k=4,
    filter={"category": "database"}
)

# Multi-tenant: scope to a specific user
results = vectorstore.similarity_search(
    "vector search",
    k=4,
    filter={"user_id": "user_123"}
)
Enter fullscreen mode Exit fullscreen mode

With use_jsonb=True, this translates to a JSONB containment query that can be indexed.

MMR Search for Diverse Results

Maximum Marginal Relevance avoids returning five chunks that all say the same thing:

results = vectorstore.max_marginal_relevance_search(
    "PostgreSQL extensions",
    k=4,
    fetch_k=20,
    lambda_mult=0.5  # 0 = max diversity, 1 = max relevance
)
Enter fullscreen mode Exit fullscreen mode

Building a Retrieval Chain

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20}
)

prompt = ChatPromptTemplate.from_template("""
Answer the question based only on the following context:

{context}

Question: {question}
""")

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

chain = (
    {"context": retriever | (lambda docs: "\n\n".join(d.page_content for d in docs)),
     "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

answer = chain.invoke("What indexes does pgvector support?")
Enter fullscreen mode Exit fullscreen mode

For multi-tenant RAG, scope the retriever per authenticated user:

retriever = vectorstore.as_retriever(
    search_kwargs={"k": 4, "filter": {"user_id": current_user_id}}
)
Enter fullscreen mode Exit fullscreen mode

HNSW Indexing for Production

By default, pgvector uses exact k-NN search. For datasets over 100k vectors, add an HNSW index:

CREATE INDEX ON langchain_pg_embedding
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- At query time:
SET hnsw.ef_search = 100;
Enter fullscreen mode Exit fullscreen mode

LangChain won't create this index automatically. Run it once after initial load.

Connection Pooling

Configure an engine with a pool instead of relying on per-instance connections:

from sqlalchemy import create_engine

engine = create_engine(
    connection_string,
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,
)

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name="documents",
    connection=engine,
    use_jsonb=True,
)
Enter fullscreen mode Exit fullscreen mode

Async Usage

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name="documents",
    connection=async_engine,
    use_jsonb=True,
    async_mode=True,
)

results = await vectorstore.asimilarity_search("your query", k=4)
answer = await chain.ainvoke("What indexes does pgvector support?")
Enter fullscreen mode Exit fullscreen mode

Document Upsert and Deletion

# Upsert by ID
vectorstore.add_documents(updated_docs, ids=["doc_id_1", "doc_id_2"])

# Delete by ID
vectorstore.delete(ids=["doc_id_1"])
Enter fullscreen mode Exit fullscreen mode

Summary

pgvector with LangChain is production-ready. The langchain-postgres package handles embeddings storage, similarity search, MMR, and metadata filtering. The underlying database is PostgreSQL, so transactions, JOINs, and your existing operational tooling all work.

Key things to get right: use use_jsonb=True, create an HNSW index before your dataset grows large, configure a connection pool at the engine level, and scope retrievers to the authenticated user for multi-tenant workloads.

If you want a managed PostgreSQL setup with pgvector pre-installed, PgBouncer included, and backups handled: check out Rivestack.


Originally published at rivestack.io

Top comments (0)