DEV Community

Allan Roberto
Allan Roberto

Posted on • Edited on

Turning PostgreSQL Into a Vector Database with Docker

To store and query embeddings, we need a database capable of handling vector similarity search.

A practical solution is using PostgreSQL with the pgvector extension.

This allows PostgreSQL to store vectors and perform similarity queries efficiently.


Docker Compose Setup

Create a docker-compose.yml.

services:
  postgres:
    image: pgvector/pgvector:pg16
    container_name: vector-db
    environment:
      POSTGRES_DB: vectordb
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: admin
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:
Enter fullscreen mode Exit fullscreen mode

Run:

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Enable the Extension

Inside PostgreSQL:

CREATE EXTENSION vector;
Enter fullscreen mode Exit fullscreen mode

Creating a Vector Table

CREATE TABLE document_embedding (
    id BIGSERIAL PRIMARY KEY,
    document_id BIGINT,
    chunk_text TEXT,
    embedding VECTOR(1536)
);
Enter fullscreen mode Exit fullscreen mode

1536 dimensions are common for many embedding models.


Similarity Search

Example query:

SELECT chunk_text
FROM document_embedding
ORDER BY embedding <-> '[0.123, 0.443, ...]'
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

The <-> operator calculates vector distance.


Why Use PostgreSQL?

Advantages:

  • production-ready database
  • familiar SQL
  • easy Docker setup
  • integrates well with Java
  • avoids introducing new infrastructure

Sequence

  1. Meaning: How Data Vectorization Powers AI
  2. Turning PostgreSQL Into a Vector Database with Docker
  3. Indexing Knowledge Base Content with Spring Boot and pgvector
  4. Building Semantic Search with Spring Boot, PostgreSQL, and pgvector (RAG Retrieval)

Project Here

Top comments (0)