DEV Community

Allan Roberto
Allan Roberto

Posted on

Turning PostgreSQL Into a Vector Database with Docker

To store and query embeddings, we need a database capable of handling vector similarity search.

A practical solution is using PostgreSQL with the pgvector extension.

This allows PostgreSQL to store vectors and perform similarity queries efficiently.


Docker Compose Setup

Create a docker-compose.yml.

services:
  postgres:
    image: pgvector/pgvector:pg16
    container_name: vector-db
    environment:
      POSTGRES_DB: vectordb
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: admin
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:
Enter fullscreen mode Exit fullscreen mode

Run:

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Enable the Extension

Inside PostgreSQL:

CREATE EXTENSION vector;
Enter fullscreen mode Exit fullscreen mode

Creating a Vector Table

CREATE TABLE document_embedding (
    id BIGSERIAL PRIMARY KEY,
    document_id BIGINT,
    chunk_text TEXT,
    embedding VECTOR(1536)
);
Enter fullscreen mode Exit fullscreen mode

1536 dimensions are common for many embedding models.


Similarity Search

Example query:

SELECT chunk_text
FROM document_embedding
ORDER BY embedding <-> '[0.123, 0.443, ...]'
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

The <-> operator calculates vector distance.


Why Use PostgreSQL?

Advantages:

  • production-ready database
  • familiar SQL
  • easy Docker setup
  • integrates well with Java
  • avoids introducing new infrastructure

Next Article

Now that the database is ready, the next step is indexing a knowledge base using Spring Boot.
Project Here

Top comments (0)