Stop Syncing Elasticsearch: Native Hybrid Search with Spring AI and Pgvector sparsevec

#java #systemdesign #ai #llm

Stop Syncing Elasticsearch: Native Hybrid Search with Spring AI and Pgvector `sparsevec`

Spin up another Elasticsearch cluster just for keyword search alongside your Postgres database, and you are wasting engineering hours on synchronization lag and infrastructure overhead. With pgvector's mature sparse vector support in 2026, you can run state-of-the-art hybrid dense-sparse search natively inside PostgreSQL using Spring AI.

If you're prepping for interviews, I've been building javalld.com — real machine coding problems with full execution traces.

Why Most Developers Get This Wrong

Maintaining dual-database architectures: Running Postgres + Elasticsearch requires fragile Outbox patterns or CDC tools (like Debezium) just to keep search indexes in sync.
Ignoring sparsevec types: Treating sparse embeddings (like SPLADE) as dense vectors, which destroys database performance and blows up index sizes.
Client-side merging: Fetching separate results for keyword and semantic queries and merging them in Java heap memory instead of offloading Reciprocal Rank Fusion (RRF) to the database.

The Right Way

Consolidate your RAG pipeline into a single Postgres instance using pgvector's sparsevec for SPLADE/BM25 sparse vectors and vector for dense embeddings, queried via Spring AI.

Dual-Embedding Generation: Generate dense embeddings (e.g., text-embedding-3-small) and sparse embeddings (SPLADE) in a single Spring AI pipeline.
Single-Table Storage: Store both embeddings in the same PostgreSQL table using vector and sparsevec column types.
In-Database RRF: Execute hybrid search using a single SQL query with Reciprocal Rank Fusion directly on the HNSW indexes.

Show Me The Code (or Example)

// Native RRF Hybrid Search with Spring Data JPA & Pgvector
@Query(value = """
    WITH dense AS (SELECT id, row_number() OVER (ORDER BY embedding <=> cast(:denseQuery as vector)) as rank FROM document),
         sparse AS (SELECT id, row_number() OVER (ORDER BY sparse_emb <=> cast(:sparseQuery as sparsevec)) as rank FROM document)
    SELECT doc.id, doc.content 
    FROM document doc
    JOIN dense ON doc.id = dense.id JOIN sparse ON doc.id = sparse.id
    ORDER BY (1.0 / (60 + dense.rank)) + (1.0 / (60 + sparse.rank)) DESC LIMIT :limit
    """, nativeQuery = true)
List<Document> hybridSearch(@Param("denseQuery") String dense, @Param("sparseQuery") String sparse, @Param("limit") int limit);

Key Takeaways

Kill the Sync Lag: Eliminating Elasticsearch means zero-lag document indexing; your ACID transactions cover both relational data and vector embeddings.
Leverage sparsevec: Pgvector's native sparsevec type stores high-dimensional SPLADE vectors efficiently without memory bloat.
Spring AI Integration: Use Spring AI's modular architecture to generate dual embeddings in a single pipeline before persisting.