Franck Pachot

Posted on Jun 29

Hybrid Search (Full-Text and Vector Similarity) in HorizonDB

#horizondb #postgres #sql #database

You may have read about Hybrid search for Azure HorizonDB. It is presented as combining BM25 full‑text and vector similarity in a single query. But how are they actually combined? The execution plan answers that.

In this post, I use a small synthetic product catalog to ensure the entire demo is reproducible. The text is sufficiently realistic for BM25 queries, and the embeddings are deterministic synthetic vectors, allowing you to run the full script without needing an embedding model. If you have azure_openai.create_embeddings() configured, you can substitute the synthetic embedding function with actual embeddings.

I will test two separate queries, followed by three methods of combining BM25 with vector retrieval:

cascade (BM25 → vector)
cascade (vector → BM25)
hybrid (parallel + fusion)

These are not interchangeable: they represent distinct trade-offs between recall and performance.

Setup

This example uses:

pg_textsearch for BM25 full-text search
pgvector for the vector type and distance operator
pg_diskann for the vector index, when available

The extensions must be listed in azure.extensions to enable CREATE EXTENSION.

In addition, pg_textsearch must be loaded on startup:

Here is my HorizonDB configuration:

postgres=> select version();
                                                        version
-----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 17.9 (Azure HorizonDB (70f3b593ec7)(release)) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 13.2.0, 64-bit

postgres=> show azure.extensions;

             azure.extensions
------------------------------------------
 pg_diskann,vector,pg_textsearch,azure_ai

postgres=> show shared_preload_libraries;

                               shared_preload_libraries
---------------------------------------------------------------------------------------
 pg_textsearch,azure,orion_storage,pg_availability,pg_qs,pgms_stats,pgms_wait_sampling

I've set up the extension features, functions, data types, and operators using CREATE EXTENSIONS and created a schema for this demo:


create extension if not exists vector;
create extension if not exists pg_textsearch;
create extension if not exists pg_diskann;

drop schema if exists hybrid_demo cascade;
create schema hybrid_demo;

set search_path = hybrid_demo, public, pgfts;

For the demo: a deterministic embedding function

I should use azure_openai.create_embeddings(), as I mentioned in the previous blog post. However, for this demo, I opted to create a fake embedding function that maps text to a simple vector(16) based on product keywords. This isn't an embedding model, but it ensures that execution plans are reproducible without relying on a model. It also simplifies the concept of embeddings with a basic, small-dimension LIKE-style semantic vector. The purpose is to demonstrate a search query using both a real text search and a vector-based semantic search.

create or replace function demo_embedding(txt text)
returns vector(16)
language sql
immutable
parallel safe
as $$
select (
  '[' ||
  concat_ws(',',
    case when txt ~* '(mid|century)'       then '1.0' else '0.0' end,
    case when txt ~* '(modern|minimalist)' then '1.0' else '0.0' end,
    case when txt ~* '(chair|seat)'        then '1.0' else '0.0' end,
    case when txt ~* '(table|desk)'        then '1.0' else '0.0' end,
    case when txt ~* '(sofa|couch)'        then '1.0' else '0.0' end,
    case when txt ~* '(lamp|light)'        then '1.0' else '0.0' end,
    case when txt ~* '(wood|walnut|oak)'   then '1.0' else '0.0' end,
    case when txt ~* '(metal|steel)'       then '1.0' else '0.0' end,
    case when txt ~* '(leather)'           then '1.0' else '0.0' end,
    case when txt ~* '(fabric|linen)'      then '1.0' else '0.0' end,
    case when txt ~* '(industrial)'        then '1.0' else '0.0' end,
    case when txt ~* '(scandinavian)'      then '1.0' else '0.0' end,
    case when txt ~* '(office)'            then '1.0' else '0.0' end,
    case when txt ~* '(dining)'            then '1.0' else '0.0' end,
    case when txt ~* '(classic|vintage)'   then '1.0' else '0.0' end,
    case when txt ~* '(outdoor|garden)'    then '1.0' else '0.0' end
  ) ||
  ']'
)::vector;
$$;

Using a straightforward CASE that searches for specific words, I create the semantic vector for a text. This method employs vector similarity solely for demonstration, without utilizing a model. You can envision AI models doing the same but with thousands of dimensions driven by large language models (LLMs) rather than keywords.

Table with text and embeddings

I created a product catalog table with structured fields, text, and a single vector column:

create table products (
  product_id bigint primary key,
  category   text not null,
  brand      text not null,
  price      numeric(10,2) not null,
  title      text not null,
  description text not null,
  search_text text generated always as (
    title || ' ' || description
  ) stored,
  embedding vector(16) not null
);

I loaded a few hundred thousand rows. The data is synthetic but intentionally patterned: products have categories, materials, styles, and terms that are useful for both BM25 and vector search:

insert into products ( product_id, category, brand, price, title, description, embedding)
with generated as (
  select
    g as product_id,
    (array[ 'chair', 'table', 'sofa', 'lamp', 'desk', 'shelf' ])[1 + (g % 6)] as category,
    (array[ 'Contoso', 'Fabrikam', 'Northwind', 'AdventureWorks', 'Wingtip', 'Tailspin' ])[1 + (g % 6)] as brand,
    (array[ 'mid-century modern', 'industrial', 'scandinavian', 'classic', 'minimalist', 'outdoor' ])[1 + (g % 6)] as style,
    (array[ 'walnut wood', 'black metal', 'oak wood', 'leather', 'fabric', 'brushed steel' ])[1 + ((g / 7) % 6)] as material, (25 + (g % 500))::numeric(10,2) as price
  from generate_series(1, 200000) as g
)
select product_id, category, brand, price,
  initcap(style || ' ' || material || ' ' || category) as title,
  concat( style, ' ', category, ' by ', brand, ' with ', material, '. Designed for ',
    case
      when category in ('chair', 'sofa') then 'living room seating'
      when category in ('table', 'desk') then 'home office and dining'
      when category = 'lamp' then 'warm interior lighting'
      else 'storage and display'
    end,
    '. Product code ', product_id, '.'
  ) as description,
  demo_embedding(
    concat_ws(' ', style, material, category, brand)
  ) as embedding
from generated;

I checked a few rows:

postgres=>
 select product_id, title, category, brand, price, search_text, embedding
from products where price >= 42
order by product_id
limit 10;

 product_id |               title               | category |     brand      | price |                                                               search_text                                                               |             embedding
------------+-----------------------------------+----------+----------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------
         17 | Outdoor Oak Wood Shelf            | shelf    | Tailspin       | 42.00 | Outdoor Oak Wood Shelf outdoor shelf by Tailspin with oak wood. Designed for storage and display. Product code 17.                      | [0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1]
         18 | Mid-Century Modern Oak Wood Chair | chair    | Contoso        | 43.00 | Mid-Century Modern Oak Wood Chair mid-century modern chair by Contoso with oak wood. Designed for living room seating. Product code 18. | [1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]
         19 | Industrial Oak Wood Table         | table    | Fabrikam       | 44.00 | Industrial Oak Wood Table industrial table by Fabrikam with oak wood. Designed for home office and dining. Product code 19.             | [0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0]
         20 | Scandinavian Oak Wood Sofa        | sofa     | Northwind      | 45.00 | Scandinavian Oak Wood Sofa scandinavian sofa by Northwind with oak wood. Designed for living room seating. Product code 20.             | [0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0]
         21 | Classic Leather Lamp              | lamp     | AdventureWorks | 46.00 | Classic Leather Lamp classic lamp by AdventureWorks with leather. Designed for warm interior lighting. Product code 21.                 | [0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0]
         22 | Minimalist Leather Desk           | desk     | Wingtip        | 47.00 | Minimalist Leather Desk minimalist desk by Wingtip with leather. Designed for home office and dining. Product code 22.                  | [0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0]
         23 | Outdoor Leather Shelf             | shelf    | Tailspin       | 48.00 | Outdoor Leather Shelf outdoor shelf by Tailspin with leather. Designed for storage and display. Product code 23.                        | [0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1]
         24 | Mid-Century Modern Leather Chair  | chair    | Contoso        | 49.00 | Mid-Century Modern Leather Chair mid-century modern chair by Contoso with leather. Designed for living room seating. Product code 24.   | [1,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0]
         25 | Industrial Leather Table          | table    | Fabrikam       | 50.00 | Industrial Leather Table industrial table by Fabrikam with leather. Designed for home office and dining. Product code 25.               | [0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0]
         26 | Scandinavian Leather Sofa         | sofa     | Northwind      | 51.00 | Scandinavian Leather Sofa scandinavian sofa by Northwind with leather. Designed for living room seating. Product code 26.               | [0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0]


(10 rows)

I tested some searches on search_text and embedding and started indexing those columns.

Create the indexes (BM25 and DiskANN)

First, I created the BM25 full-text index. HorizonDB’s BM25 full-text search brings BM25 ranking into PostgreSQL without a separate Elasticsearch/OpenSearch Search service. It uses the open-source extension pg_textsearch:


postgres=> create index products_bm25_idx
            on products
            using bm25 (search_text)
            with (text_config = 'english')
;

NOTICE:  BM25 index build started for relation products_bm25_idx
NOTICE:  Using text search configuration: english
NOTICE:  Using index options: k1=1.20, b=0.75
NOTICE:  parallel index build: launched 2 of 2 requested workers

NOTICE:  BM25 index build completed: 200000 documents, avg_length=16.17

CREATE INDEX

Then I created the vector index using DiskANN and cosine similarity:


postgres=> create index products_embedding_diskann_idx
            on products
            using diskann (embedding vector_cosine_ops)
;

CREATE INDEX

I gathered the statistics:


postgres=> vacuum analyze products;

ANALYZE

My data set is ready for queries.

Query 1: BM25 only

I started with a keyword search, using to_bm25query() to define the BM25 query. Ranking uses BM25 and is performed with the <@> operator. Top-k queries use this operator in ORDER BY ... LIMIT.

BM25 is implemented as an index-backed operator that must be bound to a specific index. This is why prepared statements require explicitly naming the index:


postgres=> prepare query1 (text, int) as
select
  p.product_id,
  p.title,
  p.category,
  p.brand,
  p.price
from products p
order by p.search_text <@> to_bm25query( $1 , 'products_bm25_idx' )
limit $2;

postgres=> explain (analyze, buffers, verbose, costs off) 
           execute query1 ('mid century modern wooden chair', 10)
;

                                                                 QUERY PLAN                                                          
---------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=0.268..0.442 rows=10 loops=1)
   Output: product_id, title, category, brand, price, ((search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
   Buffers: shared hit=582
   ->  Index Scan using products_bm25_idx on hybrid_demo.products p (actual time=0.267..0.440 rows=10 loops=1)
         Output: product_id, title, category, brand, price, (search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
         Order By: (p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
         Buffers: shared hit=582
 Query Identifier: -4837396746836655100
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.116 ms
 Execution Time: 0.460 ms

(12 rows)

The Index Scan returns the Top-10 result ('rows=10') directly in ranking order (Order By).

Lexical retrieval is good for exact words, rare terms, product codes, and anything where the user expects the same token to appear in the document.

Query 2: ANN only

I further explored the semantic aspect by performing similarity search using the cosine distance operator (<=>) for vectors:

postgres=> prepare query2 (text, int) as
select
  p.product_id,
  p.title,
  p.category,
  p.brand,
  p.price,
  p.embedding <=> demo_embedding($1) as distance
from products p
order by p.embedding <=> demo_embedding($1)
limit $2;

postgres=> explain (analyze, buffers, verbose, costs off)
execute query2 ('modern wooden chair', 10);

                                                         QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=0.374..0.399 rows=10 loops=1)
   Output: product_id, title, category, brand, price, ((embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
   Buffers: shared hit=562
   ->  Index Scan using products_embedding_diskann_idx on hybrid_demo.products p (actual time=0.374..0.397 rows=10 loops=1)
         Output: product_id, title, category, brand, price, (embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
         Order By: (p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
         Buffers: shared hit=562
 Query Identifier: 2112846290524187050
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.275 ms
 Execution Time: 0.414 ms

(12 rows)

The Index Scan returns the Top-10 result (rows=10) directly in ranking order (Order By).

The key distinction is that it doesn't search for words. Instead, it finds rows near the query vector. Using the synthetic vector function, this refers to rows that share similar dimensions. With actual embeddings, it translates to model similarity.

Query 3: BM25 first, then vector search

A straightforward approach is to use BM25 as a preliminary filter before vector search. Here, the vector index is bypassed, and an accurate nearest neighbors search is performed on the candidates selected by BM25:


postgres=> prepare query3 (text, text, int, int) as
with
bm25_candidates as (
  select *
  from products p
  order by p.search_text <@> to_bm25query($1, 'products_bm25_idx')
  limit $3
)
select
  p.product_id, p.title, p.category, p.brand, p.price
from bm25_candidates p
order by p.embedding <=> demo_embedding($2)
limit $4;

postgres=> explain (analyze, buffers, verbose, costs off)
execute query3 (
  'mid century modern wooden chair',
  'modern wooden chair',
  5000,
  10
);

                                                                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=97.529..97.531 rows=10 loops=1)
   Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
   Buffers: shared hit=29482
   ->  Sort (actual time=97.528..97.529 rows=10 loops=1)
         Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
         Sort Key: ((p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
         Sort Method: top-N heapsort  Memory: 26kB
         Buffers: shared hit=29482
         ->  Subquery Scan on p (actual time=3.312..96.508 rows=5000 loops=1)
               Output: p.product_id, p.title, p.category, p.brand, p.price, (p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
               Buffers: shared hit=29482
               ->  Limit (actual time=3.310..95.687 rows=5000 loops=1)
                     Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, NULL::text, p_1.embedding, ((p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
                     Buffers: shared hit=29482
                     ->  Index Scan using products_bm25_idx on hybrid_demo.products p_1 (actual time=3.309..95.361 rows=5000 loops=1)
                           Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, NULL::text, p_1.embedding, (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
                           Order By: (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
                           Buffers: shared hit=29482
 Query Identifier: 2456399702791328288
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.328 ms
 Execution Time: 97.560 ms
(23 rows)

The execution plan features a two-phase pipeline:

Phase 1 (BM25): An index scan on products_bm25_idx retrieves the top 5000 candidates sorted by BM25 score (~95 ms).
Phase 2 (vector): A top‑N heapsort calculates exact vector distances on these 5000 rows (~2 ms).

This method uses BM25 to generate candidates, followed by vector ranking. It performs well when the query includes strong keywords, but it is not a hybrid search: BM25 sets recall, and the vector step only reranks.

Therefore, a sufficiently large candidate set (LIMIT $3) is necessary to avoid missing relevant results. Increasing this limit boosts recall but also raises latency, as most of the time is spent in the BM25 top‑K retrieval.

Query 4: ANN first, then text search

An alternative approach is to utilize the vector index (ANN) followed by text search as a post-filter. Here, the BM25 index is not used, and text ranking is only applied to the candidates identified by the vector search:


postgres=> prepare query4 (text, text, int, int) as
with
ann_candidates as (
  select *
  from products p
  order by p.embedding <=> demo_embedding($1)
  limit $3
)
select
  p.product_id, p.title, p.category, p.brand, p.price
from ann_candidates p
order by p.search_text <@> to_bm25query($2, 'products_bm25_idx')
limit $4;

postgres=> explain (analyze, buffers, verbose, costs off)
execute query4 (
  'modern wooden chair',
  'mid century modern wooden chair',
  5000,
  10
);

                                                                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=250.982..250.984 rows=10 loops=1)
   Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
   Buffers: shared hit=74975
   ->  Sort (actual time=250.981..250.982 rows=10 loops=1)
         Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
         Sort Key: ((p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
         Sort Method: top-N heapsort  Memory: 26kB
         Buffers: shared hit=74975
         ->  Subquery Scan on p (actual time=26.789..249.873 rows=5000 loops=1)
               Output: p.product_id, p.title, p.category, p.brand, p.price, (p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
               Buffers: shared hit=74975
               ->  Limit (actual time=26.681..153.452 rows=5000 loops=1)
                     Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, p_1.search_text, NULL::vector(16), ((p_1.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
                     Buffers: shared hit=59825
                     ->  Index Scan using products_embedding_diskann_idx on hybrid_demo.products p_1 (actual time=26.681..153.072 rows=5000 loops=1)
                           Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, p_1.search_text, NULL::vector(16), (p_1.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
                           Order By: (p_1.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
                           Buffers: shared hit=59825
 Query Identifier: 778851517376612713
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.311 ms
 Execution Time: 251.015 ms

(23 rows)

The execution plan displays the same two-phase pipeline but in reverse order:

Phase 1 (vector ANN): The vector index (DiskANN) retrieves the 5000 nearest neighbors based on embedding similarity, which is quick and index-driven.
- Phase 2 (BM25): A top‑N heapsort calculates BM25 scores for those 5000 entries. Since ranking occurs after the CTE materialization, no BM25 index is used.

This setup is not a hybrid search. Instead, the vector phase defines recall, and BM25 only fine-tunes the ranking.

Query 5: hybrid search with RRF

A hybrid search merges candidates from both search types. I run a query that performs these steps:

Executes a text search and retains the top 50 results.
Performs a vector search and keeps the top 50.
Ranks the small candidate lists.
Merges the candidate sets.
Uses Reciprocal Rank Fusion (RRF).
Outputs a single ranked result set.

The full query defines each step as a CTE in a WITH clause. While the query planner typically inlines these by default, I added MATERIALIZE for the demo to visualize each search and ranking stage separately:


postgres=> prepare query5 (text, text, int, int) as
with
bm25 as (
  select
    p.product_id,
    p.search_text <@> to_bm25query($1, 'products_bm25_idx') as score
  from products p
  order by p.search_text <@> to_bm25query($1, 'products_bm25_idx')
  limit $3
),
bm25_ranked as (
  select
    product_id,
    row_number() over (order by score) as bm25_rank
  from bm25
),
vec as (
  select
    p.product_id,
    p.embedding <=> demo_embedding($2) as distance
  from products p
  order by p.embedding <=> demo_embedding($2)
  limit $3
),
vec_ranked as (
  select
    product_id,
    row_number() over (order by distance) as vec_rank
  from vec
),
candidates as (
  select product_id from bm25_ranked
  union
  select product_id from vec_ranked
)
select
  p.product_id,
  p.title,
  p.category,
  p.brand,
  p.price,
  b.bm25_rank,
  v.vec_rank,
  (
    1.0 / (60 + coalesce(b.bm25_rank, 1000))
    +
    1.0 / (60 + coalesce(v.vec_rank, 1000))
  ) as rrf_score
from candidates c
join products p using (product_id)
left join bm25_ranked b using (product_id)
left join vec_ranked  v using (product_id)
order by rrf_score desc
limit $4;

postgres=> explain (analyze, buffers, verbose, costs off)
execute query5 (
  'mid century modern wooden chair',
  'modern wooden chair',
  50,
  10
);

                                                                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=84.994..85.001 rows=10 loops=1)
   Output: p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank, v.vec_rank, (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric)))
   Buffers: shared hit=34010
   CTE bm25
     ->  Limit (actual time=0.260..1.180 rows=50 loops=1)
           Output: p_1.product_id, ((p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
           Buffers: shared hit=803
           ->  Index Scan using products_bm25_idx on hybrid_demo.products p_1 (actual time=0.260..1.176 rows=50 loops=1)
                 Output: p_1.product_id, (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
                 Order By: (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
                 Buffers: shared hit=803
   CTE bm25_ranked
     ->  WindowAgg (actual time=0.263..1.202 rows=50 loops=1)
           Output: bm25.product_id, row_number() OVER (?), bm25.score
           Buffers: shared hit=803
           ->  CTE Scan on bm25 (actual time=0.261..1.189 rows=50 loops=1)
                 Output: bm25.score, bm25.product_id
                 Buffers: shared hit=803
   CTE vec
     ->  Limit (actual time=26.758..83.367 rows=50 loops=1)
           Output: p_2.product_id, ((p_2.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
           Buffers: shared hit=32807
           ->  Index Scan using products_embedding_diskann_idx on hybrid_demo.products p_2 (actual time=26.757..83.360 rows=50 loops=1)
                 Output: p_2.product_id, (p_2.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
                 Order By: (p_2.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
                 Buffers: shared hit=32807
   CTE vec_ranked
     ->  WindowAgg (actual time=26.764..83.400 rows=50 loops=1)
           Output: vec.product_id, row_number() OVER (?), vec.distance
           Buffers: shared hit=32807
           ->  CTE Scan on vec (actual time=26.760..83.382 rows=50 loops=1)
                 Output: vec.distance, vec.product_id
                 Buffers: shared hit=32807
   ->  Sort (actual time=84.993..84.996 rows=10 loops=1)
         Output: p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank, v.vec_rank, (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric)))
         Sort Key: (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric))) DESC
         Sort Method: top-N heapsort  Memory: 27kB
         Buffers: shared hit=34010
         ->  Hash Left Join (actual time=84.714..84.944 rows=100 loops=1)
               Output: p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank, v.vec_rank, ((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric))
               Hash Cond: (bm25_ranked.product_id = v.product_id)
               Buffers: shared hit=34010
               ->  Hash Left Join (actual time=84.698..84.887 rows=100 loops=1)
                     Output: bm25_ranked.product_id, p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank
                     Hash Cond: (bm25_ranked.product_id = b.product_id)
                     Buffers: shared hit=34010
                     ->  Nested Loop (actual time=84.679..84.855 rows=100 loops=1)
                           Output: bm25_ranked.product_id, p.product_id, p.title, p.category, p.brand, p.price
                           Inner Unique: true
                           Buffers: shared hit=34010
                           ->  HashAggregate (actual time=84.659..84.668 rows=100 loops=1)
                                 Output: bm25_ranked.product_id
                                 Group Key: bm25_ranked.product_id
                                 Batches: 1  Memory Usage: 24kB
                                 Buffers: shared hit=33610
                                 ->  Append (actual time=0.265..84.634 rows=100 loops=1)
                                       Buffers: shared hit=33610
                                       ->  CTE Scan on bm25_ranked (actual time=0.265..1.212 rows=50 loops=1)
                                             Output: bm25_ranked.product_id
                                             Buffers: shared hit=803
                                       ->  CTE Scan on vec_ranked (actual time=26.765..83.413 rows=50 loops=1)
                                             Output: vec_ranked.product_id
                                             Buffers: shared hit=32807
                           ->  Index Scan using products_pkey on hybrid_demo.products p (actual time=0.002..0.002 rows=1 loops=100)
                                 Output: p.product_id, p.category, p.brand, p.price, p.title, p.description, p.search_text, p.embedding
                                 Index Cond: (p.product_id = bm25_ranked.product_id)
                                 Buffers: shared hit=400
                     ->  Hash (actual time=0.011..0.011 rows=50 loops=1)
                           Output: b.bm25_rank, b.product_id
                           Buckets: 1024  Batches: 1  Memory Usage: 11kB
                           ->  CTE Scan on bm25_ranked b (actual time=0.001..0.006 rows=50 loops=1)
                                 Output: b.bm25_rank, b.product_id
               ->  Hash (actual time=0.009..0.009 rows=50 loops=1)
                     Output: v.vec_rank, v.product_id
                     Buckets: 1024  Batches: 1  Memory Usage: 11kB
                     ->  CTE Scan on vec_ranked v (actual time=0.000..0.004 rows=50 loops=1)
                           Output: v.vec_rank, v.product_id
 Query Identifier: 6796811579565214126
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.515 ms
 Execution Time: 85.087 ms
(82 rows)

Here is how the hybrid search with PostgreSQL extensions (pg_textsearch for BM25, pg_diskann for vectors) works in this query:

BM25 branch: Index Scan using products_bm25_idx → 50 rows in ~1 ms
Vector branch: Index Scan using products_embedding_diskann_idx → 50 rows in ~80 ms
Fusion layer: 100 candidate IDs from the two branches (Append), looked up by primary key (Index Scan products_pkey via Nested Loop), joined with their ranks (Hash Join), then a final top‑N sort on the RRF score.

RRF, or Reciprocal Rank Fusion, is a straightforward method for combining multiple rankings. It uses a constant (usually 60) to smooth out rank differences, ensuring that an item must perform well across several lists rather than just one to stand out. Since RRF operates on ranks rather than raw scores, it sidesteps the issue of comparing different scoring systems, such as BM25 and vector scores, which are on different scales.

Without MATERIALIZE, the query shape is:

 Limit
   CTE bm25_ranked
     ->  WindowAgg
           ->  Subquery Scan on bm25
                 ->  Limit
                       ->  Index Scan using products_bm25_idx on products p_1
                             Order By: (search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
   CTE vec_ranked
     ->  WindowAgg
           ->  Subquery Scan on vec
                 ->  Limit
                       ->  Index Scan using products_embedding_diskann_idx on products p_2
                             Order By: (embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
   ->  Sort
         Sort Key: (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric))) DESC
         ->  Hash Left Join
               Hash Cond: (bm25_ranked.product_id = v.product_id)
               ->  Hash Left Join
                     Hash Cond: (bm25_ranked.product_id = b.product_id)
                     ->  Nested Loop
                           ->  HashAggregate
                                 Group Key: bm25_ranked.product_id
                                 ->  Append
                                       ->  CTE Scan on bm25_ranked
                                       ->  CTE Scan on vec_ranked
                           ->  Index Scan using products_pkey on products p
                                 Index Cond: (product_id = bm25_ranked.product_id)
                     ->  Hash
                           ->  CTE Scan on bm25_ranked b
               ->  Hash
                     ->  CTE Scan on vec_ranked v

This is a hybrid search method in HorizonDB that involves two top‑k retrievals and a ranking layer expressed in SQL. No single hybrid index scan is used. PostgreSQL employs different access methods for text and vector search, with fusion occurring above these scans.

This approach offers a database‑friendly way to implement hybrid search that is explicit and easy to debug: you can see both branches in the plan (CTE), adjust the candidate size (LIMIT 50), balance BM25 versus vector scores (rrf_score), which can be swapped with a reranker, and inspect each candidate list separately, including the number of rows and buffers read.

In HorizonDB, hybrid search isn't about merging indexes into a single search engine. It involves correctly combining multiple top‑k retrievals within a SQL statement.

Conclusion

Hybrid search in HorizonDB isn't a new index type or access method. It's a query pattern created with PostgreSQL extensions.

Execution plans clarify this: BM25 and vector search operate as separate index scans, each with its own ranking system. PostgreSQL doesn’t combine them into a single operator. Instead, it performs independent top‑k retrievals and merges the results within SQL.

From the examples, three patterns emerge:

BM25 → vector: lexical search defines recall, vector search reranks
Vector → BM25: semantic search defines recall, text search reranks
BM25 + vector (RRF): both enhance recall, and their results are fused

Of these, only the last is truly hybrid. The first two are cascades in which one stage generates candidates, and the other refines them.

This distinction affects both relevance and performance. In cascade queries, recall is constrained by the first stage: anything not retrieved at that stage is lost. In hybrid queries, both branches contribute, boosting recall but requiring two index scans.

Plan analysis also shows where time is spent. In my examples, nearly all the cost is in the top‑k retrieval (BM25 or ANN), while reranking is less expensive. The primary parameter to tune is candidate size, not the scoring method.

A practical point: RRF uses ranks instead of raw scores to avoid normalization issues between BM25 and vector metrics.

Hybrid search in PostgreSQL is explicit and modular: separate retrievals combined within SQL. This makes execution plans clear and trade-offs transparent.

DEV Community