You may have read about Hybrid search for Azure HorizonDB. It is presented as combining BM25 full‑text and vector similarity in a single query. But how are they actually combined? The execution plan answers that.
In this post, I use a small synthetic product catalog to ensure the entire demo is reproducible. The text is sufficiently realistic for BM25 queries, and the embeddings are deterministic synthetic vectors, allowing you to run the full script without needing an embedding model. If you have azure_openai.create_embeddings() configured, you can substitute the synthetic embedding function with actual embeddings.
I will test two separate queries, followed by three methods of combining BM25 with vector retrieval:
- cascade (BM25 → vector)
- cascade (vector → BM25)
- hybrid (parallel + fusion)
These are not interchangeable; they represent distinct trade-offs between recall and performance.
Setup
This example uses:
-
pg_textsearchfor BM25 full-text search -
pgvectorfor the vector type and distance operator -
pg_diskannfor the vector index, when available
The extensions must be listed in azure.extensions to enable CREATE EXTENSION.
In addition, pg_textsearch must be loaded on startup:
Here is my HorizonDB configuration:
postgres=> select version();
version
-----------------------------------------------------------------------------------------------------------------------
PostgreSQL 17.9 (Azure HorizonDB (70f3b593ec7)(release)) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 13.2.0, 64-bit
postgres=> show azure.extensions;
azure.extensions
------------------------------------------
pg_diskann,vector,pg_textsearch,azure_ai
postgres=> show shared_preload_libraries;
shared_preload_libraries
---------------------------------------------------------------------------------------
pg_textsearch,azure,orion_storage,pg_availability,pg_qs,pgms_stats,pgms_wait_sampling
I've set up the extension features, functions, data types, and operators using CREATE EXTENSIONS and created a schema for this demo:
create extension if not exists vector;
create extension if not exists pg_textsearch;
create extension if not exists pg_diskann;
drop schema if exists hybrid_demo cascade;
create schema hybrid_demo;
set search_path = hybrid_demo, public, pgfts;
For the demo: a deterministic embedding function
I should use azure_openai.create_embeddings(), as I mentioned in the previous blog post. However, for this demo, I opted to create a fake embedding function that maps text to a simple vector(16) based on product keywords. This isn't an embedding model, but it ensures that execution plans are reproducible without relying on a model. It also simplifies the concept of embeddings with a basic, small-dimension LIKE-style semantic vector. The purpose is to demonstrate a search query using both a real text search and a vector-based semantic search.
create or replace function demo_embedding(txt text)
returns vector(16)
language sql
immutable
parallel safe
as $$
select (
'[' ||
concat_ws(',',
case when txt ~* '(mid|century)' then '1.0' else '0.0' end,
case when txt ~* '(modern|minimalist)' then '1.0' else '0.0' end,
case when txt ~* '(chair|seat)' then '1.0' else '0.0' end,
case when txt ~* '(table|desk)' then '1.0' else '0.0' end,
case when txt ~* '(sofa|couch)' then '1.0' else '0.0' end,
case when txt ~* '(lamp|light)' then '1.0' else '0.0' end,
case when txt ~* '(wood|walnut|oak)' then '1.0' else '0.0' end,
case when txt ~* '(metal|steel)' then '1.0' else '0.0' end,
case when txt ~* '(leather)' then '1.0' else '0.0' end,
case when txt ~* '(fabric|linen)' then '1.0' else '0.0' end,
case when txt ~* '(industrial)' then '1.0' else '0.0' end,
case when txt ~* '(scandinavian)' then '1.0' else '0.0' end,
case when txt ~* '(office)' then '1.0' else '0.0' end,
case when txt ~* '(dining)' then '1.0' else '0.0' end,
case when txt ~* '(classic|vintage)' then '1.0' else '0.0' end,
case when txt ~* '(outdoor|garden)' then '1.0' else '0.0' end
) ||
']'
)::vector;
$$;
Using a straightforward CASE that searches for specific words, I create the semantic vector for a text. This method employs vector similarity solely for demonstration, without utilizing a model. You can envision AI models doing the same but with thousands of dimensions driven by large language models (LLMs) rather than keywords.
Table with text and embeddings
I created a product catalog table with structured fields, text, and a single vector column:
create table products (
product_id bigint primary key,
category text not null,
brand text not null,
price numeric(10,2) not null,
title text not null,
description text not null,
search_text text generated always as (
title || ' ' || description
) stored,
embedding vector(16) not null
);
I loaded a few hundred thousand rows. The data is synthetic but intentionally patterned: products have categories, materials, styles, and terms that are useful for both BM25 and vector search:
insert into products ( product_id, category, brand, price, title, description, embedding)
with generated as (
select
g as product_id,
(array[ 'chair', 'table', 'sofa', 'lamp', 'desk', 'shelf' ])[1 + (g % 6)] as category,
(array[ 'Contoso', 'Fabrikam', 'Northwind', 'AdventureWorks', 'Wingtip', 'Tailspin' ])[1 + (g % 6)] as brand,
(array[ 'mid-century modern', 'industrial', 'scandinavian', 'classic', 'minimalist', 'outdoor' ])[1 + (g % 6)] as style,
(array[ 'walnut wood', 'black metal', 'oak wood', 'leather', 'fabric', 'brushed steel' ])[1 + ((g / 7) % 6)] as material, (25 + (g % 500))::numeric(10,2) as price
from generate_series(1, 200000) as g
)
select product_id, category, brand, price,
initcap(style || ' ' || material || ' ' || category) as title,
concat( style, ' ', category, ' by ', brand, ' with ', material, '. Designed for ',
case
when category in ('chair', 'sofa') then 'living room seating'
when category in ('table', 'desk') then 'home office and dining'
when category = 'lamp' then 'warm interior lighting'
else 'storage and display'
end,
'. Product code ', product_id, '.'
) as description,
demo_embedding(
concat_ws(' ', style, material, category, brand)
) as embedding
from generated;
I checked a few rows:
postgres=>
select product_id, title, category, brand, price, search_text, embedding
from products where price >= 42
order by product_id
limit 10;
product_id | title | category | brand | price | search_text | embedding
------------+-----------------------------------+----------+----------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------
17 | Outdoor Oak Wood Shelf | shelf | Tailspin | 42.00 | Outdoor Oak Wood Shelf outdoor shelf by Tailspin with oak wood. Designed for storage and display. Product code 17. | [0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1]
18 | Mid-Century Modern Oak Wood Chair | chair | Contoso | 43.00 | Mid-Century Modern Oak Wood Chair mid-century modern chair by Contoso with oak wood. Designed for living room seating. Product code 18. | [1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]
19 | Industrial Oak Wood Table | table | Fabrikam | 44.00 | Industrial Oak Wood Table industrial table by Fabrikam with oak wood. Designed for home office and dining. Product code 19. | [0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0]
20 | Scandinavian Oak Wood Sofa | sofa | Northwind | 45.00 | Scandinavian Oak Wood Sofa scandinavian sofa by Northwind with oak wood. Designed for living room seating. Product code 20. | [0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0]
21 | Classic Leather Lamp | lamp | AdventureWorks | 46.00 | Classic Leather Lamp classic lamp by AdventureWorks with leather. Designed for warm interior lighting. Product code 21. | [0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0]
22 | Minimalist Leather Desk | desk | Wingtip | 47.00 | Minimalist Leather Desk minimalist desk by Wingtip with leather. Designed for home office and dining. Product code 22. | [0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0]
23 | Outdoor Leather Shelf | shelf | Tailspin | 48.00 | Outdoor Leather Shelf outdoor shelf by Tailspin with leather. Designed for storage and display. Product code 23. | [0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1]
24 | Mid-Century Modern Leather Chair | chair | Contoso | 49.00 | Mid-Century Modern Leather Chair mid-century modern chair by Contoso with leather. Designed for living room seating. Product code 24. | [1,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0]
25 | Industrial Leather Table | table | Fabrikam | 50.00 | Industrial Leather Table industrial table by Fabrikam with leather. Designed for home office and dining. Product code 25. | [0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0]
26 | Scandinavian Leather Sofa | sofa | Northwind | 51.00 | Scandinavian Leather Sofa scandinavian sofa by Northwind with leather. Designed for living room seating. Product code 26. | [0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0]
(10 rows)
I tested some searches on search_text and embedding and started indexing those columns.
Create the indexes (BM25 and DiskANN)
First, I created the BM25 full-text index. HorizonDB’s BM25 full-text search brings BM25 ranking into PostgreSQL without a separate Elasticsearch/OpenSearch Search service. It uses the open-source extension pg_textsearch:
postgres=> create index products_bm25_idx
on products
using bm25 (search_text)
with (text_config = 'english')
;
NOTICE: BM25 index build started for relation products_bm25_idx
NOTICE: Using text search configuration: english
NOTICE: Using index options: k1=1.20, b=0.75
NOTICE: parallel index build: launched 2 of 2 requested workers
NOTICE: BM25 index build completed: 200000 documents, avg_length=16.17
CREATE INDEX
Then I created the vector index using DiskANN and cosine similarity:
postgres=> create index products_embedding_diskann_idx
on products
using diskann (embedding vector_cosine_ops)
;
CREATE INDEX
I gathered the statistics:
postgres=> vacuum analyze products;
ANALYZE
My data set is ready for queries.
Query 1: BM25 only
I started with a keyword search, using to_bm25query() to define the BM25 query. Ranking uses BM25 and is performed with the <@> operator. Top-k queries use this operator in ORDER BY ... LIMIT.
BM25 is implemented as an index-backed operator that must be bound to a specific index. This is why prepared statements require explicitly naming the index:
postgres=> prepare query1 (text, int) as
select
p.product_id,
p.title,
p.category,
p.brand,
p.price
from products p
order by p.search_text <@> to_bm25query( $1 , 'products_bm25_idx' )
limit $2;
postgres=> explain (analyze, buffers, verbose, costs off)
execute query1 ('mid century modern wooden chair', 10)
;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (actual time=0.268..0.442 rows=10 loops=1)
Output: product_id, title, category, brand, price, ((search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
Buffers: shared hit=582
-> Index Scan using products_bm25_idx on hybrid_demo.products p (actual time=0.267..0.440 rows=10 loops=1)
Output: product_id, title, category, brand, price, (search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Order By: (p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Buffers: shared hit=582
Query Identifier: -4837396746836655100
Planning:
Buffers: shared hit=1
Planning Time: 0.116 ms
Execution Time: 0.460 ms
(12 rows)
The Index Scan returns the Top-10 result ('rows=10') directly in ranking order (Order By).
Lexical retrieval is good for exact words, rare terms, product codes, and anything where the user expects the same token to appear in the document.
Query 2: ANN only
I further explored the semantic aspect by performing similarity search using the cosine distance operator (<=>) for vectors:
postgres=> prepare query2 (text, int) as
select
p.product_id,
p.title,
p.category,
p.brand,
p.price,
p.embedding <=> demo_embedding($1) as distance
from products p
order by p.embedding <=> demo_embedding($1)
limit $2;
postgres=> explain (analyze, buffers, verbose, costs off)
execute query2 ('modern wooden chair', 10);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Limit (actual time=0.374..0.399 rows=10 loops=1)
Output: product_id, title, category, brand, price, ((embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
Buffers: shared hit=562
-> Index Scan using products_embedding_diskann_idx on hybrid_demo.products p (actual time=0.374..0.397 rows=10 loops=1)
Output: product_id, title, category, brand, price, (embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Order By: (p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Buffers: shared hit=562
Query Identifier: 2112846290524187050
Planning:
Buffers: shared hit=1
Planning Time: 0.275 ms
Execution Time: 0.414 ms
(12 rows)
The Index Scan returns the Top-10 result (rows=10) directly in ranking order (Order By).
The key distinction is that it doesn't search for words. Instead, it finds rows near the query vector. Using the synthetic vector function, this refers to rows that share similar dimensions. With actual embeddings, it translates to model similarity.
Query 3: BM25 first, then vector search
A straightforward approach is to use BM25 as a preliminary filter before vector search. Here, the vector index is bypassed, and an accurate nearest neighbors search is performed on the candidates selected by BM25:
postgres=> prepare query3 (text, text, int, int) as
with
bm25_candidates as (
select *
from products p
order by p.search_text <@> to_bm25query($1, 'products_bm25_idx')
limit $3
)
select
p.product_id, p.title, p.category, p.brand, p.price
from bm25_candidates p
order by p.embedding <=> demo_embedding($2)
limit $4;
postgres=> explain (analyze, buffers, verbose, costs off)
execute query3 (
'mid century modern wooden chair',
'modern wooden chair',
5000,
10
);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (actual time=97.529..97.531 rows=10 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
Buffers: shared hit=29482
-> Sort (actual time=97.528..97.529 rows=10 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
Sort Key: ((p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
Sort Method: top-N heapsort Memory: 26kB
Buffers: shared hit=29482
-> Subquery Scan on p (actual time=3.312..96.508 rows=5000 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, (p.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Buffers: shared hit=29482
-> Limit (actual time=3.310..95.687 rows=5000 loops=1)
Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, NULL::text, p_1.embedding, ((p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
Buffers: shared hit=29482
-> Index Scan using products_bm25_idx on hybrid_demo.products p_1 (actual time=3.309..95.361 rows=5000 loops=1)
Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, NULL::text, p_1.embedding, (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Order By: (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Buffers: shared hit=29482
Query Identifier: 2456399702791328288
Planning:
Buffers: shared hit=1
Planning Time: 0.328 ms
Execution Time: 97.560 ms
(23 rows)
The execution plan features a two-phase pipeline:
- Phase 1 (BM25): An index scan on
products_bm25_idxretrieves the top 5000 candidates sorted by BM25 score (~95 ms). - Phase 2 (vector): A top‑N heapsort calculates exact vector distances on these 5000 rows (~2 ms).
This method uses BM25 to generate candidates, followed by vector ranking. It performs well when the query includes strong keywords, but it is not a hybrid search: BM25 sets recall, and the vector step only reranks.
Therefore, a sufficiently large candidate set (LIMIT $3) is necessary to avoid missing relevant results. Increasing this limit boosts recall but also raises latency, as most of the time is spent in the BM25 top‑K retrieval.
Query 4: ANN first, then text search
An alternative approach is to utilize the vector index (ANN) followed by text search as a post-filter. Here, the BM25 index is not used, and text ranking is only applied to the candidates identified by the vector search:
postgres=> prepare query4 (text, text, int, int) as
with
ann_candidates as (
select *
from products p
order by p.embedding <=> demo_embedding($1)
limit $3
)
select
p.product_id, p.title, p.category, p.brand, p.price
from ann_candidates p
order by p.search_text <@> to_bm25query($2, 'products_bm25_idx')
limit $4;
postgres=> explain (analyze, buffers, verbose, costs off)
execute query4 (
'modern wooden chair',
'mid century modern wooden chair',
5000,
10
);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (actual time=250.982..250.984 rows=10 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
Buffers: shared hit=74975
-> Sort (actual time=250.981..250.982 rows=10 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, ((p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
Sort Key: ((p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
Sort Method: top-N heapsort Memory: 26kB
Buffers: shared hit=74975
-> Subquery Scan on p (actual time=26.789..249.873 rows=5000 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, (p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Buffers: shared hit=74975
-> Limit (actual time=26.681..153.452 rows=5000 loops=1)
Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, p_1.search_text, NULL::vector(16), ((p_1.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
Buffers: shared hit=59825
-> Index Scan using products_embedding_diskann_idx on hybrid_demo.products p_1 (actual time=26.681..153.072 rows=5000 loops=1)
Output: p_1.product_id, p_1.category, p_1.brand, p_1.price, p_1.title, NULL::text, p_1.search_text, NULL::vector(16), (p_1.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Order By: (p_1.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Buffers: shared hit=59825
Query Identifier: 778851517376612713
Planning:
Buffers: shared hit=1
Planning Time: 0.311 ms
Execution Time: 251.015 ms
(23 rows)
The execution plan displays the same two-phase pipeline but in reverse order:
- Phase 1 (vector ANN): The vector index (DiskANN) retrieves the 5000 nearest neighbors based on embedding similarity, which is quick and index-driven.
- Phase 2 (BM25): A top‑N heapsort calculates BM25 scores for those 5000 entries. Since ranking occurs after the CTE materialization, no BM25 index is used.
This setup is not a hybrid search. Instead, the vector phase defines recall, and BM25 only fine-tunes the ranking.
Query 5: hybrid search with RRF
A hybrid search merges candidates from both search types. I run a query that performs these steps:
- Executes a text search and retains the top 50 results.
- Performs a vector search and keeps the top 50.
- Ranks the small candidate lists.
- Merges the candidate sets.
- Uses Reciprocal Rank Fusion (RRF).
- Outputs a single ranked result set.
The full query defines each step as a CTE in a WITH clause. While the query planner typically inlines these by default, I added MATERIALIZE for the demo to visualize each search and ranking stage separately:
postgres=> prepare query5 (text, text, int, int) as
with
bm25 as (
select
p.product_id,
p.search_text <@> to_bm25query($1, 'products_bm25_idx') as score
from products p
order by p.search_text <@> to_bm25query($1, 'products_bm25_idx')
limit $3
),
bm25_ranked as (
select
product_id,
row_number() over (order by score) as bm25_rank
from bm25
),
vec as (
select
p.product_id,
p.embedding <=> demo_embedding($2) as distance
from products p
order by p.embedding <=> demo_embedding($2)
limit $3
),
vec_ranked as (
select
product_id,
row_number() over (order by distance) as vec_rank
from vec
),
candidates as (
select product_id from bm25_ranked
union
select product_id from vec_ranked
)
select
p.product_id,
p.title,
p.category,
p.brand,
p.price,
b.bm25_rank,
v.vec_rank,
(
1.0 / (60 + coalesce(b.bm25_rank, 1000))
+
1.0 / (60 + coalesce(v.vec_rank, 1000))
) as rrf_score
from candidates c
join products p using (product_id)
left join bm25_ranked b using (product_id)
left join vec_ranked v using (product_id)
order by rrf_score desc
limit $4;
postgres=> explain (analyze, buffers, verbose, costs off)
execute query5 (
'mid century modern wooden chair',
'modern wooden chair',
50,
10
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (actual time=84.994..85.001 rows=10 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank, v.vec_rank, (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric)))
Buffers: shared hit=34010
CTE bm25
-> Limit (actual time=0.260..1.180 rows=50 loops=1)
Output: p_1.product_id, ((p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
Buffers: shared hit=803
-> Index Scan using products_bm25_idx on hybrid_demo.products p_1 (actual time=0.260..1.176 rows=50 loops=1)
Output: p_1.product_id, (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Order By: (p_1.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
Buffers: shared hit=803
CTE bm25_ranked
-> WindowAgg (actual time=0.263..1.202 rows=50 loops=1)
Output: bm25.product_id, row_number() OVER (?), bm25.score
Buffers: shared hit=803
-> CTE Scan on bm25 (actual time=0.261..1.189 rows=50 loops=1)
Output: bm25.score, bm25.product_id
Buffers: shared hit=803
CTE vec
-> Limit (actual time=26.758..83.367 rows=50 loops=1)
Output: p_2.product_id, ((p_2.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
Buffers: shared hit=32807
-> Index Scan using products_embedding_diskann_idx on hybrid_demo.products p_2 (actual time=26.757..83.360 rows=50 loops=1)
Output: p_2.product_id, (p_2.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Order By: (p_2.embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
Buffers: shared hit=32807
CTE vec_ranked
-> WindowAgg (actual time=26.764..83.400 rows=50 loops=1)
Output: vec.product_id, row_number() OVER (?), vec.distance
Buffers: shared hit=32807
-> CTE Scan on vec (actual time=26.760..83.382 rows=50 loops=1)
Output: vec.distance, vec.product_id
Buffers: shared hit=32807
-> Sort (actual time=84.993..84.996 rows=10 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank, v.vec_rank, (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric)))
Sort Key: (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric))) DESC
Sort Method: top-N heapsort Memory: 27kB
Buffers: shared hit=34010
-> Hash Left Join (actual time=84.714..84.944 rows=100 loops=1)
Output: p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank, v.vec_rank, ((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric))
Hash Cond: (bm25_ranked.product_id = v.product_id)
Buffers: shared hit=34010
-> Hash Left Join (actual time=84.698..84.887 rows=100 loops=1)
Output: bm25_ranked.product_id, p.product_id, p.title, p.category, p.brand, p.price, b.bm25_rank
Hash Cond: (bm25_ranked.product_id = b.product_id)
Buffers: shared hit=34010
-> Nested Loop (actual time=84.679..84.855 rows=100 loops=1)
Output: bm25_ranked.product_id, p.product_id, p.title, p.category, p.brand, p.price
Inner Unique: true
Buffers: shared hit=34010
-> HashAggregate (actual time=84.659..84.668 rows=100 loops=1)
Output: bm25_ranked.product_id
Group Key: bm25_ranked.product_id
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=33610
-> Append (actual time=0.265..84.634 rows=100 loops=1)
Buffers: shared hit=33610
-> CTE Scan on bm25_ranked (actual time=0.265..1.212 rows=50 loops=1)
Output: bm25_ranked.product_id
Buffers: shared hit=803
-> CTE Scan on vec_ranked (actual time=26.765..83.413 rows=50 loops=1)
Output: vec_ranked.product_id
Buffers: shared hit=32807
-> Index Scan using products_pkey on hybrid_demo.products p (actual time=0.002..0.002 rows=1 loops=100)
Output: p.product_id, p.category, p.brand, p.price, p.title, p.description, p.search_text, p.embedding
Index Cond: (p.product_id = bm25_ranked.product_id)
Buffers: shared hit=400
-> Hash (actual time=0.011..0.011 rows=50 loops=1)
Output: b.bm25_rank, b.product_id
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> CTE Scan on bm25_ranked b (actual time=0.001..0.006 rows=50 loops=1)
Output: b.bm25_rank, b.product_id
-> Hash (actual time=0.009..0.009 rows=50 loops=1)
Output: v.vec_rank, v.product_id
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> CTE Scan on vec_ranked v (actual time=0.000..0.004 rows=50 loops=1)
Output: v.vec_rank, v.product_id
Query Identifier: 6796811579565214126
Planning:
Buffers: shared hit=2
Planning Time: 0.515 ms
Execution Time: 85.087 ms
(82 rows)
Here is how the hybrid search with PostgreSQL extensions (pg_textsearch for BM25, pg_diskann for vectors) works in this query:
- BM25 branch:
Index Scan using products_bm25_idx→ 50 rows in ~1 ms - Vector branch:
Index Scan using products_embedding_diskann_idx→ 50 rows in ~80 ms - Fusion layer: 100 candidate IDs from the two branches (
Append), looked up by primary key (Index Scan products_pkeyviaNested Loop), joined with their ranks (Hash Join), then a final top‑N sort on the RRF score.
RRF, or Reciprocal Rank Fusion, is a straightforward method for combining multiple rankings. It uses a constant (usually 60) to smooth out rank differences, ensuring that an item must perform well across several lists rather than just one to stand out. Since RRF operates on ranks rather than raw scores, it sidesteps the issue of comparing different scoring systems, such as BM25 and vector scores, which are on different scales.
Without MATERIALIZE, the query shape is:
Limit
CTE bm25_ranked
-> WindowAgg
-> Subquery Scan on bm25
-> Limit
-> Index Scan using products_bm25_idx on products p_1
Order By: (search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
CTE vec_ranked
-> WindowAgg
-> Subquery Scan on vec
-> Limit
-> Index Scan using products_embedding_diskann_idx on products p_2
Order By: (embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector)
-> Sort
Sort Key: (((1.0 / ((60 + COALESCE(b.bm25_rank, '1000'::bigint)))::numeric) + (1.0 / ((60 + COALESCE(v.vec_rank, '1000'::bigint)))::numeric))) DESC
-> Hash Left Join
Hash Cond: (bm25_ranked.product_id = v.product_id)
-> Hash Left Join
Hash Cond: (bm25_ranked.product_id = b.product_id)
-> Nested Loop
-> HashAggregate
Group Key: bm25_ranked.product_id
-> Append
-> CTE Scan on bm25_ranked
-> CTE Scan on vec_ranked
-> Index Scan using products_pkey on products p
Index Cond: (product_id = bm25_ranked.product_id)
-> Hash
-> CTE Scan on bm25_ranked b
-> Hash
-> CTE Scan on vec_ranked v
This is a hybrid search method in HorizonDB that involves two top‑k retrievals and a ranking layer expressed in SQL. No single hybrid index scan is used. PostgreSQL employs different access methods for text and vector search, with fusion occurring above these scans.
This approach offers a database‑friendly way to implement hybrid search that is explicit and easy to debug: you can see both branches in the plan (CTE), adjust the candidate size (LIMIT 50), balance BM25 versus vector scores (rrf_score), which can be swapped with a reranker, and inspect each candidate list separately, including the number of rows and buffers read.
In HorizonDB, hybrid search isn't about merging indexes into a single search engine. It involves correctly combining multiple top‑k retrievals within a SQL statement.
Conclusion
Hybrid search in HorizonDB isn't a new index type or access method. It's a query pattern created with PostgreSQL extensions.
Execution plans clarify this: BM25 and vector search operate as separate index scans, each with its own ranking system. PostgreSQL doesn’t combine them into a single operator. Instead, it performs independent top‑k retrievals and merges the results within SQL.
From the examples, three patterns emerge:
- BM25 → vector: lexical search defines recall, vector search reranks
- Vector → BM25: semantic search defines recall, text search reranks
- BM25 + vector (RRF): both enhance recall, and their results are fused
Of these, only the last is truly hybrid. The first two are cascades in which one stage generates candidates, and the other refines them.
This distinction affects both relevance and performance. In cascade queries, recall is constrained by the first stage: anything not retrieved at that stage is lost. In hybrid queries, both branches contribute, boosting recall but requiring two index scans.
Plan analysis also shows where time is spent. In my examples, nearly all the cost is in the top‑k retrieval (BM25 or ANN), while reranking is less expensive. The primary parameter to tune is candidate size, not the scoring method.
A practical point: RRF uses ranks instead of raw scores to avoid normalization issues between BM25 and vector metrics.
Hybrid search in PostgreSQL is explicit and modular: separate retrievals combined within SQL. This makes execution plans clear and trade-offs transparent.


Top comments (0)