<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergey Nikolaev</title>
    <description>The latest articles on DEV Community by Sergey Nikolaev (@sanikolaev).</description>
    <link>https://dev.to/sanikolaev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F363352%2F6f7a2da7-fa00-47f5-aaca-a007b1d43350.jpeg</url>
      <title>DEV Community: Sergey Nikolaev</title>
      <link>https://dev.to/sanikolaev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sanikolaev"/>
    <language>en</language>
    <item>
      <title>The Evolution of 'More Like This</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Tue, 02 Jun 2026 04:01:20 +0000</pubDate>
      <link>https://dev.to/sanikolaev/the-evolution-of-more-like-this-5225</link>
      <guid>https://dev.to/sanikolaev/the-evolution-of-more-like-this-5225</guid>
      <description>&lt;p&gt;In many search scenarios, the user does not start from an empty query box, but from an existing result.&lt;/p&gt;

&lt;p&gt;A user opens an article and wants to find related material. A buyer views a product card and looks for close alternatives. A support engineer investigates an incident and wants to see earlier cases with the same symptoms. In all these situations, the user already has a relevant document to start from.&lt;/p&gt;

&lt;p&gt;This scenario is traditionally called &lt;strong&gt;More Like This (MLT)&lt;/strong&gt;: a function for finding documents similar to the selected one. In this article, MLT means search that starts from a known document, not from a newly typed query.&lt;/p&gt;

&lt;p&gt;The classic MLT approach, or similar-document search, was based on comparing textual matches. Modern implementations increasingly use embeddings: numerical representations of documents. A search index stores embeddings as vectors, and the search system can find documents with close vector representations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Short glossary
&lt;/h2&gt;

&lt;p&gt;To avoid repeating definitions throughout the article, here are the main terms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Meaning in this article&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;More Like This (MLT)&lt;/td&gt;
&lt;td&gt;search for documents similar to an already selected document&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;embedding&lt;/td&gt;
&lt;td&gt;a numerical representation of text, a product, an image, or another object&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;embedding vector&lt;/td&gt;
&lt;td&gt;a numerical representation of an object, such as text or a product, stored in the index to find similar objects by vector proximity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KNN, nearest-neighbor search&lt;/td&gt;
&lt;td&gt;search for nearest neighbors, meaning objects with close vectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ANN, approximate nearest neighbors&lt;/td&gt;
&lt;td&gt;approximate nearest-neighbor search; it speeds up KNN on large datasets without scanning every vector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAG, Retrieval-Augmented Generation&lt;/td&gt;
&lt;td&gt;an approach where the search system retrieves context for a generative model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hybrid search&lt;/td&gt;
&lt;td&gt;combining full-text search and vector search in one scenario&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reranking&lt;/td&gt;
&lt;td&gt;an additional sorting step for already retrieved candidates using a more precise model or rule&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What classic More Like This did
&lt;/h2&gt;

&lt;p&gt;Classic MLT was lexical. It answered a simple question: which documents use similar important words?&lt;/p&gt;

&lt;p&gt;The process usually looked like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The search system took the source document.&lt;/li&gt;
&lt;li&gt;It analyzed its text.&lt;/li&gt;
&lt;li&gt;It selected informative terms.&lt;/li&gt;
&lt;li&gt;It built a query from those terms.&lt;/li&gt;
&lt;li&gt;It searched for documents with a similar set of words.&lt;/li&gt;
&lt;li&gt;It returned a list of similar documents.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Internally, this used familiar full-text search mechanisms: TF-IDF or BM25, term frequency, stopwords, field boosts, and document-frequency limits. That is why older MLT implementations exposed parameters such as &lt;code&gt;min_term_freq&lt;/code&gt;, &lt;code&gt;min_doc_freq&lt;/code&gt;, &lt;code&gt;max_doc_freq&lt;/code&gt;, and &lt;code&gt;max_query_terms&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This was not just an interface element, but a full search mechanism. MLT was used for related articles and products, duplicate detection, support-ticket matching, legal search, patent research, and internal knowledge bases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the lexical approach is still strong
&lt;/h2&gt;

&lt;p&gt;Lexical MLT works well when specific words, identifiers, and stable formulations matter.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;error codes;&lt;/li&gt;
&lt;li&gt;product SKUs;&lt;/li&gt;
&lt;li&gt;part numbers;&lt;/li&gt;
&lt;li&gt;function names;&lt;/li&gt;
&lt;li&gt;stack traces;&lt;/li&gt;
&lt;li&gt;legal wording;&lt;/li&gt;
&lt;li&gt;nearly identical product or ticket descriptions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The reason is that exact matching is critical here. If two incident reports contain the same error code or the same stack trace, full-text search sees a direct match. For example, when searching tickets with the code &lt;code&gt;ERR_404&lt;/code&gt;, lexical MLT quickly finds every mention of that code, while vector search may return tickets that describe similar but not identical problems.&lt;/p&gt;

&lt;p&gt;Lexical MLT had another advantage: it was cheap to run. The inverted index is already in the search engine. The analyzers are already configured. Ranking already works. There is no need to deploy separate search infrastructure just to support a “find similar” feature.&lt;/p&gt;

&lt;p&gt;The limitation is also clear. If two documents describe the same thing in different words, lexical MLT may fail to connect them. Synonyms work unevenly. Paraphrases are harder. Cross-lingual similarity is usually unavailable. For example, &lt;code&gt;memory leak&lt;/code&gt; and &lt;code&gt;unbounded heap growth&lt;/code&gt; may describe the same problem, but a standard analyzer sees different tokens.&lt;/p&gt;

&lt;p&gt;Lexical MLT efficiently finds documents with matching or similar wording. Semantic search helps when the meaning matches, not the words.&lt;/p&gt;

&lt;h2&gt;
  
  
  What embeddings change
&lt;/h2&gt;

&lt;p&gt;Using &lt;a href="https://manticoresearch.com/blog/vector-search-deep-dive/" rel="noopener noreferrer"&gt;embeddings&lt;/a&gt; — numerical representations of documents — changes the comparison principle: instead of words, the system compares vector representations.&lt;/p&gt;

&lt;p&gt;A document no longer has to be represented only as a set of weighted terms. It can be stored as a dense vector. Nearby vectors usually correspond to documents that are similar in meaning, even if they are written in different words.&lt;/p&gt;

&lt;p&gt;The lexical approach looks for matches by words and terms, while embedding search looks at the proximity of document vector representations. The first approach is optimal for exact matches such as error codes and SKUs. The second finds semantically close documents, even when they are phrased differently.&lt;/p&gt;

&lt;p&gt;This expands the scope of this kind of search. You can compare not only articles, but also products, images, code fragments, user events, or context fragments in a RAG system. In RAG, the search system first retrieves relevant context, and then the generative model uses that context to produce an answer.&lt;/p&gt;

&lt;p&gt;Lexical search does not disappear. Exact error codes, SKUs, names, statute references, and near duplicates are still better handled lexically. That is why production systems often use &lt;a href="https://manticoresearch.com/blog/hybrid-search/" rel="noopener noreferrer"&gt;hybrid search&lt;/a&gt;: full-text search provides exact matches, vector search adds results by meaning, filters constrain the search space, and reranking refines the final order.&lt;/p&gt;

&lt;p&gt;As shown in our &lt;a href="https://manticoresearch.com/blog/lexical-search-vs-vector-search/" rel="noopener noreferrer"&gt;comparison of lexical and vector search&lt;/a&gt;, the former wins on precise strict matches, while the latter improves coverage of semantic relationships.&lt;/p&gt;

&lt;h2&gt;
  
  
  MLT as lookup by a vector from the index
&lt;/h2&gt;

&lt;p&gt;If a vector representation has already been computed for a document and stored in the index, modern MLT can be described without a separate API example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Take the source document.&lt;/li&gt;
&lt;li&gt;Retrieve its precomputed vector representation from the index.&lt;/li&gt;
&lt;li&gt;Find the nearest vectors.&lt;/li&gt;
&lt;li&gt;Return the documents those vectors belong to.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is still More Like This: the user starts from one document and gets related results. Only the comparison method changes. Instead of extracting terms, the search system uses the vector representation of the source document.&lt;/p&gt;

&lt;p&gt;In Manticore Search, this operation can be performed directly at the search-engine level: the query specifies the ID of the source document, and Manticore takes its embedding vector from the index and runs KNN search. The application does not need to fetch the vector separately, serialize hundreds or thousands of numbers, and send them back in a second request.&lt;/p&gt;

&lt;p&gt;A minimal SQL example looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, &lt;code&gt;embedding&lt;/code&gt; is the field with the precomputed embedding vector, &lt;code&gt;123&lt;/code&gt; is the ID of the source document, and &lt;code&gt;10&lt;/code&gt; is the number of nearest documents to return. The &lt;code&gt;knn_dist()&lt;/code&gt; function returns the distance between vectors: a smaller value means greater semantic proximity to the source document. The same operation can be performed through the HTTP JSON API; the search logic does not change. The application passes the document ID, and Manticore performs lookup using that document’s vector from the index.&lt;/p&gt;

&lt;p&gt;For large datasets, KNN is usually implemented through an ANN index. This speeds up search through approximate computation and avoids scanning every vector. For the user, the important part is not the internal structure of the index, but the result: quickly finding documents that are close to the source in meaning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why search is better handled in the engine
&lt;/h2&gt;

&lt;p&gt;You can implement this scenario in the application: first fetch the document, then extract its vector, then send a separate KNN query, and then combine the result with filters.&lt;/p&gt;

&lt;p&gt;That approach makes the system architecture more complex. The application has to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pass the vector between services;&lt;/li&gt;
&lt;li&gt;prevent accidental logging;&lt;/li&gt;
&lt;li&gt;check the embedding model version;&lt;/li&gt;
&lt;li&gt;keep data synchronized with the main index;&lt;/li&gt;
&lt;li&gt;apply the same filters used in normal search.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the search system performs the lookup itself, the path is shorter:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The application passes the ID of the source document.&lt;/li&gt;
&lt;li&gt;The search system finds the precomputed vector representation in the index.&lt;/li&gt;
&lt;li&gt;The search system runs nearest-neighbor search (KNN) or its approximate variant (ANN).&lt;/li&gt;
&lt;li&gt;The search system returns the found documents with the same access filters and metadata.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Benefits of this approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fewer inter-service requests from the application;&lt;/li&gt;
&lt;li&gt;large vectors do not have to be sent through external APIs;&lt;/li&gt;
&lt;li&gt;filters stay close to search;&lt;/li&gt;
&lt;li&gt;the result is easier to reproduce and debug;&lt;/li&gt;
&lt;li&gt;the application does not need an additional layer for similarity calculation — everything runs inside the search engine.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This will not fix poor embeddings or remove the need to tune ranking. But it reduces the number of interacting components in the search chain, which makes the system easier to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical examples and the evolution of MLT
&lt;/h2&gt;

&lt;p&gt;Search from an existing object is especially useful when the user has already found a relevant starting point.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Source object&lt;/th&gt;
&lt;th&gt;What to find&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;ticket with an error&lt;/td&gt;
&lt;td&gt;past tickets with similar symptoms and related fixes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catalog&lt;/td&gt;
&lt;td&gt;product card&lt;/td&gt;
&lt;td&gt;close alternatives, similar models, or products from the same category&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAG&lt;/td&gt;
&lt;td&gt;relevant fragment already found by the first search&lt;/td&gt;
&lt;td&gt;context expansion: neighboring sections of the same document, related documentation fragments, or similar discussions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Developer tools&lt;/td&gt;
&lt;td&gt;stack trace, diff, or bug description&lt;/td&gt;
&lt;td&gt;related code changes, discussions, and past incidents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In these examples, there is no need to type a new query manually. The system uses the source object as a reference point and finds documents similar to it lexically, semantically, or by both criteria.&lt;/p&gt;

&lt;p&gt;In the context of RAG, this is not about the primary search by the user’s query, but about subsequent context selection: the system has already found a relevant fragment and uses it as the reference object to collect surrounding context. This is useful when one fragment is too narrow: nearby content may include a term definition, a configuration example, a related discussion, or a neighboring section of the same guide.&lt;/p&gt;

&lt;p&gt;In systems with personalization or AI agents, it is important to clearly define which data is used for search: the system may consider the user’s search-query history, the context of previous interactions, or saved working notes. This makes it clear which data participates in retrieval and why the result is considered similar.&lt;/p&gt;

&lt;p&gt;The evolution of MLT can be described like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Period&lt;/th&gt;
&lt;th&gt;What changed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2000s&lt;/td&gt;
&lt;td&gt;MLT mostly relied on lexical analysis, TF-IDF, BM25, and term overlap.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2010s&lt;/td&gt;
&lt;td&gt;Word2Vec and GloVe appeared and became widely used, making it possible to build semantic embeddings of words and texts.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Early 2020s&lt;/td&gt;
&lt;td&gt;FAISS and similar ANN libraries made it possible to run vector search efficiently even on very large datasets.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mid-2020s&lt;/td&gt;
&lt;td&gt;RAG, recommendations, and search from an existing object made lookup by stored vectors a common product scenario.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The evolution of MLT is a shift from lexical comparison to matching document vector representations. But the practical request stayed the same: find documents relevant to the source result.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to keep in mind
&lt;/h2&gt;

&lt;p&gt;Semantic MLT does not replace all search engineering.&lt;/p&gt;

&lt;p&gt;Production systems still need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;exact search for identifiers, error codes, and other strict matches;&lt;/li&gt;
&lt;li&gt;embedding model metadata and versioning;&lt;/li&gt;
&lt;li&gt;ACL filters: rules for document access by roles or users;&lt;/li&gt;
&lt;li&gt;tenant filters: data isolation between customers or workspaces;&lt;/li&gt;
&lt;li&gt;hybrid search when both meaning and exact matches matter;&lt;/li&gt;
&lt;li&gt;reranking when result order is critical;&lt;/li&gt;
&lt;li&gt;search-quality monitoring: precision and recall metrics, false-positive frequency, and missed relevant documents caused by ANN-index approximation errors.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lexical MLT can miss documents that use different words. Vector search sometimes returns overly broad results, or false positives, and can miss relevant documents because of the approximate nature of ANN indexes. That is why the quality of this kind of search should be evaluated on real queries and real data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;More Like This has moved from purely lexical search to hybrid solutions that combine lexical, vector, and filtering mechanisms.&lt;/p&gt;

&lt;p&gt;The core concept remains the same: the user selects a source document, and the system finds materials relevant to it, taking both lexical and semantic similarity into account.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>algorithms</category>
      <category>machinelearning</category>
      <category>nlp</category>
    </item>
    <item>
      <title>KNN early termination in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Mon, 01 Jun 2026 09:51:55 +0000</pubDate>
      <link>https://dev.to/sanikolaev/knn-early-termination-in-manticore-search-4fa5</link>
      <guid>https://dev.to/sanikolaev/knn-early-termination-in-manticore-search-4fa5</guid>
      <description>&lt;p&gt;Modern search engines do more than match keywords. When you search for "cozy mystery set in Paris" and get results for "atmospheric detective novel in France" that's vector search at work: documents and queries are converted into lists of numbers, called embeddings, and the search engine finds the documents whose numbers are closest to the query's.&lt;/p&gt;

&lt;p&gt;Manticore Search supports this natively. Under the hood, it uses a data structure called HNSW: a graph that connects nearby vectors, so it can find nearest neighbors quickly without scanning every document. That makes vector search fast enough to run on millions of documents in milliseconds.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;But HNSW has an inefficiency. Early in the traversal, almost every distance computation finds a better candidate than the ones already in the result set.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As the search goes on, those improvements become rarer, but the algorithm keeps traversing the graph until it exhausts its exploration budget. By that point, the result set has often already converged, and the remaining work does little or nothing to improve it. Early termination fixes this by detecting that point and stopping early.&lt;/p&gt;

&lt;p&gt;The effect becomes more noticeable as &lt;code&gt;k&lt;/code&gt; grows, where &lt;code&gt;k&lt;/code&gt; is the number of nearest neighbors the query asks Manticore to return. Returning more neighbors requires more graph exploration, and much of that extra work happens after the result set has already stabilized. That also makes early termination more valuable, because it has more unnecessary work to cut.&lt;/p&gt;

&lt;p&gt;This gets more pronounced with &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Vector-quantization" rel="noopener noreferrer"&gt;vector quantization&lt;/a&gt;. Quantization compresses stored vectors to save memory, which slightly lowers search precision. To recover it, Manticore uses &lt;a href="https://manual.manticoresearch.com/Searching/KNN#KNN-vector-search" rel="noopener noreferrer"&gt;oversampling&lt;/a&gt;: it fetches 3x more candidates than requested, then rescores them using the original full-precision vectors. With the default 3x oversampling, HNSW explores many more candidates per query. Large &lt;code&gt;k&lt;/code&gt; values often come from this kind of candidate expansion: an application may ask the vector index for hundreds or thousands of candidates, then rescore, rerank, or filter them down to a much smaller final result set to improve recall and precision. That raises latency, and early termination helps win some of it back.&lt;/p&gt;

&lt;p&gt;The waste is measurable. Benchmarks on a 1M-vector dataset show that with &lt;code&gt;k=60&lt;/code&gt;, which is the default result limit with default 3x oversampling, early termination reduces distance computations to about 65% of the full search. At &lt;code&gt;k=1000&lt;/code&gt;, computations drop to 30%. At &lt;code&gt;k=10000&lt;/code&gt;, just 20%. The search converges long before the exploration budget runs out, and the savings grow with &lt;code&gt;k&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Early termination lets Manticore detect this convergence and stop. The algorithm was designed with a specific precision target: lose no more than 2-4% of result set precision compared to a full HNSW search.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;The algorithm tracks a simple signal: discovery rate - the fraction of distance computations that actually improve the result set.&lt;/p&gt;

&lt;p&gt;Each time a new node's distance is computed, one of two things happens: either it's good enough to enter the heap - the priority queue that holds the current best candidate neighbors - or it's worse than everything already there and gets discarded. Entering the heap counts as a "discovery." Early in the search, discoveries are frequent - the heap is filling up and most candidates are useful. As the search progresses and the heap saturates with good results, discoveries become rare. Most new distance computations just confirm that the algorithm has already found the best candidates.&lt;/p&gt;

&lt;p&gt;Manticore monitors this transition. After each round of neighbor expansion, it computes the discovery rate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;discovery_rate &lt;span class="o"&gt;=&lt;/span&gt; new_candidates_collected / distances_computed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this rate stays below a threshold for several rounds in a row, the search stops. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The idea is simple: if the algorithm keeps computing distances but nothing improves the result, the search has converged.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The threshold: quantile-based adaptation
&lt;/h2&gt;

&lt;p&gt;That raises the obvious next question: what threshold should count as "low"? A fixed threshold wouldn't work well - different datasets and different regions of the same dataset have wildly different discovery rate distributions. What counts as "low" depends on context.&lt;/p&gt;

&lt;p&gt;Manticore uses a quantile-based adaptive threshold. Instead of comparing the discovery rate against a fixed number, it continuously estimates a low percentile of recent rounds (20th percentile, or 14th percentile for L2 distance) and uses that as the baseline. This keeps the method lightweight while letting it adapt to different datasets and different regions of the graph.&lt;/p&gt;

&lt;p&gt;In other words, the threshold adapts to the local search pattern. If the algorithm enters a sparse region of the graph, the threshold drops and avoids stopping too early. If it enters a richer region, the threshold rises.&lt;/p&gt;

&lt;h2&gt;
  
  
  Patience: how many bad rounds before stopping
&lt;/h2&gt;

&lt;p&gt;The threshold alone is not enough, though. A single round with a low discovery rate isn't enough to declare convergence. It could just be a temporary dip before the search finds a better path. Manticore uses a "patience counter" that requires multiple consecutive bad rounds before terminating.&lt;/p&gt;

&lt;p&gt;The patience value scales inversely with &lt;code&gt;ef&lt;/code&gt;, the HNSW exploration factor that controls how many candidates the search keeps exploring. For example, patience ranges from 9 at low &lt;code&gt;ef&lt;/code&gt; values down to 6 at very high &lt;code&gt;ef&lt;/code&gt;. Larger &lt;code&gt;ef&lt;/code&gt; values mean more total rounds, so even with lower patience the algorithm has seen more evidence before deciding to stop. The counter resets to zero whenever a round has a healthy discovery rate, so a single good round restarts the patience window. This prevents the algorithm from stopping during a temporary plateau that leads to a productive region of the graph.&lt;/p&gt;

&lt;h2&gt;
  
  
  Warm-up phase
&lt;/h2&gt;

&lt;p&gt;The algorithm ignores the termination signal while the heap is still filling up, meaning fewer than &lt;code&gt;ef&lt;/code&gt; candidates have been collected. During this phase, discovery rates are artificially high because almost everything enters the heap, so the signal is not useful. Early termination only starts once the heap is full and new candidates must replace existing ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark results
&lt;/h2&gt;

&lt;p&gt;The quantile thresholds were tuned to keep precision loss within 2–4%. They were tuned separately for L2 and cosine/IP distance metrics, and validated across both &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Vector-quantization" rel="noopener noreferrer"&gt;quantized and non-quantized&lt;/a&gt; data.&lt;/p&gt;

&lt;p&gt;The following benchmarks were run on the &lt;a href="https://huggingface.co/datasets/KShivendu/dbpedia-entities-openai-1M" rel="noopener noreferrer"&gt;dbpedia-entities&lt;/a&gt; dataset (1M vectors, 768 dimensions) on a machine with 8 physical cores / 16 logical cores.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Precision" here means the fraction of true k-nearest neighbors that appear in the result set (with fixed k, this is the same as recall@k).&lt;/li&gt;
&lt;li&gt;"Precision ratio" is the precision of HNSW with early termination ("ET") divided by precision without it (1.0 means no precision loss). &lt;/li&gt;
&lt;li&gt;"Visit ratio" is the fraction of distance computations performed compared to full HNSW search (lower is better). &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://manual.manticoresearch.com/Searching/KNN#KNN-vector-search" rel="noopener noreferrer"&gt;Oversampling and rescoring&lt;/a&gt; were disabled to isolate the effect of early termination on raw HNSW traversal.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7unr6rkgltv8ws2y2sr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7unr6rkgltv8ws2y2sr.png" alt=" " width="800" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The green line on the chart (precision) stays almost flat across all &lt;code&gt;k&lt;/code&gt; values, with precision ratio remaining above 0.97 throughout the benchmark. Meanwhile the orange line (visit ratio) drops steeply. At &lt;code&gt;k=100&lt;/code&gt;, it cuts distance computations nearly in half. At &lt;code&gt;k=1000&lt;/code&gt;, it saves 70%. At &lt;code&gt;k=10000&lt;/code&gt;, 80%.&lt;/p&gt;

&lt;p&gt;At &lt;code&gt;k &amp;lt;= 10&lt;/code&gt;, early termination is disabled because the search is already cheap and the savings are too small to justify any precision loss. The savings grow with &lt;code&gt;k&lt;/code&gt;, because larger result sets lead to more rounds of neighbor expansion and more chances to detect convergence early.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance under concurrent load
&lt;/h2&gt;

&lt;p&gt;The benchmarks above show that early termination cuts distance computations a lot while preserving precision. But what does that mean for actual query latency, especially under concurrent load? The chart below shows latency ratios (ET / no ET) at 1, 8, and 16 concurrent threads on the same dbpedia dataset:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx61vv2acc8ytv5jd7dsa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx61vv2acc8ytv5jd7dsa.png" alt=" " width="800" height="461"&gt;&lt;/a&gt;&lt;br&gt;
At &lt;code&gt;k=1000&lt;/code&gt;, early termination reduces distance computations by 71% (ratio 0.29). The latency improvement depends on how many threads are running at the same time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1 thread:&lt;/strong&gt; 24% faster (ratio 0.76)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8 threads:&lt;/strong&gt; 45% faster (ratio 0.55)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;16 threads:&lt;/strong&gt; 48% faster (ratio 0.52)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The distance computation savings stay the same regardless of thread count, but the latency benefit nearly doubles from 1 to 16 threads.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The main reason is lower pressure on the CPU memory system. Each distance computation pulls vector data and graph links into cache. When several threads run HNSW traversal at the same time, they compete for shared cache and memory bandwidth. Doing fewer distance computations per query reduces memory traffic, keeps each thread’s working set smaller, and lowers cache churn between queries. As a result, each thread finishes faster and interferes less with the others.&lt;/p&gt;

&lt;p&gt;Single-thread benchmarks understate the benefit of early termination. Under production-like concurrent load, the percentage latency reduction is roughly twice as large.&lt;/p&gt;
&lt;h2&gt;
  
  
  When early termination kicks in (and when it doesn't)
&lt;/h2&gt;

&lt;p&gt;Early termination is enabled by default and works on both quantized and non-quantized vector data. It is automatically disabled when &lt;code&gt;k &amp;lt;= 10&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The benefit grows with the effective exploration budget, which is &lt;code&gt;max(ef, k)&lt;/code&gt;. Since hnswlib uses this internally as the number of candidates it keeps in play, larger &lt;code&gt;k&lt;/code&gt; means more candidates, more rounds, and more chances to detect convergence.&lt;/p&gt;

&lt;p&gt;Quantized vectors are typically used with rescoring and oversampling (both enabled by default) to recover precision lost from quantization. Oversampling (default 3x) multiplies the effective &lt;code&gt;k&lt;/code&gt; passed to HNSW.  For example, a query with &lt;code&gt;k=100&lt;/code&gt; uses 300 candidates internally when oversampling is 3x. That larger search budget gives early termination more room to detect convergence and stop early. Since the performance benefit of early termination grows with &lt;code&gt;k&lt;/code&gt;, oversampling pushes queries into the range where the savings are largest.&lt;/p&gt;
&lt;h2&gt;
  
  
  Syntax
&lt;/h2&gt;

&lt;p&gt;Early termination is on by default. To disable it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- default: early termination enabled&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="c1"&gt;-- explicitly disable early termination&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;early_termination&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;-- combine with other KNN options&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;ef&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;early_termination&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JSON:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.33&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"early_termination"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When to disable it
&lt;/h2&gt;

&lt;p&gt;There are a few scenarios where you might want to turn early termination off:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Maximum precision is critical.&lt;/strong&gt; Early termination trades a small amount of recall for speed. If your application requires the absolute best recall that HNSW can provide at a given &lt;code&gt;ef&lt;/code&gt;, disable it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small k values (&amp;lt;= 30).&lt;/strong&gt; The algorithm auto-disables for &lt;code&gt;k &amp;lt;= 10&lt;/code&gt;, but even for &lt;code&gt;k&lt;/code&gt; between 11 and 30, the performance benefit is modest. If you notice any recall difference in this range, disabling early termination costs little in latency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmarking HNSW recall.&lt;/strong&gt; If you are measuring HNSW recall, you probably want deterministic behavior without adaptive shortcuts. Disable early termination to get a clean baseline.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How it relates to other KNN optimizations
&lt;/h2&gt;

&lt;p&gt;Early termination is one of several optimizations that Manticore applies to KNN search. It works independently of and stacks with the others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://manticoresearch.com/blog/knn-prefiltering/" rel="noopener noreferrer"&gt;Prefiltering&lt;/a&gt; reduces wasted work by skipping filtered-out documents during HNSW traversal. Early termination reduces wasted work by stopping the traversal once the result set has converged. They solve different problems and work well together.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://manticoresearch.com/blog/quantization/#why-oversampling--rescoring-matters" rel="noopener noreferrer"&gt;Oversampling&lt;/a&gt; retrieves more candidates than &lt;code&gt;k&lt;/code&gt; to improve recall after rescoring. Early termination can reduce the cost of that expanded search by stopping once enough good candidates have been found.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://manticoresearch.com/blog/quantization/#why-oversampling--rescoring-matters" rel="noopener noreferrer"&gt;Rescoring&lt;/a&gt; recalculates distances using full-precision vectors after the initial search with quantized vectors. Early termination operates during the initial quantized search phase, reducing the number of candidates evaluated before rescoring kicks in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic brute-force fallback&lt;/strong&gt; skips HNSW entirely when a linear scan is cheaper. Early termination only applies when HNSW is actually used.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>algorithms</category>
      <category>database</category>
      <category>machinelearning</category>
      <category>performance</category>
    </item>
    <item>
      <title>How to Make xt850 Match xt 850</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Fri, 08 May 2026 05:30:14 +0000</pubDate>
      <link>https://dev.to/sanikolaev/how-to-make-xt850-match-xt-850-o15</link>
      <guid>https://dev.to/sanikolaev/how-to-make-xt850-match-xt-850-o15</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Since version &lt;code&gt;23.0.0&lt;/code&gt;, Manticore can make searches like &lt;code&gt;xt850&lt;/code&gt; match &lt;code&gt;xt 850&lt;/code&gt; using &lt;a href="https://manual.manticoresearch.com/dev/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_delimiter" rel="noopener noreferrer"&gt;bigram_delimiter&lt;/a&gt; together with digit-aware &lt;a href="https://manual.manticoresearch.com/dev/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_index" rel="noopener noreferrer"&gt;bigram_index&lt;/a&gt; modes.&lt;/p&gt;

&lt;p&gt;This solves a common tokenization mismatch in product search, where users remove spaces from model names but the source data stores them as separate tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Assumptions and verification
&lt;/h2&gt;

&lt;p&gt;This article assumes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RT tables created with SQL examples exactly as shown&lt;/li&gt;
&lt;li&gt;default tokenization unless the example explicitly changes a setting&lt;/li&gt;
&lt;li&gt;ASCII digits in model names, because &lt;code&gt;second_numeric&lt;/code&gt; and &lt;code&gt;second_has_digit&lt;/code&gt; are digit-aware modes built around &lt;code&gt;0-9&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All SQL examples and expected outputs in this article were verified against a real Manticore &lt;code&gt;23.0.0&lt;/code&gt; instance before publishing, using fresh tables created from scratch for each scenario.&lt;/p&gt;

&lt;h2&gt;
  
  
  The broader search problem
&lt;/h2&gt;

&lt;p&gt;Imagine a catalog containing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;xt 850 action camera&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iphone 5se battery case&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;canon eos 80d body&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;thinkpad x1 carbon&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now imagine users searching for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;xt850&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iphone5se&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;eos80d&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;thinkpadx1&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From the user's point of view, these should obviously match. From the engine's point of view, they often do not, because the indexed text is tokenized as separate terms.&lt;/p&gt;

&lt;p&gt;Search systems usually attack that mismatch in one of four ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;index prefixes or infixes&lt;/li&gt;
&lt;li&gt;add custom normalization rules&lt;/li&gt;
&lt;li&gt;duplicate content into alternate normalized fields&lt;/li&gt;
&lt;li&gt;index adjacent token pairs and optionally store glued variants too&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Manticore's newer bigram functionality is a structured way to do the fourth option without awkward field duplication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Baseline: why &lt;code&gt;xt850&lt;/code&gt; fails by default
&lt;/h2&gt;

&lt;p&gt;Here is the problem in its simplest form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_default_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_default_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_default_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'xt 850 action camera'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_default_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'xt850'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Empty&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why does this fail?&lt;/p&gt;

&lt;p&gt;Because the document is indexed as two separate tokens, &lt;code&gt;xt&lt;/code&gt; and &lt;code&gt;850&lt;/code&gt;, while the query is a single token, &lt;code&gt;xt850&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;By default, Manticore does not assume that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;xt850&lt;/code&gt; should be split into &lt;code&gt;xt&lt;/code&gt; + &lt;code&gt;850&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;or &lt;code&gt;xt&lt;/code&gt; + &lt;code&gt;850&lt;/code&gt; should also be searchable as &lt;code&gt;xt850&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So this is not really a typo-tolerance problem or a phrase problem. It is a tokenization mismatch: the index sees two tokens, while the query provides one.&lt;/p&gt;

&lt;p&gt;That is the gap the newer bigram settings are designed to close. They let Manticore index selected adjacent token pairs in a form that can also match glued queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why bigrams help here
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://manual.manticoresearch.com/dev/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_index" rel="noopener noreferrer"&gt;bigram_index&lt;/a&gt; can help with both &lt;a href="https://dev.to/blog/how-to-speed-up-phrase-search-with-bigram-index/"&gt;phrase acceleration&lt;/a&gt; and model-name matching, and in this article we focus on the &lt;code&gt;xt 850&lt;/code&gt; vs &lt;code&gt;xt850&lt;/code&gt; problem.&lt;/p&gt;

&lt;p&gt;The key idea is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;detect adjacent token pairs that look like model names&lt;/li&gt;
&lt;li&gt;store those pairs in a glued form too&lt;/li&gt;
&lt;li&gt;let queries such as &lt;code&gt;xt850&lt;/code&gt;, &lt;code&gt;iphone5se&lt;/code&gt;, or &lt;code&gt;thinkpadx1&lt;/code&gt; hit the spaced text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is where &lt;a href="https://manual.manticoresearch.com/dev/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_delimiter" rel="noopener noreferrer"&gt;bigram_delimiter&lt;/a&gt; matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note about &lt;a href="https://manual.manticoresearch.com/dev/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_delimiter" rel="noopener noreferrer"&gt;bigram_delimiter&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;bigram_index&lt;/code&gt; decides which adjacent pairs are eligible.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;bigram_delimiter&lt;/code&gt; decides how eligible bigrams are stored:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;true&lt;/code&gt;: internal delimited token only&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;none&lt;/code&gt;: glued token only, such as &lt;code&gt;galaxy24&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;both&lt;/code&gt;: both forms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The practical difference is easiest to understand from the query side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;with &lt;code&gt;true&lt;/code&gt;, Manticore keeps the internal bigram form used for phrase optimization, but it does not keep the glued user-facing form, so a query like &lt;code&gt;xt850&lt;/code&gt; will not match &lt;code&gt;xt 850&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;with &lt;code&gt;none&lt;/code&gt;, Manticore keeps only the glued form, so &lt;code&gt;xt850&lt;/code&gt; can match &lt;code&gt;xt 850&lt;/code&gt;, but you are leaning entirely on the glued representation for those pairs&lt;/li&gt;
&lt;li&gt;with &lt;code&gt;both&lt;/code&gt;, Manticore keeps both the internal bigram representation and the glued form, so &lt;code&gt;xt850&lt;/code&gt; can match &lt;code&gt;xt 850&lt;/code&gt; without giving up ordinary phrase behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this use case, &lt;code&gt;both&lt;/code&gt; is usually the safer default because it covers the user-visible problem directly while keeping behavior less surprising for normal phrase queries and mixed workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode 1: &lt;code&gt;second_numeric&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;bigram_index&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;second_numeric&lt;/span&gt;
&lt;span class="py"&gt;bigram_delimiter&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;both&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mode is aimed at model names where the second token is purely numeric.&lt;/p&gt;

&lt;p&gt;That is common in product catalogs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;xt 850&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;galaxy 24&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playstation 5&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pixel 8&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The idea is simple: users often search these as glued terms such as &lt;code&gt;xt850&lt;/code&gt;, &lt;code&gt;galaxy24&lt;/code&gt;, or &lt;code&gt;playstation5&lt;/code&gt;, even though the source text stores them with a space.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;second_numeric&lt;/code&gt; stores the pair only when the second token is ASCII digits only.&lt;/p&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you have product generations and numbered models&lt;/li&gt;
&lt;li&gt;users often remove spaces in search&lt;/li&gt;
&lt;li&gt;the second token is usually just digits&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;bigram_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'second_numeric'&lt;/span&gt;
  &lt;span class="n"&gt;bigram_delimiter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'both'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'xt 850 action camera'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'galaxy 24 ultra'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'playstation 5 slim'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'iphone 5se case'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'canon eos 80d body'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'thinkpad x1 carbon'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then test the queries one by one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'xt850'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+----------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;                &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+----------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;xt&lt;/span&gt; &lt;span class="mi"&gt;850&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="n"&gt;camera&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+----------------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'galaxy24'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+-----------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+-----------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;galaxy&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="n"&gt;ultra&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+-----------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'playstation5'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+--------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;              &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+--------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;playstation&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;slim&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+--------------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'iphone5se'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Empty&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'eos80d'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Empty&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_numeric_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'thinkpadx1'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Empty&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That boundary is the whole point of the mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;24&lt;/code&gt; and &lt;code&gt;5&lt;/code&gt; qualify&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;5se&lt;/code&gt;, &lt;code&gt;80d&lt;/code&gt;, and &lt;code&gt;x1&lt;/code&gt; do not&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Mode 2: &lt;code&gt;second_has_digit&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;bigram_index&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;second_has_digit&lt;/span&gt;
&lt;span class="py"&gt;bigram_delimiter&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;both&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mode is the more flexible sibling of &lt;code&gt;second_numeric&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It stores the pair when the second token contains at least one ASCII digit. That makes it a much better fit for real product catalogs, where model identifiers are often mixed alphanumeric strings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;xt 850&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iphone 5se&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;eos 80d&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;thinkpad x1&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;your model names mix letters and digits&lt;/li&gt;
&lt;li&gt;users frequently remove spaces in their searches&lt;/li&gt;
&lt;li&gt;you want catalog-friendly matching without indexing every pair in the table&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;bigram_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'second_has_digit'&lt;/span&gt;
  &lt;span class="n"&gt;bigram_delimiter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'both'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'xt 850 action camera'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'galaxy 24 ultra'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'playstation 5 slim'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'iphone 5se case'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'canon eos 80d body'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'thinkpad x1 carbon'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'kindle paperwhite signature'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then test the queries one by one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'xt850'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+----------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;                &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+----------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;xt&lt;/span&gt; &lt;span class="mi"&gt;850&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="n"&gt;camera&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+----------------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'galaxy24'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+-----------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+-----------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;galaxy&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="n"&gt;ultra&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+-----------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'iphone5se'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;               &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;iphone&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="n"&gt;se&lt;/span&gt; &lt;span class="k"&gt;case&lt;/span&gt;     &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'eos80d'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;               &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;canon&lt;/span&gt; &lt;span class="n"&gt;eos&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'thinkpadx1'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;               &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;thinkpad&lt;/span&gt; &lt;span class="n"&gt;x1&lt;/span&gt; &lt;span class="n"&gt;carbon&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;------+---------------------+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_second_has_digit_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'kindlesignature'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Empty&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is often the better fit for mixed model identifiers, because real catalog data frequently includes forms like &lt;code&gt;5se&lt;/code&gt;, &lt;code&gt;80d&lt;/code&gt;, or &lt;code&gt;x1&lt;/code&gt; rather than only clean numeric suffixes like &lt;code&gt;24&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to choose between the two
&lt;/h2&gt;

&lt;p&gt;If your search problem is specifically "How do I make &lt;code&gt;xt850&lt;/code&gt; find &lt;code&gt;xt 850&lt;/code&gt;?", the practical rule is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use &lt;code&gt;second_numeric&lt;/code&gt; when the second token is digits-only&lt;/li&gt;
&lt;li&gt;use &lt;code&gt;second_has_digit&lt;/code&gt; when the second token may be mixed, like &lt;code&gt;5se&lt;/code&gt;, &lt;code&gt;80d&lt;/code&gt;, or &lt;code&gt;x1&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is one practical caveat: this is compatible with other common text-processing settings in the straightforward case. &lt;code&gt;xt 850&lt;/code&gt; still matches &lt;code&gt;xt850&lt;/code&gt; with &lt;code&gt;morphology='stem_en'&lt;/code&gt; enabled and with a wordforms rule enabled.&lt;/p&gt;

&lt;p&gt;But that does not mean those settings rewrite the glued query for you. In tests, &lt;code&gt;iphones 5&lt;/code&gt; matched &lt;code&gt;iphones5&lt;/code&gt;, but not &lt;code&gt;iphone5&lt;/code&gt;, even with stemming or a wordforms rule mapping &lt;code&gt;iphones&lt;/code&gt; to &lt;code&gt;iphone&lt;/code&gt;. So the short version is: basic &lt;code&gt;xt 850&lt;/code&gt; vs &lt;code&gt;xt850&lt;/code&gt; matching stays compatible with morphology and wordforms, but if you rely on them, test the exact query shape you care about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;xt850&lt;/code&gt; problem is not really about one product name. It is about a broader mismatch between how users type model names and how search engines tokenize them.&lt;/p&gt;

&lt;p&gt;Since version &lt;code&gt;23.0.0&lt;/code&gt;, Manticore gives you a built-in way to handle that mismatch with &lt;code&gt;bigram_delimiter&lt;/code&gt; plus the digit-aware &lt;code&gt;bigram_index&lt;/code&gt; modes, which is much cleaner than duplicating fields or inventing custom preprocessing pipelines.&lt;/p&gt;

&lt;p&gt;If your main problem is phrase-search performance rather than glued model-name matching, see &lt;a href="https://manticoresearch.com/blog/how-to-speed-up-phrase-search-with-bigram-index/" rel="noopener noreferrer"&gt;How to Speed Up Phrase Search with bigram_index&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>database</category>
      <category>nlp</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Speed Up Phrase Search with bigram_index</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Thu, 07 May 2026 08:50:15 +0000</pubDate>
      <link>https://dev.to/sanikolaev/how-to-speed-up-phrase-search-with-bigramindex-l4f</link>
      <guid>https://dev.to/sanikolaev/how-to-speed-up-phrase-search-with-bigramindex-l4f</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_index" rel="noopener noreferrer"&gt;bigram_index&lt;/a&gt; can be used for several purposes, and in this article we focus specifically on phrase-search performance: on the 1M-document benchmark below, &lt;code&gt;bigram_index='all'&lt;/code&gt; improved QPS by about &lt;code&gt;2.9x&lt;/code&gt; and cut average phrase-query latency by about &lt;code&gt;3.2x&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If your main problem is matching &lt;code&gt;xt850&lt;/code&gt; against &lt;code&gt;xt 850&lt;/code&gt; rather than speeding up phrase search, see &lt;a href="https://manticoresearch.com/blog/how-to-make-searches-like-xt850-match-xt-850/" rel="noopener noreferrer"&gt;How to Make xt850 Match xt 850&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Phrase search can be expensive. Even when a query is short, the engine still has to verify ordering and adjacency, and that work gets more noticeable when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the individual words are common&lt;/li&gt;
&lt;li&gt;the dataset is large&lt;/li&gt;
&lt;li&gt;phrase queries are frequent in your workload&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is exactly what &lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#bigram_index" rel="noopener noreferrer"&gt;bigram_index&lt;/a&gt; is for.&lt;/p&gt;

&lt;h2&gt;
  
  
  What bigram indexing actually does
&lt;/h2&gt;

&lt;p&gt;Normally, a phrase like &lt;code&gt;"noise cancelling headphones"&lt;/code&gt; is handled as separate tokens that also need to appear in the right order and next to each other. Bigram indexing lets Manticore pre-store adjacent token pairs such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;noise cancelling&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cancelling headphones&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gives the engine a faster way to narrow down candidate documents during phrase matching.&lt;/p&gt;

&lt;p&gt;This article focuses specifically on phrase acceleration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Important caveat: bigrams work at tokenization level
&lt;/h2&gt;

&lt;p&gt;This is the part that is easy to miss when you only look at the happy-path speedup story.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;bigram_index&lt;/code&gt; works at the tokenization level only. It does not account for later transformations such as morphology, wordforms, or stopwords, and that can materially change phrase-matching expectations.&lt;/p&gt;

&lt;p&gt;The practical conclusion is simple: bigrams can be excellent for phrase speed, but if your index relies heavily on morphology, wordforms, or stopwords, test the actual phrase behavior you care about before rolling the setting out broadly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode 1: Default behavior
&lt;/h2&gt;

&lt;p&gt;This is the baseline. No explicit bigram indexing is enabled, so no bigram posting lists are stored.&lt;/p&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;phrase search is rare&lt;/li&gt;
&lt;li&gt;documents are short&lt;/li&gt;
&lt;li&gt;you want the leanest indexing path&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_none_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_none_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_none_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'wireless noise cancelling headphones'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'noise cancelling microphone'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'wireless gaming headset'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_none_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"noise cancelling"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the baseline behavior. The query matches the expected rows, but Manticore has no precomputed bigram posting lists to help resolve the phrase more efficiently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode 2: &lt;code&gt;all&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;bigram_index&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;all&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the most aggressive phrase-acceleration mode. Every adjacent token pair gets indexed as a bigram.&lt;/p&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;exact phrase search is a core feature&lt;/li&gt;
&lt;li&gt;phrase queries often include common words and produce many candidates&lt;/li&gt;
&lt;li&gt;you want the strongest phrase acceleration&lt;/li&gt;
&lt;li&gt;you do not want to tune a frequent-word list&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_all_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_all_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;bigram_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'all'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_all_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'lord of the rings trilogy'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'house of the dragon season 2'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'made for iphone charger'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_all_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"house of the dragon"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_all_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"made for iphone"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important point here is not different matches, but different indexing strategy: &lt;code&gt;all&lt;/code&gt; stores every adjacent pair, so phrase queries have the maximum amount of bigram help available at search time.&lt;/p&gt;

&lt;p&gt;The reason to choose &lt;code&gt;all&lt;/code&gt; is when phrase search becomes more expensive because many documents match the individual words, and Manticore then has to do more positional verification to confirm the exact phrase. &lt;code&gt;all&lt;/code&gt; helps by narrowing candidates earlier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode 3: &lt;code&gt;first_freq&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;bigram_index&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;first_freq&lt;/span&gt;
&lt;span class="py"&gt;bigram_freq_words&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;for, of, the, with&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mode stores a pair only when the first token is in your frequent-word list.&lt;/p&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;phrase search matters&lt;/li&gt;
&lt;li&gt;you want a lighter alternative to &lt;code&gt;all&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;many phrases in your data contain words that are genuinely frequent in your own corpus&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the list above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;for iphone&lt;/code&gt; is eligible&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;of the&lt;/code&gt; is eligible&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;the dragon&lt;/code&gt; is eligible&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;made for&lt;/code&gt; is not eligible&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lord of&lt;/code&gt; is not eligible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For production use, do not pick &lt;code&gt;bigram_freq_words&lt;/code&gt; from memory. Derive it from your own data. A practical way is to dump dictionary stats with &lt;a href="https://manual.manticoresearch.com/Miscellaneous_tools#indextool" rel="noopener noreferrer"&gt;indextool&lt;/a&gt; using &lt;code&gt;--dumpdict ... --stats&lt;/code&gt;, review the most frequent tokens, and then build a small &lt;code&gt;bigram_freq_words&lt;/code&gt; list from those results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_first_freq_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_first_freq_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;bigram_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'first_freq'&lt;/span&gt;
  &lt;span class="n"&gt;bigram_freq_words&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'for,of,the,with'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_first_freq_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'made for iphone charger'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'lord of the rings trilogy'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'house of the dragon season 2'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_first_freq_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"made for iphone"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_first_freq_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"lord of the"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The queries still return the expected rows. What changes is which pairs get indexed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"made for iphone"&lt;/code&gt; benefits from &lt;code&gt;for iphone&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"lord of the"&lt;/code&gt; benefits from &lt;code&gt;of the&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes &lt;code&gt;first_freq&lt;/code&gt; a lighter alternative to &lt;code&gt;all&lt;/code&gt; when many useful phrases involve common bridge words.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode 4: &lt;code&gt;both_freq&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;bigram_index&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;both_freq&lt;/span&gt;
&lt;span class="py"&gt;bigram_freq_words&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;for, of, the, with&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the narrowest frequency-based mode. A pair is stored only when both tokens are in the frequent-word list.&lt;/p&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you want the most conservative bigram footprint&lt;/li&gt;
&lt;li&gt;you mainly care about pairs built from words that are highly frequent in your corpus&lt;/li&gt;
&lt;li&gt;you are tuning a large corpus and do not want to index every adjacent pair&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the same list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;of the&lt;/code&gt; is eligible&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;for iphone&lt;/code&gt; is not eligible&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;the dragon&lt;/code&gt; is not eligible&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;bi_both_freq_demo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bi_both_freq_demo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;bigram_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'both_freq'&lt;/span&gt;
  &lt;span class="n"&gt;bigram_freq_words&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'for,of,the,with'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;bi_both_freq_demo&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'lord of the rings trilogy'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'house of the dragon season 2'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'made for iphone charger'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_both_freq_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"lord of the"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bi_both_freq_demo&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"made for iphone"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The queries still match, but the internal selectivity differs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"lord of the"&lt;/code&gt; includes &lt;code&gt;of the&lt;/code&gt;, which &lt;code&gt;both_freq&lt;/code&gt; is willing to store&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"made for iphone"&lt;/code&gt; includes &lt;code&gt;for iphone&lt;/code&gt;, which &lt;code&gt;first_freq&lt;/code&gt; would cover but &lt;code&gt;both_freq&lt;/code&gt; would not&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Which performance mode should you choose?
&lt;/h2&gt;

&lt;p&gt;The benchmark in this article shows that &lt;code&gt;all&lt;/code&gt; can deliver a strong speedup, but it is still just one benchmark on one workload.&lt;/p&gt;

&lt;p&gt;Manticore's own documentation says that for most use cases, &lt;code&gt;both_freq&lt;/code&gt; is the best mode. That is a sensible default because it aims for a more balanced trade-off between phrase acceleration and indexing cost.&lt;/p&gt;

&lt;p&gt;Use the modes like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;choose &lt;code&gt;both_freq&lt;/code&gt; as the default starting point for general phrase-search workloads&lt;/li&gt;
&lt;li&gt;choose &lt;code&gt;all&lt;/code&gt; when phrase search is especially important and you want the strongest acceleration, accepting higher indexing cost&lt;/li&gt;
&lt;li&gt;choose &lt;code&gt;first_freq&lt;/code&gt; when many useful phrases in your data involve common bridge words and you want something broader than &lt;code&gt;both_freq&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;choose the default behavior when phrase acceleration is not important&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Benchmark: does bigram indexing really speed up phrase search?
&lt;/h2&gt;

&lt;p&gt;Yes. In a simple local benchmark, the difference was easy to measure.&lt;/p&gt;

&lt;p&gt;I used &lt;code&gt;manticore-load&lt;/code&gt; to build two 1M-document tables against the same Manticore instance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one with no explicit &lt;code&gt;bigram_index&lt;/code&gt; setting&lt;/li&gt;
&lt;li&gt;one with &lt;code&gt;bigram_index='all'&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The documents were random 60-80 word texts, and the benchmark repeatedly ran random 2-word phrase queries.&lt;/p&gt;

&lt;p&gt;For clarity, both indexing and search were run with &lt;code&gt;--threads=1&lt;/code&gt;. Multi-threaded numbers would of course be higher, but single-thread runs make it easier to see what the feature changes on one CPU core.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bench_bigram_&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"&amp;lt;text/2/2&amp;gt;"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Benchmark setup
&lt;/h3&gt;

&lt;p&gt;Data load without bigrams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;manticore-load &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--drop&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--batch-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--init&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"CREATE TABLE bench_bigram_none_rand(title text)"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO bench_bigram_none_rand(id,title) VALUES(&amp;lt;increment&amp;gt;,'&amp;lt;text/60/80&amp;gt;')"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Data load with all bigrams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;manticore-load &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--drop&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--batch-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--init&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"CREATE TABLE bench_bigram_all_rand(title text) bigram_index='all'"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO bench_bigram_all_rand(id,title) VALUES(&amp;lt;increment&amp;gt;,'&amp;lt;text/60/80&amp;gt;')"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Search benchmark without bigrams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;manticore-load &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"SELECT COUNT(*) FROM bench_bigram_none_rand WHERE MATCH('&lt;/span&gt;&lt;span class="se"&gt;\\\"&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;text/2/2&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\\\"&lt;/span&gt;&lt;span class="s2"&gt;')"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Search benchmark with all bigrams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;manticore-load &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"SELECT COUNT(*) FROM bench_bigram_all_rand WHERE MATCH('&lt;/span&gt;&lt;span class="se"&gt;\\\"&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;text/2/2&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\\\"&lt;/span&gt;&lt;span class="s2"&gt;')"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What I observed
&lt;/h3&gt;

&lt;p&gt;On this local run:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;QPS&lt;/th&gt;
&lt;th&gt;Avg latency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bench_bigram_none_rand&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;755&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1.3 ms&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bench_bigram_all_rand&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;2175&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0.4 ms&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That is roughly a &lt;code&gt;2.9x&lt;/code&gt; improvement in QPS and about a &lt;code&gt;3.2x&lt;/code&gt; improvement in average latency on the same 1M-document workload.&lt;/p&gt;

&lt;p&gt;Indexing was slower with &lt;code&gt;bigram_index='all'&lt;/code&gt;, which is expected:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;without bigrams: about &lt;code&gt;45k docs/sec&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;with &lt;code&gt;all&lt;/code&gt;: about &lt;code&gt;17k docs/sec&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That trade-off is exactly why multiple modes exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;If your main problem is phrase-search performance, treat &lt;code&gt;bigram_index&lt;/code&gt; first and foremost as an acceleration feature.&lt;/p&gt;

&lt;p&gt;For most real workloads, start with &lt;code&gt;both_freq&lt;/code&gt; and measure. Move to &lt;code&gt;all&lt;/code&gt; if you need a stronger effect and can afford the extra indexing cost. Consider &lt;code&gt;first_freq&lt;/code&gt; when your phrase workload is heavily shaped by common bridge words.&lt;/p&gt;

</description>
      <category>database</category>
      <category>nlp</category>
      <category>performance</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Build a Searchable Catalog with Filters, Facets, and Semantic Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 06 May 2026 06:17:07 +0000</pubDate>
      <link>https://dev.to/sanikolaev/build-a-searchable-catalog-with-filters-facets-and-semantic-search-24bm</link>
      <guid>https://dev.to/sanikolaev/build-a-searchable-catalog-with-filters-facets-and-semantic-search-24bm</guid>
      <description>&lt;p&gt;A search box is easy. A searchable catalog that keeps being useful after the first query is the harder part.&lt;/p&gt;

&lt;p&gt;That is the problem this demo takes on. It uses a small board-game catalog, but the shape of the problem is familiar: users type something half-remembered, misspell it, narrow by constraints, keep browsing, open a result, then want "more like this" without starting over. If your product has that flow, most of the work is not the UI polish. It is getting the search behavior right without turning the stack into a science project.&lt;/p&gt;

&lt;p&gt;In this article, we build a searchable catalog with autocomplete, typo tolerance, filters, facets, deep pagination, semantic search, and similar-item recommendations.&lt;/p&gt;

&lt;p&gt;You can try the hosted version first:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://catalog.manticoresearch.com" rel="noopener noreferrer"&gt;https://catalog.manticoresearch.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn97e2bt8zs5ckipsuqm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn97e2bt8zs5ckipsuqm.png" alt=" " width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The app itself is implemented in PHP, but that is not really the story here. The interesting part is how little ceremony you need to get from a basic query box to something that already feels like a working catalog: search, filters, facets, and similar-item discovery all show up quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run it locally
&lt;/h2&gt;

&lt;p&gt;To run the same demo locally, you only need PHP 8.1+, Composer, and Docker (or any other way to run Manticore).&lt;/p&gt;

&lt;p&gt;In this setup, Manticore is the search engine behind the catalog: it handles indexing, filtering, faceting, and semantic retrieval. The repo already includes a Docker setup for it, so the quickest way to get the demo running is to clone the repo and start Manticore from the project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/manticoresoftware/php-catalog-demo
&lt;span class="nb"&gt;cd &lt;/span&gt;php-catalog-demo
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;docker compose ps&lt;/code&gt; should show the container as running.&lt;/p&gt;

&lt;p&gt;Inside the cloned repo, create the app environment file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp &lt;/span&gt;app/.env.example app/.env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a local run, the important part is just how the app reaches Manticore:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MANTICORE_HOST=127.0.0.1
MANTICORE_PORT=9308
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;app
composer &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The demo reads those settings and creates a Manticore client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$settings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;require&lt;/span&gt; &lt;span class="nv"&gt;$root&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'/config/settings.php'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nv"&gt;$client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'host'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$settings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'manticore'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'host'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s1"&gt;'port'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$settings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'manticore'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'port'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s1"&gt;'transport'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Http'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then load the demo dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php bin/bootstrap-demo.php
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That command recreates the demo table and imports the starter catalog, so you begin from a known state instead of debugging old data.&lt;/p&gt;

&lt;p&gt;Start the app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php &lt;span class="nt"&gt;-S&lt;/span&gt; localhost:8081 &lt;span class="nt"&gt;-t&lt;/span&gt; public
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:8081/&lt;/code&gt; and you have a working catalog to search.&lt;/p&gt;

&lt;p&gt;Not glamorous. Still worth it. A lot of search demos lose people before the first query because setup sprawls. This one does not need much.&lt;/p&gt;

&lt;h2&gt;
  
  
  What makes the app feel usable
&lt;/h2&gt;

&lt;p&gt;The part I care about most is not that the demo returns results. Plenty of demos do that. It is that the search flow holds together as users get more specific.&lt;/p&gt;

&lt;h3&gt;
  
  
  Start with autocomplete
&lt;/h3&gt;

&lt;p&gt;People usually begin with fragments. Sometimes they remember the exact game title. Often they do not.&lt;/p&gt;

&lt;p&gt;So the first layer is autocomplete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'body'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'query'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$term&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'table'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'options'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'limit'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'force_bigrams'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nv"&gt;$suggestions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;autocomplete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$payload&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;force_bigrams&lt;/code&gt; here helps tighten typo-tolerant matching for short or slightly wrong input, which is exactly where autocomplete can otherwise get mushy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fla04xej74ti60i55veya.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fla04xej74ti60i55veya.gif" alt=" " width="600" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a small feature, but it changes the feel of the app immediately. Users stop guessing what your catalog calls things.&lt;/p&gt;

&lt;h3&gt;
  
  
  Make the first results page forgiving
&lt;/h3&gt;

&lt;p&gt;Once the query is submitted, the first page needs to be useful even when the spelling is off by a bit.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$limit&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$query&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fuzzy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;option&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'fuzzy'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;option&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'force_bigrams'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fuzzy mode is doing plain practical work here: recovering close matches when users do not type the title exactly right.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4tuup98snh18vufjy3tf.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4tuup98snh18vufjy3tf.gif" alt=" " width="720" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you want the lower-level details, see &lt;a href="https://manual.manticoresearch.com/Searching/Spell_correction#Fuzzy-Search" rel="noopener noreferrer"&gt;Spell correction and fuzzy search&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Let users narrow without rewriting
&lt;/h3&gt;

&lt;p&gt;This is where many search interfaces get annoying. The query is close enough, but the result set is still too broad, so now the user has to reformulate it from scratch.&lt;/p&gt;

&lt;p&gt;Better to let them narrow in place.&lt;/p&gt;

&lt;p&gt;Range filters handle constraints like price, player count, play time, and release year. Facets expose the shape of the current result set so users can click into categories or tags instead of thinking up a more precise sentence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$attributeFilters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'price_min'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$priceMin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'price_max'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$priceMax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'play_time_min'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$playTimeMin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'play_time_max'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$playTimeMax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'player_count_min'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$playerCountMin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'player_count_max'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$playerCountMax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'release_year_min'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$yearMin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'release_year_max'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$yearMax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$categoryIds&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'category_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$categoryIds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$tagIds&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tag_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$tagIds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;applyNumericFilters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$attributeFilters&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;facet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'category_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;facet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tag_id'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w1n3fpnpafz5vrso6mr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w1n3fpnpafz5vrso6mr.gif" alt=" " width="720" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That combination matters more than it may look on paper. In practice, this is where the catalog starts feeling easy to use: a broad query can shrink fast once you click into a category or tag, without losing the original search intent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Keep deep pagination stable
&lt;/h3&gt;

&lt;p&gt;If people browse further, offset pagination starts showing its age. Data changes between requests, offsets get larger, and eventually "show more" becomes less trustworthy than it should be.&lt;/p&gt;

&lt;p&gt;This demo uses scroll tokens instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Page 1 starts a fresh scroll session; next pages continue with returned token.&lt;/span&gt;
&lt;span class="nv"&gt;$effectiveScrollToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$page&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nv"&gt;$scrollToken&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;option&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'scroll'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$effectiveScrollToken&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nv"&gt;$resultSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ResultSet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'body'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$body&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="nv"&gt;$nextScroll&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$resultSet&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getScroll&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nv"&gt;$hasMore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$nextScroll&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$nextScroll&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gives the app a much better foundation for deep pagination: each request continues from a returned token rather than recomputing larger and larger offsets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7xmopdkplhlf2f6nc9gk.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7xmopdkplhlf2f6nc9gk.gif" alt=" " width="800" height="527"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Operationally, this is one of those choices users never notice when it works and definitely notice when it does not. More on the mechanism here: &lt;a href="https://manticoresearch.com/blog/pagination/#scroll-based-pagination" rel="noopener noreferrer"&gt;Scroll-Based Pagination&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add semantic retrieval where keywords fail
&lt;/h3&gt;

&lt;p&gt;Keyword search gets you far. It does not solve everything.&lt;/p&gt;

&lt;p&gt;Sometimes users describe something in roughly the right language, but not in the same words your catalog uses. That is where hybrid search earns its keep.&lt;/p&gt;

&lt;h4&gt;
  
  
  Use hybrid search on the results page
&lt;/h4&gt;

&lt;p&gt;In this demo, one request includes both a lexical &lt;code&gt;query&lt;/code&gt; block and a semantic &lt;code&gt;knn&lt;/code&gt; block, then combines them with reciprocal rank fusion via &lt;code&gt;options.fusion_method = rrf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'query'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'bool'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'must'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'query_string'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'query'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;]]]]],&lt;/span&gt;
    &lt;span class="s1"&gt;'knn'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'field'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'description_vector'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'query'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s1"&gt;'options'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'fusion_method'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="s1"&gt;'limit'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The vector field uses auto-embeddings, so the app does not have to generate query vectors on its own:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="s1"&gt;'description_vector'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'float_vector'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'options'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'MODEL_NAME'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'sentence-transformers/all-MiniLM-L6-v2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'FROM'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because the &lt;code&gt;knn&lt;/code&gt; block names the vector field directly (&lt;code&gt;'field' =&amp;gt; 'description_vector'&lt;/code&gt;), Manticore can embed the query text automatically for KNN search.&lt;/p&gt;

&lt;p&gt;That keeps the application logic simpler than many teams expect when they first hear "semantic search." It also lets the results page stay in one flow instead of bolting a separate semantic experience onto the side.&lt;/p&gt;

&lt;h4&gt;
  
  
  Use similar-item discovery on the detail page
&lt;/h4&gt;

&lt;p&gt;The same vector field does a different job on the item page: "show me similar games" without forcing the user to invent another query. This part uses KNN directly against the current item.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'description_vector'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$source&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SIMILAR_KNN_LIMIT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;notFilter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$source&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SIMILAR_RESULT_LIMIT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nv"&gt;$resultSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$search&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nv"&gt;$hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;formatResultSet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$resultSet&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s1"&gt;'hits'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;array_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$hits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SIMILAR_RESULT_LIMIT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1t1nlbnhbgd3f8es907n.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1t1nlbnhbgd3f8es907n.gif" alt=" " width="720" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is where search stops being a utility and starts helping discovery. On a real detail page, this is the part that makes it easy to keep exploring instead of bouncing back to the search box.&lt;/p&gt;

&lt;p&gt;For reference: &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Auto-Embeddings-(Recommended)" rel="noopener noreferrer"&gt;Auto Embeddings&lt;/a&gt; and &lt;a href="https://manual.manticoresearch.com/Searching/KNN#KNN-vector-search" rel="noopener noreferrer"&gt;KNN Search&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keeping writes and search results in sync
&lt;/h2&gt;

&lt;p&gt;A demo app is easy to trust when the data never changes. Real apps do not get that luxury.&lt;/p&gt;

&lt;p&gt;Here, the table stays in sync through the same application flow users and admins already touch: bootstrap for a clean baseline, batched imports from the admin UI, and update/delete actions for individual items.&lt;/p&gt;

&lt;p&gt;Prepared imports use the client's batch write methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;indexConfig&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$appendAsNewIds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;addDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$batch&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;replaceDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$batch&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For individual item changes, the app uses the table API directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;replaceDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;addDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;deleteDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F826omlznhffvf8qofq45.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F826omlznhffvf8qofq45.gif" alt=" " width="720" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And if you want to reset the experiment, the admin UI can drop imported records and return to the baseline dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$baseMaxId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;resolveBaseMaxId&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;deleteDocuments&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'range'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'gt'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$baseMaxId&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No extra background machinery in the demo, no detached sync story to explain away. Just writes going where they need to go.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;What this demo shows is not just that Manticore can return results. It shows you can assemble a searchable catalog that feels complete: users can start loosely, narrow quickly with filters and facets, recover from imperfect queries, open an item, and keep discovering from there without the whole stack getting complicated.&lt;/p&gt;

&lt;p&gt;That is already enough to make search feel like part of the product, not a bolt-on feature.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>database</category>
      <category>showdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Why monitoring your search engine matters: Manticore ➡ Prometheus ➡ Grafana</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Thu, 09 Apr 2026 04:13:56 +0000</pubDate>
      <link>https://dev.to/sanikolaev/why-monitoring-your-search-engine-matters-manticore-prometheus-grafana-51g3</link>
      <guid>https://dev.to/sanikolaev/why-monitoring-your-search-engine-matters-manticore-prometheus-grafana-51g3</guid>
      <description>&lt;p&gt;One of our users reached out recently with a familiar problem: search had suddenly become noticeably slower, even though nothing looked obviously broken.&lt;/p&gt;

&lt;p&gt;The service was up, no errors in the logs, CPU usage looked normal — yet users were starting to complain that results felt sluggish.&lt;/p&gt;

&lt;p&gt;This is how search problems usually show up in production. Not with a dramatic outage, but as a slow, creeping degradation. A little more traffic here, some extra indexing there, and before you know it, performance has slipped.&lt;/p&gt;

&lt;p&gt;By the time users notice, the real issue has often been building for hours. Without good visibility you’re left guessing: Is the system overloaded? Is one table eating up resources? Or is something else quietly going wrong?&lt;/p&gt;

&lt;p&gt;That’s why monitoring matters. It turns the vague “search feels slow” complaint into something you can actually diagnose and fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing the Manticore Grafana dashboard
&lt;/h2&gt;

&lt;p&gt;This is exactly what our new Manticore Grafana dashboard is built for.&lt;/p&gt;

&lt;p&gt;Instead of raw metrics, it gives you a clean, practical view of what really matters when running search in production. At a glance you can see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the node healthy?&lt;/li&gt;
&lt;li&gt;How heavy is the current load?&lt;/li&gt;
&lt;li&gt;Are queries slowing down?&lt;/li&gt;
&lt;li&gt;Which tables are using the most resources?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s designed to help you move quickly from a user symptom to the actual root cause.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the stack works
&lt;/h2&gt;

&lt;p&gt;The setup is straightforward: &lt;strong&gt;Manticore → Prometheus → Grafana&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Manticore exposes rich internal metrics, Prometheus collects and stores them as time-series data, and Grafana visualizes everything with our pre-built dashboard — including &lt;strong&gt;21 production-ready alerts&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You can launch the entire stack with a single Docker command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MANTICORE_TARGETS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;localhost:9308 &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 manticoresearch/dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Just change the &lt;code&gt;MANTICORE_TARGETS&lt;/code&gt; environment variable if your Manticore instance is running somewhere else.)&lt;/p&gt;

&lt;p&gt;If you prefer to set things up manually, grab these files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://raw.githubusercontent.com/manticoresoftware/grafana-dashboard/main/grafana/dashboards/manticore-dashboard.json" rel="noopener noreferrer"&gt;Dashboard JSON&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://raw.githubusercontent.com/manticoresoftware/grafana-dashboard/main/prometheus/rules/manticore-alerts.yml" rel="noopener noreferrer"&gt;Alert rules&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Example &lt;a href="https://raw.githubusercontent.com/manticoresoftware/grafana-dashboard/main/prometheus/prometheus.yml" rel="noopener noreferrer"&gt;Prometheus config&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Minimal Prometheus scrape config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;scrape_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;job_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;manticore"&lt;/span&gt;
    &lt;span class="na"&gt;static_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;targets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;localhost:9308"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Exploring the dashboard
&lt;/h2&gt;

&lt;p&gt;The dashboard is laid out so you can follow a natural troubleshooting flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Health summary (start here)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmamj2w49gnkrc2xm6d5q.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmamj2w49gnkrc2xm6d5q.jpeg" alt=" " width="800" height="163"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Open the dashboard and look at the top row first. It gives you an instant picture of the node’s overall health.&lt;/p&gt;

&lt;p&gt;Key panels to watch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Health / Up&lt;/strong&gt; — Is Prometheus even able to scrape metrics?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health / Crash indicator&lt;/strong&gt; — Any recent crashes?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workers Utilization %&lt;/strong&gt; + &lt;strong&gt;Load / Queue pressure&lt;/strong&gt; — These two together are gold. High utilization plus rising queue pressure is one of the clearest early signs the node is approaching saturation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;System Score&lt;/strong&gt; panel also gives you a quick overall health rating at a glance.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Query load and latency
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F01t1b1j8aknf8fqbfjy7.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F01t1b1j8aknf8fqbfjy7.jpeg" alt=" " width="800" height="376"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuwxzmfukmebspz6bwxjz.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuwxzmfukmebspz6bwxjz.jpeg" alt=" " width="800" height="233"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, check what kind of workload the system is handling.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;QPS Total&lt;/strong&gt; shows overall traffic levels.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search Latency (p95/p99)&lt;/strong&gt; is one of the most important panels — averages can hide problems, but percentiles show what your users are really experiencing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slowest Thread&lt;/strong&gt; helps spot expensive or stuck queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Work Queue Length&lt;/strong&gt; and &lt;strong&gt;Worker Saturation&lt;/strong&gt; together tell you whether the node is keeping up or starting to fall behind.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Memory and resources
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fikr650w81upzodhm2kll.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fikr650w81upzodhm2kll.jpeg" alt=" " width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This section is one of the most useful because memory pressure is a very common (and often hidden) cause of slowdowns in search engines. Instead of showing one vague number, the dashboard breaks it down so you can see exactly where the growth is happening.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Searchd RSS&lt;/strong&gt; and &lt;strong&gt;Buddy RSS&lt;/strong&gt; show the &lt;em&gt;total resident memory&lt;/em&gt; — how much physical RAM the main search daemon (searchd) and the Buddy helper process are actually using right now.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Anon RSS&lt;/strong&gt; panels go one level deeper. “Anonymous” memory is the private, dynamic RAM allocated by Manticore itself (think heap, query caches, loaded data structures, temporary buffers — everything not backed by a file on disk). Unlike file-mapped memory (which the OS can page out or reclaim), anon memory is what usually puts real pressure on your system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why show both RSS &lt;em&gt;and&lt;/em&gt; Anon RSS? Total RSS gives you the big picture, but Anon RSS tells you the story behind it. If total RSS is climbing but Anon RSS is stable, the growth might be harmless (e.g. more cached files). If Anon RSS is also rising fast, that’s usually a sign that Manticore’s own data structures or query activity are consuming more and more memory — exactly the kind of thing that leads to slower queries or even swapping.&lt;/p&gt;

&lt;p&gt;At the bottom you’ll also see several quick counters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resources / FDs (searchd)&lt;/strong&gt; — current number of open file descriptors used by the search daemon. Manticore opens a lot of files for indexes (especially large real-time tables with many disk chunks). If this number gets too high you can hit the OS limit and start seeing “Too many open files” errors. You can raise the soft limit with the &lt;code&gt;max_open_files&lt;/code&gt; setting (see the &lt;a href="https://manual.manticoresearch.com/Server_settings/Searchd#max_open_files" rel="noopener noreferrer"&gt;Manticore docs on server settings&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;Active workers, table counts, and non-served tables — all quick signals that something might need attention.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Table-level insights
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wvh6hf6or0d5s3ikjlj.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wvh6hf6or0d5s3ikjlj.jpeg" alt=" " width="800" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now zoom in on the data itself.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Document counts per table&lt;/li&gt;
&lt;li&gt;Top 10 tables by RAM and disk usage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tables / Health&lt;/strong&gt; panel — this one is particularly valuable because it combines docs, RAM, disk, and state flags (locked/optimizing) in a single view.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Cluster state and history
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfar2cbhgdxep5qsv8w8.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfar2cbhgdxep5qsv8w8.jpeg" alt=" " width="800" height="91"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyv61ev2ltjw4dkr49fj.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyv61ev2ltjw4dkr49fj.jpeg" alt=" " width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For distributed setups you get node status and sync state. The history section is excellent for answering the most important question during any incident: &lt;em&gt;what changed right before things slowed down?&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Remember the user who reached out because search had suddenly become noticeably slower?&lt;/p&gt;

&lt;p&gt;Once he enabled this dashboard, the problem became obvious almost immediately: workers were getting busier, queues were growing, and memory pressure was building — all before any obvious errors or crashes appeared. With clear visibility into what was actually happening inside the engine, he quickly pinpointed the root cause, made the right adjustments, and got performance back to the fast, reliable level his users expected.&lt;/p&gt;

&lt;p&gt;The real value of monitoring isn’t just seeing pretty graphs. It’s catching those creeping issues early — before they cost you money or customers.&lt;/p&gt;

&lt;p&gt;This dashboard removes that blind spot. It gives you the visibility you need to keep your search fast and reliable.&lt;/p&gt;

</description>
      <category>database</category>
      <category>devops</category>
      <category>monitoring</category>
      <category>performance</category>
    </item>
    <item>
      <title>Monitor Manticore Search in Grafana with One Command</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 08 Apr 2026 02:27:24 +0000</pubDate>
      <link>https://dev.to/sanikolaev/monitor-manticore-search-in-grafana-with-one-command-d04</link>
      <guid>https://dev.to/sanikolaev/monitor-manticore-search-in-grafana-with-one-command-d04</guid>
      <description>&lt;p&gt;The most annoying kind of incident is when database doesn’t go down completely - it just gets slower.&lt;/p&gt;

&lt;p&gt;Users start noticing it right away. Complaints come in. Everything is technically still running, but clearly something is off.&lt;/p&gt;

&lt;p&gt;And that is usually the hardest part: not noticing the problem, but figuring out what is actually happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  When everything looks fine, but search is still slow
&lt;/h2&gt;

&lt;p&gt;Let’s take a pretty normal scenario.&lt;/p&gt;

&lt;p&gt;Search starts slowing down. It is not crashing. It is not returning obvious errors. The service is up. From the outside, nothing looks broken in a dramatic way.&lt;/p&gt;

&lt;p&gt;But users can feel it.&lt;/p&gt;

&lt;p&gt;So you open your monitoring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU looks fine.&lt;/li&gt;
&lt;li&gt;Average latency does not look too bad.&lt;/li&gt;
&lt;li&gt;No obvious alerts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At first glance, nothing really explains the slowdown.&lt;/p&gt;

&lt;p&gt;So you keep digging...&lt;/p&gt;

&lt;p&gt;You check the queue. Nothing jumps out immediately.&lt;br&gt;
You look at worker usage. They are busy, but not in a way that tells you much on its own.&lt;br&gt;
You check the logs. Still nothing obvious.&lt;/p&gt;

&lt;p&gt;And after a while you get to that frustrating point where you realize you have already checked the usual things, and you still do not know where the problem is.&lt;/p&gt;

&lt;p&gt;Each metric, by itself, looks more or less okay. But together, the system is clearly degrading.&lt;/p&gt;

&lt;p&gt;So now you are no longer following a clear line of investigation. You are just checking everything you can think of and hoping the pattern shows up.&lt;/p&gt;

&lt;p&gt;Meanwhile, time is passing.&lt;/p&gt;
&lt;h2&gt;
  
  
  What was actually going on
&lt;/h2&gt;

&lt;p&gt;A couple of hours later, the picture finally starts to make sense.&lt;/p&gt;

&lt;p&gt;It turns out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the request queue has been slowly growing;&lt;/li&gt;
&lt;li&gt;workers have been sitting near 100% utilization;&lt;/li&gt;
&lt;li&gt;one heavy query keeps blocking execution from time to time;&lt;/li&gt;
&lt;li&gt;p99 latency is much worse than the average suggests;&lt;/li&gt;
&lt;li&gt;and one of the nodes restarted recently.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the signals were there all along.&lt;/p&gt;

&lt;p&gt;The problem was that they were scattered across different places, and it took too long to connect them into one clear story.&lt;/p&gt;
&lt;h2&gt;
  
  
  The solution: see the whole picture right away
&lt;/h2&gt;

&lt;p&gt;Instead of spending hours piecing all of that together by hand, it is much better to have one place where the important signals are already visible.&lt;/p&gt;

&lt;p&gt;That is why we put together a ready-to-use dashboard for Manticore Search that starts with a single Docker command. It comes with Grafana, Prometheus, a preconfigured data source, and built-in alerts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 manticoresearch/dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment variables
&lt;/h3&gt;

&lt;p&gt;The container supports two &lt;a href="https://github.com/manticoresoftware/grafana-dashboard?tab=readme-ov-file#environment-variables" rel="noopener noreferrer"&gt;environment variables&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;MANTICORE_TARGETS&lt;/code&gt; - comma-separated list of Manticore Search instances (default: &lt;code&gt;localhost:9308&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GF_AUTH_ENABLED&lt;/code&gt; - set to &lt;code&gt;true&lt;/code&gt; to enable Grafana login (by default, anonymous admin access is enabled)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MANTICORE_TARGETS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-host:9308 &lt;span class="se"&gt;\&lt;/span&gt;
  manticoresearch/dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you monitor multiple nodes, pass them as a comma-separated list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MANTICORE_TARGETS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;node1:9308,node2:9308,node3:9308 &lt;span class="se"&gt;\&lt;/span&gt;
  manticoresearch/dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  If Manticore is running on a remote server
&lt;/h3&gt;

&lt;p&gt;By default, the dashboard expects Manticore at &lt;code&gt;localhost:9308&lt;/code&gt;. If your instance is running on a remote machine, the simplest option is SSH port forwarding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh &lt;span class="nt"&gt;-L&lt;/span&gt; 9308:localhost:9308 user@your-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, local connections to &lt;code&gt;localhost:9308&lt;/code&gt; will be forwarded to the remote server, so the dashboard can connect without additional changes.&lt;/p&gt;

&lt;p&gt;A minute later, you have a usable overview of your system.&lt;/p&gt;

&lt;p&gt;Not just a pile of graphs, but a dashboard that helps you quickly answer the questions you actually care about when something feels wrong.&lt;/p&gt;

&lt;p&gt;You can see queue growth, worker saturation, latency, process state, and query behavior in one place, instead of bouncing between tools and trying to stitch the story together in your head.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the dashboard shows
&lt;/h2&gt;

&lt;p&gt;The value here is not that there are a lot of panels. The value is that the panels answer the right questions quickly.&lt;/p&gt;

&lt;p&gt;The first place to look is the overall system view:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feh7qbvu0wv7o7hd6oo0d.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feh7qbvu0wv7o7hd6oo0d.jpeg" alt=" " width="800" height="163"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This gives you the basic picture right away:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;is the service up;&lt;/li&gt;
&lt;li&gt;has it restarted recently;&lt;/li&gt;
&lt;li&gt;is there queue pressure;&lt;/li&gt;
&lt;li&gt;are workers already under load.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If this row looks healthy, maybe the issue is narrow and local. If it does not, you know right away that the system is under real pressure.&lt;/p&gt;

&lt;p&gt;Then you move to load and query behavior:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8j7ttiqfehxif501w84g.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8j7ttiqfehxif501w84g.jpeg" alt=" " width="800" height="376"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is where you can quickly see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;whether work is starting to pile up;&lt;/li&gt;
&lt;li&gt;whether workers are saturated;&lt;/li&gt;
&lt;li&gt;whether latency is getting worse, especially p95 and p99;&lt;/li&gt;
&lt;li&gt;whether one slow thread is causing a disproportionate amount of trouble.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And if you need more context, you can drill down into the rest of the dashboard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cluster state:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3x0i86suqqt44ktchci.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3x0i86suqqt44ktchci.jpeg" alt=" " width="800" height="91"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tables and data:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5c5m9fp73wj14obwdab.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5c5m9fp73wj14obwdab.jpeg" alt=" " width="800" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At that point, you are no longer looking at disconnected metrics. You are looking at the system as a whole.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;In the kind of situation that used to cost you a couple of hours just to understand, now you can usually spot the direction in a few minutes.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can see that the queue is growing.&lt;/li&gt;
&lt;li&gt;You can see that workers are pinned.&lt;/li&gt;
&lt;li&gt;You can see that p99 is climbing.&lt;/li&gt;
&lt;li&gt;You can see that one node restarted.&lt;/li&gt;
&lt;li&gt;You can see that one query is probably doing most of the damage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That does not mean the dashboard magically fixes the issue for you.&lt;/p&gt;

&lt;p&gt;What it does do is remove the slowest part of the whole process: figuring out where to look.&lt;/p&gt;

&lt;p&gt;And in practice, that is often the difference between spending two hours trying to understand the incident and spending five minutes getting to the real problem.&lt;/p&gt;

</description>
      <category>database</category>
      <category>monitoring</category>
      <category>performance</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Parallel chunk merging in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Tue, 07 Apr 2026 03:52:38 +0000</pubDate>
      <link>https://dev.to/sanikolaev/parallel-chunk-merging-in-manticore-search-47h2</link>
      <guid>https://dev.to/sanikolaev/parallel-chunk-merging-in-manticore-search-47h2</guid>
      <description>&lt;p&gt;Starting from &lt;strong&gt;Manticore Search 24.4.0&lt;/strong&gt;, RT table compaction has a more capable execution model. Instead of merging chunk pairs one-by-one in a serial flow, optimization now supports two important improvements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;disk chunk merges can run in parallel&lt;/li&gt;
&lt;li&gt;&lt;p&gt;each merge job can merge more than two chunks at once&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://manual.manticoresearch.com/Server_settings/Searchd#parallel_chunk_merges" rel="noopener noreferrer"&gt;parallel_chunk_merges&lt;/a&gt;: how many RT disk chunk merge jobs may run at the same time&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://manual.manticoresearch.com/Server_settings/Searchd#merge_chunks_per_job" rel="noopener noreferrer"&gt;merge_chunks_per_job&lt;/a&gt;: how many RT disk chunks a single job can merge in one pass&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The compaction docs were also updated to describe optimization as an &lt;strong&gt;N-way merge&lt;/strong&gt; handled by a &lt;strong&gt;background worker pool&lt;/strong&gt; rather than a single serial merge thread.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;For RT workloads, the interesting number is often not just how fast you can insert documents, but how long it takes until compaction catches up and the table returns to its target chunk count.&lt;/p&gt;

&lt;p&gt;That is especially noticeable when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you ingest data at a sustained rate&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;optimize_cutoff&lt;/code&gt; is low enough that merges kick in early&lt;/li&gt;
&lt;li&gt;you wait for compaction to finish before considering the load fully complete&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This matters most in two common cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you are doing an initial bulk upload into a real-time table and want the table not just searchable, but already compacted to its steady state before putting more pressure on it&lt;/li&gt;
&lt;li&gt;you regularly ingest large batches and want each batch to finish cleanly before the next one arrives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The table is searchable before compaction finishes, but "fully searchable" and "fully optimized" are not the same thing. A higher chunk count can still matter if you care about keeping the table close to its target shape, limiting background merge work before the next ingest wave, or reducing the window where storage is busy with post-load compaction.&lt;/p&gt;

&lt;p&gt;To show the difference, we loaded &lt;strong&gt;10 million documents&lt;/strong&gt; into an RT table. Each document contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;id bigint&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;name text&lt;/code&gt; with generated text between 10 and 100 words&lt;/li&gt;
&lt;li&gt;&lt;code&gt;type int&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The table was created with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;optimize_cutoff&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'16'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the target was to compact the table back down to roughly 16 disk chunks.&lt;/p&gt;

&lt;p&gt;For the benchmark we used &lt;a href="https://dev.to/blog/manticore-load/"&gt;manticore-load&lt;/a&gt;, our load generation and benchmarking tool. It is useful for reproducing scenarios like this, stress-testing ingestion, and comparing configuration changes without building custom scripts every time.&lt;/p&gt;

&lt;p&gt;The data was loaded with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;manticore-load &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cache-gen-workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--drop&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--batch-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10000000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--init&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"CREATE TABLE test(id bigint, name text, type int) optimize_cutoff='16'"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO test(id,name,type) VALUES(&amp;lt;increment&amp;gt;,'&amp;lt;text/10/100&amp;gt;',&amp;lt;int/1/100&amp;gt;)"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wait&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Before: one merge job, two chunks at a time
&lt;/h2&gt;

&lt;p&gt;With the old behavior forced explicitly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysql &lt;span class="nt"&gt;-P9306&lt;/span&gt; &lt;span class="nt"&gt;-h0&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"set global parallel_chunk_merges=1; set global merge_chunks_per_job=2"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the run looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;merging started at &lt;strong&gt;14 seconds&lt;/strong&gt;, when about &lt;strong&gt;1.8M&lt;/strong&gt; documents had been inserted&lt;/li&gt;
&lt;li&gt;all &lt;strong&gt;10M&lt;/strong&gt; documents were loaded after &lt;strong&gt;1 minute 18 seconds&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;at that point the data was already fully searchable&lt;/li&gt;
&lt;li&gt;compaction kept running in the background until &lt;strong&gt;3 minutes 23 seconds&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At &lt;code&gt;01:18&lt;/code&gt;, the table still had more than 50 chunks. Near the end of loading the status looked like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;17:14:50  01:17     98%         133      128.4K   21%     5          53        1         4.22GB      9.9M
17:14:51  01:18     100%        131      310.9K   15%     1          53        1         4.27GB      10.0M
...
17:16:55  03:22     100%        0        49.4K    4%      1          17        1         4.27GB      10.0M
...
Total time:       03:23
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the classic pattern of a healthy ingest pipeline followed by a long merge tail.&lt;/p&gt;

&lt;h2&gt;
  
  
  After: parallel merges plus larger merge jobs
&lt;/h2&gt;

&lt;p&gt;With the new settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysql &lt;span class="nt"&gt;-P9306&lt;/span&gt; &lt;span class="nt"&gt;-h0&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"set global parallel_chunk_merges=3; set global merge_chunks_per_job=5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the same workload finished much faster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;merging again started at about &lt;strong&gt;14 seconds&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;all &lt;strong&gt;10M&lt;/strong&gt; documents were again loaded after about &lt;strong&gt;1 minute 18 seconds&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;full compaction finished after only &lt;strong&gt;1 minute 31 seconds&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The end of the run looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;17:19:22  01:17     99%         127      127.9K   28%     6          26        1         4.22GB      9.9M
17:19:23  01:18     100%        132      1883.8K  17%     1          23        1         4.25GB      10.0M
...
17:19:36  01:31     100%        0        110.2K   3%      1          17        1         4.25GB      10.0M
...
Total time:       01:31
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What changed in practice
&lt;/h2&gt;

&lt;p&gt;The ingest phase itself stayed roughly the same:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;old settings: &lt;strong&gt;1:18&lt;/strong&gt; to load all data&lt;/li&gt;
&lt;li&gt;new settings: &lt;strong&gt;1:18&lt;/strong&gt; to load all data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The big gain came from post-ingest compaction:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;old settings: about &lt;strong&gt;2:05&lt;/strong&gt; of additional merge time after loading finished&lt;/li&gt;
&lt;li&gt;new settings: about &lt;strong&gt;0:13&lt;/strong&gt; of additional merge time after loading finished&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;55% lower total time&lt;/strong&gt; overall, from &lt;strong&gt;3:23&lt;/strong&gt; down to &lt;strong&gt;1:31&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;about &lt;strong&gt;90% less merge tail&lt;/strong&gt; after the last document was inserted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Chunk pressure during ingest was much lower too. Near the end of loading:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;old settings: &lt;strong&gt;53 chunks&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;new settings: &lt;strong&gt;23 chunks&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the improvement is not just that compaction finishes sooner. It also keeps the chunk count under control much more aggressively while data is still being inserted.&lt;/p&gt;

&lt;h2&gt;
  
  
  What about the new defaults?
&lt;/h2&gt;

&lt;p&gt;On this server, with the new default settings and no explicit tuning at all, the same workload finished in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total time:       01:57
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That already cuts the old &lt;code&gt;03:23&lt;/code&gt; result substantially, while still leaving room for additional tuning with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;parallel_chunk_merges&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;merge_chunks_per_job&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, the new defaults already improve the out-of-the-box experience, and systems with enough I/O headroom can push compaction even further by increasing both settings carefully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Broader benchmark results: row-wise and columnar storage
&lt;/h2&gt;

&lt;p&gt;The 10M-document example above shows the mechanics clearly, but the larger picture is even more interesting. In a wider test matrix we measured the combined &lt;strong&gt;load + optimize&lt;/strong&gt; time for both row-wise and columnar storage across multiple values of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;parallel_chunk_merges&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;merge_chunks_per_job&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The headline result is that, in some cases, tuning these settings can reduce total load + optimize time by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;up to &lt;strong&gt;4.6x&lt;/strong&gt; for &lt;strong&gt;row-wise&lt;/strong&gt; storage&lt;/li&gt;
&lt;li&gt;up to &lt;strong&gt;6.8x&lt;/strong&gt; for &lt;strong&gt;columnar&lt;/strong&gt; storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is the best-vs-worst picture from that test set:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;Best settings&lt;/th&gt;
&lt;th&gt;Best time&lt;/th&gt;
&lt;th&gt;Slowest settings&lt;/th&gt;
&lt;th&gt;Slowest time&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Row-wise&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;parallel_chunk_merges=4&lt;/code&gt;, &lt;code&gt;merge_chunks_per_job=5&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;14:35&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;parallel_chunk_merges=1&lt;/code&gt;, &lt;code&gt;merge_chunks_per_job=2&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;67:15&lt;/td&gt;
&lt;td&gt;4.61x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Columnar&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;parallel_chunk_merges=4&lt;/code&gt;, &lt;code&gt;merge_chunks_per_job=5&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;15:10&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;parallel_chunk_merges=1&lt;/code&gt;, &lt;code&gt;merge_chunks_per_job=2&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;99:14&lt;/td&gt;
&lt;td&gt;6.80x&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There is also a useful tuning pattern in the full results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the best runs for both storage modes clustered around &lt;code&gt;parallel_chunk_merges=4..5&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the best runs also clustered around &lt;code&gt;merge_chunks_per_job=4..5&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the slowest results were consistently at &lt;code&gt;parallel_chunk_merges=1&lt;/code&gt; with &lt;code&gt;merge_chunks_per_job=2&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, the old serial two-chunk pattern is not just a little slower. On large workloads it can become dramatically slower, especially with columnar storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to think about the two settings
&lt;/h2&gt;

&lt;p&gt;The new docs describe two separate levers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;parallel_chunk_merges&lt;/code&gt; increases how many merge jobs can run at once&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;merge_chunks_per_job&lt;/code&gt; increases how many chunks each job can consume&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lower &lt;code&gt;merge_chunks_per_job&lt;/code&gt; values make it easier to schedule more jobs in parallel because each job consumes fewer chunks from the available pool. If a table has many chunks waiting to be compacted, smaller jobs leave more independent chunks available for other workers, so the scheduler can keep several merges active at once. Higher values reduce the total number of merge steps, but each job becomes heavier and grabs a larger portion of the available chunks, which can leave less room for concurrent jobs.&lt;/p&gt;

&lt;p&gt;The right balance depends on your storage and workload, but the benchmark above shows that combining both approaches can dramatically reduce the time spent waiting for RT chunk compaction to finish.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;If your RT workloads spend too long waiting for chunk compaction after bulk inserts, the new parallel merge model changes that equation significantly.&lt;/p&gt;

&lt;p&gt;On this 10M-document test with &lt;code&gt;optimize_cutoff=16&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Searchable at&lt;/th&gt;
&lt;th&gt;Fully optimized at&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Old settings: &lt;code&gt;parallel_chunk_merges=1&lt;/code&gt;, &lt;code&gt;merge_chunks_per_job=2&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;1:18&lt;/td&gt;
&lt;td&gt;3:23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New defaults&lt;/td&gt;
&lt;td&gt;1:18&lt;/td&gt;
&lt;td&gt;1:57&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tuned new settings: &lt;code&gt;parallel_chunk_merges=3&lt;/code&gt;, &lt;code&gt;merge_chunks_per_job=5&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;1:18&lt;/td&gt;
&lt;td&gt;1:31&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;the time until all data became searchable stayed the same&lt;/li&gt;
&lt;li&gt;the time until chunk compaction completed dropped from &lt;strong&gt;3:23&lt;/strong&gt; to &lt;strong&gt;1:31&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;even the new defaults reduced the total time to &lt;strong&gt;1:57&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is exactly the kind of improvement that matters for operational RT indexing. The data is searchable as soon as it is loaded, and that point stayed about the same in both runs. The difference is what happens after that: how long the server keeps spending time compacting chunks in the background before the table returns to its target shape. If your workflow depends on the table becoming compact again before the next heavy ingest, before a maintenance window closes, or before you hand the system over to a search workload that should run with fewer chunks and less background merge pressure, the improvement is substantial.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>database</category>
      <category>news</category>
      <category>performance</category>
    </item>
    <item>
      <title>S3 Streamable Backup: Direct-to-Cloud Backups for Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Mon, 06 Apr 2026 08:24:45 +0000</pubDate>
      <link>https://dev.to/sanikolaev/s3-streamable-backup-direct-to-cloud-backups-for-manticore-search-15jl</link>
      <guid>https://dev.to/sanikolaev/s3-streamable-backup-direct-to-cloud-backups-for-manticore-search-15jl</guid>
      <description>&lt;p&gt;Since we introduced the &lt;a href="https://manticoresearch.com/blog/new-backup-and-recovery-approaches/" rel="noopener noreferrer"&gt;backup tool&lt;/a&gt; in Manticore Search 6, backing up your data has become significantly easier. But we kept hearing the same question: &lt;em&gt;"What about cloud storage?"&lt;/em&gt; Today, we're excited to announce that &lt;strong&gt;manticore-backup&lt;/strong&gt; now supports &lt;strong&gt;S3-compatible storage&lt;/strong&gt; with streaming uploads — no intermediate files, no local disk space headaches, just direct-to-cloud backups.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Traditional Backups
&lt;/h2&gt;

&lt;p&gt;When you're running Manticore Search in production, your datasets can grow quickly. Backing up to local storage has its limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Disk space constraints&lt;/strong&gt;: You need free space equal to your backup size on the same machine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual transfer steps&lt;/strong&gt;: Backup locally, then upload to cloud storage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time overhead&lt;/strong&gt;: The copy-then-upload dance doubles your backup window&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt;: Scripting reliable uploads with resume capability, encryption, and error handling&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Streamable S3 Backup: How It Works
&lt;/h2&gt;

&lt;p&gt;The new S3 storage support streams your backup data &lt;strong&gt;directly&lt;/strong&gt; to S3-compatible storage. Here's what happens under the hood:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No intermediate files&lt;/strong&gt;: Data streams from Manticore straight to S3&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic multipart uploads&lt;/strong&gt;: Large files are automatically chunked and uploaded in parallel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in encryption&lt;/strong&gt;: SSE-S3 encryption is enabled by default for AWS S3 (configurable for other providers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compression support&lt;/strong&gt;: Optional zstd compression reduces transfer time and storage costs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manifest-based restore&lt;/strong&gt;: No &lt;code&gt;s3:ListBucket&lt;/code&gt; permission required for restores&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Supported Storage Providers
&lt;/h3&gt;

&lt;p&gt;We've tested with &lt;strong&gt;AWS S3&lt;/strong&gt;, &lt;strong&gt;MinIO&lt;/strong&gt;, and &lt;strong&gt;Cloudflare R2&lt;/strong&gt;, but any S3-compatible storage should work. The implementation uses the standard AWS SDK for PHP, so if it speaks the S3 API, it should work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Usage
&lt;/h2&gt;

&lt;p&gt;Using S3 backup is as simple as changing your destination path:&lt;/p&gt;

&lt;h3&gt;
  
  
  CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set your credentials&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_access_key
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_secret_key
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1

&lt;span class="c"&gt;# Backup to S3&lt;/span&gt;
manticore-backup &lt;span class="nt"&gt;--config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/manticore/manticore.conf &lt;span class="nt"&gt;--backup-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;s3://my-bucket/manticore-backups

&lt;span class="c"&gt;# With custom endpoint (MinIO, Wasabi, etc.)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_ENDPOINT_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://minio.example.com
manticore-backup &lt;span class="nt"&gt;--config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/manticore/manticore.conf &lt;span class="nt"&gt;--backup-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;s3://my-bucket/backups
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment Variables
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variable&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Your S3 access key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Your S3 secret key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AWS_REGION&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;S3 region (e.g., &lt;code&gt;us-east-1&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AWS_ENDPOINT_URL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Custom endpoint for S3-compatible storage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AWS_S3_ENCRYPTION&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Set to &lt;code&gt;0&lt;/code&gt; to disable SSE-S3 encryption (for MinIO/custom endpoints)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Performance Considerations
&lt;/h2&gt;

&lt;p&gt;S3 streaming backup performance depends primarily on your network bandwidth and the S3 provider's upload speeds. Unlike local disk backups where you're limited by disk I/O, S3 backups are network-bound. The key advantage is eliminating the "write locally, then upload" overhead — data streams directly from Manticore to S3 without touching the local filesystem.&lt;/p&gt;

&lt;p&gt;For optimal performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensure adequate upload bandwidth to your S3 endpoint&lt;/li&gt;
&lt;li&gt;Consider using compression (&lt;code&gt;--compress&lt;/code&gt;) to reduce data transfer&lt;/li&gt;
&lt;li&gt;Multipart uploads are automatic for files over 5MB, improving reliability for large datasets&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Restore from S3
&lt;/h2&gt;

&lt;p&gt;Restoring works seamlessly too. The tool downloads files to a temporary directory first, then performs the restore:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List available backups&lt;/span&gt;
manticore-backup &lt;span class="nt"&gt;--backup-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;s3://my-bucket/manticore-backups &lt;span class="nt"&gt;--list&lt;/span&gt;

&lt;span class="c"&gt;# Restore a specific backup&lt;/span&gt;
manticore-backup &lt;span class="nt"&gt;--config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/manticore/manticore.conf &lt;span class="nt"&gt;--backup-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;s3://my-bucket/manticore-backups &lt;span class="nt"&gt;--restore&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;backup-20250115120000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Required S3 Permissions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;For backup:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;s3:PutObject&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;s3:PutObjectAcl&lt;/code&gt; (if using ACLs)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;For listing backups:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;s3:ListBucket&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;For restore:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;s3:GetObject&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; While listing backups requires &lt;code&gt;s3:ListBucket&lt;/code&gt;, restoring a specific backup does not. If you know the backup folder name (e.g., &lt;code&gt;backup-20250115120000&lt;/code&gt;), you can restore directly using &lt;code&gt;--restore&lt;/code&gt; with just &lt;code&gt;s3:GetObject&lt;/code&gt; permission. The manifest file tracks all backup contents, so no directory listing is needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cloud-Native Deployments
&lt;/h3&gt;

&lt;p&gt;Running Manticore in Kubernetes or Docker? S3 backup fits naturally into cloud-native workflows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Kubernetes CronJob example&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;batch/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CronJob&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manticore-backup&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;jobTemplate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backup&lt;/span&gt;
            &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manticoresearch/manticore:latest&lt;/span&gt;
            &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;manticore-backup&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--config=/etc/manticore/manticore.conf&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--backup-dir=s3://my-backup-bucket/manticore&lt;/span&gt;
            &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;
              &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;s3-credentials&lt;/span&gt;
                  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;access-key&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;
              &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;s3-credentials&lt;/span&gt;
                  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secret-key&lt;/span&gt;
          &lt;span class="na"&gt;restartPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;OnFailure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Disaster Recovery
&lt;/h3&gt;

&lt;p&gt;Store backups in a different region or even a different cloud provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Primary backup to local S3-compatible storage&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_ENDPOINT_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://minio.internal.company.com
manticore-backup &lt;span class="nt"&gt;--backup-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;s3://backups-primary/manticore

&lt;span class="c"&gt;# Secondary backup to AWS S3 for DR&lt;/span&gt;
&lt;span class="nb"&gt;unset &lt;/span&gt;AWS_ENDPOINT_URL
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;eu-west-1
manticore-backup &lt;span class="nt"&gt;--backup-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;s3://company-dr-backups/manticore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Reducing Local Storage Requirements
&lt;/h3&gt;

&lt;p&gt;For large datasets, local backup storage can be expensive. With S3 streaming:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No need to provision large backup volumes&lt;/li&gt;
&lt;li&gt;Pay only for the S3 storage you use&lt;/li&gt;
&lt;li&gt;Lifecycle policies can automatically move old backups to cheaper storage classes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Technical Details
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Streaming Architecture
&lt;/h3&gt;

&lt;p&gt;The S3 storage implementation uses a streaming approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;File-by-file streaming&lt;/strong&gt;: Each table file is read and uploaded as a stream&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic multipart&lt;/strong&gt;: Files over 5MB automatically use multipart upload for reliability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compression on-the-fly&lt;/strong&gt;: If enabled, zstd compression happens during the stream&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Checksum verification&lt;/strong&gt;: Each file is checksummed to ensure integrity&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Storage Interface
&lt;/h3&gt;

&lt;p&gt;The S3 support is built on a new &lt;code&gt;StorageInterface&lt;/code&gt; that abstracts storage operations. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local filesystem and S3 share the same code path&lt;/li&gt;
&lt;li&gt;Future storage backends (GCS, Azure Blob) can be added easily&lt;/li&gt;
&lt;li&gt;Consistent behavior regardless of storage type&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Migration from Local Backups
&lt;/h2&gt;

&lt;p&gt;Already using local backups? Migration is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set up your S3 credentials&lt;/li&gt;
&lt;li&gt;Change &lt;code&gt;--backup-dir&lt;/code&gt; from &lt;code&gt;/local/path&lt;/code&gt; to &lt;code&gt;s3://bucket/path&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;That's it! The same commands work exactly the same way&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Your existing local backups remain accessible, and you can gradually transition to S3 or maintain both for redundancy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;S3 streamable backup brings Manticore Search backup capabilities to the cloud era. Whether you're running in a cloud-native environment, need cross-region disaster recovery, or simply want to reduce local storage overhead, direct-to-S3 streaming makes backups simpler and more efficient.&lt;/p&gt;

&lt;p&gt;The feature is available now in manticore-backup. Check out the &lt;a href="https://manual.manticoresearch.com/Securing_and_compacting_a_table/Backup_and_restore#S3-storage-support" rel="noopener noreferrer"&gt;documentation&lt;/a&gt; for more details, and let us know what you think!&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Ready to try it?&lt;/strong&gt; &lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;Install Manticore Search&lt;/a&gt; and start backing up to S3 today. Questions or feedback? Join us on &lt;a href="https://slack.manticoresearch.com/" rel="noopener noreferrer"&gt;Slack&lt;/a&gt; or &lt;a href="https://github.com/manticoresoftware/manticoresearch-backup" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>database</category>
      <category>devops</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Prepared statements in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Fri, 03 Apr 2026 04:38:10 +0000</pubDate>
      <link>https://dev.to/sanikolaev/prepared-statements-in-manticore-search-2n4e</link>
      <guid>https://dev.to/sanikolaev/prepared-statements-in-manticore-search-2n4e</guid>
      <description>&lt;p&gt;Imagine you're building a powerful search application. Users type in keywords, and your backend needs to query the Manticore Search database to find matching results. A common (and tempting!) approach is to embed user input directly into your SQL queries. For example, you might filter by a numeric field such as a category or record ID. If the user passes a normal value like &lt;code&gt;5&lt;/code&gt;, the query is &lt;code&gt;SELECT * FROM products WHERE id=5&lt;/code&gt;. But what if they pass &lt;code&gt;1 OR 1=1&lt;/code&gt;? The query becomes &lt;code&gt;SELECT * FROM products WHERE id=1 OR 1=1&lt;/code&gt; — the condition is always true, so the query returns every row instead of one. This is SQL injection.&lt;/p&gt;

&lt;p&gt;Fortunately, there's a safer and more efficient way: &lt;strong&gt;prepared statements&lt;/strong&gt;. Essentially, prepared statements separate your SQL code from the data you pass in. Instead of building the entire query string each time, you define the query structure once with placeholders and then supply the search terms separately. You can learn more about the concept on &lt;a href="https://en.wikipedia.org/wiki/Prepared_statement" rel="noopener noreferrer"&gt;Wikipedia&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Manticore Search supports prepared statements over the standard MySQL protocol, giving you a powerful tool for building secure search applications. By using prepared statements, you'll not only dramatically reduce the risk of SQL injection, but you'll also improve the readability of your code.&lt;/p&gt;

&lt;p&gt;Prepared statements aren't just a feature; they're sometimes a requirement. For example, the Rust &lt;code&gt;sqlx&lt;/code&gt; library works with the MySQL endpoint solely using prepared statements. Also, some OLE DB connectors that enable MS SQL to work with a MySQL server use prepared statements internally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Use Prepared Statements?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Security First (SQL Injection)&lt;/strong&gt;: SQL injection is a web security vulnerability that allows attackers to interfere with the queries an application makes to its database. It happens when user input is improperly incorporated into a SQL query, allowing malicious code to be executed. For example, consider a simple search query built by concatenating a user's search term directly into the SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Vulnerable code example (DO NOT USE!)&lt;/span&gt;
&lt;span class="nv"&gt;$productId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$_GET&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'search'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nv"&gt;$query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SELECT * FROM products WHERE id= "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$productId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;$productId&lt;/code&gt; contained something like &lt;code&gt;0 OR 1=1&lt;/code&gt;, the query would become &lt;code&gt;SELECT * FROM products WHERE id= 0 OR 1=1&lt;/code&gt;, effectively bypassing the WHERE clause and returning all products.&lt;/p&gt;

&lt;p&gt;Prepared statements prevent this by treating user input strictly as &lt;em&gt;data&lt;/em&gt;, not as part of the SQL command itself. The database driver handles the escaping and quoting, ensuring that any potentially harmful characters are neutralized. Here's the same query using a prepared statement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Secure code example using a prepared statement&lt;/span&gt;
&lt;span class="nv"&gt;$productId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$_GET&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'search'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$mysqli&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"SELECT * FROM products WHERE id= ?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"i"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$productId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, even if &lt;code&gt;$productId&lt;/code&gt; contains malicious code, it will be treated as a literal value, not executable SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  How They Work
&lt;/h2&gt;

&lt;p&gt;Prepared statements operate using a simple three-step process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Prepare:&lt;/strong&gt; First, you send the SQL statement with placeholders (like &lt;code&gt;?&lt;/code&gt; or &lt;code&gt;?VEC?&lt;/code&gt;) to Manticore Search. Manticore parses this statement and creates a query plan. It then returns a unique identifier for this prepared statement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bind:&lt;/strong&gt; Next, you send the actual data – the values for the placeholders – to Manticore &lt;em&gt;separately&lt;/em&gt;. This is where the security comes in; the data is treated purely as data, not as SQL code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute:&lt;/strong&gt; Finally, you instruct Manticore to execute the prepared statement using the stored query plan and the bound parameters.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Think of it like creating a template. You build the structure once, then fill in the blanks with different information each time you need to use it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parameter Placeholders: &lt;code&gt;?&lt;/code&gt; &amp;amp; &lt;code&gt;?VEC?&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Manticore Search uses specific placeholders to identify parameters within your prepared statements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;?&lt;/code&gt; represents a single parameter – this could be an integer, a floating-point number, or a string. When using this placeholder, Manticore automatically handles escaping and quoting for string values, protecting against SQL injection and ensuring proper data formatting.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;?VEC?&lt;/code&gt; is designed for lists of numeric values. It expects a string containing numbers separated by commas and optional spaces (e.g., &lt;code&gt;1, 2.3, 4, 1e-10, INF&lt;/code&gt;). Crucially, &lt;em&gt;no escaping or quoting is applied&lt;/em&gt; to the values within &lt;code&gt;?VEC?&lt;/code&gt;. Valid input consists solely of numbers, commas, and spaces; any other characters will likely result in an error. This makes it perfect for directly inserting numeric vectors into your data - both float vectors and integer MVAs (multi-value attributes).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example: prepared statements in PHP
&lt;/h2&gt;

&lt;p&gt;Let's see how prepared statements work in practice using PHP. We'll demonstrate both a simple insert with string values and a more complex insert involving a floating-point vector using the &lt;code&gt;?VEC?&lt;/code&gt; placeholder.&lt;/p&gt;

&lt;p&gt;First, a basic insertion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;
&lt;span class="c1"&gt;// Assuming you have a valid MySQLi connection established ($mysqli)&lt;/span&gt;

&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$mysqli&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO products (name, description) VALUES (?, ?)"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$productName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Awesome Widget"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$productDescription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"A truly amazing widget for all your needs."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"ss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$productName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$productDescription&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "ss" indicates two strings&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Product added successfully!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;?&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code prepares the &lt;code&gt;INSERT&lt;/code&gt; statement, binds the string values for the product name and description, and then executes the query. The resulting SQL executed by Manticore would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Awesome Widget'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'A truly amazing widget for all your needs.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let's tackle an example using a float vector. &lt;strong&gt;What is &lt;code&gt;?VEC?&lt;/code&gt;?&lt;/strong&gt; It is a placeholder (only used in prepared statements) for a &lt;em&gt;vector&lt;/em&gt; — a list of numbers, e.g. for embeddings or similar data. In Manticore SQL, a vector literal is always written with parentheses: &lt;code&gt;(0.1, 0.2, 0.3)&lt;/code&gt;. So when you use a prepared statement and have a vector parameter, you write those parentheses in the SQL string and use &lt;code&gt;?VEC?&lt;/code&gt; where the numbers go. You bind only the comma-separated numbers (e.g. &lt;code&gt;"0.1,0.2,0.3"&lt;/code&gt;); you do not bind the &lt;code&gt;(&lt;/code&gt; and &lt;code&gt;)&lt;/code&gt; — they stay in the query. Without prepared statements you would build the full literal &lt;code&gt;(0.1, 0.2, 0.3)&lt;/code&gt; yourself in the query string.&lt;/p&gt;

&lt;p&gt;In PHP &lt;code&gt;mysqli&lt;/code&gt;, the usual way to bind &lt;code&gt;?VEC?&lt;/code&gt; values is as strings, so &lt;code&gt;iss&lt;/code&gt; is the normal choice in this example. If you want to stream a larger vector payload, you can also bind the parameter as &lt;code&gt;b&lt;/code&gt; and send the contents with &lt;code&gt;send_long_data()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;
&lt;span class="c1"&gt;// Assuming you have a valid MySQLi connection established ($mysqli)&lt;/span&gt;

&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$mysqli&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO items (item_id, coords, features) VALUES (?, (?VEC?),(?VEC?))"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$itemId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$coordVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"20.245,54.354,30.000"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// that is vector of floats&lt;/span&gt;
&lt;span class="nv"&gt;$featureSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1,4,20,456,112,3"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// that is set of integer values (MVA)&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$itemId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$coordVector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$featureSet&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "i" for integer (itemId), "s" for string&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Item with feature vector added successfully!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nv"&gt;$itemId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;124&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$coordVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"18.500,42.000,31.125"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Another float vector&lt;/span&gt;
&lt;span class="nv"&gt;$featureSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0,6,34,665,22,3445,221,564,2232,5644,43"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Example with more feature values&lt;/span&gt;

&lt;span class="c1"&gt;// For larger payloads you can bind the second ?VEC? as a blob and stream it.&lt;/span&gt;
&lt;span class="nv"&gt;$featurePlaceholder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"isb"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$itemId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$coordVector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$featurePlaceholder&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "b" is for blob data&lt;/span&gt;
&lt;span class="c1"&gt;// bind_param() must be called before send_long_data().&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;send_long_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$featureSet&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// zero-based index: 2 means the third bound parameter&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Item with feature vector added successfully!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;?&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that the parentheses are &lt;em&gt;part of the SQL string&lt;/em&gt; in the &lt;code&gt;prepare()&lt;/code&gt; call. We only bind the &lt;em&gt;values&lt;/em&gt; within the parentheses using the &lt;code&gt;?VEC?&lt;/code&gt; placeholder. The resulting SQL executed by Manticore will be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;245&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;354&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;456&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;112&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;124&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;125&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;34&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;665&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3445&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;221&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;564&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2232&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5644&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;43&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;?VEC?&lt;/code&gt; in a prepared statement gives you the same benefits as with the &lt;code&gt;?&lt;/code&gt; placeholder: the vector values are sent as data, not as part of the SQL text, so they cannot be interpreted as SQL and cannot cause injection. You also avoid having to manually build or escape the vector literal in your application — Manticore receives the bound numbers and formats the vector correctly, which keeps the query safe and the data consistent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Important Considerations &amp;amp; Limitations
&lt;/h2&gt;

&lt;p&gt;While powerful, Manticore's prepared statements have a few limitations to keep in mind.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-Queries:&lt;/strong&gt; Only a single SQL statement is allowed per prepared statement. Attempts to use multi-queries (e.g., &lt;code&gt;SELECT ...; SHOW META&lt;/code&gt;) will fail. If you need to execute multiple statements, prepare a separate statement for each one and execute them sequentially within the same session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Numeric Types:&lt;/strong&gt; Some database drivers (like &lt;code&gt;mysql2&lt;/code&gt; for Node.js) might send numeric parameters as &lt;code&gt;DOUBLE&lt;/code&gt; by default. This could lead to unexpected behavior if you require strict integer behavior (like rejecting negative IDs). In such cases, consider sending integers as strings or utilize driver-specific integer types (e.g., &lt;code&gt;BigInt&lt;/code&gt;) to ensure correct data handling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rust &lt;code&gt;sqlx&lt;/code&gt; Users:&lt;/strong&gt; If you're using the &lt;code&gt;sqlx&lt;/code&gt; crate in Rust, be aware that when reading result set rows, you &lt;strong&gt;must&lt;/strong&gt; use column &lt;em&gt;indices&lt;/em&gt; rather than column names. While column names are present in the result set, &lt;code&gt;sqlx&lt;/code&gt; doesn't utilize them for mapping. For example, use &lt;code&gt;row.try_get(0)?&lt;/code&gt; instead of &lt;code&gt;row.try_get("id")?&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Prepared statements offer a critical combination of security, readability, and potential performance gains when working with Manticore Search. By separating your SQL logic from your data, you dramatically reduce the risk of SQL injection attacks, improve code maintainability, and potentially speed up query execution. We strongly encourage you to adopt prepared statements in your Manticore Search applications.&lt;/p&gt;

&lt;p&gt;For more in-depth information, be sure to consult these resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manticore Search Documentation on Prepared Statements: &lt;a href="https://manual.manticoresearch.com/Connecting_to_the_server/MySQL_protocol#Prepared-statements" rel="noopener noreferrer"&gt;https://manual.manticoresearch.com/Connecting_to_the_server/MySQL_protocol#Prepared-statements&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Wikipedia - Prepared Statements: &lt;a href="https://en.wikipedia.org/wiki/Prepared_statement" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Prepared_statement&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This guide provides a solid foundation for using prepared statements effectively in your Manticore Search projects, leading to more secure, efficient, and maintainable applications.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>database</category>
      <category>security</category>
      <category>sql</category>
    </item>
    <item>
      <title>KNN prefiltering in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Thu, 02 Apr 2026 05:50:56 +0000</pubDate>
      <link>https://dev.to/sanikolaev/knn-prefiltering-in-manticore-search-c2f</link>
      <guid>https://dev.to/sanikolaev/knn-prefiltering-in-manticore-search-c2f</guid>
      <description>&lt;p&gt;Vector search rarely happens in isolation. You almost always have filters — a price range, a category, a date window, a geographic boundary. The question is: when do those filters get applied?&lt;/p&gt;

&lt;p&gt;The answer makes a surprising difference in result quality.&lt;/p&gt;

&lt;p&gt;KNN prefiltering is available in Manticore Search starting from version &lt;code&gt;19.0.1&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with postfiltering
&lt;/h2&gt;

&lt;p&gt;Consider a product catalog with 10 million items. A user asks for the 10 nearest neighbors to a query vector, restricted to &lt;code&gt;category = 'electronics'&lt;/code&gt;. With postfiltering, the KNN search runs first over the entire dataset, then the filter is applied to the results. If electronics make up 5% of the catalog, the graph explores nodes that are mostly irrelevant. Worse, many of the k nearest neighbors may not be electronics at all, so the final result set can be much smaller than requested. Ask for 10 results, get 2.&lt;/p&gt;

&lt;p&gt;This is the fundamental limitation of postfiltering: the HNSW graph doesn't know about your filters. It finds the closest vectors overall, not the closest vectors that match your criteria. The more selective the filter, the worse the problem gets.&lt;/p&gt;

&lt;h2&gt;
  
  
  What prefiltering does differently
&lt;/h2&gt;

&lt;p&gt;Prefiltering passes the filter into the HNSW graph traversal itself. As the algorithm explores candidate nodes, each one is checked against the filter before being added to the result heap. Only matching documents contribute to the final k results. This means you reliably get the k results you asked for, assuming k matching documents exist in the dataset.&lt;/p&gt;

&lt;p&gt;In Manticore Search, prefiltering is enabled by default when your query combines KNN search with attribute filters. No special syntax is needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both &lt;code&gt;category = 'electronics'&lt;/code&gt; and &lt;code&gt;price &amp;lt; 500&lt;/code&gt; are evaluated during HNSW traversal, not after. The equivalent JSON query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.33&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"must"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"equals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"electronics"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"range"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"lt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Naive prefiltering and where it falls short
&lt;/h2&gt;

&lt;p&gt;The obvious first approach is straightforward: traverse the HNSW graph normally, compute distances for every neighbor, but only add filter-matching nodes to the result heap. Filtered-out nodes still participate in navigation — if a non-matching node has a competitive distance, it enters the candidate queue and its neighbors get explored. The filter only gates what goes into the results.&lt;/p&gt;

&lt;p&gt;This actually works reasonably well. The graph stays connected because filtered-out nodes are still traversed. But it has a performance problem that gets worse as the filter becomes more selective: every unvisited neighbor gets a distance computation regardless of whether it passes the filter. Distance computation is the most expensive operation in the search. With a filter matching 5% of documents, 95% of that work produces results that are immediately discarded. The algorithm pays full cost for navigation but gets no results from most of the work.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Manticore solves it: ACORN-1
&lt;/h2&gt;

&lt;p&gt;Manticore uses an ACORN-1-based algorithm (from the &lt;a href="https://arxiv.org/abs/2403.04871" rel="noopener noreferrer"&gt;ACORN paper&lt;/a&gt;, SIGMOD 2024) that improves on naive prefiltering in two ways:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No distance computation for filtered-out nodes.&lt;/strong&gt; When visiting a node's neighbors, ACORN-1 checks the filter first and only computes distance for nodes that pass. Filtered-out neighbors are never scored. When 95% of nodes fail the filter, this saves roughly 95% of the distance work compared to the naive approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adaptive expansion through filtered-out nodes.&lt;/strong&gt; When a neighbor fails the filter, the algorithm looks through that node's own neighbors to find filter-passing nodes further away. If those neighbors also fail the filter and not enough matching candidates have been found yet, it keeps going — 3 hops, 4 hops, as far as needed. The more selective the filter, the more aggressively the algorithm expands. This targeted walk through non-matching neighborhoods reaches matching candidates without scoring the non-matching ones along the way.&lt;/p&gt;

&lt;p&gt;Think of it as searching for Italian restaurants in a city. The naive approach checks the menu at every restaurant and only keeps the Italian ones. ACORN-1 glances at the sign first — "French, skip; Thai, skip" — without going inside. And when it sees a stretch of non-Italian restaurants, it walks past them, peeking around each corner until it finds an Italian place on the other side.&lt;/p&gt;

&lt;p&gt;Manticore activates ACORN-1 when fewer than 60% of total documents pass the filter. Above that threshold, naive prefiltering works well enough on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automatic brute-force fallback
&lt;/h2&gt;

&lt;p&gt;Prefiltering works well across a wide range of filter selectivities, but there's an extreme case: what if only 50 documents out of 10 million match the filter? Traversing the HNSW graph — even with ACORN-1 — visits far more nodes than just scanning those 50 documents directly.&lt;/p&gt;

&lt;p&gt;Manticore detects this automatically. When prefiltering is enabled, the query planner estimates the cost of HNSW traversal versus a brute-force distance scan over the filtered subset. It uses histogram-based selectivity estimates to predict how many documents pass the filter, then compares that against the expected number of nodes HNSW would visit. If brute-force is cheaper, Manticore skips HNSW entirely and scans the filtered documents directly.&lt;/p&gt;

&lt;p&gt;This means you don't need to think about edge cases. Prefiltering adapts: ACORN-1 for moderate selectivity, brute-force for extreme selectivity, and standard HNSW when no filter is present.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use postfiltering instead
&lt;/h2&gt;

&lt;p&gt;Prefiltering isn't always the best choice. There are cases where postfiltering is preferable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;When you want the closest vectors regardless of filters.&lt;/strong&gt; Postfiltering gives you the k nearest neighbors from the full dataset, then removes non-matching ones. If your application tolerates getting fewer than k results and you care most about vector distance quality, postfiltering is simpler and more predictable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When the filter matches most documents.&lt;/strong&gt; If 95% of documents pass the filter, prefiltering adds overhead for almost no benefit — nearly every candidate matches anyway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When you're debugging or benchmarking.&lt;/strong&gt; Postfiltering gives you a clean baseline: pure HNSW results with a filter on top. This makes it easier to isolate whether a quality issue comes from the graph or the filter.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To explicitly request postfiltering in SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;prefilter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In JSON, set &lt;code&gt;"prefilter": false&lt;/code&gt; inside the &lt;code&gt;knn&lt;/code&gt; object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.33&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prefilter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"equals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"electronics"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Forcing brute-force
&lt;/h2&gt;

&lt;p&gt;If you know your dataset is small enough or your filters selective enough that a linear scan is the right strategy, you can force brute-force mode directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;fullscan&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This skips HNSW entirely and computes exact distances over all documents that pass the filter. It guarantees perfect recall at the cost of linear-time scanning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Prefiltering is the default in Manticore and the right choice for most filtered KNN queries. It guarantees you get k results (if they exist). Manticore automatically picks the best strategy based on how selective the filter is: standard filtered HNSW when most documents match, ACORN-1 when fewer than 60% pass (saving distance computations on filtered-out nodes), and brute-force when the filtered subset is small enough to scan directly. The query planner estimates filter selectivity per-query, per-segment, so there's nothing to tune.&lt;/p&gt;

&lt;p&gt;Use postfiltering (&lt;code&gt;prefilter=0&lt;/code&gt; in SQL, &lt;code&gt;"prefilter": false&lt;/code&gt; in JSON) when you want the globally closest vectors and can tolerate getting fewer than k results. Use brute-force (&lt;code&gt;fullscan=1&lt;/code&gt; in SQL, &lt;code&gt;"fullscan": true&lt;/code&gt; in JSON) when you know a linear scan is the right strategy for your data.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>database</category>
      <category>machinelearning</category>
      <category>performance</category>
    </item>
    <item>
      <title>Hybrid search in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 01 Apr 2026 10:46:41 +0000</pubDate>
      <link>https://dev.to/sanikolaev/hybrid-search-in-manticore-search-5ake</link>
      <guid>https://dev.to/sanikolaev/hybrid-search-in-manticore-search-5ake</guid>
      <description>&lt;p&gt;Search is rarely a one-size-fits-all problem. A user typing "cheap running shoes" wants exact keyword matches, but a user asking "comfortable footwear for jogging" is expressing the same intent in different words. Traditional full-text search handles the first case well. Vector search handles the second. Hybrid search combines both in a single query so you don't have to choose.&lt;/p&gt;

&lt;p&gt;In modern search systems, this is often described as combining &lt;strong&gt;lexical (sparse) retrieval&lt;/strong&gt; with &lt;strong&gt;semantic (dense) retrieval&lt;/strong&gt;. Different terms, same idea: exact matching plus meaning.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is hybrid search?
&lt;/h2&gt;

&lt;p&gt;Hybrid search runs a full-text (BM25) search and a vector (KNN) search side by side, then merges the two result lists into one. Documents that score well on either signal (or both) rise to the top.&lt;/p&gt;

&lt;p&gt;Full-text search is great at exact keywords, rare terms, and identifiers. Vector search understands meaning — that "automobile" and "car" are the same concept — because their embeddings are nearby in vector space.&lt;/p&gt;

&lt;p&gt;Each method has blind spots:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full-text struggles with synonyms and natural language&lt;/li&gt;
&lt;li&gt;Vector search struggles with exact tokens like SKUs, error codes, and IDs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hybrid search covers both.&lt;/p&gt;

&lt;h2&gt;
  
  
  How hybrid search fits into modern search pipelines
&lt;/h2&gt;

&lt;p&gt;Hybrid search is the &lt;strong&gt;retrieval stage&lt;/strong&gt; — the part that finds relevant candidates from your dataset.&lt;/p&gt;

&lt;p&gt;Instead of relying on a single method, hybrid search combines keyword matching and semantic similarity to produce a stronger result set from the start.&lt;/p&gt;

&lt;p&gt;In practice, this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better recall for natural language queries&lt;/li&gt;
&lt;li&gt;Precise matching for identifiers like SKUs or error codes&lt;/li&gt;
&lt;li&gt;More relevant results without needing complex query logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is simple: return the best possible candidates in a single pass, using both signals together.&lt;/p&gt;

&lt;h2&gt;
  
  
  When should you use it?
&lt;/h2&gt;

&lt;p&gt;Hybrid search is a good fit when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your queries mix intent and specifics. A search like &lt;code&gt;python error 403 forbidden&lt;/code&gt; benefits from keyword precision on the error code and semantic understanding of the problem description.&lt;/li&gt;
&lt;li&gt;You're building a RAG pipeline. Retrieval-Augmented Generation needs the most relevant chunks fed to the LLM. Hybrid retrieval consistently finds more relevant documents than either method alone.&lt;/li&gt;
&lt;li&gt;Your catalog has structured and unstructured data. E-commerce products have precise names and model numbers (keyword territory) but also descriptions where meaning matters more than exact wording.&lt;/li&gt;
&lt;li&gt;You can't predict how users will search. Some will paste exact phrases, others will describe what they're looking for in natural language. Hybrid search handles both gracefully.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;Manticore uses Reciprocal Rank Fusion (RRF) to merge results. The idea is simple: instead of trying to compare raw BM25 scores with KNN distances (which are on completely different scales), RRF looks at rank positions. A document that's ranked #1 in the text results and #3 in the KNN results gets a higher combined score than a document that only appears in one list.&lt;/p&gt;

&lt;p&gt;Here's a quick example. Suppose a text search and a KNN search each return their own top 3:&lt;/p&gt;

&lt;p&gt;Text search results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Doc A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Doc B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Doc C&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;KNN search results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Doc C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Doc A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Doc D&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;RRF scores each document using the formula &lt;code&gt;1 / (rank_constant + rank)&lt;/code&gt;. With the default &lt;code&gt;rank_constant=60&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;th&gt;Text contribution&lt;/th&gt;
&lt;th&gt;KNN contribution&lt;/th&gt;
&lt;th&gt;RRF score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Doc A&lt;/td&gt;
&lt;td&gt;1/(60+1) = 0.0164&lt;/td&gt;
&lt;td&gt;1/(60+2) = 0.0161&lt;/td&gt;
&lt;td&gt;0.0325&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc C&lt;/td&gt;
&lt;td&gt;1/(60+3) = 0.0159&lt;/td&gt;
&lt;td&gt;1/(60+1) = 0.0164&lt;/td&gt;
&lt;td&gt;0.0323&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc B&lt;/td&gt;
&lt;td&gt;1/(60+2) = 0.0161&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;0.0161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc D&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;1/(60+3) = 0.0159&lt;/td&gt;
&lt;td&gt;0.0159&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Doc A ranks highest because it appears near the top in both lists. Doc C is close behind for the same reason. Doc B and Doc D each appear in only one list, so they score lower.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why RRF?
&lt;/h3&gt;

&lt;p&gt;There are two common ways to combine results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rank-based fusion (RRF)&lt;/strong&gt; — simple, robust, no need to normalize scores&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Score-based fusion&lt;/strong&gt; — normalize scores first, then combine&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Manticore uses RRF because it works well out of the box and avoids score calibration problems.&lt;/p&gt;

&lt;p&gt;Under the hood, a hybrid query is split into independent sub-queries — one for full-text, one (or more) for KNN — that run in parallel. Once all sub-queries finish, RRF fuses their ranked result lists into a single output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not just use one or the other?
&lt;/h2&gt;

&lt;p&gt;Consider a support knowledge base with articles for different error codes — connection failures, authentication problems, sync issues. A user sees error E-5020 on screen and reports: &lt;code&gt;"I can't connect to the server."&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Vector search understands the symptom but not the error code. A KNN search for "can not connect to the server" returns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Title&lt;/th&gt;
&lt;th&gt;KNN distance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Error E-5030: DNS Resolution Failed&lt;/td&gt;
&lt;td&gt;0.572&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Error E-2091: App Loading Timeout&lt;/td&gt;
&lt;td&gt;0.583&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Error E-5020: SSL Certificate Mismatch&lt;/td&gt;
&lt;td&gt;0.605&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Error E-5010: Service Unavailable&lt;/td&gt;
&lt;td&gt;0.622&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Error E-4001: Login Failed&lt;/td&gt;
&lt;td&gt;0.665&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The correct article (E-5020) is buried at #3. KNN ranks DNS and timeout errors higher because their descriptions are semantically closer to "can't connect." The actual problem — an SSL certificate mismatch — uses completely different vocabulary, so it scores lower.&lt;/p&gt;

&lt;p&gt;You might think: just add the error code to the KNN query. But "E-5020" and "E-5010" are arbitrary identifiers with no semantic meaning — embeddings treat them as nearly identical tokens. KNN for "E-5020 can not connect to the server" does move E-5020 to #1, but only because the added text shifts the semantic context — the error code itself carries no weight.&lt;/p&gt;

&lt;p&gt;Hybrid search solves this by sending each signal where it works best — the error code to full-text, the symptom to KNN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;support_articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'can not connect to the server'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'E-5020'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fusion_method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Title&lt;/th&gt;
&lt;th&gt;Hybrid score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Error E-5020: SSL Certificate Mismatch&lt;/td&gt;
&lt;td&gt;0.032&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Error E-5030: DNS Resolution Failed&lt;/td&gt;
&lt;td&gt;0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Error E-2091: App Loading Timeout&lt;/td&gt;
&lt;td&gt;0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Error E-5010: Service Unavailable&lt;/td&gt;
&lt;td&gt;0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Error E-4001: Login Failed&lt;/td&gt;
&lt;td&gt;0.015&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;E-5020 jumps from #3 to #1 with twice the score of everything else. Full-text treats "E-5020" as an exact string — not similar to "E-5010", not close enough, just different. KNN ensures related connection errors still appear below for context.&lt;/p&gt;

&lt;p&gt;This is the core value of hybrid search:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifiers → full-text&lt;/li&gt;
&lt;li&gt;Meaning → vector search&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each method covers the other's blind spot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;The simplest way to run a hybrid search is with &lt;code&gt;hybrid_match()&lt;/code&gt;. If your table has auto-embeddings configured, one line does everything — text search, embedding generation, KNN search, and RRF fusion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hybrid_match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'running shoes'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The JSON equivalent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hybrid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"running shoes"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Manticore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;generates embeddings&lt;/li&gt;
&lt;li&gt;runs both searches in parallel&lt;/li&gt;
&lt;li&gt;fuses results&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Full control: explicit MATCH + KNN
&lt;/h3&gt;

&lt;p&gt;When you need to supply your own vectors or tune individual sub-queries, use the explicit form with &lt;code&gt;MATCH()&lt;/code&gt; and &lt;code&gt;KNN()&lt;/code&gt; in the WHERE clause:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'running shoes'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...))&lt;/span&gt;
&lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fusion_method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"query_vector"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"running shoes"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"options"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"fusion_method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rrf"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each result includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;hybrid_score()&lt;/code&gt; — fused score (used for default sorting)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;weight()&lt;/code&gt; — BM25 score&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;knn_dist()&lt;/code&gt; — vector distance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Attribute filters (&lt;code&gt;AND category = 'footwear'&lt;/code&gt;) apply to both sub-queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tuning
&lt;/h2&gt;

&lt;p&gt;Three options let you adjust fusion behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;rank_constant&lt;/code&gt; — controls how much top positions dominate the fused score. Lower values (e.g. 10) make rank #1 count significantly more than rank #5. Higher values flatten the curve. See &lt;a href="https://manual.manticoresearch.com/Searching/Options#rank_constant" rel="noopener noreferrer"&gt;rank_constant&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fusion_weights&lt;/code&gt; — lets you give different importance to each sub-query. If text relevance matters more than vector similarity, weight it higher. See &lt;a href="https://manual.manticoresearch.com/Searching/Options#fusion_weights" rel="noopener noreferrer"&gt;fusion_weights&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;window_size&lt;/code&gt; — how many results each sub-query retrieves before fusion. By default, Manticore computes this automatically from your KNN parameters and query LIMIT. See &lt;a href="https://manual.manticoresearch.com/Searching/Options#window_size" rel="noopener noreferrer"&gt;window_size&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Multi-vector fusion
&lt;/h2&gt;

&lt;p&gt;Hybrid search isn't limited to one text search plus one KNN search. You can fuse multiple vector searches together — useful when your data has several distinct semantic dimensions. For example, an e-commerce product has a textual description and a photo. A user searching for "minimalist white sneakers" cares about both: the title should match the style, and the product image should look like what they have in mind. By encoding the title and the image into separate vector spaces, you can search both at once and let RRF surface products that match across all three signals — keywords, text meaning, and visual similarity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'running shoes'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title_vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;title_sim&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;image_sim&lt;/span&gt;
&lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fusion_method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;fusion_weights&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title_sim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_sim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All sub-queries run in parallel and are fused together via RRF.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Hybrid search is not about replacing full-text or vector search — it’s about using both where they work best.&lt;/p&gt;

&lt;p&gt;Keyword search gives you precision for exact terms and identifiers. Vector search gives you flexibility for natural language and meaning. On their own, each has gaps. Together, they produce consistently better results across a wide range of queries.&lt;/p&gt;

&lt;p&gt;With hybrid search in Manticore, you don’t need to choose between the two or build complex query logic to handle different cases. You can run both signals in parallel and get a single, unified result set.&lt;/p&gt;

&lt;p&gt;If your search needs to handle both exact matches and intent — which most real-world applications do — hybrid search is a straightforward way to improve relevance without adding complexity.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>database</category>
      <category>machinelearning</category>
      <category>nlp</category>
    </item>
  </channel>
</rss>
