<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergey Nikolaev</title>
    <description>The latest articles on DEV Community by Sergey Nikolaev (@sanikolaev).</description>
    <link>https://dev.to/sanikolaev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F363352%2F6f7a2da7-fa00-47f5-aaca-a007b1d43350.jpeg</url>
      <title>DEV Community: Sergey Nikolaev</title>
      <link>https://dev.to/sanikolaev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sanikolaev"/>
    <language>en</language>
    <item>
      <title>Prepared statements in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Fri, 03 Apr 2026 04:38:10 +0000</pubDate>
      <link>https://dev.to/sanikolaev/prepared-statements-in-manticore-search-2n4e</link>
      <guid>https://dev.to/sanikolaev/prepared-statements-in-manticore-search-2n4e</guid>
      <description>&lt;p&gt;Imagine you're building a powerful search application. Users type in keywords, and your backend needs to query the Manticore Search database to find matching results. A common (and tempting!) approach is to embed user input directly into your SQL queries. For example, you might filter by a numeric field such as a category or record ID. If the user passes a normal value like &lt;code&gt;5&lt;/code&gt;, the query is &lt;code&gt;SELECT * FROM products WHERE id=5&lt;/code&gt;. But what if they pass &lt;code&gt;1 OR 1=1&lt;/code&gt;? The query becomes &lt;code&gt;SELECT * FROM products WHERE id=1 OR 1=1&lt;/code&gt; — the condition is always true, so the query returns every row instead of one. This is SQL injection.&lt;/p&gt;

&lt;p&gt;Fortunately, there's a safer and more efficient way: &lt;strong&gt;prepared statements&lt;/strong&gt;. Essentially, prepared statements separate your SQL code from the data you pass in. Instead of building the entire query string each time, you define the query structure once with placeholders and then supply the search terms separately. You can learn more about the concept on &lt;a href="https://en.wikipedia.org/wiki/Prepared_statement" rel="noopener noreferrer"&gt;Wikipedia&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Manticore Search supports prepared statements over the standard MySQL protocol, giving you a powerful tool for building secure search applications. By using prepared statements, you'll not only dramatically reduce the risk of SQL injection, but you'll also improve the readability of your code.&lt;/p&gt;

&lt;p&gt;Prepared statements aren't just a feature; they're sometimes a requirement. For example, the Rust &lt;code&gt;sqlx&lt;/code&gt; library works with the MySQL endpoint solely using prepared statements. Also, some OLE DB connectors that enable MS SQL to work with a MySQL server use prepared statements internally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Use Prepared Statements?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Security First (SQL Injection)&lt;/strong&gt;: SQL injection is a web security vulnerability that allows attackers to interfere with the queries an application makes to its database. It happens when user input is improperly incorporated into a SQL query, allowing malicious code to be executed. For example, consider a simple search query built by concatenating a user's search term directly into the SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Vulnerable code example (DO NOT USE!)&lt;/span&gt;
&lt;span class="nv"&gt;$productId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$_GET&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'search'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nv"&gt;$query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SELECT * FROM products WHERE id= "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$productId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;$productId&lt;/code&gt; contained something like &lt;code&gt;0 OR 1=1&lt;/code&gt;, the query would become &lt;code&gt;SELECT * FROM products WHERE id= 0 OR 1=1&lt;/code&gt;, effectively bypassing the WHERE clause and returning all products.&lt;/p&gt;

&lt;p&gt;Prepared statements prevent this by treating user input strictly as &lt;em&gt;data&lt;/em&gt;, not as part of the SQL command itself. The database driver handles the escaping and quoting, ensuring that any potentially harmful characters are neutralized. Here's the same query using a prepared statement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Secure code example using a prepared statement&lt;/span&gt;
&lt;span class="nv"&gt;$productId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$_GET&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'search'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$mysqli&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"SELECT * FROM products WHERE id= ?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"i"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$productId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, even if &lt;code&gt;$productId&lt;/code&gt; contains malicious code, it will be treated as a literal value, not executable SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  How They Work
&lt;/h2&gt;

&lt;p&gt;Prepared statements operate using a simple three-step process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Prepare:&lt;/strong&gt; First, you send the SQL statement with placeholders (like &lt;code&gt;?&lt;/code&gt; or &lt;code&gt;?VEC?&lt;/code&gt;) to Manticore Search. Manticore parses this statement and creates a query plan. It then returns a unique identifier for this prepared statement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bind:&lt;/strong&gt; Next, you send the actual data – the values for the placeholders – to Manticore &lt;em&gt;separately&lt;/em&gt;. This is where the security comes in; the data is treated purely as data, not as SQL code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute:&lt;/strong&gt; Finally, you instruct Manticore to execute the prepared statement using the stored query plan and the bound parameters.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Think of it like creating a template. You build the structure once, then fill in the blanks with different information each time you need to use it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parameter Placeholders: &lt;code&gt;?&lt;/code&gt; &amp;amp; &lt;code&gt;?VEC?&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Manticore Search uses specific placeholders to identify parameters within your prepared statements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;?&lt;/code&gt; represents a single parameter – this could be an integer, a floating-point number, or a string. When using this placeholder, Manticore automatically handles escaping and quoting for string values, protecting against SQL injection and ensuring proper data formatting.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;?VEC?&lt;/code&gt; is designed for lists of numeric values. It expects a string containing numbers separated by commas and optional spaces (e.g., &lt;code&gt;1, 2.3, 4, 1e-10, INF&lt;/code&gt;). Crucially, &lt;em&gt;no escaping or quoting is applied&lt;/em&gt; to the values within &lt;code&gt;?VEC?&lt;/code&gt;. Valid input consists solely of numbers, commas, and spaces; any other characters will likely result in an error. This makes it perfect for directly inserting numeric vectors into your data - both float vectors and integer MVAs (multi-value attributes).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example: prepared statements in PHP
&lt;/h2&gt;

&lt;p&gt;Let's see how prepared statements work in practice using PHP. We'll demonstrate both a simple insert with string values and a more complex insert involving a floating-point vector using the &lt;code&gt;?VEC?&lt;/code&gt; placeholder.&lt;/p&gt;

&lt;p&gt;First, a basic insertion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;
&lt;span class="c1"&gt;// Assuming you have a valid MySQLi connection established ($mysqli)&lt;/span&gt;

&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$mysqli&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO products (name, description) VALUES (?, ?)"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$productName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Awesome Widget"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$productDescription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"A truly amazing widget for all your needs."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"ss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$productName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$productDescription&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "ss" indicates two strings&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Product added successfully!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;?&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code prepares the &lt;code&gt;INSERT&lt;/code&gt; statement, binds the string values for the product name and description, and then executes the query. The resulting SQL executed by Manticore would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Awesome Widget'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'A truly amazing widget for all your needs.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let's tackle an example using a float vector. &lt;strong&gt;What is &lt;code&gt;?VEC?&lt;/code&gt;?&lt;/strong&gt; It is a placeholder (only used in prepared statements) for a &lt;em&gt;vector&lt;/em&gt; — a list of numbers, e.g. for embeddings or similar data. In Manticore SQL, a vector literal is always written with parentheses: &lt;code&gt;(0.1, 0.2, 0.3)&lt;/code&gt;. So when you use a prepared statement and have a vector parameter, you write those parentheses in the SQL string and use &lt;code&gt;?VEC?&lt;/code&gt; where the numbers go. You bind only the comma-separated numbers (e.g. &lt;code&gt;"0.1,0.2,0.3"&lt;/code&gt;); you do not bind the &lt;code&gt;(&lt;/code&gt; and &lt;code&gt;)&lt;/code&gt; — they stay in the query. Without prepared statements you would build the full literal &lt;code&gt;(0.1, 0.2, 0.3)&lt;/code&gt; yourself in the query string.&lt;/p&gt;

&lt;p&gt;In PHP &lt;code&gt;mysqli&lt;/code&gt;, the usual way to bind &lt;code&gt;?VEC?&lt;/code&gt; values is as strings, so &lt;code&gt;iss&lt;/code&gt; is the normal choice in this example. If you want to stream a larger vector payload, you can also bind the parameter as &lt;code&gt;b&lt;/code&gt; and send the contents with &lt;code&gt;send_long_data()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;
&lt;span class="c1"&gt;// Assuming you have a valid MySQLi connection established ($mysqli)&lt;/span&gt;

&lt;span class="nv"&gt;$stmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$mysqli&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"INSERT INTO items (item_id, coords, features) VALUES (?, (?VEC?),(?VEC?))"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$itemId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$coordVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"20.245,54.354,30.000"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// that is vector of floats&lt;/span&gt;
&lt;span class="nv"&gt;$featureSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1,4,20,456,112,3"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// that is set of integer values (MVA)&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$itemId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$coordVector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$featureSet&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "i" for integer (itemId), "s" for string&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Item with feature vector added successfully!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nv"&gt;$itemId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;124&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$coordVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"18.500,42.000,31.125"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Another float vector&lt;/span&gt;
&lt;span class="nv"&gt;$featureSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0,6,34,665,22,3445,221,564,2232,5644,43"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Example with more feature values&lt;/span&gt;

&lt;span class="c1"&gt;// For larger payloads you can bind the second ?VEC? as a blob and stream it.&lt;/span&gt;
&lt;span class="nv"&gt;$featurePlaceholder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"isb"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$itemId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$coordVector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$featurePlaceholder&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "b" is for blob data&lt;/span&gt;
&lt;span class="c1"&gt;// bind_param() must be called before send_long_data().&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;send_long_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$featureSet&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// zero-based index: 2 means the third bound parameter&lt;/span&gt;
&lt;span class="nv"&gt;$stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Item with feature vector added successfully!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;?&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that the parentheses are &lt;em&gt;part of the SQL string&lt;/em&gt; in the &lt;code&gt;prepare()&lt;/code&gt; call. We only bind the &lt;em&gt;values&lt;/em&gt; within the parentheses using the &lt;code&gt;?VEC?&lt;/code&gt; placeholder. The resulting SQL executed by Manticore will be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;245&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;354&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;456&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;112&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;124&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;125&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;34&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;665&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3445&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;221&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;564&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2232&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5644&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;43&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;?VEC?&lt;/code&gt; in a prepared statement gives you the same benefits as with the &lt;code&gt;?&lt;/code&gt; placeholder: the vector values are sent as data, not as part of the SQL text, so they cannot be interpreted as SQL and cannot cause injection. You also avoid having to manually build or escape the vector literal in your application — Manticore receives the bound numbers and formats the vector correctly, which keeps the query safe and the data consistent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Important Considerations &amp;amp; Limitations
&lt;/h2&gt;

&lt;p&gt;While powerful, Manticore's prepared statements have a few limitations to keep in mind.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-Queries:&lt;/strong&gt; Only a single SQL statement is allowed per prepared statement. Attempts to use multi-queries (e.g., &lt;code&gt;SELECT ...; SHOW META&lt;/code&gt;) will fail. If you need to execute multiple statements, prepare a separate statement for each one and execute them sequentially within the same session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Numeric Types:&lt;/strong&gt; Some database drivers (like &lt;code&gt;mysql2&lt;/code&gt; for Node.js) might send numeric parameters as &lt;code&gt;DOUBLE&lt;/code&gt; by default. This could lead to unexpected behavior if you require strict integer behavior (like rejecting negative IDs). In such cases, consider sending integers as strings or utilize driver-specific integer types (e.g., &lt;code&gt;BigInt&lt;/code&gt;) to ensure correct data handling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rust &lt;code&gt;sqlx&lt;/code&gt; Users:&lt;/strong&gt; If you're using the &lt;code&gt;sqlx&lt;/code&gt; crate in Rust, be aware that when reading result set rows, you &lt;strong&gt;must&lt;/strong&gt; use column &lt;em&gt;indices&lt;/em&gt; rather than column names. While column names are present in the result set, &lt;code&gt;sqlx&lt;/code&gt; doesn't utilize them for mapping. For example, use &lt;code&gt;row.try_get(0)?&lt;/code&gt; instead of &lt;code&gt;row.try_get("id")?&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Prepared statements offer a critical combination of security, readability, and potential performance gains when working with Manticore Search. By separating your SQL logic from your data, you dramatically reduce the risk of SQL injection attacks, improve code maintainability, and potentially speed up query execution. We strongly encourage you to adopt prepared statements in your Manticore Search applications.&lt;/p&gt;

&lt;p&gt;For more in-depth information, be sure to consult these resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manticore Search Documentation on Prepared Statements: &lt;a href="https://manual.manticoresearch.com/Connecting_to_the_server/MySQL_protocol#Prepared-statements" rel="noopener noreferrer"&gt;https://manual.manticoresearch.com/Connecting_to_the_server/MySQL_protocol#Prepared-statements&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Wikipedia - Prepared Statements: &lt;a href="https://en.wikipedia.org/wiki/Prepared_statement" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Prepared_statement&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This guide provides a solid foundation for using prepared statements effectively in your Manticore Search projects, leading to more secure, efficient, and maintainable applications.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>database</category>
      <category>security</category>
      <category>sql</category>
    </item>
    <item>
      <title>KNN prefiltering in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Thu, 02 Apr 2026 05:50:56 +0000</pubDate>
      <link>https://dev.to/sanikolaev/knn-prefiltering-in-manticore-search-c2f</link>
      <guid>https://dev.to/sanikolaev/knn-prefiltering-in-manticore-search-c2f</guid>
      <description>&lt;p&gt;Vector search rarely happens in isolation. You almost always have filters — a price range, a category, a date window, a geographic boundary. The question is: when do those filters get applied?&lt;/p&gt;

&lt;p&gt;The answer makes a surprising difference in result quality.&lt;/p&gt;

&lt;p&gt;KNN prefiltering is available in Manticore Search starting from version &lt;code&gt;19.0.1&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with postfiltering
&lt;/h2&gt;

&lt;p&gt;Consider a product catalog with 10 million items. A user asks for the 10 nearest neighbors to a query vector, restricted to &lt;code&gt;category = 'electronics'&lt;/code&gt;. With postfiltering, the KNN search runs first over the entire dataset, then the filter is applied to the results. If electronics make up 5% of the catalog, the graph explores nodes that are mostly irrelevant. Worse, many of the k nearest neighbors may not be electronics at all, so the final result set can be much smaller than requested. Ask for 10 results, get 2.&lt;/p&gt;

&lt;p&gt;This is the fundamental limitation of postfiltering: the HNSW graph doesn't know about your filters. It finds the closest vectors overall, not the closest vectors that match your criteria. The more selective the filter, the worse the problem gets.&lt;/p&gt;

&lt;h2&gt;
  
  
  What prefiltering does differently
&lt;/h2&gt;

&lt;p&gt;Prefiltering passes the filter into the HNSW graph traversal itself. As the algorithm explores candidate nodes, each one is checked against the filter before being added to the result heap. Only matching documents contribute to the final k results. This means you reliably get the k results you asked for, assuming k matching documents exist in the dataset.&lt;/p&gt;

&lt;p&gt;In Manticore Search, prefiltering is enabled by default when your query combines KNN search with attribute filters. No special syntax is needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both &lt;code&gt;category = 'electronics'&lt;/code&gt; and &lt;code&gt;price &amp;lt; 500&lt;/code&gt; are evaluated during HNSW traversal, not after. The equivalent JSON query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.33&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"must"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"equals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"electronics"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"range"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"lt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Naive prefiltering and where it falls short
&lt;/h2&gt;

&lt;p&gt;The obvious first approach is straightforward: traverse the HNSW graph normally, compute distances for every neighbor, but only add filter-matching nodes to the result heap. Filtered-out nodes still participate in navigation — if a non-matching node has a competitive distance, it enters the candidate queue and its neighbors get explored. The filter only gates what goes into the results.&lt;/p&gt;

&lt;p&gt;This actually works reasonably well. The graph stays connected because filtered-out nodes are still traversed. But it has a performance problem that gets worse as the filter becomes more selective: every unvisited neighbor gets a distance computation regardless of whether it passes the filter. Distance computation is the most expensive operation in the search. With a filter matching 5% of documents, 95% of that work produces results that are immediately discarded. The algorithm pays full cost for navigation but gets no results from most of the work.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Manticore solves it: ACORN-1
&lt;/h2&gt;

&lt;p&gt;Manticore uses an ACORN-1-based algorithm (from the &lt;a href="https://arxiv.org/abs/2403.04871" rel="noopener noreferrer"&gt;ACORN paper&lt;/a&gt;, SIGMOD 2024) that improves on naive prefiltering in two ways:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No distance computation for filtered-out nodes.&lt;/strong&gt; When visiting a node's neighbors, ACORN-1 checks the filter first and only computes distance for nodes that pass. Filtered-out neighbors are never scored. When 95% of nodes fail the filter, this saves roughly 95% of the distance work compared to the naive approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adaptive expansion through filtered-out nodes.&lt;/strong&gt; When a neighbor fails the filter, the algorithm looks through that node's own neighbors to find filter-passing nodes further away. If those neighbors also fail the filter and not enough matching candidates have been found yet, it keeps going — 3 hops, 4 hops, as far as needed. The more selective the filter, the more aggressively the algorithm expands. This targeted walk through non-matching neighborhoods reaches matching candidates without scoring the non-matching ones along the way.&lt;/p&gt;

&lt;p&gt;Think of it as searching for Italian restaurants in a city. The naive approach checks the menu at every restaurant and only keeps the Italian ones. ACORN-1 glances at the sign first — "French, skip; Thai, skip" — without going inside. And when it sees a stretch of non-Italian restaurants, it walks past them, peeking around each corner until it finds an Italian place on the other side.&lt;/p&gt;

&lt;p&gt;Manticore activates ACORN-1 when fewer than 60% of total documents pass the filter. Above that threshold, naive prefiltering works well enough on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automatic brute-force fallback
&lt;/h2&gt;

&lt;p&gt;Prefiltering works well across a wide range of filter selectivities, but there's an extreme case: what if only 50 documents out of 10 million match the filter? Traversing the HNSW graph — even with ACORN-1 — visits far more nodes than just scanning those 50 documents directly.&lt;/p&gt;

&lt;p&gt;Manticore detects this automatically. When prefiltering is enabled, the query planner estimates the cost of HNSW traversal versus a brute-force distance scan over the filtered subset. It uses histogram-based selectivity estimates to predict how many documents pass the filter, then compares that against the expected number of nodes HNSW would visit. If brute-force is cheaper, Manticore skips HNSW entirely and scans the filtered documents directly.&lt;/p&gt;

&lt;p&gt;This means you don't need to think about edge cases. Prefiltering adapts: ACORN-1 for moderate selectivity, brute-force for extreme selectivity, and standard HNSW when no filter is present.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use postfiltering instead
&lt;/h2&gt;

&lt;p&gt;Prefiltering isn't always the best choice. There are cases where postfiltering is preferable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;When you want the closest vectors regardless of filters.&lt;/strong&gt; Postfiltering gives you the k nearest neighbors from the full dataset, then removes non-matching ones. If your application tolerates getting fewer than k results and you care most about vector distance quality, postfiltering is simpler and more predictable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When the filter matches most documents.&lt;/strong&gt; If 95% of documents pass the filter, prefiltering adds overhead for almost no benefit — nearly every candidate matches anyway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When you're debugging or benchmarking.&lt;/strong&gt; Postfiltering gives you a clean baseline: pure HNSW results with a filter on top. This makes it easier to isolate whether a quality issue comes from the graph or the filter.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To explicitly request postfiltering in SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;prefilter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In JSON, set &lt;code&gt;"prefilter": false&lt;/code&gt; inside the &lt;code&gt;knn&lt;/code&gt; object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.33&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prefilter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"equals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"electronics"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Forcing brute-force
&lt;/h2&gt;

&lt;p&gt;If you know your dataset is small enough or your filters selective enough that a linear scan is the right strategy, you can force brute-force mode directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;33&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;fullscan&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This skips HNSW entirely and computes exact distances over all documents that pass the filter. It guarantees perfect recall at the cost of linear-time scanning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Prefiltering is the default in Manticore and the right choice for most filtered KNN queries. It guarantees you get k results (if they exist). Manticore automatically picks the best strategy based on how selective the filter is: standard filtered HNSW when most documents match, ACORN-1 when fewer than 60% pass (saving distance computations on filtered-out nodes), and brute-force when the filtered subset is small enough to scan directly. The query planner estimates filter selectivity per-query, per-segment, so there's nothing to tune.&lt;/p&gt;

&lt;p&gt;Use postfiltering (&lt;code&gt;prefilter=0&lt;/code&gt; in SQL, &lt;code&gt;"prefilter": false&lt;/code&gt; in JSON) when you want the globally closest vectors and can tolerate getting fewer than k results. Use brute-force (&lt;code&gt;fullscan=1&lt;/code&gt; in SQL, &lt;code&gt;"fullscan": true&lt;/code&gt; in JSON) when you know a linear scan is the right strategy for your data.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>database</category>
      <category>machinelearning</category>
      <category>performance</category>
    </item>
    <item>
      <title>Hybrid search in Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 01 Apr 2026 10:46:41 +0000</pubDate>
      <link>https://dev.to/sanikolaev/hybrid-search-in-manticore-search-5ake</link>
      <guid>https://dev.to/sanikolaev/hybrid-search-in-manticore-search-5ake</guid>
      <description>&lt;p&gt;Search is rarely a one-size-fits-all problem. A user typing "cheap running shoes" wants exact keyword matches, but a user asking "comfortable footwear for jogging" is expressing the same intent in different words. Traditional full-text search handles the first case well. Vector search handles the second. Hybrid search combines both in a single query so you don't have to choose.&lt;/p&gt;

&lt;p&gt;In modern search systems, this is often described as combining &lt;strong&gt;lexical (sparse) retrieval&lt;/strong&gt; with &lt;strong&gt;semantic (dense) retrieval&lt;/strong&gt;. Different terms, same idea: exact matching plus meaning.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is hybrid search?
&lt;/h2&gt;

&lt;p&gt;Hybrid search runs a full-text (BM25) search and a vector (KNN) search side by side, then merges the two result lists into one. Documents that score well on either signal (or both) rise to the top.&lt;/p&gt;

&lt;p&gt;Full-text search is great at exact keywords, rare terms, and identifiers. Vector search understands meaning — that "automobile" and "car" are the same concept — because their embeddings are nearby in vector space.&lt;/p&gt;

&lt;p&gt;Each method has blind spots:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full-text struggles with synonyms and natural language&lt;/li&gt;
&lt;li&gt;Vector search struggles with exact tokens like SKUs, error codes, and IDs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hybrid search covers both.&lt;/p&gt;

&lt;h2&gt;
  
  
  How hybrid search fits into modern search pipelines
&lt;/h2&gt;

&lt;p&gt;Hybrid search is the &lt;strong&gt;retrieval stage&lt;/strong&gt; — the part that finds relevant candidates from your dataset.&lt;/p&gt;

&lt;p&gt;Instead of relying on a single method, hybrid search combines keyword matching and semantic similarity to produce a stronger result set from the start.&lt;/p&gt;

&lt;p&gt;In practice, this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better recall for natural language queries&lt;/li&gt;
&lt;li&gt;Precise matching for identifiers like SKUs or error codes&lt;/li&gt;
&lt;li&gt;More relevant results without needing complex query logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is simple: return the best possible candidates in a single pass, using both signals together.&lt;/p&gt;

&lt;h2&gt;
  
  
  When should you use it?
&lt;/h2&gt;

&lt;p&gt;Hybrid search is a good fit when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your queries mix intent and specifics. A search like &lt;code&gt;python error 403 forbidden&lt;/code&gt; benefits from keyword precision on the error code and semantic understanding of the problem description.&lt;/li&gt;
&lt;li&gt;You're building a RAG pipeline. Retrieval-Augmented Generation needs the most relevant chunks fed to the LLM. Hybrid retrieval consistently finds more relevant documents than either method alone.&lt;/li&gt;
&lt;li&gt;Your catalog has structured and unstructured data. E-commerce products have precise names and model numbers (keyword territory) but also descriptions where meaning matters more than exact wording.&lt;/li&gt;
&lt;li&gt;You can't predict how users will search. Some will paste exact phrases, others will describe what they're looking for in natural language. Hybrid search handles both gracefully.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;Manticore uses Reciprocal Rank Fusion (RRF) to merge results. The idea is simple: instead of trying to compare raw BM25 scores with KNN distances (which are on completely different scales), RRF looks at rank positions. A document that's ranked #1 in the text results and #3 in the KNN results gets a higher combined score than a document that only appears in one list.&lt;/p&gt;

&lt;p&gt;Here's a quick example. Suppose a text search and a KNN search each return their own top 3:&lt;/p&gt;

&lt;p&gt;Text search results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Doc A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Doc B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Doc C&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;KNN search results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Doc C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Doc A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Doc D&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;RRF scores each document using the formula &lt;code&gt;1 / (rank_constant + rank)&lt;/code&gt;. With the default &lt;code&gt;rank_constant=60&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;th&gt;Text contribution&lt;/th&gt;
&lt;th&gt;KNN contribution&lt;/th&gt;
&lt;th&gt;RRF score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Doc A&lt;/td&gt;
&lt;td&gt;1/(60+1) = 0.0164&lt;/td&gt;
&lt;td&gt;1/(60+2) = 0.0161&lt;/td&gt;
&lt;td&gt;0.0325&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc C&lt;/td&gt;
&lt;td&gt;1/(60+3) = 0.0159&lt;/td&gt;
&lt;td&gt;1/(60+1) = 0.0164&lt;/td&gt;
&lt;td&gt;0.0323&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc B&lt;/td&gt;
&lt;td&gt;1/(60+2) = 0.0161&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;0.0161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc D&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;1/(60+3) = 0.0159&lt;/td&gt;
&lt;td&gt;0.0159&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Doc A ranks highest because it appears near the top in both lists. Doc C is close behind for the same reason. Doc B and Doc D each appear in only one list, so they score lower.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why RRF?
&lt;/h3&gt;

&lt;p&gt;There are two common ways to combine results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rank-based fusion (RRF)&lt;/strong&gt; — simple, robust, no need to normalize scores&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Score-based fusion&lt;/strong&gt; — normalize scores first, then combine&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Manticore uses RRF because it works well out of the box and avoids score calibration problems.&lt;/p&gt;

&lt;p&gt;Under the hood, a hybrid query is split into independent sub-queries — one for full-text, one (or more) for KNN — that run in parallel. Once all sub-queries finish, RRF fuses their ranked result lists into a single output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not just use one or the other?
&lt;/h2&gt;

&lt;p&gt;Consider a support knowledge base with articles for different error codes — connection failures, authentication problems, sync issues. A user sees error E-5020 on screen and reports: &lt;code&gt;"I can't connect to the server."&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Vector search understands the symptom but not the error code. A KNN search for "can not connect to the server" returns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Title&lt;/th&gt;
&lt;th&gt;KNN distance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Error E-5030: DNS Resolution Failed&lt;/td&gt;
&lt;td&gt;0.572&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Error E-2091: App Loading Timeout&lt;/td&gt;
&lt;td&gt;0.583&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Error E-5020: SSL Certificate Mismatch&lt;/td&gt;
&lt;td&gt;0.605&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Error E-5010: Service Unavailable&lt;/td&gt;
&lt;td&gt;0.622&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Error E-4001: Login Failed&lt;/td&gt;
&lt;td&gt;0.665&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The correct article (E-5020) is buried at #3. KNN ranks DNS and timeout errors higher because their descriptions are semantically closer to "can't connect." The actual problem — an SSL certificate mismatch — uses completely different vocabulary, so it scores lower.&lt;/p&gt;

&lt;p&gt;You might think: just add the error code to the KNN query. But "E-5020" and "E-5010" are arbitrary identifiers with no semantic meaning — embeddings treat them as nearly identical tokens. KNN for "E-5020 can not connect to the server" does move E-5020 to #1, but only because the added text shifts the semantic context — the error code itself carries no weight.&lt;/p&gt;

&lt;p&gt;Hybrid search solves this by sending each signal where it works best — the error code to full-text, the symptom to KNN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;support_articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'can not connect to the server'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'E-5020'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fusion_method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Title&lt;/th&gt;
&lt;th&gt;Hybrid score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Error E-5020: SSL Certificate Mismatch&lt;/td&gt;
&lt;td&gt;0.032&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Error E-5030: DNS Resolution Failed&lt;/td&gt;
&lt;td&gt;0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Error E-2091: App Loading Timeout&lt;/td&gt;
&lt;td&gt;0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Error E-5010: Service Unavailable&lt;/td&gt;
&lt;td&gt;0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Error E-4001: Login Failed&lt;/td&gt;
&lt;td&gt;0.015&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;E-5020 jumps from #3 to #1 with twice the score of everything else. Full-text treats "E-5020" as an exact string — not similar to "E-5010", not close enough, just different. KNN ensures related connection errors still appear below for context.&lt;/p&gt;

&lt;p&gt;This is the core value of hybrid search:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifiers → full-text&lt;/li&gt;
&lt;li&gt;Meaning → vector search&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each method covers the other's blind spot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;The simplest way to run a hybrid search is with &lt;code&gt;hybrid_match()&lt;/code&gt;. If your table has auto-embeddings configured, one line does everything — text search, embedding generation, KNN search, and RRF fusion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hybrid_match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'running shoes'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The JSON equivalent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hybrid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"running shoes"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Manticore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;generates embeddings&lt;/li&gt;
&lt;li&gt;runs both searches in parallel&lt;/li&gt;
&lt;li&gt;fuses results&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Full control: explicit MATCH + KNN
&lt;/h3&gt;

&lt;p&gt;When you need to supply your own vectors or tune individual sub-queries, use the explicit form with &lt;code&gt;MATCH()&lt;/code&gt; and &lt;code&gt;KNN()&lt;/code&gt; in the WHERE clause:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'running shoes'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...))&lt;/span&gt;
&lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fusion_method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"knn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"query_vector"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"running shoes"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"options"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"fusion_method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rrf"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each result includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;hybrid_score()&lt;/code&gt; — fused score (used for default sorting)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;weight()&lt;/code&gt; — BM25 score&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;knn_dist()&lt;/code&gt; — vector distance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Attribute filters (&lt;code&gt;AND category = 'footwear'&lt;/code&gt;) apply to both sub-queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tuning
&lt;/h2&gt;

&lt;p&gt;Three options let you adjust fusion behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;rank_constant&lt;/code&gt; — controls how much top positions dominate the fused score. Lower values (e.g. 10) make rank #1 count significantly more than rank #5. Higher values flatten the curve. See &lt;a href="https://manual.manticoresearch.com/Searching/Options#rank_constant" rel="noopener noreferrer"&gt;rank_constant&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fusion_weights&lt;/code&gt; — lets you give different importance to each sub-query. If text relevance matters more than vector similarity, weight it higher. See &lt;a href="https://manual.manticoresearch.com/Searching/Options#fusion_weights" rel="noopener noreferrer"&gt;fusion_weights&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;window_size&lt;/code&gt; — how many results each sub-query retrieves before fusion. By default, Manticore computes this automatically from your KNN parameters and query LIMIT. See &lt;a href="https://manual.manticoresearch.com/Searching/Options#window_size" rel="noopener noreferrer"&gt;window_size&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Multi-vector fusion
&lt;/h2&gt;

&lt;p&gt;Hybrid search isn't limited to one text search plus one KNN search. You can fuse multiple vector searches together — useful when your data has several distinct semantic dimensions. For example, an e-commerce product has a textual description and a photo. A user searching for "minimalist white sneakers" cares about both: the title should match the style, and the product image should look like what they have in mind. By encoding the title and the image into separate vector spaces, you can search both at once and let RRF surface products that match across all three signals — keywords, text meaning, and visual similarity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hybrid_score&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'running shoes'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title_vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;title_sim&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;image_sim&lt;/span&gt;
&lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fusion_method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'rrf'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;fusion_weights&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title_sim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_sim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All sub-queries run in parallel and are fused together via RRF.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Hybrid search is not about replacing full-text or vector search — it’s about using both where they work best.&lt;/p&gt;

&lt;p&gt;Keyword search gives you precision for exact terms and identifiers. Vector search gives you flexibility for natural language and meaning. On their own, each has gaps. Together, they produce consistently better results across a wide range of queries.&lt;/p&gt;

&lt;p&gt;With hybrid search in Manticore, you don’t need to choose between the two or build complex query logic to handle different cases. You can run both signals in parallel and get a single, unified result set.&lt;/p&gt;

&lt;p&gt;If your search needs to handle both exact matches and intent — which most real-world applications do — hybrid search is a straightforward way to improve relevance without adding complexity.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>database</category>
      <category>machinelearning</category>
      <category>nlp</category>
    </item>
    <item>
      <title>Manticore Search 25.0.0</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Tue, 31 Mar 2026 10:54:20 +0000</pubDate>
      <link>https://dev.to/sanikolaev/manticore-search-2500-36op</link>
      <guid>https://dev.to/sanikolaev/manticore-search-2500-36op</guid>
      <description>&lt;p&gt;&lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;Manticore Search 25.0.0&lt;/a&gt; has been released. This version brings a simpler packaging model together with major improvements in hybrid search, vector filtering, backups, RT table maintenance, and application integration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Upgrade Notes
&lt;/h2&gt;

&lt;p&gt;Please review these before upgrading:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCL 13.0.0 is required&lt;/strong&gt;. Manticore Search 25.0.0 updates the daemon/MCL interface and adds &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Creating-a-table-with-auto-embeddings" rel="noopener noreferrer"&gt;API_URL&lt;/a&gt; and &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Creating-a-table-with-auto-embeddings" rel="noopener noreferrer"&gt;API_TIMEOUT&lt;/a&gt; for auto-embedding models. If you manage MCL separately, upgrade the daemon and MCL together. (&lt;a href="https://github.com/manticoresoftware/columnar/pull/123" rel="noopener noreferrer"&gt;PR #123&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replication clusters require coordinated upgrades&lt;/strong&gt;. Mixed-version clusters are not compatible with the replication changes in 24.0.0. Upgrade clustered nodes together. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4343" rel="noopener noreferrer"&gt;Issue #4343&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Newer bigram tokenization options affect downgrade paths&lt;/strong&gt;. If you rebuild indexes with the bigram tokenization changes introduced in 23.0.0, those rewritten indexes are not compatible with older Manticore versions. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4364" rel="noopener noreferrer"&gt;Issue #4364&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filtered KNN results may change&lt;/strong&gt;. Since KNN prefiltering was introduced in 19.0.0, filtered vector queries can now prioritize nearest neighbors that satisfy the filter during search, rather than filtering only after candidate selection. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4103" rel="noopener noreferrer"&gt;Issue #4103&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Packaging Simplified
&lt;/h2&gt;

&lt;p&gt;Starting with 25.0.0, &lt;code&gt;manticore&lt;/code&gt; is the bundle package for deb and rpm. It includes the daemon, tools, converter, development headers, ICU data, bundled dependency packages, and built-in language packs for German, English, and Russian, along with Jieba support.&lt;/p&gt;

&lt;p&gt;In most cases, upgrading is now simpler: install &lt;code&gt;manticore&lt;/code&gt; and let the bundle pull in the components you need. If older split packages conflict with the new layout, remove them first with &lt;code&gt;apt remove 'manticore*'&lt;/code&gt; or &lt;code&gt;yum remove 'manticore*'&lt;/code&gt; and then install &lt;code&gt;manticore&lt;/code&gt;. Your existing data remains intact. On &lt;code&gt;yum&lt;/code&gt;-based systems, the package manager may replace the config file, but it automatically keeps a backup of the previous one.&lt;/p&gt;

&lt;p&gt;This is an important operational change: it reduces packaging friction and makes installation simpler and more predictable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Highlights
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Hybrid search is now a first-class option
&lt;/h3&gt;

&lt;p&gt;Manticore now supports &lt;a href="https://manticoresearch.com/blog/hybrid-search/" rel="noopener noreferrer"&gt;hybrid search&lt;/a&gt;, allowing you to combine full-text and vector retrieval in a single query. This makes it much easier to build retrieval pipelines that balance lexical precision with semantic recall.&lt;/p&gt;

&lt;p&gt;You can use hybrid search via both SQL and JSON interfaces. In SQL, you can combine &lt;code&gt;MATCH()&lt;/code&gt; with one or more &lt;code&gt;KNN()&lt;/code&gt; subqueries. For teams building modern search experiences, this is one of the biggest additions in the release line.&lt;/p&gt;

&lt;h3&gt;
  
  
  Better vector search with KNN prefiltering
&lt;/h3&gt;

&lt;p&gt;With &lt;a href="https://manticoresearch.com/blog/knn-prefiltering/" rel="noopener noreferrer"&gt;KNN prefiltering&lt;/a&gt;, attribute filters can be applied during vector search instead of only after candidate selection. That matters when you need "the nearest neighbors among documents that also match my filter", not just "the nearest neighbors overall, filtered afterward".&lt;/p&gt;

&lt;p&gt;This improves both relevance and predictability for filtered vector search workloads such as category-constrained product search, tenant-aware search, and permission-filtered semantic retrieval.&lt;/p&gt;

&lt;h3&gt;
  
  
  Faster RT maintenance with parallel chunk merging
&lt;/h3&gt;

&lt;p&gt;Manticore RT tables now handle heavy maintenance much better thanks to N-way merges and parallel &lt;code&gt;OPTIMIZE&lt;/code&gt; jobs. We covered the details in &lt;a href="https://manticoresearch.com/blog/parallel-chunk-merging/" rel="noopener noreferrer"&gt;Parallel chunk merging&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The result is simpler to explain than the implementation: when a table accumulates many disk chunks, cleanup and compaction take less time, so RT tables perform better under sustained write load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Easier application integration with prepared statements
&lt;/h3&gt;

&lt;p&gt;Manticore now supports MySQL-compatible prepared statements, which we covered in &lt;a href="https://manticoresearch.com/blog/prepared-statements/" rel="noopener noreferrer"&gt;Prepared statements in Manticore Search&lt;/a&gt;. This improves compatibility with MySQL clients, connection pools, ORMs, and frameworks that expect binary protocol prepare/execute behavior.&lt;/p&gt;

&lt;p&gt;For application developers, this removes one more integration edge case and makes Manticore easier to adopt in existing stacks.&lt;/p&gt;

&lt;h3&gt;
  
  
  S3-compatible backup and restore
&lt;/h3&gt;

&lt;p&gt;Backup operations are more flexible now thanks to &lt;a href="https://manticoresearch.com/blog/s3-streamable-backup/" rel="noopener noreferrer"&gt;S3-compatible backup and restore&lt;/a&gt;. Manticore Backup supports AWS S3, MinIO, Wasabi, and Cloudflare R2, making it easier to ship backups to object storage and build cleaner disaster-recovery workflows.&lt;/p&gt;

&lt;p&gt;This is especially useful for containerized and cloud-native deployments where local disk is temporary but object storage is the durable layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auto-embeddings keep improving
&lt;/h3&gt;

&lt;p&gt;25.0.0 also extends Manticore's recent auto-embeddings work. The new MCL version adds &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Creating-a-table-with-auto-embeddings" rel="noopener noreferrer"&gt;API_URL&lt;/a&gt; and &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Creating-a-table-with-auto-embeddings" rel="noopener noreferrer"&gt;API_TIMEOUT&lt;/a&gt; controls for auto-embedding models. Recent development also added support for GGUF quantized local embedding models, T5 encoders, gated Hugging Face downloads, and replication-safe embedding handling for RT tables.&lt;/p&gt;

&lt;p&gt;Taken together, these changes make Manticore more practical both for local embedding pipelines and for deployments that rely on external model endpoints.&lt;/p&gt;




&lt;h2&gt;
  
  
  Other Notable Improvements
&lt;/h2&gt;

&lt;p&gt;This release also includes &lt;strong&gt;36 bug fixes&lt;/strong&gt; across query execution, replication, macOS packaging, auto-embeddings, RT tables, and SQL compatibility.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;False-positive full-text matches caused by &lt;code&gt;max_query_time&lt;/code&gt; interruptions in complex queries were fixed, so timed-out searches no longer return rows that do not actually satisfy the query. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4375" rel="noopener noreferrer"&gt;Issue #4375&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Replication was fixed for transactions containing duplicate document IDs, so replicas no longer lose rows while the donor removes duplicates correctly. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4388" rel="noopener noreferrer"&gt;Issue #4388&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Several auto-embedding stability issues were fixed, including crashes during embedding generation, invalid UTF-8 handling, and missing RT locks during validation. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/pull/4349" rel="noopener noreferrer"&gt;PR #4349&lt;/a&gt;, &lt;a href="https://github.com/manticoresoftware/columnar/issues/125" rel="noopener noreferrer"&gt;PR #4370&lt;/a&gt;, &lt;a href="https://github.com/manticoresoftware/manticoresearch/pull/4371" rel="noopener noreferrer"&gt;PR #4371&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LEFT JOIN&lt;/code&gt; now returns proper MySQL &lt;code&gt;NULL&lt;/code&gt; values instead of the string &lt;code&gt;NULL&lt;/code&gt;, improving compatibility with MySQL clients and drivers. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4229" rel="noopener noreferrer"&gt;Issue #4229&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;A race during RT disk chunk save that could lose killed documents and produce duplicate rows after merges or saves was fixed. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4207" rel="noopener noreferrer"&gt;Issue #4207&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fuzzy search now works across queries involving multiple tables. (&lt;a href="https://github.com/manticoresoftware/manticoresearch-buddy/pull/648" rel="noopener noreferrer"&gt;PR #4372&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why 25.0.0 Matters
&lt;/h2&gt;

&lt;p&gt;Manticore Search 25.0.0 combines the packaging changes with several important capabilities that are now available together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hybrid lexical + vector retrieval&lt;/li&gt;
&lt;li&gt;filtered vector search that behaves the way users expect&lt;/li&gt;
&lt;li&gt;simpler integration through prepared statements&lt;/li&gt;
&lt;li&gt;object-storage-friendly backup workflows&lt;/li&gt;
&lt;li&gt;faster RT table compaction and maintenance&lt;/li&gt;
&lt;li&gt;more flexible auto-embedding deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the complete technical details, see the &lt;a href="https://manual.manticoresearch.com/Changelog#Version-25.0.0" rel="noopener noreferrer"&gt;changelog&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need help or want to connect?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Join our &lt;a href="https://slack.manticoresearch.com" rel="noopener noreferrer"&gt;Slack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Visit the &lt;a href="https://forum.manticoresearch.com" rel="noopener noreferrer"&gt;Forum&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Report issues or suggest features on &lt;a href="https://github.com/manticoresoftware/manticoresearch/issues" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Email us at &lt;code&gt;contact@manticoresearch.com&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>database</category>
      <category>news</category>
      <category>opensource</category>
    </item>
    <item>
      <title>MCP-Manticore: Let Your AI Assistant Write Manticore Queries for You</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 25 Mar 2026 10:23:25 +0000</pubDate>
      <link>https://dev.to/sanikolaev/mcp-manticore-let-your-ai-assistant-write-manticore-queries-for-you-33kp</link>
      <guid>https://dev.to/sanikolaev/mcp-manticore-let-your-ai-assistant-write-manticore-queries-for-you-33kp</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;You've heard Manticore Search is fast. You've heard it handles full-text, vector, and fuzzy search in one engine. But when you sit down to actually use it, you're staring at documentation, guessing at SQL syntax, and hoping your &lt;code&gt;CREATE TABLE&lt;/code&gt; doesn't throw an obscure error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP-Manticore&lt;/strong&gt; changes the game. It's a Model Context Protocol (MCP) server that connects Cursor, Claude Code, Codex CLI, or any MCP-compatible AI assistant directly to your Manticore instance. The AI can read the docs, inspect your schema, and execute queries — all before it writes a single query for you.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;MCP&lt;/a&gt; (Model Context Protocol) is an open standard that lets AI assistants connect to external tools and data sources. Instead of the AI hallucinating Manticore syntax based on training data from who-knows-when, it gets real-time access to your database and the official documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Ways This Helps You
&lt;/h2&gt;

&lt;p&gt;Depending on what you're doing, MCP-Manticore provides value in two different ways:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Developer Assistance (Primary):&lt;/strong&gt; You're building an app that uses Manticore. The AI helps you create tables, write complex queries, and understand features — without you memorizing SQL syntax.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Data Exploration (Secondary):&lt;/strong&gt; You have data in Manticore and want to ask questions in plain English. "Show me the cheapest accessories" or "Find products similar to this one." For complex operations like vector search or fuzzy matching, the AI needs MCP to know the correct Manticore-specific syntax.&lt;/p&gt;

&lt;p&gt;Let's look at both.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: AI Without Context
&lt;/h2&gt;

&lt;p&gt;When you ask an AI assistant to help with Manticore &lt;strong&gt;without&lt;/strong&gt; MCP integration, here's what happens:&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Creating a Table with Auto-Embeddings
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;You:&lt;/strong&gt; "How do I create a Manticore table with auto-embeddings for semantic search?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI (guessing):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description_vector&lt;/span&gt; &lt;span class="n"&gt;float_vector&lt;/span&gt; 
        &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sentence-transformers/all-MiniLM-L6-v2'&lt;/span&gt; 
        &lt;span class="k"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'description'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks reasonable, right? &lt;strong&gt;Wrong.&lt;/strong&gt; The actual syntax uses &lt;code&gt;MODEL_NAME&lt;/code&gt; and &lt;code&gt;FROM&lt;/code&gt;, not &lt;code&gt;embedding&lt;/code&gt; and &lt;code&gt;source&lt;/code&gt;. You'd discover this only after the query fails, sending you back to the documentation.&lt;/p&gt;

&lt;p&gt;This is the pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;AI generates query based on outdated or guessed syntax&lt;/li&gt;
&lt;li&gt;You run it, get an error&lt;/li&gt;
&lt;li&gt;You manually check documentation&lt;/li&gt;
&lt;li&gt;You correct the AI (or fix it yourself)&lt;/li&gt;
&lt;li&gt;Repeat until it works&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For developers integrating Manticore into applications, this friction kills productivity. You're not exploring data — you're fighting syntax.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: MCP-Manticore
&lt;/h2&gt;

&lt;p&gt;With MCP-Manticore enabled, the conversation changes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You:&lt;/strong&gt; "How do I create a Manticore table with auto-embeddings for semantic search?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI:&lt;/strong&gt; "Let me check the documentation for the correct syntax..."&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[AI calls &lt;code&gt;list_documentation&lt;/code&gt; via MCP, searches for "embedding|semantic|knn|vector"]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI:&lt;/strong&gt; "Here's the correct syntax. Manticore uses &lt;code&gt;MODEL_NAME&lt;/code&gt; and &lt;code&gt;FROM&lt;/code&gt; for auto-embeddings:"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding_vector&lt;/span&gt; &lt;span class="n"&gt;FLOAT_VECTOR&lt;/span&gt; &lt;span class="n"&gt;KNN_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'hnsw'&lt;/span&gt; &lt;span class="n"&gt;HNSW_SIMILARITY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'l2'&lt;/span&gt;
    &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sentence-transformers/all-MiniLM-L6-v2'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'title,description'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI now has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real-time access&lt;/strong&gt; to Manticore documentation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema introspection&lt;/strong&gt; via &lt;code&gt;list_tables()&lt;/code&gt; and &lt;code&gt;describe_table()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query execution&lt;/strong&gt; to test and validate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safety controls&lt;/strong&gt; — read-only by default, write operations require opt-in&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real Examples: With and Without MCP
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Example 1: Schema Creation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Without MCP:&lt;/strong&gt;&lt;br&gt;
The AI guessed the syntax, using &lt;code&gt;embedding='...'&lt;/code&gt; and &lt;code&gt;source='...'&lt;/code&gt;—which doesn't exist in Manticore. You'd hit an error and waste time debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With MCP:&lt;/strong&gt;&lt;br&gt;
The AI retrieved the official documentation first and provided the correct &lt;code&gt;MODEL_NAME&lt;/code&gt; and &lt;code&gt;FROM&lt;/code&gt; syntax. It also explained the supported models (local HuggingFace models, OpenAI, Voyage, Jina) and the &lt;code&gt;HNSW_SIMILARITY&lt;/code&gt; options (&lt;code&gt;L2&lt;/code&gt;, &lt;code&gt;IP&lt;/code&gt;, &lt;code&gt;COSINE&lt;/code&gt;).&lt;/p&gt;
&lt;h3&gt;
  
  
  Example 2: Semantic Search with Auto-Embeddings
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;You:&lt;/strong&gt; "Find products similar to 'noise-canceling headphones for travel'"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without MCP:&lt;/strong&gt;&lt;br&gt;
The AI completely loses track. Without access to documentation, it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tries to SELECT all data and aggregate internally without any filter&lt;/li&gt;
&lt;li&gt;Hallucinates embedding vectors with made-up syntax: &lt;code&gt;ANY_KNN(embedding, (-0.07089090,0.04201586,-0.03262700...))&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Attempts to write Python scripts to manually calculate similarity&lt;/li&gt;
&lt;li&gt;Eventually gives up and just does string matching on descriptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; It finds "Wireless Headphones" only because the description literally contains "noise-canceling headphones" — pure luck, not semantic search.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With MCP:&lt;/strong&gt;&lt;br&gt;
The AI checks documentation, discovers your table uses auto-embeddings, and learns that &lt;code&gt;knn()&lt;/code&gt; accepts &lt;strong&gt;text directly&lt;/strong&gt; when &lt;code&gt;MODEL_NAME&lt;/code&gt; is configured:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'noise-canceling headphones for travel'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Returns Wireless Headphones as #1 (correct), but also surfaces semantically related items — actual vector similarity, not keyword matching.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 3: Fuzzy Search (Typo Tolerance)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;You:&lt;/strong&gt; "Find products even if I misspell the name, like 'headphons' instead of 'headphones'"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without MCP:&lt;/strong&gt;&lt;br&gt;
The AI tries everything it was trained on, hoping something works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;MATCH('headphons~1')&lt;/code&gt; and &lt;code&gt;MATCH('headphons~')&lt;/code&gt; — wrong operators&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CALL SUGGEST('headphons', 'products')&lt;/code&gt; — wrong approach for this use case&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MATCH('FUZZY(headphons')&lt;/code&gt; — hallucinated syntax that doesn't exist&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ALTER TABLE products SET min_infix_len = 3&lt;/code&gt; — unnecessary and wrong&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;OPTION expand_keywords = 1&lt;/code&gt; — unrelated feature&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It even tried to optimize the table and run suggestions again. Complete chaos.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; No working query. Just a pile of failed attempts based on outdated or confused training data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With MCP:&lt;/strong&gt;&lt;br&gt;
The AI checks the documentation and finds the correct syntax immediately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'headphons'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;OPTION&lt;/span&gt; &lt;span class="n"&gt;fuzzy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Returns "Wireless Headphones" despite the typo. The AI also explains that &lt;code&gt;fuzzy=1&lt;/code&gt; allows Levenshtein distance of 1 (one character difference), and you can adjust tolerance with &lt;code&gt;OPTION fuzzy=1, distance=2&lt;/code&gt; for more flexibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Intelligent Documentation Lookup
&lt;/h3&gt;

&lt;p&gt;MCP-Manticore includes a documentation fetcher that pulls directly from the &lt;a href="https://manual.manticoresearch.com" rel="noopener noreferrer"&gt;Manticore Search manual&lt;/a&gt; on GitHub. When you ask about features like KNN vector search, fuzzy matching, or full-text operators, the AI retrieves the official documentation before responding.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema-Aware Query Building
&lt;/h3&gt;

&lt;p&gt;The server provides tools that let the AI understand your data structure before writing queries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;list_tables()&lt;/code&gt; — See what tables exist&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;describe_table()&lt;/code&gt; — Understand column names and types&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;execute_query()&lt;/code&gt; — Run queries and see results&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Safe Query Execution
&lt;/h3&gt;

&lt;p&gt;By default, MCP-Manticore runs in &lt;strong&gt;read-only mode&lt;/strong&gt;. Write operations (INSERT, UPDATE, DELETE, DROP) require explicit opt-in via environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MANTICORE_ALLOW_WRITE_ACCESS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;  &lt;span class="c"&gt;# Enable INSERT/UPDATE/DELETE&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MANTICORE_ALLOW_DROP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;            &lt;span class="c"&gt;# Enable DROP/TRUNCATE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Multiple Transport Options
&lt;/h3&gt;

&lt;p&gt;Connect via:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;stdio&lt;/strong&gt; (for CLI-based AI assistants like Claude Code)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP&lt;/strong&gt; (for web-based integrations)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSE&lt;/strong&gt; (Server-Sent Events for real-time updates)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With optional JWT authentication for secure deployments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tutorial: Setting Up MCP-Manticore
&lt;/h2&gt;

&lt;p&gt;MCP-Manticore works with any MCP-compatible AI assistant, including &lt;a href="https://cursor.sh" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, &lt;a href="https://claude.ai/download" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, &lt;a href="https://github.com/openai/codex" rel="noopener noreferrer"&gt;Codex CLI&lt;/a&gt;, &lt;a href="https://codeium.com/windsurf" rel="noopener noreferrer"&gt;Windsurf&lt;/a&gt;, and any other tool that supports the Model Context Protocol.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Ensure UV is Installed
&lt;/h3&gt;

&lt;p&gt;MCP-Manticore runs best with &lt;a href="https://docs.astral.sh/uv/" rel="noopener noreferrer"&gt;uv&lt;/a&gt;, a fast Python package manager:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-LsSf&lt;/span&gt; https://astral.sh/uv/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;uv&lt;/code&gt;, you don't need to manually install MCP-Manticore—&lt;code&gt;uvx&lt;/code&gt; downloads and runs it automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Configure Environment Variables (Optional)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Required: Manticore connection (defaults shown)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MANTICORE_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;localhost
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MANTICORE_PORT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;9308

&lt;span class="c"&gt;# Optional: Enable write access (default: read-only)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MANTICORE_ALLOW_WRITE_ACCESS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Optional: Allow destructive operations (DROP, TRUNCATE)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MANTICORE_ALLOW_DROP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Add to Your MCP Client
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;General Configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Command&lt;/strong&gt;: &lt;code&gt;uvx mcp-manticore&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment variables&lt;/strong&gt; (if needed): &lt;code&gt;MANTICORE_HOST&lt;/code&gt;, &lt;code&gt;MANTICORE_PORT&lt;/code&gt;, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example configuration&lt;/strong&gt; (&lt;code&gt;mcp.json&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"manticore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"uvx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"mcp-manticore"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"MANTICORE_HOST"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"localhost"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"MANTICORE_PORT"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"9308"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For client-specific setup instructions (Cursor, Claude Desktop, Windsurf, etc.), see the &lt;a href="https://github.com/manticoresoftware/mcp-manticore#client-configuration" rel="noopener noreferrer"&gt;MCP-Manticore README&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Verify Connection
&lt;/h3&gt;

&lt;p&gt;Test by asking your AI assistant:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Show me all tables in Manticore"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You should see the AI call the &lt;code&gt;list_tables()&lt;/code&gt; tool and display your tables.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment Variable&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MANTICORE_HOST&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Manticore server hostname&lt;/td&gt;
&lt;td&gt;&lt;code&gt;localhost&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MANTICORE_PORT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Manticore HTTP port&lt;/td&gt;
&lt;td&gt;&lt;code&gt;9308&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MANTICORE_ALLOW_WRITE_ACCESS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enable INSERT/UPDATE/DELETE&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MANTICORE_ALLOW_DROP&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enable DROP/TRUNCATE&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MANTICORE_MCP_TRANSPORT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Transport type (stdio/http/sse)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;stdio&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MANTICORE_MCP_AUTH_TOKEN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JWT token for HTTP/SSE&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Future: Agents That Install Themselves
&lt;/h2&gt;

&lt;p&gt;There's a third use case on the horizon: &lt;strong&gt;autonomous agents&lt;/strong&gt; that discover and install MCP servers themselves.&lt;/p&gt;

&lt;p&gt;Imagine an AI agent that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Finds your GitHub repo mentioning Manticore&lt;/li&gt;
&lt;li&gt;Searches for "Manticore MCP server"&lt;/li&gt;
&lt;li&gt;Finds MCP-Manticore, installs it automatically&lt;/li&gt;
&lt;li&gt;Starts querying your database to complete its task&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This isn't science fiction — OpenAI's Codex and similar agentic systems are moving in this direction. When that future arrives, having MCP-Manticore in the MCP registry means your AI tools will just work with Manticore, no manual setup required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;MCP-Manticore transforms AI assistants from passive text generators into active, knowledgeable development partners. Whether you're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Building with Manticore&lt;/strong&gt; — Let the AI handle syntax while you focus on your application&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Learning Manticore&lt;/strong&gt; — Ask questions in plain English, get accurate answers backed by docs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exploring your data&lt;/strong&gt; — Query without memorizing SQL syntax or table schemas&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The old way: guess, error, debug, repeat.&lt;br&gt;&lt;br&gt;
The new way: ask, verify, execute, done.&lt;/p&gt;

&lt;p&gt;Ready to try it? With &lt;code&gt;uv&lt;/code&gt; installed, just add MCP-Manticore to your MCP client settings and start asking. Your future self — free from syntax rabbit holes — will thank you.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Resources:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/manticoresoftware/mcp-manticore" rel="noopener noreferrer"&gt;MCP-Manticore on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;MCP Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://manual.manticoresearch.com" rel="noopener noreferrer"&gt;Manticore Search Manual&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cursor.com/context/model-context-protocol" rel="noopener noreferrer"&gt;Cursor MCP Setup Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>database</category>
      <category>mcp</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Manticore Search on Microsoft Azure: DX1's Story</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 18 Feb 2026 04:02:42 +0000</pubDate>
      <link>https://dev.to/sanikolaev/manticore-search-on-microsoft-azure-dx1s-story-5335</link>
      <guid>https://dev.to/sanikolaev/manticore-search-on-microsoft-azure-dx1s-story-5335</guid>
      <description>&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TL;DR: 

- DX1 uses Manticore Search for customer and parts search with a fast typeahead UX  
- Chosen for open-source licensing and speed  
- Deployed on Azure VMs running Ubuntu, aligned with DX1’s existing Azure footprint  
- Handles 20M+ parts; best typeahead performance requires indexes in memory  
- Scales by upgrading VM memory or adding nodes to a Manticore cluster  
- Day-to-day operations are low touch and low maintenance  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;This article is based on direct input from &lt;a href="https://theorg.com/org/dx1/org-chart/damir-tresnjo" rel="noopener noreferrer"&gt;Damir Tresnjo&lt;/a&gt; at &lt;a href="https://www.dx1app.com/" rel="noopener noreferrer"&gt;DX1&lt;/a&gt;. It describes how DX1 runs Manticore Search in production on Microsoft Azure today, focusing on why they chose Manticore, how they deploy it, and what they have learned about performance and scaling.&lt;/p&gt;




&lt;h2&gt;
  
  
  DX1 in One Paragraph
&lt;/h2&gt;

&lt;p&gt;DX1 uses Manticore Search as a fast, user-facing search layer for customers and a parts catalog that has grown beyond 20 million records. The setup is intentionally simple: Manticore runs on Ubuntu-based Azure VMs alongside the rest of their Azure infrastructure, delivering responsive typeahead while staying “low touch” operationally. As their data and traffic grow, they scale in a straightforward way by upgrading VM sizes or adding more nodes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Search That Customers Actually Enjoy Using
&lt;/h2&gt;

&lt;p&gt;DX1 uses Manticore Search to power search across customer and parts data. Typeahead is a core part of the experience, and according to Damir, it is one of the most appreciated features by their users.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“We use it for searching through customers and parts data, we have a type ahead functionality that our customers love.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a practical, user-facing use case where milliseconds matter, and it has shaped both infrastructure and operational choices.&lt;/p&gt;

&lt;p&gt;If you're exploring autocomplete in Manticore, there are multiple ways to implement it depending on data and UX requirements. For a deeper dive, see our overview of fuzzy search and autocomplete: &lt;a href="https://manticoresearch.com/blog/new-fuzzy-search-and-autocomplete/" rel="noopener noreferrer"&gt;New fuzzy search and autocomplete&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why DX1 Chose Manticore Search
&lt;/h2&gt;

&lt;p&gt;The decision to use Manticore Search was straightforward: it is open source and fast.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Open source and very fast.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That combination made it a good fit for DX1’s search workload and cost expectations, while keeping the stack approachable for a lean team.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deployment on Azure VMs
&lt;/h2&gt;

&lt;p&gt;DX1 runs all of its infrastructure on Azure, so deploying Manticore there was the natural choice. The team runs Manticore Search on Azure virtual machines using Ubuntu.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;**“We run everything on Azure, so we deployed Manticore there as well."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;No Azure-specific expensive managed services were required; VMs provided the flexibility they needed while staying consistent with the rest of their environment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance, Memory, and Scale
&lt;/h2&gt;

&lt;p&gt;Manticore has been fast and stable for DX1, even at large scale. Their production dataset includes over 20 million parts.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“It performs very fast, we have over 20 million parts we search through.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;One practical consideration is memory. Typeahead performance benefits from indexes being in memory, which means VM memory may need to grow alongside the index.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“It does need the database to be in memory for the type ahead performance. As soon as index outgrows available memory, we need to upgrade the VM memory.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This creates a clear scaling path: grow memory on existing VMs or add more nodes to a cluster.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“We can scale each VM or we can add more VMs to a Manticore cluster.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Day-to-Day Operations
&lt;/h2&gt;

&lt;p&gt;Operationally, DX1 describes Manticore as low touch and low maintenance.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Low touch, low maintenance, most of the time it just runs.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are no special Azure features involved; the setup is deliberately simple, focused on VMs and predictable operations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Recommendation
&lt;/h2&gt;

&lt;p&gt;DX1 would recommend Manticore Search to other teams looking for a fast and cost-effective search engine.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Yes, I would recommend Manticore to anyone looking for a fast, reliable and cost effective search engine.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For DX1, the combination of speed, open-source flexibility, and straightforward VM-based deployment on Azure has been a dependable foundation for search at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;DX1’s story is a good fit for teams who want a fast, reliable search engine without turning search infrastructure into a project of its own: run Manticore on straightforward Linux VMs, keep operations simple, and scale predictably. For low-latency typeahead in particular, it’s normal to plan for sufficient RAM headroom, so scaling often starts with memory (scale up) and later expands to adding nodes (scale out) as data and traffic grow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Talk to Us About Migrating to Manticore
&lt;/h2&gt;

&lt;p&gt;If you're considering a migration to Manticore Search and want a quick architecture review (for example, a VM-based setup on Azure), &lt;a href="https://manticoresearch.com/contact/" rel="noopener noreferrer"&gt;get in touch with us&lt;/a&gt;. Share a bit about your dataset size, query patterns, and latency targets, and we will help you validate an approach and plan the next steps.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>azure</category>
      <category>opensource</category>
      <category>performance</category>
    </item>
    <item>
      <title>Azure AI Search vs Manticore Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Mon, 16 Feb 2026 11:13:41 +0000</pubDate>
      <link>https://dev.to/sanikolaev/azure-ai-search-vs-manticore-search-15g4</link>
      <guid>https://dev.to/sanikolaev/azure-ai-search-vs-manticore-search-15g4</guid>
      <description>&lt;p&gt;Vector search is great for the “kinda similar” part of search. The annoying part is everything else: exact phrases, filters that &lt;em&gt;must&lt;/em&gt; be respected, typo tolerance, relevance you can explain to a PM, and results that don’t randomly flip because a model sneezed.&lt;/p&gt;

&lt;p&gt;So this isn’t a “vectors vs keywords” post. It’s about the boring, practical combo: &lt;strong&gt;vector + full-text&lt;/strong&gt;, in the same system, with predictable behavior.&lt;/p&gt;

&lt;p&gt;Azure AI Search and Manticore Search can both do hybrid search. But they feel very different to operate day-to-day.&lt;/p&gt;

&lt;p&gt;Azure optimizes for rapid delivery. Manticore optimizes for living with search over time. That difference shows up less in week one — and a lot more in month six.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where “generic search” stops working
&lt;/h2&gt;

&lt;p&gt;There’s a whole class of search problems that look simple until you ship them: document search, knowledge bases, internal tools, anything with “small chunks” (paragraphs/sections) and lots of metadata.&lt;/p&gt;

&lt;p&gt;This is where managed abstractions start to leak, and where defaults stop being your friend.&lt;/p&gt;

&lt;p&gt;You end up caring about stuff like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;strict filters (workspace/project, region, owner/team, timestamps, doc type, visibility, version)
&lt;/li&gt;
&lt;li&gt;phrase/proximity (because wording matters)
&lt;/li&gt;
&lt;li&gt;stable ranking (so results don’t wander around week to week)
&lt;/li&gt;
&lt;li&gt;highlights/snippets that look like actual citations, not “AI vibes”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And yes, semantic/vector search helps — but it doesn’t replace full-text fundamentals or explainability.&lt;/p&gt;

&lt;p&gt;Once you’re in that world (filters, phrases, stable ranking, “why did this rank?”), search stops being a thing you “turn on” and becomes something you’ll tune and debug over time. That’s when the real choice shows up: accept a managed service’s abstraction layer, or run an engine where ranking and execution are more explicit.&lt;/p&gt;

&lt;p&gt;That split maps pretty closely to Azure AI Search vs Manticore.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two ways to solve it
&lt;/h2&gt;

&lt;p&gt;The cleanest way to think about it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Azure AI Search&lt;/strong&gt; is a managed service you rent. You trade control for convenience (and you get Azure-shaped guardrails).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manticore Search&lt;/strong&gt; is a search engine you run. You get knobs and dials, plus responsibility for the box it runs on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If all you need is “good enough search” plus easy integration, Azure is hard to beat. If you need to argue with relevance and win, Manticore is easier to live with.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost comparison: managed vs self-hosted
&lt;/h2&gt;

&lt;p&gt;This is one of the biggest practical differences between these two approaches, and it often only becomes obvious after a few months in production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure AI Search costs
&lt;/h3&gt;

&lt;p&gt;Azure AI Search is billed as a &lt;strong&gt;managed cloud service&lt;/strong&gt;. You provision capacity (replicas and partitions), and you pay for that capacity &lt;strong&gt;per hour&lt;/strong&gt;, whether it’s fully used or not.&lt;/p&gt;

&lt;p&gt;That model has a few practical consequences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost scales with &lt;em&gt;provisioned&lt;/em&gt; capacity, not actual query volume.&lt;/li&gt;
&lt;li&gt;High availability and higher throughput multiply costs (replicas × partitions).&lt;/li&gt;
&lt;li&gt;Vector search increases memory pressure, which often pushes you into higher tiers sooner than expected.&lt;/li&gt;
&lt;li&gt;You can’t “scale to zero” — the service costs money as long as it exists.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In real-world setups, teams often start small and then gradually scale up as indexes grow, query volume increases, or latency requirements tighten. Over time, it’s common for Azure AI Search to land in the &lt;strong&gt;hundreds of dollars per month&lt;/strong&gt;, and for more demanding workloads, &lt;strong&gt;four figures per month&lt;/strong&gt; is not unusual.&lt;/p&gt;

&lt;p&gt;None of this is surprising — you’re paying for a fully managed service with SLAs, built-in redundancy, and tight Azure integration. But the important thing is that &lt;strong&gt;cost growth can feel indirect&lt;/strong&gt;: you don’t always see a clear, linear connection between “we changed X” and “the bill went up”.&lt;/p&gt;

&lt;h3&gt;
  
  
  Manticore Search costs
&lt;/h3&gt;

&lt;p&gt;Manticore Search itself is &lt;strong&gt;free and open source&lt;/strong&gt;. There is no licensing cost. What you pay for is infrastructure and operations.&lt;/p&gt;

&lt;p&gt;In practice, that usually means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one or more VMs (or containers)&lt;/li&gt;
&lt;li&gt;storage&lt;/li&gt;
&lt;li&gt;monitoring and backups&lt;/li&gt;
&lt;li&gt;some ops time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For many document and knowledge-base workloads, a single modest VM is enough. That often puts the monthly infrastructure cost in the &lt;strong&gt;tens of dollars&lt;/strong&gt;, not hundreds. Even with redundancy or horizontal scaling, costs tend to grow in a &lt;strong&gt;predictable, hardware-shaped way&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The key difference is visibility: if costs increase with Manticore, it’s usually because you explicitly added RAM, CPU, or machines. There’s no opaque service unit math in the middle.&lt;/p&gt;

&lt;h3&gt;
  
  
  The tipping point
&lt;/h3&gt;

&lt;p&gt;If your priority is minimal operational effort and deep Azure-native integration, Azure AI Search’s pricing can be a reasonable trade-off.&lt;/p&gt;

&lt;p&gt;If your priority is &lt;strong&gt;predictable long-term cost&lt;/strong&gt;, &lt;strong&gt;clear performance knobs&lt;/strong&gt;, and avoiding surprise bills as data grows, running Manticore yourself often ends up significantly cheaper — especially once vector search is in the mix.&lt;/p&gt;




&lt;h2&gt;
  
  
  Azure AI Search: works fast, gets fuzzy later
&lt;/h2&gt;

&lt;p&gt;Azure has a solid keyword engine: analyzers, stemming, phrases/proximity, synonym maps, scoring profiles, filters, and an optional semantic ranking layer. You can ship something quickly.&lt;/p&gt;

&lt;p&gt;Where it starts to sting is when you’re past the demo and now you’re maintaining it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tuning is mostly “turn these weights” (scoring profiles), plus maybe semantic ranking.
&lt;/li&gt;
&lt;li&gt;The scoring is less transparent end-to-end than Manticore. You can tune with scoring profiles/analyzers and measure results, but you don’t get the same “read the query, understand the ranking” feeling.
&lt;/li&gt;
&lt;li&gt;Hybrid merging is managed; keyword and vector results are fused with Reciprocal Rank Fusion (RRF). It’s convenient. It’s also harder to inspect when the top 10 looks wrong.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, this often turns into compensating logic in the application layer: boosting, filtering, or post-processing results because the search engine won’t quite do what you need. That logic is harder to test, harder to explain, and harder to remove later.&lt;/p&gt;

&lt;p&gt;If you’ve never had to explain “why is this #1?” to someone who’s mad, this is fine. If you have, you already know the pain.&lt;/p&gt;

&lt;p&gt;Also: the whole thing is defined in service terms — schema JSON, Azure APIs, Azure limits. The practical downside is vendor lock-in: once your indexing model, analyzers, scoring profiles, and query patterns are Azure-shaped, moving later is real work.&lt;/p&gt;




&lt;h2&gt;
  
  
  Manticore Search: explicit, and honestly… nicer to debug
&lt;/h2&gt;

&lt;p&gt;Manticore comes from the classic IR world and keeps full-text features very upfront: BM25-style scoring, field-level matching, phrases/proximity, filtering, and a SQL-ish query language. You can look at a query and tell what it will do. And more importantly: you can explain it to someone who doesn’t work on search.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example: a “normal” full-text query
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"data retention policy"~3'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Finance'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;effective_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2022-01-01'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That “WEIGHT()” bit is not magic; it’s part of the mental model. This matters more than people think.&lt;/p&gt;

&lt;h4&gt;
  
  
  Hybrid search without guesswork
&lt;/h4&gt;

&lt;p&gt;With Azure, hybrid is “run both, fuse with RRF”. With Manticore, hybrid is “do this, then this, then filter and apply secondary ordering”. It’s less elegant on a slide, but it’s very practical.&lt;/p&gt;

&lt;p&gt;Example (vector first, then text, then explicit sorting):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'how to rotate api keys safely'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"key rotation" | "rotate keys"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can read this query out loud and it doesn’t sound like a prayer. That’s the point.&lt;/p&gt;

&lt;p&gt;One nuance: KNN results are primarily ordered by vector distance; additional &lt;code&gt;ORDER BY&lt;/code&gt; criteria refine within that KNN set (think: tie-breaks / secondary sorting), rather than “fully fusing” scores into a single blended rank.&lt;/p&gt;

&lt;p&gt;Also: when ranking changes, you can usually point to the exact part of the query that caused it. Regressions become boring to debug — which is exactly what you want.&lt;/p&gt;




&lt;h3&gt;
  
  
  Storage-first products don’t replace search (they just force a second system)
&lt;/h3&gt;

&lt;p&gt;This comes up a lot on Azure: you pick a document store (Cosmos DB, DocumentDB/Mongo API, etc.) and hope it’ll cover search too.&lt;/p&gt;

&lt;p&gt;It won’t, not for anything chunk-level or relevance-sensitive. You’ll quickly want:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;phrase/proximity&lt;/li&gt;
&lt;li&gt;relevance tuning beyond basic text matching&lt;/li&gt;
&lt;li&gt;better ranking control&lt;/li&gt;
&lt;li&gt;hybrid (vector + keyword) that you can reason about&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So you end up bolting on Azure AI Search anyway, and now you’re maintaining &lt;em&gt;two&lt;/em&gt; separate things: storage + search, plus an indexing pipeline in between. That can be totally fine. Just don’t pretend it’s one system.&lt;/p&gt;




&lt;h3&gt;
  
  
  Freshness and “did my update land?”
&lt;/h3&gt;

&lt;p&gt;In Azure, updates are API calls with their own semantics (and you need to be careful with partial updates). When content changes, vector fields need to be handled explicitly. It’s doable, it’s just… application-work.&lt;/p&gt;

&lt;p&gt;Manticore’s real-time tables behave more like a database: insert/update/delete and the full-text + vector indexes keep up together. If you’re building something like product search or docs search where things change all the time, this feels simpler.&lt;/p&gt;




&lt;h3&gt;
  
  
  “We want Azure, preferably managed” (fair)
&lt;/h3&gt;

&lt;p&gt;If your company’s default posture is “managed everything”, Azure AI Search fits that worldview. You plug it in, you accept the service model, you move on.&lt;/p&gt;

&lt;p&gt;If you want Manticore with minimal headaches on Azure, the honest pitch is: it’s not managed, but it can be &lt;em&gt;boring&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Azure AI Search works best when search is an infrastructure dependency. Manticore works best when search is a product surface.&lt;/p&gt;

&lt;p&gt;Typical setup that doesn’t turn into a science project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;keep documents in whatever Azure storage you already trust (Blob, a DB, etc.)&lt;/li&gt;
&lt;li&gt;run Manticore on a single VM (or a small VMSS later) in a VNet&lt;/li&gt;
&lt;li&gt;keep chunking/segmentation + indexing as a simple worker pipeline (queue + worker, or whatever you already have)&lt;/li&gt;
&lt;li&gt;treat search as a stateless-ish service: snapshots/backups, metrics, and replacement, not “pet servers”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re in a compliance-heavy environment, this is also the boring win: private networking, predictable data flow, and no “please open the internet so the managed thing can talk to the other managed thing” dance.&lt;/p&gt;

&lt;p&gt;You still own it, but you’re not forced into a complex cluster if you don’t need one.&lt;/p&gt;




&lt;h3&gt;
  
  
  A practical note: why vector search changes the bill
&lt;/h3&gt;

&lt;p&gt;Hybrid search isn’t just “keyword + vector”. It’s also &lt;strong&gt;CPU + RAM&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keyword-heavy workloads mostly burn CPU.&lt;/li&gt;
&lt;li&gt;Vector-heavy workloads mostly burn memory.&lt;/li&gt;
&lt;li&gt;Chunk-level indexing often multiplies both: more rows, more vectors, more metadata, more filters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On Azure AI Search, that usually shows up as “we need more capacity” (and the bill follows the provisioned units).&lt;br&gt;
On Manticore, it usually shows up as “we need more RAM/CPU” (and you choose the VM size).&lt;/p&gt;

&lt;p&gt;Same physics — different pricing model.&lt;/p&gt;


&lt;h3&gt;
  
  
  Quick note on “maybe Elastic then?”
&lt;/h3&gt;

&lt;p&gt;Elastic is capable, and on Azure it’s a familiar choice. The trade is usually operational: more moving pieces, more knobs, more “cluster care and feeding”.&lt;/p&gt;

&lt;p&gt;If you already run it well, cool. If you don’t, and all you want is chunk-level document search that behaves, it can feel like bringing a whole orchestra because you need a violin.&lt;/p&gt;


&lt;h3&gt;
  
  
  Developer experience (how it feels in the editor)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Azure AI Search&lt;/th&gt;
&lt;th&gt;Manticore Search&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Query style&lt;/td&gt;
&lt;td&gt;REST + JSON&lt;/td&gt;
&lt;td&gt;SQL + HTTP JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full-text logic&lt;/td&gt;
&lt;td&gt;Service-defined&lt;/td&gt;
&lt;td&gt;Explicit, query-level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector + text&lt;/td&gt;
&lt;td&gt;Managed fusion&lt;/td&gt;
&lt;td&gt;Explicit composition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Debugging relevance&lt;/td&gt;
&lt;td&gt;Indirect&lt;/td&gt;
&lt;td&gt;Direct&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Portability&lt;/td&gt;
&lt;td&gt;Azure-only&lt;/td&gt;
&lt;td&gt;Any environment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transparency&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  A concrete example: clause-level document search
&lt;/h3&gt;

&lt;p&gt;If you want a stress test that exposes search tradeoffs quickly, this is it: split documents into clauses/sections, index those chunks, then ask people to find &lt;em&gt;specific language&lt;/em&gt; under strict filters.&lt;/p&gt;

&lt;p&gt;Why it’s unforgiving:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Phrase/proximity really matters. A system that’s merely “similar” will surface lookalikes that waste time.&lt;/li&gt;
&lt;li&gt;Filters are not optional. Users will treat them as hard constraints, and they’ll notice when “almost matching” sneaks in.&lt;/li&gt;
&lt;li&gt;Trust is fragile. If people can’t tell why a result is #1, they stop trusting search and start working around it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it usually turns into (roughly):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;each clause becomes a row/document with &lt;code&gt;clause_id&lt;/code&gt;, &lt;code&gt;document_id&lt;/code&gt;, &lt;code&gt;clause_path&lt;/code&gt; (or whatever naming), and the clause text&lt;/li&gt;
&lt;li&gt;metadata fields become hard filters (workspace/matter, jurisdiction/region, dates, version, visibility, etc.)&lt;/li&gt;
&lt;li&gt;optional: an embedding per clause for the “find me similar language” part&lt;/li&gt;
&lt;li&gt;UI pulls the full document separately and shows the clause with a snippet/highlight that’s easy to verify&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full-text baseline (predictable, easy to explain). Use this when people know the wording they’re hunting for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;clause_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;clauses&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"limitation of liability"~3'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;jurisdiction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AU'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;effective_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2022-01-01'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where embeddings fit: they can be great for expanding recall (“find similar language”), but they’re not “thinking”. In practice, semantic search can land in an awkward middle ground: it looks smart at first, then frustrates people because it’s not reliably smart enough.&lt;/p&gt;

&lt;p&gt;So here’s the pattern that tends to behave:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the user types exact-ish keywords → do pure full-text (BM25/&lt;code&gt;WEIGHT()&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;If the user types an idea (“cap on liability”, “excluded damages”) → use embeddings to pull candidates, then lock it down with full-text + filters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hybrid example (semantic candidates → strict text + metadata → distance-first ordering):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;clause_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;clauses&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cap on liability and excluded damages'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"limitation of liability" | "consequential damages"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;jurisdiction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AU'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;WEIGHT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What this actually does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;knn(...)&lt;/code&gt; picks a candidate set by meaning.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MATCH(...)&lt;/code&gt; + filters keep it verifiable (you can point at the words on the page).&lt;/li&gt;
&lt;li&gt;Results are primarily sorted by vector distance; &lt;code&gt;WEIGHT()&lt;/code&gt; refines within that KNN set.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you need real reasoning, do it after retrieval (e.g., run an LLM over the top N candidates).&lt;/p&gt;

&lt;p&gt;This is also where RAG-style workflows fit. Manticore’s RAG support is on the &lt;a href="https://roadmap.manticoresearch.com/" rel="noopener noreferrer"&gt;roadmap&lt;/a&gt; (see &lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/2286" rel="noopener noreferrer"&gt;issue #2286&lt;/a&gt;) — and by the time you’re reading this, it might already be shipped.&lt;/p&gt;




&lt;h3&gt;
  
  
  So which one would I pick?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If you want “don’t make me run search infra” and you’re already deep in Azure, Azure AI Search is the obvious choice. You’ll move fast.&lt;/li&gt;
&lt;li&gt;If search relevance is a product feature (not a checkbox), and you expect to tune and debug it for months, you’ll probably prefer Manticore. You can be opinionated and precise without fighting managed abstractions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One slightly unromantic rule: if your team can’t or won’t own search as a system, pick Azure. If your team &lt;em&gt;can&lt;/em&gt; own it, pick the option that lets you see what’s going on — which usually means Manticore is the calmer long-term choice.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>azure</category>
      <category>database</category>
    </item>
    <item>
      <title>Inline Stopwords, Exceptions, and Wordforms</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Thu, 12 Feb 2026 10:30:16 +0000</pubDate>
      <link>https://dev.to/sanikolaev/inline-stopwords-exceptions-and-wordforms-2fl</link>
      <guid>https://dev.to/sanikolaev/inline-stopwords-exceptions-and-wordforms-2fl</guid>
      <description>&lt;p&gt;Manticore Search &lt;a href="https://dev.to/blog/manticore-search-17-5-1/"&gt;now supports&lt;/a&gt; inline specification of tokenization dictionary settings directly in the &lt;code&gt;CREATE TABLE&lt;/code&gt; statement. This enhancement eliminates the need for external files when configuring stopwords, exceptions, wordforms, and hitless words, making table creation more streamlined and deployment-friendly.&lt;/p&gt;

&lt;h2&gt;
  
  
  New Features
&lt;/h2&gt;

&lt;p&gt;Four new configuration options are now available in &lt;a href="https://manual.manticoresearch.com/Read_this_first#Real-time-mode-vs-plain-mode" rel="noopener noreferrer"&gt;RT mode&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;stopwords_list&lt;/code&gt;&lt;/strong&gt; - Specify stop words directly in the table definition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;exceptions_list&lt;/code&gt;&lt;/strong&gt; - Define tokenization exceptions inline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;wordforms_list&lt;/code&gt;&lt;/strong&gt; - Configure word form mappings without external files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hitless_words_list&lt;/code&gt;&lt;/strong&gt; - Set hitless words as part of table creation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these options use semicolon (&lt;code&gt;;&lt;/code&gt;) as a separator between entries, making them easy to use in SQL and HTTP JSON interfaces.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem They Solve
&lt;/h2&gt;

&lt;p&gt;Traditionally, configuring tokenization dictionaries required creating external files that Manticore would read during table creation. While this approach works well in many scenarios, it presents several challenges:&lt;/p&gt;

&lt;h3&gt;
  
  
  File Permission Issues
&lt;/h3&gt;

&lt;p&gt;Web applications running under restricted user accounts often struggle to create files in directories that are both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writable by the web server process&lt;/li&gt;
&lt;li&gt;Readable by the Manticore daemon process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is particularly problematic in shared hosting environments where web applications run under restricted user accounts (such as in &lt;a href="https://www.virtualmin.com/" rel="noopener noreferrer"&gt;Virtualmin&lt;/a&gt; or similar control panel setups), where user home directories are typically only readable by the owner, while system directories may have restrictive permissions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sticky Directory Problems
&lt;/h3&gt;

&lt;p&gt;Using system temporary directories (like &lt;code&gt;/tmp&lt;/code&gt;) introduces another issue: the sticky bit on these directories can prevent proper cleanup of stopword files. When indexes are frequently rebuilt, orphaned files can accumulate, consuming disk space and creating maintenance headaches.&lt;/p&gt;

&lt;h3&gt;
  
  
  File Lifecycle Management
&lt;/h3&gt;

&lt;p&gt;When tables are frequently created and destroyed, managing the associated tokenization dictionary files becomes cumbersome. Developers must:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the file before table creation&lt;/li&gt;
&lt;li&gt;Ensure the file is readable by Manticore&lt;/li&gt;
&lt;li&gt;Remember to clean up the file when the table is dropped&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This manual process is error-prone and can lead to file system clutter.&lt;/p&gt;

&lt;h3&gt;
  
  
  The New Options
&lt;/h3&gt;

&lt;p&gt;The new &lt;code&gt;*_list&lt;/code&gt; options let you specify tokenization dictionary settings directly in the &lt;code&gt;CREATE TABLE&lt;/code&gt; statement. With external files, &lt;code&gt;SHOW CREATE TABLE&lt;/code&gt; shows file paths and you maintain dictionary content in separate files; with the inline options, you never create or reference external paths. Dictionary content lives in the DDL (internally it still ends up as files in the table directory, same as with file paths). &lt;code&gt;SHOW CREATE TABLE&lt;/code&gt; shows the full dictionary settings inline (e.g., &lt;code&gt;stopwords_list = 'a; the; an'&lt;/code&gt;), so the table definition is self-contained in one statement, easier to version control and to copy or share. The table definition is portable across different environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Usage Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Stopwords
&lt;/h3&gt;

&lt;p&gt;Instead of creating a stopwords file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Old way (requires external file)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;stopwords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'/usr/local/manticore/data/stopwords.txt'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can now specify stopwords inline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- New way (no external file needed)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;stopwords_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'a; the; an; and; or; but'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Exceptions
&lt;/h3&gt;

&lt;p&gt;Exceptions (synonyms) can be defined inline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;exceptions_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AT&amp;amp;T =&amp;gt; ATT; MS Windows =&amp;gt; ms windows; C++ =&amp;gt; cplusplus'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Wordforms
&lt;/h3&gt;

&lt;p&gt;Word form mappings can be specified directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;wordforms_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'walks &amp;gt; walk; walked &amp;gt; walk; walking &amp;gt; walk'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hitless Words
&lt;/h3&gt;

&lt;p&gt;Hitless words can be configured inline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;hitless_words_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hello; world; test'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Combining Multiple Options
&lt;/h3&gt;

&lt;p&gt;You can combine all these options in a single &lt;code&gt;CREATE TABLE&lt;/code&gt; statement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;stopwords_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'a; the; an'&lt;/span&gt; 
&lt;span class="n"&gt;exceptions_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AT&amp;amp;T =&amp;gt; ATT'&lt;/span&gt; 
&lt;span class="n"&gt;wordforms_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'walks &amp;gt; walk; walked &amp;gt; walk'&lt;/span&gt; 
&lt;span class="n"&gt;hitless_words_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hello; world'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When to Use Inline Configuration
&lt;/h2&gt;

&lt;p&gt;Inline configuration is ideal when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Small to Medium Lists&lt;/strong&gt;: The lists are reasonably sized (typically under a few hundred entries). For very large dictionaries, external files may still be more practical.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Table Creation&lt;/strong&gt;: Your application programmatically creates and destroys tables, making file management cumbersome.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Restricted File System Access&lt;/strong&gt;: You're running in an environment with limited file system permissions (shared hosting, containers, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplified Deployment&lt;/strong&gt;: You want to avoid managing additional files as part of your deployment process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frequent Index Rebuilding&lt;/strong&gt;: Tables are frequently recreated, making file cleanup a maintenance burden.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  When External Files Are Better
&lt;/h2&gt;

&lt;p&gt;While inline configuration is convenient, external files remain the better choice in these scenarios:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Large Dictionaries&lt;/strong&gt;: When you have thousands of entries, external files are more manageable and don't bloat your &lt;code&gt;CREATE TABLE&lt;/code&gt; statements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared Dictionaries&lt;/strong&gt;: If the same dictionary is used across multiple tables, an external file allows you to define it once and reference it from multiple tables, reducing duplication.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version Control&lt;/strong&gt;: External files can be easily tracked in version control systems, making it easier to review changes and maintain history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Updates&lt;/strong&gt;: If you need to update dictionaries without recreating tables, external files can be modified and then use &lt;code&gt;ALTER TABLE &amp;lt;table_name&amp;gt; RECONFIGURE&lt;/code&gt; to apply the changes. For RT tables, this makes the new tokenization settings take effect for new documents (existing documents remain unchanged). For plain tables, rotation is required to pick up changes from modified dictionary files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex Formatting&lt;/strong&gt;: Very complex wordform or exception rules may be easier to edit in a dedicated file with proper formatting and comments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Legacy Systems&lt;/strong&gt;: If you already have well-maintained external dictionary files, there's no need to migrate unless you're facing the specific problems that inline configuration solves.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Format Details
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Separator
&lt;/h3&gt;

&lt;p&gt;All &lt;code&gt;*_list&lt;/code&gt; options use semicolons (&lt;code&gt;;&lt;/code&gt;) to separate entries. Spaces around semicolons are normalized, so &lt;code&gt;'word1; word2'&lt;/code&gt; and &lt;code&gt;'word1 ; word2'&lt;/code&gt; are equivalent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Escaping
&lt;/h3&gt;

&lt;p&gt;If you need to use a semicolon as part of the value itself (not as a separator), escape it with a backslash: &lt;code&gt;\;&lt;/code&gt;. For example, if you want to map a source form that contains a semicolon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;exceptions_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'test&lt;/span&gt;&lt;span class="se"&gt;\;&lt;/span&gt;&lt;span class="s1"&gt;value =&amp;gt; testvalue; another =&amp;gt; mapping'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates two mappings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;test;value&lt;/code&gt; (with a semicolon) → &lt;code&gt;testvalue&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;another&lt;/code&gt; → &lt;code&gt;mapping&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The escaped semicolon (&lt;code&gt;\;&lt;/code&gt;) is treated as a literal semicolon character, not as a separator between entries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wordforms Format
&lt;/h3&gt;

&lt;p&gt;Wordforms support both &lt;code&gt;&amp;gt;&lt;/code&gt; and &lt;code&gt;=&amp;gt;&lt;/code&gt; as separators:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;wordforms_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'word1 &amp;gt; form1; word2 =&amp;gt; form2'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Exceptions Format
&lt;/h3&gt;

&lt;p&gt;Exceptions use &lt;code&gt;=&amp;gt;&lt;/code&gt; as the separator between source and destination forms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;exceptions_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'source form =&amp;gt; destination form'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: When using &lt;code&gt;exceptions_list&lt;/code&gt;, you may see warnings in the searchd log about &lt;code&gt;mapping token (=&amp;gt;) not found&lt;/code&gt; in temporary exception files. These warnings are harmless and can be safely ignored—the exceptions function correctly despite these messages. The warnings occur during internal file processing and don't affect the actual exception mapping behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example: Stopwords, Wordforms, and Exceptions Together
&lt;/h2&gt;

&lt;p&gt;Here's a practical example using inline stopwords, wordforms, and exceptions on a single table. Wordforms normalize variants to a single form (e.g. "learning" → "learn"); exceptions map shorthand to a normalized form (e.g. "JS" → "javascript") so that both "JS" and "JavaScript" match the same documents. Use lowercase in the exception destination so it matches the token form produced by charset_table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create a table with inline stopwords, wordforms, and exceptions&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;bigint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;stopwords_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'a; the; an; and; or; but; in; on; at; to; for; of; with'&lt;/span&gt;
&lt;span class="n"&gt;wordforms_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'learning &amp;gt; learn; programming &amp;gt; program; reference &amp;gt; refer; introduction &amp;gt; intro; complete &amp;gt; complet; basics &amp;gt; basic'&lt;/span&gt;
&lt;span class="n"&gt;exceptions_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'JS =&amp;gt; javascript; ML =&amp;gt; machine learning'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Insert test data&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'The Quick Guide to Python Programming'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'A Complete Reference for JavaScript'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'An Introduction to Machine Learning'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Python Programming Basics'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Getting Started with JS'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Stopwords:&lt;/strong&gt; queries with or without stopwords match the same documents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'python'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;The Quick Guide to Python Programming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Python Programming Basics&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'the python'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;The Quick Guide to Python Programming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Python Programming Basics&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Phrase search:&lt;/strong&gt; stopwords are skipped for matching but still affect positions (tunable with &lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Ignoring_stop-words#stopword_step" rel="noopener noreferrer"&gt;stopword_step&lt;/a&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"the quick"'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;The Quick Guide to Python Programming&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Wordforms:&lt;/strong&gt; "learn" matches "Learning" via the wordform.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'learn'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;An Introduction to Machine Learning&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Exceptions:&lt;/strong&gt; the mapping &lt;code&gt;JS =&amp;gt; javascript&lt;/code&gt; normalizes "JS" to "javascript" when it appears in text or in the query. Because the destination is lowercase, it matches the token form that charset_table produces for "JavaScript", so both &lt;code&gt;MATCH('JavaScript')&lt;/code&gt; and &lt;code&gt;MATCH('JS')&lt;/code&gt; return the same rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'JavaScript'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;A Complete Reference for JavaScript&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Getting Started with JS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'JS'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;A Complete Reference for JavaScript&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Getting Started with JS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Benefits Summary
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No File Management&lt;/strong&gt;: Eliminates the need to create, manage, and clean up external files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplified Deployment&lt;/strong&gt;: Configuration is part of the table definition, making deployments more straightforward&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permission Independence&lt;/strong&gt;: No file system permission issues between web server and Manticore processes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better for Automation&lt;/strong&gt;: Easier to script and automate table creation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-Contained and Self-Documenting&lt;/strong&gt;: Table configuration is complete in the &lt;code&gt;CREATE TABLE&lt;/code&gt; statement, and &lt;code&gt;SHOW CREATE TABLE&lt;/code&gt; shows the full dictionary content inline, so definitions are easy to share and version control without managing separate dictionary files&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Migration Path
&lt;/h2&gt;

&lt;p&gt;If you're currently using external files, you can easily migrate to inline configuration:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read your existing file content&lt;/li&gt;
&lt;li&gt;Convert the format to use semicolons as separators&lt;/li&gt;
&lt;li&gt;Replace the file path with the &lt;code&gt;*_list&lt;/code&gt; option in your &lt;code&gt;CREATE TABLE&lt;/code&gt; statement&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For example, if you have a &lt;code&gt;stopwords.txt&lt;/code&gt; file containing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;a
the
an
and
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can convert it to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;stopwords_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'a; the; an; and'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The new inline tokenization dictionary configuration options (&lt;code&gt;stopwords_list&lt;/code&gt;, &lt;code&gt;exceptions_list&lt;/code&gt;, &lt;code&gt;wordforms_list&lt;/code&gt;, and &lt;code&gt;hitless_words_list&lt;/code&gt;) provide a cleaner, more maintainable way to configure tokenization settings. They're particularly valuable in environments where file management is challenging or when you want to simplify your deployment process and keep table definitions self-contained. While external files remain supported for large dictionaries, inline configuration offers a convenient alternative for most use cases.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>data</category>
      <category>database</category>
      <category>sql</category>
    </item>
    <item>
      <title>Manticore Search 17.5.1</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Tue, 10 Feb 2026 06:03:44 +0000</pubDate>
      <link>https://dev.to/sanikolaev/manticore-search-1751-cbj</link>
      <guid>https://dev.to/sanikolaev/manticore-search-1751-cbj</guid>
      <description>&lt;p&gt;&lt;a href="https://manticoresearch.com/install-17.5.1/" rel="noopener noreferrer"&gt;Manticore Search 17.5.1&lt;/a&gt; has been released. This maintenance release includes bug fixes, minor improvements, and updated recommended library versions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Breaking Changes
&lt;/h2&gt;

&lt;p&gt;Please review these if you are upgrading from older versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MCL 10.0.0: Added support for &lt;code&gt;DROP CACHE&lt;/code&gt;. This updates the interface between the daemon and MCL. Older Manticore Search versions don't support the newer MCL. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4120" rel="noopener noreferrer"&gt;Issue #4120&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Percolate query JSON responses now return hit &lt;code&gt;_id&lt;/code&gt; and &lt;code&gt;_score&lt;/code&gt; as numbers instead of strings, so they now match regular search; this is a breaking change for clients that relied on string type for these fields. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4019" rel="noopener noreferrer"&gt;Issue #4019&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Recommended Versions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCL (Manticore Columnar Library)&lt;/strong&gt;: 10.2.0&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manticore Buddy&lt;/strong&gt;: 3.41.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you follow the &lt;a href="https://manticoresearch.com/install-17.5.1/" rel="noopener noreferrer"&gt;official installation guide&lt;/a&gt;, you don't need to worry about this as the correct versions will be installed automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  New Features and Improvements
&lt;/h2&gt;

&lt;p&gt;Highlights in this release:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The updated &lt;a href="https://github.com/manticoresoftware/columnar" rel="noopener noreferrer"&gt;MCL&lt;/a&gt; adds support for Llama, Qwen, Mistral, Gemma, and &lt;a href="https://manual.manticoresearch.com/Searching/KNN#Creating-a-table-with-auto-embeddings" rel="noopener noreferrer"&gt;other models&lt;/a&gt; for auto-embeddings.&lt;/li&gt;
&lt;li&gt;Jieba morphology instances are now shared across tables with the same configuration, greatly reducing memory use when many tables use Jieba.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Ignoring_stop-words#stopwords_list" rel="noopener noreferrer"&gt;stopwords&lt;/a&gt;, &lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Wordforms#wordforms_list" rel="noopener noreferrer"&gt;wordforms&lt;/a&gt;, &lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Exceptions#exceptions_list" rel="noopener noreferrer"&gt;exceptions&lt;/a&gt;, and &lt;a href="https://manual.manticoresearch.com/Creating_a_table/NLP_and_tokenization/Low-level_tokenization#hitless_words_list" rel="noopener noreferrer"&gt;hitless_words&lt;/a&gt; can now be set inline in &lt;code&gt;CREATE TABLE&lt;/code&gt;, so tables can be created without external files.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Bug Fixes
&lt;/h2&gt;

&lt;p&gt;Notable fixes in this release:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fixed JOIN results returning empty or duplicated values when a column was both a string attribute and a stored field; the attribute value is now returned correctly. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/3498" rel="noopener noreferrer"&gt;Issue #3498&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed joins on JSON string attributes (e.g. &lt;code&gt;j.s&lt;/code&gt;) returning no matches; they now work like joins on plain string attributes. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/2559" rel="noopener noreferrer"&gt;Issue #2559&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed &lt;code&gt;highlight()&lt;/code&gt; with &lt;code&gt;html_strip_mode=strip&lt;/code&gt; corrupting content by decoding entities and altering tags; original entity form is now preserved. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/1737" rel="noopener noreferrer"&gt;Issue #1737&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed &lt;code&gt;ALTER TABLE REBUILD SECONDARY&lt;/code&gt; failing with &lt;code&gt;failed to rename ... .tmp.spjidx&lt;/code&gt; when the table had multiple disk chunks. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/3203" rel="noopener noreferrer"&gt;Issue #3203&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed distributed queries returning stored fields from the wrong local index when agent tables contain duplicate document IDs. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4148" rel="noopener noreferrer"&gt;Issue #4148&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed table rename breaking tables that use external stopwords, wordforms, or exceptions: &lt;code&gt;ATTACH TABLE&lt;/code&gt; now migrates these files properly. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4176" rel="noopener noreferrer"&gt;Issue #4176&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed MATCH with OR over the same phrase in different fields returning matches from other fields. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4128" rel="noopener noreferrer"&gt;Issue #4128&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed &lt;code&gt;ALTER TABLE&lt;/code&gt; with table-level settings failing on tables with auto-embeddings; serialization now omits &lt;code&gt;knn_dims&lt;/code&gt; when &lt;code&gt;model_name&lt;/code&gt; is set. (&lt;a href="https://github.com/manticoresoftware/manticoresearch/issues/4131" rel="noopener noreferrer"&gt;Issue #4131&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...and many more (47 bug fixes in total). For the complete list, see the &lt;a href="https://manual.manticoresearch.com/Changelog#Version-17.5.1" rel="noopener noreferrer"&gt;Changelog&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Compatibility
&lt;/h2&gt;

&lt;p&gt;Manticore Search 17.5.1 maintains &lt;strong&gt;strong backward compatibility&lt;/strong&gt; with existing data and queries; see the breaking-change notes above.&lt;br&gt;
To upgrade, follow the &lt;a href="https://manticoresearch.com/install-17.5.1/" rel="noopener noreferrer"&gt;installation guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need help or want to connect?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Join our &lt;a href="https://slack.manticoresearch.com" rel="noopener noreferrer"&gt;Slack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Visit the &lt;a href="https://forum.manticoresearch.com" rel="noopener noreferrer"&gt;Forum&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Report issues or suggest features on &lt;a href="https://github.com/manticoresoftware/manticoresearch/issues" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Email us at &lt;code&gt;contact@manticoresearch.com&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For full details, see the &lt;a href="https://manual.manticoresearch.com/Changelog#Version-17.5.1" rel="noopener noreferrer"&gt;Changelog&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>database</category>
      <category>news</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Manticore Search 14.1.0: Force Bigrams and Bug Fixes</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Wed, 12 Nov 2025 09:50:35 +0000</pubDate>
      <link>https://dev.to/sanikolaev/manticore-search-1410-force-bigrams-and-bug-fixes-k3m</link>
      <guid>https://dev.to/sanikolaev/manticore-search-1410-force-bigrams-and-bug-fixes-k3m</guid>
      <description>&lt;p&gt;We're pleased to announce &lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;Manticore Search 14.1.0&lt;/a&gt;, a release that includes our work for October 2025. This update adds the &lt;code&gt;force_bigrams&lt;/code&gt; option for spell correction, replication progress tracking, and various bug fixes.&lt;/p&gt;

&lt;p&gt;❤️ &lt;strong&gt;Special thanks&lt;/strong&gt; to &lt;a href="https://github.com/ricardopintottrdata" rel="noopener noreferrer"&gt;@ricardopintottrdata&lt;/a&gt; for their contributions on HAVING total counts and filter error fixes, and to &lt;a href="https://github.com/jdelStrother" rel="noopener noreferrer"&gt;@jdelStrother&lt;/a&gt; for improving CJK segmentation handling when Jieba support isn't available.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚠️ Important Replication Update
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Version 14.0.0&lt;/strong&gt; updated the replication protocol. If you're running a replication cluster, you need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Cleanly stop all your nodes&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start the node that was stopped last&lt;/strong&gt; with &lt;code&gt;--new-cluster&lt;/code&gt;, using the &lt;code&gt;manticore_new_cluster&lt;/code&gt; tool in Linux&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read about &lt;a href="https://manual.manticoresearch.com/Creating_a_cluster/Setting_up_replication/Restarting_a_cluster#Restarting-a-cluster" rel="noopener noreferrer"&gt;restarting a cluster&lt;/a&gt;&lt;/strong&gt; for more details&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  New Features and Improvements
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Force Bigrams Option
&lt;/h3&gt;

&lt;p&gt;Added a &lt;code&gt;force_bigrams&lt;/code&gt; option for &lt;a href="https://manual.manticoresearch.com/Searching/Autocomplete#Using-force_bigrams-for-better-transposition-handling" rel="noopener noreferrer"&gt;fuzzy&lt;/a&gt; and &lt;a href="https://manual.manticoresearch.com/Searching/Spell_correction#Using-force_bigrams-for-better-transposition-handling" rel="noopener noreferrer"&gt;autocomplete&lt;/a&gt; functionality. This option helps with spell correction for shorter words where trigram matching may not work as well. For example, when correcting "Geroge" to "George", bigrams can provide better matching than trigrams for such transposition cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Replication Progress Tracking
&lt;/h3&gt;

&lt;p&gt;Added a &lt;a href="https://manual.manticoresearch.com/Creating_a_cluster/Setting_up_replication/Replication_cluster_status#SST-Progress-Metrics" rel="noopener noreferrer"&gt;progress meter&lt;/a&gt; for donor and joiner nodes in replication SST, visible in &lt;code&gt;SHOW STATUS&lt;/code&gt;. This provides visibility into replication state synchronization progress.&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional Improvements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LOCK TABLES support&lt;/strong&gt;: Added for mysqldump compatibility&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Buddy updated to 3.37.0&lt;/strong&gt;: Various improvements and stability fixes&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Bug Fixes
&lt;/h2&gt;

&lt;p&gt;This release includes numerous bug fixes across multiple versions leading up to 14.1.0:&lt;/p&gt;

&lt;h3&gt;
  
  
  Critical Fixes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fixed crash with &lt;code&gt;max(ft field)&lt;/code&gt;&lt;/strong&gt; - Resolved a critical crash when using max functions on full-text fields&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fixed empty filter name error&lt;/strong&gt; - Resolved error when using filters with empty names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fixed full-text query crashes&lt;/strong&gt; - Addressed crashes caused by specific full-text query patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fixed &lt;code&gt;"(abc|def)"&lt;/code&gt; query handling&lt;/strong&gt; - Full-text queries with this pattern now work as expected&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Query and Search Improvements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fixed HAVING total counts&lt;/strong&gt; - Added ability to get total number of results for queries using HAVING&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced CALL SUGGEST&lt;/strong&gt; - SUGGEST can now use bigrams instead of trigrams when needed, improving spell correction for shorter words&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fixed CJK segmentation&lt;/strong&gt; - Improved &lt;code&gt;ParseCJKSegmentation&lt;/code&gt; when Jieba support isn't available&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Added expansion phrase warning&lt;/strong&gt; - New &lt;code&gt;searchd.expansion_phrase_warning&lt;/code&gt; option for better query debugging&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Replication and Clustering
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fixed replication transaction handling&lt;/strong&gt; - Improved key generation and conflict resolution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  System and Component Updates
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Improved FreeBSD compilation&lt;/strong&gt; - Fixed native FreeBSD build issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced Filebeat compatibility&lt;/strong&gt; - Added testing for Filebeat version 9.2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better error handling&lt;/strong&gt; - Improved error handling for right-joined JSON queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KNN parameter validation&lt;/strong&gt; - Added proper validation for KNN parameters&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Compatibility
&lt;/h2&gt;

&lt;p&gt;Manticore Search 14.1.0 maintains &lt;strong&gt;strong backward compatibility&lt;/strong&gt; with important considerations:&lt;/p&gt;

&lt;h3&gt;
  
  
  General Compatibility
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fully compatible&lt;/strong&gt; with existing data and queries&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Replication Cluster Considerations
&lt;/h3&gt;

&lt;p&gt;⚠️ &lt;strong&gt;Important&lt;/strong&gt;: Version 14.0.0 introduced replication protocol changes. If upgrading from pre-14.0.0 versions with replication clusters:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Plan downtime&lt;/strong&gt; for proper cluster restart procedure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Follow the cluster restart guide&lt;/strong&gt; carefully&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test the upgrade&lt;/strong&gt; in a staging environment first&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To upgrade, follow the &lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;installation guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need help or want to connect?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Join our &lt;a href="https://slack.manticoresearch.com" rel="noopener noreferrer"&gt;Slack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Visit the &lt;a href="https://forum.manticoresearch.com" rel="noopener noreferrer"&gt;Forum&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Report issues or suggest features on &lt;a href="https://github.com/manticoresoftware/manticoresearch/issues" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Email us at &lt;code&gt;contact@manticoresearch.com&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For full details, see the &lt;a href="https://manual.manticoresearch.com/Changelog#Version-14.1.0" rel="noopener noreferrer"&gt;Changelog&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>database</category>
      <category>opensource</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Auto Embeddings in Manticore Search: AI-Powered Search Made Simple</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Tue, 16 Sep 2025 06:54:10 +0000</pubDate>
      <link>https://dev.to/sanikolaev/auto-embeddings-in-manticore-search-ai-powered-search-made-simple-op2</link>
      <guid>https://dev.to/sanikolaev/auto-embeddings-in-manticore-search-ai-powered-search-made-simple-op2</guid>
      <description>&lt;p&gt;We're excited to share a new feature that makes building semantic search apps as simple as writing SQL: &lt;strong&gt;Auto Embeddings&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
With this addition, Manticore Search takes care of embedding generation for you—no extra pipelines, no external services, no hassle.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Challenge Before
&lt;/h2&gt;

&lt;p&gt;Until now, semantic search often meant wrestling with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setting up separate ML pipelines for embedding generation
&lt;/li&gt;
&lt;li&gt;Managing models and their dependencies
&lt;/li&gt;
&lt;li&gt;Syncing your app, embedding service, and search engine
&lt;/li&gt;
&lt;li&gt;Handling vector dimension mismatches and preprocessing
&lt;/li&gt;
&lt;li&gt;Making sure embeddings are always generated the same way
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That overhead is now gone.&lt;/p&gt;
&lt;h2&gt;
  
  
  What Are Auto Embeddings?
&lt;/h2&gt;

&lt;p&gt;With Auto Embeddings, you just insert text. Manticore automatically:&lt;/p&gt;

&lt;p&gt;✨ &lt;strong&gt;Generates embeddings&lt;/strong&gt; with state-of-the-art models&lt;br&gt;&lt;br&gt;
✨ &lt;strong&gt;Stores them efficiently&lt;/strong&gt; in vector indexes&lt;br&gt;&lt;br&gt;
✨ &lt;strong&gt;Lets you query in natural language&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
✨ &lt;strong&gt;Hides the complexity&lt;/strong&gt; so you can focus on features, not infrastructure  &lt;/p&gt;
&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;Build a semantic search app in &lt;strong&gt;3 steps&lt;/strong&gt;:&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Create a Table (SQL Example)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="n"&gt;STRING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="n"&gt;FLOAT_VECTOR&lt;/span&gt; &lt;span class="n"&gt;KNN_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'hnsw'&lt;/span&gt; &lt;span class="n"&gt;HNSW_SIMILARITY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'l2'&lt;/span&gt;
        &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sentence-transformers/all-MiniLM-L6-v2'&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'title,description'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Configured in one line: Manticore generates embeddings from &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Insert Data (SQL Example)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'green hiking backpack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Lightweight backpack suitable for hiking trails'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'outdoors'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5999&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'laptop sleeve'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Slim padded case for 15-inch laptops'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1999&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'travel daypack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Compact daypack perfect for light travel or hiking'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'luggage'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3999&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'black laptop backpack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Spacious backpack with padded laptop compartment'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6900&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'mountain hiking bag'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Durable trail-ready backpack for mountain hikes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'outdoors'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8950&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'everyday backpack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Versatile backpack for work, gym and school'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'general'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4900&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'trail running shoes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Lightweight shoes with great grip for trails'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'footwear'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7500&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'camping gear set'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Complete set for weekend camping adventures'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'outdoors'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;12000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'outdoor laptop pack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Trail-optimized backpack with laptop sleeve'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'outdoors'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7800&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'compact hiking backpack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Light and foldable backpack for trail hikes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'outdoors'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4200&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'portable solar charger'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Foldable solar panel charger for phones and USB devices'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3400&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'reusable water bottle'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Insulated stainless steel bottle keeps drinks cold or hot'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'lifestyle'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2500&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'noise-cancelling headphones'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Over-ear headphones with noise cancellation'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;13900&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'organic trail mix'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Healthy mix of nuts and dried fruit, ideal for hikes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'food'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;899&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'wireless mouse'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Compact wireless mouse for laptops and desktops'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1599&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'office chair'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Ergonomic office chair with lumbar support and mesh back'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'furniture'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;27900&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'notebook and pen set'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Elegant A5 notebook with smooth-writing pen'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'stationery'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1200&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'children&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;s adventure book'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Illustrated storybook about outdoor exploration'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'books'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1299&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;19&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'mini drone'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Lightweight drone with HD camera and remote control'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'gadgets'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4599&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'wooden puzzle box'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Challenging mechanical puzzle made of natural wood'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'toys'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1899&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This diverse dataset spans outdoors, electronics, furniture, books, toys, and more. Notice: &lt;strong&gt;no vectors needed&lt;/strong&gt;. All embeddings are generated automatically from the text.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: Prices are in cents (e.g., 5999 = $59.99).&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Search with Natural Language (SQL Example)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'lightweight laptop backpack for trail hiking'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------+-------------------------+--------------------------------------------------+-------+------------+
| id   | title                   | description                                      | price | knn_dist() |
+------+-------------------------+--------------------------------------------------+-------+------------+
|    9 | outdoor laptop pack     | Trail-optimized backpack with laptop sleeve      |  7800 | 0.35392243 |
|    1 | green hiking backpack   | Lightweight backpack suitable for hiking trails  |  5999 | 0.53113687 |
|    5 | mountain hiking bag     | Durable trail-ready backpack for mountain hikes  |  8950 | 0.62034285 |
|    4 | black laptop backpack   | Spacious backpack with padded laptop compartment |  6900 | 0.65785009 |
|   10 | compact hiking backpack | Light and foldable backpack for trail hikes      |  4200 | 0.68591022 |
+------+-------------------------+--------------------------------------------------+-------+------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The query "lightweight laptop backpack for trail hiking" found the most relevant item first: the "outdoor laptop pack" which combines both laptop and trail features, followed by hiking backpacks and laptop-oriented products.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pick the Right Model
&lt;/h2&gt;

&lt;p&gt;You can choose different models depending on your needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🏠 &lt;strong&gt;Local (Hugging Face models)&lt;/strong&gt; — no API keys, unlimited use
&lt;/li&gt;
&lt;li&gt;🌐 &lt;strong&gt;OpenAI models&lt;/strong&gt; — best-in-class semantic quality
&lt;/li&gt;
&lt;li&gt;🚀 &lt;strong&gt;Voyage &amp;amp; Jina models&lt;/strong&gt; — domain- and language-optimized
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Hybrid Search &amp;amp; Filtering (SQL Example)
&lt;/h2&gt;

&lt;p&gt;Combine semantic, keyword, and structured filters in one query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;highlight&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'lightweight laptop backpack for trail hiking'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'outdoors'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"lightweight laptop backpack for trail hiking"/0.5'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------+-------+-----------------------------------------------------------------------------------------------+
| id   | price | highlight()                                                                                   |
+------+-------+-----------------------------------------------------------------------------------------------+
|    9 |  7800 | outdoor &amp;lt;b&amp;gt;laptop&amp;lt;/b&amp;gt; pack | &amp;lt;b&amp;gt;Trail&amp;lt;/b&amp;gt;-optimized &amp;lt;b&amp;gt;backpack&amp;lt;/b&amp;gt; with &amp;lt;b&amp;gt;laptop&amp;lt;/b&amp;gt; sleeve |
|    1 |  5999 | green &amp;lt;b&amp;gt;hiking backpack&amp;lt;/b&amp;gt; | &amp;lt;b&amp;gt;Lightweight backpack&amp;lt;/b&amp;gt; suitable &amp;lt;b&amp;gt;for hiking&amp;lt;/b&amp;gt; trails  |
|    5 |  8950 | mountain &amp;lt;b&amp;gt;hiking&amp;lt;/b&amp;gt; bag | Durable &amp;lt;b&amp;gt;trail&amp;lt;/b&amp;gt;-ready &amp;lt;b&amp;gt;backpack for&amp;lt;/b&amp;gt; mountain hikes    |
|   10 |  4200 | compact &amp;lt;b&amp;gt;hiking backpack&amp;lt;/b&amp;gt; | Light and foldable &amp;lt;b&amp;gt;backpack for trail&amp;lt;/b&amp;gt; hikes           |
+------+-------+-----------------------------------------------------------------------------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: highlight() returns markup (e.g., &lt;code&gt;&amp;lt;b&amp;gt;...&amp;lt;/b&amp;gt;&lt;/code&gt;).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This powerful combination filters by category (&lt;code&gt;outdoors&lt;/code&gt;), ensures semantic relevance through embeddings, requires text-level keyword matches, and highlights the matching terms — all in one query!&lt;/p&gt;

&lt;h2&gt;
  
  
  Complete HTTP/JSON API Support
&lt;/h2&gt;

&lt;p&gt;Auto Embeddings work seamlessly with Manticore's HTTP/JSON API, providing the same functionality as SQL but through REST endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inserting Data via JSON (HTTP/JSON API Example)
&lt;/h3&gt;

&lt;p&gt;Use the &lt;code&gt;/insert&lt;/code&gt; endpoint - embeddings are generated automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:9308/insert"&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "table": "products", 
    "id": 21, 
    "doc": {
      "title": "wireless headphones", 
      "description": "Bluetooth headphones with noise cancellation", 
      "category": "electronics", 
      "price": 15900
    }
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Inserts with Auto Embeddings (HTTP/JSON API Example)
&lt;/h3&gt;

&lt;p&gt;Insert multiple documents efficiently using &lt;code&gt;/bulk&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:9308/bulk"&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/x-ndjson"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data-raw&lt;/span&gt; &lt;span class="s1"&gt;$'{"insert": {"table": "products", "id": 22, "doc": {"title": "gaming laptop", "description": "High-performance laptop for gaming and work", "category": "electronics", "price": 159900}}}
{"insert": {"table": "products", "id": 23, "doc": {"title": "smartphone", "description": "Latest flagship smartphone with 5G", "category": "electronics", "price": 89900}}}
{"insert": {"table": "products", "id": 24, "doc": {"title": "tablet computer", "description": "Lightweight tablet for work and entertainment", "category": "electronics", "price": 49900}}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"bulk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"table"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"deleted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"updated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"current_line"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"skipped_lines"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bulk operation successfully inserted 3 documents with auto-generated embeddings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic Search via JSON (HTTP/JSON API Example)
&lt;/h3&gt;

&lt;p&gt;Search with natural language queries using &lt;code&gt;/search&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:9308/search"&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "table": "products",
    "_source": ["title"],
    "size": 5,
    "knn": {
      "field": "vector",
      "query": "outdoor hiking adventure",
      "k": 3
    }
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"took"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timed_out"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_relation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eq"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"hits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.75467718&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"children's adventure book"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.83226496&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"green hiking backpack"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.89348459&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mountain hiking bag"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.92611158&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"compact hiking backpack"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.98721427&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"travel daypack"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The query "outdoor hiking adventure" found the most relevant match to be the "children's adventure book" (0.754 distance), followed by hiking-related backpacks. This shows how semantic search can find conceptually related items beyond just literal keyword matches.&lt;/p&gt;

&lt;h3&gt;
  
  
  Filtering and Hybrid Search via JSON (HTTP/JSON API Example)
&lt;/h3&gt;

&lt;p&gt;Combine semantic search with traditional filters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:9308/search"&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "table": "products",
    "_source": ["title", "price"],
    "size": 5,
    "knn": {
      "field": "vector", 
      "query": "technology electronic device",
      "k": 5,
      "filter": {
        "range": {"price": {"gte": 15000}}
      }
    }
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"took"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timed_out"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_relation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eq"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"hits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.31113040&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tablet computer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;49900&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.56920886&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"smartphone"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;89900&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.59042466&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gaming laptop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;159900&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.84979212&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"office chair"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;27900&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_knn_dist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.88567829&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wireless headphones"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15900&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The search for "technology electronic device" with price filtering (≥$150) correctly prioritized electronics items and excluded lower-priced products like hiking backpacks and smaller electronics. Notice how "tablet computer" ranks highest due to its strong semantic match to the query.&lt;/p&gt;

&lt;h3&gt;
  
  
  Direct Vector vs Auto-Embedded Text Queries
&lt;/h3&gt;

&lt;p&gt;The HTTP/JSON API supports both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auto-embedded text queries&lt;/strong&gt;: &lt;code&gt;"query": "outdoor hiking adventure"&lt;/code&gt; (auto-embedded)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct vector queries&lt;/strong&gt;: &lt;code&gt;"query": [0.1, 0.2, 0.3, ...]&lt;/code&gt; (pre-computed vector)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This flexibility allows you to mix auto-generated embeddings with custom vectors in the same application.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Integration (OpenAI API Example)
&lt;/h2&gt;

&lt;p&gt;For even better semantic understanding, you can use OpenAI's embedding models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create table with OpenAI embeddings&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products_openai&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="n"&gt;FLOAT_VECTOR&lt;/span&gt; &lt;span class="n"&gt;KNN_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'hnsw'&lt;/span&gt; &lt;span class="n"&gt;HNSW_SIMILARITY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'l2'&lt;/span&gt;
    &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'openai/text-embedding-ada-002'&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'title, description'&lt;/span&gt;
    &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'your-openai-api-key'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Insert data (embeddings generated via OpenAI API)&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;products_openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'smartphone device'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'latest mobile technology with advanced features'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;79900&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'laptop computer'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'portable workstation for developers and professionals'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;129900&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Search with natural language&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products_openai&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'mobile phone technology'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+---------------------+-------------------+-------------------------------------------------------+------------+
| id                  | title             | description                                           | knn_dist() |
+---------------------+-------------------+-------------------------------------------------------+------------+
| 2309215617435041807 | smartphone device | latest mobile technology with advanced features       | 0.20333229 |
| 2309215617435041808 | laptop computer   | portable workstation for developers and professionals | 0.40197325 |
+---------------------+-------------------+-------------------------------------------------------+------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenAI's models excel at understanding nuanced relationships — "mobile phone technology" correctly identified the smartphone as much more relevant than the laptop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Built for Production
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;⚡ &lt;strong&gt;Fast&lt;/strong&gt;: HNSW indexing, optional quantization, optimized storage
&lt;/li&gt;
&lt;li&gt;🛡️ &lt;strong&gt;Reliable&lt;/strong&gt;: multiple model providers, empty-vector handling
&lt;/li&gt;
&lt;li&gt;🔧 &lt;strong&gt;Flexible&lt;/strong&gt;: embed from any field(s) you choose
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Use Cases
&lt;/h2&gt;

&lt;p&gt;Auto Embeddings make it easy to build:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🛍️ &lt;strong&gt;E-commerce search&lt;/strong&gt;: "waterproof hiking boots" → finds relevant products
&lt;/li&gt;
&lt;li&gt;📚 &lt;strong&gt;Document discovery&lt;/strong&gt;: "contracts about data privacy" → surfaces legal docs
&lt;/li&gt;
&lt;li&gt;🎵 &lt;strong&gt;Content recommendations&lt;/strong&gt;: "upbeat music for workouts" → matches by vibe
&lt;/li&gt;
&lt;li&gt;🏠 &lt;strong&gt;Real estate search&lt;/strong&gt;: "cozy apartments near parks" → finds lifestyle-fit homes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  More Real-World Examples
&lt;/h2&gt;

&lt;p&gt;Let's see Auto Embeddings in action with different search scenarios:&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding Work &amp;amp; Productivity Items
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'work productivity office'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------+----------------------+----------------------------------------------------------+-------+------------+
| id   | title                | description                                              | price | knn_dist() |
+------+----------------------+----------------------------------------------------------+-------+------------+
|   24 | tablet computer      | Lightweight tablet for work and entertainment            | 49900 |   1.306459 |
|   16 | office chair         | Ergonomic office chair with lumbar support and mesh back | 27900 | 1.44871426 |
|   17 | notebook and pen set | Elegant A5 notebook with smooth-writing pen              |  1200 | 1.48466742 |
+------+----------------------+----------------------------------------------------------+-------+------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The search understood "work productivity office" and returned office furniture, stationery, and work-appropriate gear.&lt;/p&gt;

&lt;h3&gt;
  
  
  Smart Category Filtering
&lt;/h3&gt;

&lt;p&gt;Sometimes semantic search is &lt;em&gt;too&lt;/em&gt; broad. Let's search for "usb charger for outdoor camping":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'usb charger for outdoor camping'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Top results include many items:&lt;/strong&gt; solar charger (0.888), outdoor packs (1.139), hiking gear (1.213), etc.&lt;/p&gt;

&lt;p&gt;But when we add category filtering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;highlight&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'usb charger for outdoor camping'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"usb charger for outdoor camping"/0.5'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Precise result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------+-------------------------------------------------------------------------------------------------------+
| id   | highlight()                                                                                           |
+------+-------------------------------------------------------------------------------------------------------+
|   11 | portable solar &amp;lt;b&amp;gt;charger&amp;lt;/b&amp;gt; | Foldable solar panel &amp;lt;b&amp;gt;charger for&amp;lt;/b&amp;gt; phones and &amp;lt;b&amp;gt;USB&amp;lt;/b&amp;gt; devices |
+------+-------------------------------------------------------------------------------------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: highlight() returns markup (e.g., &lt;code&gt;&amp;lt;b&amp;gt;...&amp;lt;/b&amp;gt;&lt;/code&gt;). Bold in the table is for readability.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The combination of semantic understanding + category filtering + keyword matching gave us exactly what we wanted!&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding Fun &amp;amp; Creative Items
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_dist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'fun creative play toys'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------+---------------------------+----------------------------------------------------+-------+------------+
| id   | title                     | description                                        | price | knn_dist() |
+------+---------------------------+----------------------------------------------------+-------+------------+
|    8 | camping gear set          | Complete set for weekend camping adventures        | 12000 | 1.30462146 |
|   20 | wooden puzzle box         | Challenging mechanical puzzle made of natural wood |  1899 |   1.305056 |
|   18 | children's adventure book | Illustrated storybook about outdoor exploration    |  1299 | 1.47192979 |
+------+---------------------------+----------------------------------------------------+-------+------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Auto Embeddings understood the concept of "fun creative play" and found adventure gear, puzzles, and children's books—all items that relate to creativity and play!  &lt;/p&gt;

&lt;h2&gt;
  
  
  Behind the Scenes
&lt;/h2&gt;

&lt;p&gt;Auto Embeddings rely on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sentence Transformers&lt;/strong&gt; for semantic understanding
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HNSW&lt;/strong&gt; for fast similarity search
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smart caching&lt;/strong&gt; for efficient inference
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-provider APIs&lt;/strong&gt; for flexibility
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It Today
&lt;/h2&gt;

&lt;p&gt;As you've seen from our examples, Auto Embeddings deliver powerful semantic search capabilities with minimal setup. Whether you're building:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;E-commerce platforms&lt;/strong&gt; with natural language product search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content management systems&lt;/strong&gt; with intelligent document discovery
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recommendation engines&lt;/strong&gt; that understand user intent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge bases&lt;/strong&gt; with semantic question answering&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Auto Embeddings remove the hardest part — managing embeddings — so you can focus on building great features that users love.&lt;/p&gt;

&lt;p&gt;🚀 &lt;strong&gt;Ready to transform your search experience?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;Download Manticore Search&lt;/a&gt; and start building with Auto Embeddings today.&lt;br&gt;&lt;br&gt;
📚 Check out the &lt;a href="https://manual.manticoresearch.com/Searching/KNN" rel="noopener noreferrer"&gt;KNN search documentation&lt;/a&gt; for detailed guides.&lt;br&gt;&lt;br&gt;
💬 Join our &lt;a href="https://slack.manticoresearch.com/" rel="noopener noreferrer"&gt;Slack community&lt;/a&gt; to share your success stories.  &lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions or feedback? Join our &lt;a href="https://forum.manticoresearch.com/" rel="noopener noreferrer"&gt;community forum&lt;/a&gt; or follow us on &lt;a href="https://twitter.com/manticoresearch" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>nlp</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Manticore Search 13.11.0: Introducing Auto Embeddings and Enhanced AI Search</title>
      <dc:creator>Sergey Nikolaev</dc:creator>
      <pubDate>Mon, 15 Sep 2025 14:36:44 +0000</pubDate>
      <link>https://dev.to/sanikolaev/manticore-search-13110-introducing-auto-embeddings-and-enhanced-ai-search-272p</link>
      <guid>https://dev.to/sanikolaev/manticore-search-13110-introducing-auto-embeddings-and-enhanced-ai-search-272p</guid>
      <description>&lt;p&gt;We're excited to announce the &lt;strong&gt;August 2025 release&lt;/strong&gt; of &lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;Manticore Search 13.11.0&lt;/a&gt;, a major update featuring &lt;strong&gt;Auto Embeddings&lt;/strong&gt; — our new way of making AI-powered semantic search simple and efficient. This version also includes bug fixes, and several improvements.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Auto Embeddings: AI Search Made Easy
&lt;/h2&gt;

&lt;p&gt;The highlight of Manticore Search 13.11.0 is &lt;strong&gt;Auto Embeddings&lt;/strong&gt; — a game-changing feature that makes semantic search as easy as SQL. No need for external services or complex pipelines: just insert text and search with natural language.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Auto Embeddings Offer:
&lt;/h3&gt;

&lt;p&gt;✨ &lt;strong&gt;Automatic embedding generation&lt;/strong&gt; from your text&lt;br&gt;&lt;br&gt;
✨ &lt;strong&gt;Natural language queries&lt;/strong&gt; that understand meaning, not just keywords&lt;br&gt;&lt;br&gt;
✨ &lt;strong&gt;Support for multiple models&lt;/strong&gt; (OpenAI, Hugging Face, Voyage, Jina)&lt;br&gt;&lt;br&gt;
✨ &lt;strong&gt;Smooth integration&lt;/strong&gt; with SQL and JSON APIs  &lt;/p&gt;
&lt;h3&gt;
  
  
  Quick Example
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create table with auto-embeddings&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="n"&gt;FLOAT_VECTOR&lt;/span&gt; &lt;span class="n"&gt;KNN_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'hnsw'&lt;/span&gt; &lt;span class="n"&gt;HNSW_SIMILARITY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'l2'&lt;/span&gt;
        &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sentence-transformers/all-MiniLM-L6-v2'&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'title,description'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Insert data (embeddings generated automatically)&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'wireless headphones'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Bluetooth headphones with noise cancellation'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'hiking backpack'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Lightweight backpack for outdoor adventures'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Search with natural language&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'portable audio device for music'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------+---------------------+
| id   | title               |
+------+---------------------+
|    1 | wireless headphones |
...
+------+---------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, semantic search correctly matched "wireless headphones" with the phrase "portable audio device for music," even though no keywords overlapped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Learn More
&lt;/h3&gt;

&lt;p&gt;Want a full deep dive? Check out our dedicated article: &lt;strong&gt;&lt;a href="https://manticoresearch.com/blog/auto-embeddings/" rel="noopener noreferrer"&gt;Introducing Auto Embeddings: AI-Powered Search Made Simple&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Other Improvements
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Configuration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Boolean Simplify Support&lt;/strong&gt;: Added &lt;code&gt;boolean_simplify&lt;/code&gt; option for faster query processing
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System Optimization&lt;/strong&gt;: Sysctl config now auto-increases &lt;code&gt;vm.max_map_count&lt;/code&gt; for large datasets
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Package Management&lt;/strong&gt;: RPM packages no longer own &lt;code&gt;/run&lt;/code&gt; directory for better compatibility
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Bug Fixes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Fixed scroll option with large 64-bit IDs
&lt;/li&gt;
&lt;li&gt;Fixed KNN crashes when using filter trees
&lt;/li&gt;
&lt;li&gt;Fixed &lt;code&gt;/sql&lt;/code&gt; endpoint behavior (removed unsupported &lt;code&gt;SHOW VERSION&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;Fixed duplicate ID handling in columnar mode
&lt;/li&gt;
&lt;li&gt;Fixed crashes with joined queries using multiple facets
&lt;/li&gt;
&lt;li&gt;Fixed delete/update commits in bulk transactions
&lt;/li&gt;
&lt;li&gt;Fixed crashes when joining on non-columnar string attributes
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  System &amp;amp; Integration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Updated Windows installer script
&lt;/li&gt;
&lt;li&gt;Fixed local time zone detection on Linux
&lt;/li&gt;
&lt;li&gt;Improved JDBC+MySQL driver compatibility with &lt;code&gt;transaction_read_only&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Enhanced error reporting across components
&lt;/li&gt;
&lt;li&gt;Improved master-agent communication for embeddings
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Compatibility
&lt;/h2&gt;

&lt;p&gt;Manticore Search 13.11.0 is &lt;strong&gt;fully backward compatible&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No breaking changes in standard use cases
&lt;/li&gt;
&lt;li&gt;Smooth upgrades from 13.x versions
&lt;/li&gt;
&lt;li&gt;Auto Embeddings work alongside current search features
&lt;/li&gt;
&lt;li&gt;APIs are extended, not replaced
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is designed to work seamlessly with your existing data and queries.&lt;/p&gt;




&lt;h2&gt;
  
  
  Upgrade
&lt;/h2&gt;

&lt;p&gt;To upgrade, follow the &lt;a href="https://manticoresearch.com/install/" rel="noopener noreferrer"&gt;installation guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🚀 Ready to try Auto Embeddings?&lt;/strong&gt; Start with the &lt;strong&gt;&lt;a href="https://manual.manticoresearch.com/Searching/KNN#Auto-Embeddings-(Recommended)" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Need help or want to connect?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Join our &lt;a href="https://slack.manticoresearch.com" rel="noopener noreferrer"&gt;Slack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Visit the &lt;a href="https://forum.manticoresearch.com" rel="noopener noreferrer"&gt;Forum&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Report issues or suggest features on &lt;a href="https://github.com/manticoresoftware/manticoresearch/issues" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Email us at &lt;code&gt;contact@manticoresearch.com&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For full details, see the &lt;a href="https://manual.manticoresearch.com/Changelog#Version-13.11.0" rel="noopener noreferrer"&gt;Changelog&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>database</category>
      <category>machinelearning</category>
      <category>news</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
