<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dhananjay Haridas</title>
    <description>The latest articles on DEV Community by Dhananjay Haridas (@dhananjay_haridas_dc11460).</description>
    <link>https://dev.to/dhananjay_haridas_dc11460</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3397109%2Fa945ad5b-5487-48b8-986c-b21650ad8daf.jpg</url>
      <title>DEV Community: Dhananjay Haridas</title>
      <link>https://dev.to/dhananjay_haridas_dc11460</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dhananjay_haridas_dc11460"/>
    <language>en</language>
    <item>
      <title>Fuzzy Text Search in PostgreSQL: How Trigrams Make Computers "Almost Right"</title>
      <dc:creator>Dhananjay Haridas</dc:creator>
      <pubDate>Wed, 03 Jun 2026 08:28:48 +0000</pubDate>
      <link>https://dev.to/dhananjay_haridas_dc11460/fuzzy-text-search-in-postgresql-how-trigrams-make-computers-almost-right-f60</link>
      <guid>https://dev.to/dhananjay_haridas_dc11460/fuzzy-text-search-in-postgresql-how-trigrams-make-computers-almost-right-f60</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;A deep dive into one of the most clever tricks in database engineering - explained so simply, your grandmother could follow along.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Problem: Computers Are Too Exact
&lt;/h2&gt;

&lt;p&gt;Imagine you're searching for a restaurant called &lt;strong&gt;"Maharaja Palace"&lt;/strong&gt; on an app. You type &lt;em&gt;"Maharja Palce"&lt;/em&gt; - two small typos. A traditional search engine looks at your input and says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"No results found."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Zero. Nothing. Because the text doesn't &lt;strong&gt;exactly&lt;/strong&gt; match.&lt;/p&gt;

&lt;p&gt;This is the fundamental problem with how computers search for text. They compare strings character by character, and if even one letter is out of place, the match fails. Humans, on the other hand, are remarkably good at understanding that "Maharja" probably means "Maharaja". We do it instinctively, without even thinking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fuzzy search&lt;/strong&gt; is the art of teaching computers to think a little more like humans - to find things that are &lt;em&gt;close enough&lt;/em&gt;, not just &lt;em&gt;exactly right&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;And in PostgreSQL, the most powerful tool for this is called &lt;strong&gt;Trigram Search&lt;/strong&gt;, powered by an extension called &lt;code&gt;pg_trgm&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 1: What Is a Trigram? (The Building Block of Everything)
&lt;/h2&gt;

&lt;p&gt;Before we dive into how PostgreSQL searches, we need to understand one simple concept: the &lt;strong&gt;trigram&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;trigram&lt;/strong&gt; is just a group of &lt;strong&gt;3 consecutive characters&lt;/strong&gt; from a word.&lt;/p&gt;

&lt;p&gt;That's it. Three letters in a row.&lt;/p&gt;

&lt;p&gt;Let's break the word &lt;strong&gt;"cat"&lt;/strong&gt; into trigrams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Word: "cat"
PostgreSQL pads the word with spaces for edge handling: "  cat  "

Trigrams:
  "  c"
  " ca"
  "cat"
  "at "
  "t  "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡 PostgreSQL secretly adds two spaces before and one space after every word. This is to give the beginning and end of words their own identity, so "cat" and "concatenate" don't blur together.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let's try a longer word: &lt;strong&gt;"elephant"&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Word: "  elephant "

Trigrams:
  "  e"
  " el"
  "ele"
  "lep"
  "eph"
  "pha"
  "han"
  "ant"
  "nt "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every word gets chopped into these little 3-character windows, sliding one character at a time across the word from left to right.&lt;/p&gt;

&lt;p&gt;Think of it like a &lt;strong&gt;magnifying glass&lt;/strong&gt; that's 3 characters wide, slowly sliding across a word and taking a snapshot at every position.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: Why Trigrams? Why Not 2 or 4 Letters?
&lt;/h2&gt;

&lt;p&gt;This is a fair question. Why exactly three?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bigrams (2 letters):&lt;/strong&gt; Too short. The bigram "ca" appears in "cat", "car", "can", "castle", "scar", "occasion"... it's so common it doesn't carry much meaning. Matching would produce too many false positives.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;4-grams or longer:&lt;/strong&gt; Too specific. A 4-gram like "elep" only really appears in "elephant" and close relatives. If you make a typo in that 4-gram, you lose the match entirely. Too brittle.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Trigrams (3 letters):&lt;/strong&gt; The sweet spot. Common enough to appear across related words, specific enough to carry meaning. A single typo only destroys a few trigrams, leaving most of the others intact to still find a match.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's the Goldilocks zone of string matching.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: The Magic - How Similarity Is Calculated
&lt;/h2&gt;

&lt;p&gt;Here's where the real cleverness lives.&lt;/p&gt;

&lt;p&gt;Once every word is broken into trigrams, PostgreSQL measures &lt;strong&gt;how similar two words are&lt;/strong&gt; by counting how many trigrams they &lt;em&gt;share&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The formula is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Similarity = (Number of shared trigrams) / (Total unique trigrams across both words)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives a number between &lt;strong&gt;0&lt;/strong&gt; (completely different) and &lt;strong&gt;1&lt;/strong&gt; (identical).&lt;/p&gt;

&lt;h3&gt;
  
  
  Let's work through a real example
&lt;/h3&gt;

&lt;p&gt;Say we're comparing &lt;strong&gt;"night"&lt;/strong&gt; and &lt;strong&gt;"knight"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Generate trigrams for "night"&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"  n", " ni", "nig", "igh", "ght", "ht "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;→ 6 trigrams&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Generate trigrams for "knight"&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"  k", " kn", "kni", "nig", "igh", "ght", "ht "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;→ 7 trigrams&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Find the shared trigrams&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Shared: "nig", "igh", "ght", "ht "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;→ 4 shared trigrams&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Calculate similarity&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total unique trigrams = 6 + 7 - 4 = 9   (subtract shared ones to avoid counting them twice)
Shared = 4

Similarity = 4 / 9 = 0.44
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So "night" and "knight" are about &lt;strong&gt;44% similar&lt;/strong&gt; according to trigrams. That seems low - they sound the same! But trigrams are measuring &lt;em&gt;character patterns&lt;/em&gt;, not pronunciation. The extra "kn" at the start of "knight" introduces several new unique trigrams that "night" doesn't have.&lt;/p&gt;

&lt;h3&gt;
  
  
  Now let's try a typo: "niight" vs "night"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Trigrams for "niight":&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"  n", " ni", "nii", "iig", "igh", "ght", "ht "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Trigrams for "night":&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"  n", " ni", "nig", "igh", "ght", "ht "
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Shared:&lt;/strong&gt; &lt;code&gt;"  n"&lt;/code&gt;, &lt;code&gt;" ni"&lt;/code&gt;, &lt;code&gt;"igh"&lt;/code&gt;, &lt;code&gt;"ght"&lt;/code&gt;, &lt;code&gt;"ht "&lt;/code&gt; → 5 shared&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Total unique:&lt;/strong&gt; 7 + 6 - 5 = 8&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Similarity = 5/8 = 0.625&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Despite the typo ("niight"), the similarity is still &lt;strong&gt;62.5%&lt;/strong&gt; - high enough that a search would still find the match. This is the magic of trigrams.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 4: The Threshold - Deciding What "Close Enough" Means
&lt;/h2&gt;

&lt;p&gt;Now that we can calculate similarity as a number, we need to decide: &lt;strong&gt;how similar is similar enough?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PostgreSQL uses a default threshold of &lt;strong&gt;0.3&lt;/strong&gt; (30%). If two strings share at least 30% of their trigrams, they're considered a match.&lt;/p&gt;

&lt;p&gt;You can think of it like a teacher grading papers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score ≥ 0.3 → Pass (it's a match)&lt;/li&gt;
&lt;li&gt;Score &amp;lt; 0.3 → Fail (not a match)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can adjust this threshold depending on how strict or lenient you want your search to be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Make it stricter (only very close matches)&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;similarity_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Make it more lenient (catch more distant matches, but more noise)&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;similarity_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Practical guidance:&lt;/strong&gt;&lt;br&gt;
| Threshold | Best For |&lt;br&gt;
|-----------|----------|&lt;br&gt;
| 0.1 – 0.2 | Very forgiving search, names from other languages |&lt;br&gt;
| 0.3 | General purpose (default, works well most of the time) |&lt;br&gt;
| 0.5 | Names, product codes where near-exact match is needed |&lt;br&gt;
| 0.7+ | Almost exact matching only |&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 5: Under the Hood - How PostgreSQL Actually Executes the Search
&lt;/h2&gt;

&lt;p&gt;Understanding the math is one thing. But how does PostgreSQL actually &lt;em&gt;run&lt;/em&gt; this efficiently when you have millions of rows? Recalculating trigrams for every row on every query would be painfully slow.&lt;/p&gt;

&lt;p&gt;The answer is: &lt;strong&gt;the GIN Index&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  What is a GIN Index?
&lt;/h3&gt;

&lt;p&gt;GIN stands for &lt;strong&gt;Generalized Inverted Index&lt;/strong&gt;. The name sounds intimidating, but the concept is simple.&lt;/p&gt;

&lt;p&gt;Think of it like the &lt;strong&gt;index at the back of a book&lt;/strong&gt;. Instead of reading every page to find where "photosynthesis" is mentioned, you flip to the back, find "photosynthesis" alphabetically, and it tells you exactly which pages contain it.&lt;/p&gt;

&lt;p&gt;A GIN index does the same thing for trigrams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GIN Index Structure (simplified):

" ca" → [row 5, row 12, row 89, row 203]
" co" → [row 3, row 7, row 44]
"cat" → [row 5, row 89]
"cof" → [row 3]
"ofe" → [row 3]
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For every possible trigram, the index stores a list of which rows contain that trigram. When you search for "coffee", PostgreSQL:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Breaks "coffee" into its trigrams: &lt;code&gt;" co"&lt;/code&gt;, &lt;code&gt;"cof"&lt;/code&gt;, &lt;code&gt;"off"&lt;/code&gt;, &lt;code&gt;"ffe"&lt;/code&gt;, &lt;code&gt;"fee"&lt;/code&gt;, &lt;code&gt;"ee "&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Looks up each trigram in the GIN index&lt;/li&gt;
&lt;li&gt;Finds rows that contain &lt;em&gt;several&lt;/em&gt; of these trigrams (intersection)&lt;/li&gt;
&lt;li&gt;Calculates the exact similarity score only for those candidate rows&lt;/li&gt;
&lt;li&gt;Returns results above the threshold&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Instead of scanning millions of rows, it scans a short list of candidates. This is what makes trigram search &lt;strong&gt;fast&lt;/strong&gt; - often hundreds of times faster than a brute-force scan.&lt;/p&gt;

&lt;h3&gt;
  
  
  Visualizing the Query Flow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Your&lt;/span&gt; &lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;"SELECT * WHERE name % 'cofee'"&lt;/span&gt;
                    &lt;span class="err"&gt;│&lt;/span&gt;
                    &lt;span class="err"&gt;▼&lt;/span&gt;
         &lt;span class="n"&gt;Break&lt;/span&gt; &lt;span class="s1"&gt;'cofee'&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;trigrams&lt;/span&gt;
         &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;" co"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"cof"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"ofe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"fee"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"ee "&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="err"&gt;│&lt;/span&gt;
                    &lt;span class="err"&gt;▼&lt;/span&gt;
         &lt;span class="n"&gt;Look&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt; &lt;span class="k"&gt;each&lt;/span&gt; &lt;span class="n"&gt;trigram&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="k"&gt;index&lt;/span&gt;
         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;Candidate&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;44&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;89&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;102&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="err"&gt;│&lt;/span&gt;
                    &lt;span class="err"&gt;▼&lt;/span&gt;
         &lt;span class="n"&gt;Calculate&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;each&lt;/span&gt; &lt;span class="n"&gt;candidate&lt;/span&gt;
         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="k"&gt;Row&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;71&lt;/span&gt; &lt;span class="err"&gt;✓&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;above&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="k"&gt;Row&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt; &lt;span class="err"&gt;✗&lt;/span&gt;
         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="k"&gt;Row&lt;/span&gt; &lt;span class="mi"&gt;44&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;65&lt;/span&gt; &lt;span class="err"&gt;✓&lt;/span&gt;
         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="k"&gt;Row&lt;/span&gt; &lt;span class="mi"&gt;89&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="err"&gt;✗&lt;/span&gt;
         &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="k"&gt;Row&lt;/span&gt; &lt;span class="mi"&gt;102&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt; &lt;span class="err"&gt;✓&lt;/span&gt;
                    &lt;span class="err"&gt;│&lt;/span&gt;
                    &lt;span class="err"&gt;▼&lt;/span&gt;
         &lt;span class="k"&gt;Return&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;44&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;102&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;ordered&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without the GIN index, step 2 would require checking every row. With it, only a handful of candidates get the expensive similarity calculation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 6: Setting It Up in PostgreSQL
&lt;/h2&gt;

&lt;p&gt;Enough theory - here's how you actually use it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Enable the Extension
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a one-time setup per database. The extension comes bundled with PostgreSQL - no installation needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create Your Table and Index
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Create a GIN index on the name column&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_products_name_trgm&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
    &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;gin_trgm_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;gin_trgm_ops&lt;/code&gt; part tells PostgreSQL: &lt;em&gt;"use trigram logic for this index"&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Query Using Similarity
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- The % operator: returns rows above the similarity threshold&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'cofee maker'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- The similarity() function: returns the score, lets you sort by closeness&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cofee maker'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'cofee maker'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Sample output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csvs"&gt;&lt;code&gt;&lt;span class="k"&gt;name&lt;/span&gt;               &lt;span class="err"&gt;|&lt;/span&gt; &lt;span class="k"&gt;score&lt;/span&gt;
&lt;span class="err"&gt;-------------------+-------&lt;/span&gt;
&lt;span class="k"&gt;Coffee&lt;/span&gt; &lt;span class="k"&gt;Maker&lt;/span&gt; &lt;span class="k"&gt;Pro&lt;/span&gt;   &lt;span class="err"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.82&lt;/span&gt;
&lt;span class="k"&gt;Coffee&lt;/span&gt; &lt;span class="k"&gt;Maker&lt;/span&gt; &lt;span class="k"&gt;Mini&lt;/span&gt;  &lt;span class="err"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.78&lt;/span&gt;
&lt;span class="k"&gt;Coffee&lt;/span&gt; &lt;span class="k"&gt;Mug&lt;/span&gt; &lt;span class="k"&gt;Set&lt;/span&gt;     &lt;span class="err"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.31&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Tune the Threshold (Optional)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- See the current threshold&lt;/span&gt;
&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;similarity_threshold&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Change it for this session&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;similarity_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Part 7: Real-World Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Search Autocomplete / "Did you mean?"
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- User typed "postgress" - find the closest match&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgress'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;technologies&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'postgress'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Deduplication - Finding Near-Duplicate Records
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find customer records that might be the same person&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Product Search with Typo Tolerance
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Search even with misspellings&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;relevance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
   &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;relevance&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Combining with Full-Text Search for Best Results
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Use full-text search for speed + trigrams for typo tolerance&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tsv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;text_rank&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'databse'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;fuzzy_rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="n"&gt;plainto_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'database'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tsv&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
   &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'databse'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tsv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'databse'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Part 8: The &lt;code&gt;word_similarity&lt;/code&gt; and &lt;code&gt;strict_word_similarity&lt;/code&gt; Functions
&lt;/h2&gt;

&lt;p&gt;PostgreSQL's &lt;code&gt;pg_trgm&lt;/code&gt; also provides two variants of the similarity function that handle &lt;strong&gt;multi-word strings&lt;/strong&gt; differently.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;similarity(a, b)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Compares the full strings as a whole&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;word_similarity(a, b)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Checks if &lt;code&gt;a&lt;/code&gt; is similar to &lt;em&gt;any word&lt;/em&gt; in &lt;code&gt;b&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;strict_word_similarity(a, b)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Like &lt;code&gt;word_similarity&lt;/code&gt; but requires whole-word boundaries&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'word'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'two words'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 0.18 (compares full strings)&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;word_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'word'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'two words'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 0.8 (finds that 'word' closely matches 'words')&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is invaluable when users search for a single product word inside a longer product name.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 9: Performance Tips
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Always Create the GIN Index Before Querying Large Tables
&lt;/h3&gt;

&lt;p&gt;Without the index, PostgreSQL does a full sequential scan - checking every row. On a table with 10 million rows, this can take minutes. With the index, the same query can complete in milliseconds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check if your query is using the index&lt;/span&gt;
&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'laptop'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for &lt;code&gt;Index Scan using idx_products_name_trgm&lt;/code&gt; in the output. If you see &lt;code&gt;Seq Scan&lt;/code&gt; instead, your index isn't being used - check that it was created correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Index Maintenance
&lt;/h3&gt;

&lt;p&gt;GIN indexes can become bloated over time with many inserts and updates. Periodically run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Consider a &lt;code&gt;GIST&lt;/code&gt; Index for Certain Workloads
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Alternative index type&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_products_name_gist&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
    &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;gist_trgm_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Index Type&lt;/th&gt;
&lt;th&gt;Build Speed&lt;/th&gt;
&lt;th&gt;Query Speed&lt;/th&gt;
&lt;th&gt;Update Speed&lt;/th&gt;
&lt;th&gt;Choose When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GIN&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Faster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;td&gt;Read-heavy workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GiST&lt;/td&gt;
&lt;td&gt;Faster&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Faster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Write-heavy workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Part 10: Limitations - When Trigrams Don't Work Well
&lt;/h2&gt;

&lt;p&gt;Trigrams are powerful but not magic. They struggle in a few situations:&lt;/p&gt;

&lt;h3&gt;
  
  
  Short Strings (Less Than 3 Characters)
&lt;/h3&gt;

&lt;p&gt;A 1 or 2 character string can't form a meaningful set of trigrams. Searching for "AI" or "DB" will produce poor results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Workaround: Fall back to ILIKE for very short queries&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="k"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'AI'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%AI%'&lt;/span&gt;
    &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'AI'&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Very Different Languages
&lt;/h3&gt;

&lt;p&gt;Trigrams work on character patterns. For languages with different scripts (Arabic, Chinese, Thai), the trigram approach still works but may need different tokenization strategies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phonetic Similarity
&lt;/h3&gt;

&lt;p&gt;Trigrams match &lt;em&gt;visual&lt;/em&gt; similarity, not &lt;em&gt;sound&lt;/em&gt; similarity. "Night" and "knight" sound identical but have low trigram similarity because they look different. For phonetic matching, consider the &lt;code&gt;fuzzystrmatch&lt;/code&gt; extension with &lt;code&gt;soundex&lt;/code&gt; or &lt;code&gt;metaphone&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;fuzzystrmatch&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;names&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;metaphone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metaphone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'knight'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- This would match "night" and "knight"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Summary: The Trigram Story in One Paragraph
&lt;/h2&gt;

&lt;p&gt;Every word in your database gets sliced into overlapping 3-character chunks called trigrams. These trigrams are stored in a GIN index - like a giant lookup table mapping each trigram to the rows containing it. When you search for a word, PostgreSQL slices &lt;em&gt;that&lt;/em&gt; word into trigrams too, quickly finds candidate rows via the index, then calculates a similarity score (0 to 1) based on how many trigrams the two words share. If that score passes a threshold (default 0.3), it's considered a match. The result: fast, typo-tolerant, human-friendly search - baked right into your database, no external search engine required.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Reference Cheat Sheet
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Setup&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="n"&gt;gin_trgm_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Basic search&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'search term'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- With score&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'search term'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'search term'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Adjust threshold&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;pg_trgm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;similarity_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Word-level similarity&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;word_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'search'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'long string with search inside'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Check index usage&lt;/span&gt;
&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s1"&gt;'term'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;em&gt;Fuzzy search is one of those features that, once you add it to your application, users can't imagine living without. That tolerant, forgiving search experience - the one that doesn't punish you for a stray keystroke - is powered by these elegant little 3-character windows sliding across your data, quietly doing the math.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
