<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sheikh Abdullah</title>
    <description>The latest articles on DEV Community by Sheikh Abdullah (@sheikh566).</description>
    <link>https://dev.to/sheikh566</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F961257%2Fe7b0b9cc-f8ad-430a-9c9e-ac9a03056be5.png</url>
      <title>DEV Community: Sheikh Abdullah</title>
      <link>https://dev.to/sheikh566</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sheikh566"/>
    <language>en</language>
    <item>
      <title>Counting Views Is Harder Than It Looks</title>
      <dc:creator>Sheikh Abdullah</dc:creator>
      <pubDate>Sat, 30 May 2026 12:37:27 +0000</pubDate>
      <link>https://dev.to/sheikh566/counting-views-is-harder-than-it-looks-1k0a</link>
      <guid>https://dev.to/sheikh566/counting-views-is-harder-than-it-looks-1k0a</guid>
      <description>&lt;p&gt;A while back I picked up what looked like a tiny ticket: show how many times a post has been viewed. Easy, right? Add a column, increment it when someone opens the post, return the number. I figured I'd be done before lunch.&lt;/p&gt;

&lt;p&gt;I was wrong, and the reason I was wrong is what this post is about.&lt;/p&gt;

&lt;p&gt;The moment you start asking real questions, the simple counter falls apart. Should two visits from the same person count as one view or two? What happens when a thousand people open the same post at the same second? How do you show the count on every page load without hammering the database into the ground? And if the product team comes back next month asking for "trending posts" or "recommended for you," does your design have any chance of supporting that, or do you start from scratch?&lt;/p&gt;

&lt;p&gt;I spent a good amount of time researching this before I wrote the real implementation, and I went through four or five different designs that all sounded great until I poked at them. So this is the writeup I wish I had found when I started: the approaches I tried, what each one is actually good at, and where each one quietly breaks.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Before writing any code, I sat down and tried to pin down what "views" actually meant for us, because the answer changes everything about the design. Here is what I landed on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Total views or unique views?&lt;/strong&gt; A total view counter just goes up every time the post is opened. Unique views means I have to know whether this specific person has already seen the post. Those are two very different problems. One is basically a number, the other means I have to remember who saw what.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How accurate does it need to be, and can I afford to lose a few?&lt;/strong&gt; For a "1.2M views" label nobody cares if the real number is off by a few hundred, but if views feed billing or a leaderboard, "close enough" stops being good enough. Related question: if the system crashes, can I drop a handful of views that hadn't been saved yet? For a view counter the honest answer is usually yes, and admitting that opens up some much faster designs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's write-heavy but shown on every read.&lt;/strong&gt; This is the part that makes views tricky. Every open is a write, and the count is displayed on the page, so it's part of basically every read too. A design that's fine for a few views a minute can fall over at a few thousand a second, so I needed a rough idea of peak traffic up front. The trap is letting a cheap-write design quietly turn into an expensive-read design, or the other way around.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Will this data be used for anything else later?&lt;/strong&gt; Product teams love to ask for more once the basic feature ships: trending posts, recommendations, analytics. Some designs throw away all the detail and keep only a number, which is great until someone wants that detail back. I wanted to at least know which door I was closing.&lt;/p&gt;

&lt;p&gt;Once I had these written down, comparing the approaches stopped being about which one was "best" in the abstract and became about which one fit these answers. That framing is what the rest of this post is built around.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Journey: Approaches I Explored
&lt;/h2&gt;

&lt;p&gt;I didn't arrive at the final design in one go. I worked through a handful of approaches, and each one taught me something that pushed me toward the next. Here's the order I actually explored them in, roughly from "what I tried first because it was obvious" to "what I ended up shipping."&lt;/p&gt;

&lt;h3&gt;
  
  
  Counter-Based
&lt;/h3&gt;

&lt;h4&gt;
  
  
  A. Full Counter (the naive start)
&lt;/h4&gt;

&lt;p&gt;This is the version I had in my head when I picked up the ticket. Add an integer column to the posts table, call it &lt;code&gt;views&lt;/code&gt;, and run something like &lt;code&gt;UPDATE posts SET views = views + 1&lt;/code&gt; every time someone opens the post. To show the count, you just read the column. It's about as simple as a feature gets, and it costs you essentially nothing in storage: one number per post, forever.&lt;/p&gt;

&lt;p&gt;For low traffic, this is genuinely fine. I want to be clear about that, because there's a temptation to dismiss the simple thing, and plenty of real products run on exactly this. The problem isn't that it's wrong, it's that it stops being fine pretty quickly once the post gets popular.&lt;/p&gt;

&lt;p&gt;The first crack is concurrency. Imagine the increment as "read the current value, add one, write it back." Now picture two requests hitting the same post at the same instant. Both read 100, both compute 101, both write 101. Two views happened, but the counter only moved by one. That's the classic lost-update race, and it's sneaky because it only shows up under load, which is exactly when you care most. You can dodge it by leaning on the database's atomic increment instead of reading and writing in your own code, and that helps a lot. But even with that fixed, you're still doing a database write on every single view, and you still have no idea whether two visits came from the same person. No unique views, no detail, just a number going up. That last part is what eventually pushed me to keep looking.&lt;/p&gt;

&lt;p&gt;The whole storage model is a single column:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;posts
+-----------+------------------+
| id (PK)   | bigint           |
| title     | text             |
| views     | bigint  &amp;lt;-- this |
+-----------+------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here's the lost-update race that bites you under load, two requests interleaving on that one column:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl83pehtdnxbxqh09obkd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl83pehtdnxbxqh09obkd.png" alt="full counter" width="664" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  B. Probabilistic Counter
&lt;/h4&gt;

&lt;p&gt;This one felt like a clever trick the first time I read about it. Instead of recording every view, you only record one with some probability &lt;code&gt;p&lt;/code&gt;, say one in ten. Then when you read the counter back, you multiply by &lt;code&gt;1/p&lt;/code&gt; to estimate the real total. Record roughly a tenth of the views, multiply the stored number by ten, and on average you land close to the truth.&lt;/p&gt;

&lt;p&gt;The win is obvious: you've cut your writes by ninety percent. For a post getting hammered, that's a huge relief on the write path. The catch is right there in the name. It's approximate. Any single post's count can be off, and the smaller the real number is, the worse the relative error looks, which is awkward because brand new posts are exactly where people scrutinize the count.&lt;/p&gt;

&lt;p&gt;That got me thinking about when approximate is actually acceptable, and the answer is "more often than you'd guess." Nobody reads "1.2M views" on a video and assumes it's accurate to the last digit. At that scale the label is a vibe, not a measurement, and shaving the write cost is well worth a bit of fuzz. But this still gave me zero unique-view information, and our requirements left the door open to needing real counts for smaller posts. So it went in the "good idea, wrong fit" pile.&lt;/p&gt;

&lt;p&gt;The storage is the same single column as before. The only change is in the application logic around the write and the read:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;P&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;  &lt;span class="c1"&gt;# only persist 1 in 10 views
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;P&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UPDATE posts SET views = views + 1 WHERE id = %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_views&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;stored&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT views FROM posts WHERE id = %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stored&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# scale back up to estimate the real total
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  C. Sharded Counter
&lt;/h4&gt;

&lt;p&gt;The sharded counter keeps the accuracy of the full counter but spreads out the contention. Instead of one row per post, you keep N rows, the shards. When a view comes in, you pick a shard at random and increment that one. To get the total, you sum across all the shards for that post.&lt;/p&gt;

&lt;p&gt;The point of all this is to stop every request from fighting over the same single row. If a thousand views land at once and you have twenty shards, those writes scatter across twenty places instead of piling onto one, so they stop stepping on each other. And unlike the probabilistic approach, you don't give up any accuracy: every view is still counted exactly, just split across rows. Spreading write contention across shards like this is a well worn pattern, and it does what it says.&lt;/p&gt;

&lt;p&gt;Where it got less appealing for me was the reading side and, again, the uniqueness question. Every read now has to sum N rows instead of grabbing one value, so you've traded a write headache for a slightly heavier read, and you have to pick N up front. More to the point, sharding a counter makes a fast counter, not a smart one. It still only knows totals. I needed something that could tell me who viewed what, and no amount of sharding gets you there.&lt;/p&gt;

&lt;p&gt;With N shards per post, writes scatter across them and the read fans out and sums them back up:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7v484fsiiqabrw4ufbsj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7v484fsiiqabrw4ufbsj.png" alt="sharded counter" width="800" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  D. Redis Counter + Periodic DB Flush
&lt;/h4&gt;

&lt;p&gt;By this point I'd accepted that hitting the main database on every view was the real enemy, so the next idea was to stop doing that. Keep the counter in Redis, increment it there with a simple &lt;code&gt;INCR post:{id}:views&lt;/code&gt;, and let a background job flush the accumulated counts into the database every so often. Reads can serve from Redis too, which is fast.&lt;/p&gt;

&lt;p&gt;This is genuinely quick. In-memory increments are cheap, and the flush collapses thousands of individual view events into a handful of database writes per interval instead of one write per view. As a write-back cache, it's a big step up: you write to the fast layer constantly and to the durable layer occasionally.&lt;/p&gt;

&lt;p&gt;The cost is durability and moving parts. If Redis goes down between flushes and you haven't configured persistence or routed the increments through a durable queue, the views since the last flush are just gone. For a view counter that's often an acceptable trade, losing a few counts on a crash usually isn't the end of the world, and I'd already decided we could tolerate that. But you also now own Redis, the flush job, and all the failure cases in between. It's a great tool and I kept it in mind, yet on its own it still only counts. I was starting to notice a theme: every counter variant solved the write problem and dodged the unique-views problem. So I went looking at the other end of the spectrum.&lt;/p&gt;

&lt;p&gt;Every view hits the fast in-memory layer, and only the periodic job touches the database:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4rnrfn155z2ewsxkttwa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4rnrfn155z2ewsxkttwa.png" alt="redis counter plus periodic DB flush" width="605" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Table-Based
&lt;/h3&gt;

&lt;p&gt;The opposite of "store a number" is "store everything." Here you create a &lt;code&gt;post_view&lt;/code&gt; table and write a row for every view: &lt;code&gt;post_id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, and a &lt;code&gt;viewed_at&lt;/code&gt; timestamp. Want the total? Count the rows for that post. Want unique views? Count the distinct users. Want to know what someone read last Tuesday? It's all sitting right there.&lt;/p&gt;

&lt;p&gt;This is the first approach that actually answered the questions the counters couldn't. You get true unique views basically for free, and you get the raw material for everything the product team might ask for later: trending posts, recommendations, "people who read this also read that," analytics. Instead of throwing away detail, you're keeping all of it.&lt;/p&gt;

&lt;p&gt;The trouble is that "all of it" is a lot, and it never stops growing. Every view is an insert, so the write pressure is still there. Worse, the table grows forever, and the cost of reading grows right along with it. This is the part that bit me conceptually: counting rows is cheap when there are a thousand of them and painful when there are fifty million, and the view count is shown on every page load. So your read cost creeps up in direct proportion to your success. The more views a post gets, the slower it becomes to display how many views it got. You can soften that with indexes and cached aggregates, but at that point you're already reaching for a hybrid, which is exactly where I ended up.&lt;/p&gt;

&lt;p&gt;One row per view, with the detail you were missing before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;post_view
+------------+-----------+---------------------+
| post_id    | user_id   | viewed_at           |
+------------+-----------+---------------------+
| 42         | 1001      | 2026-05-30 09:14:02 |
| 42         | 1001      | 2026-05-30 11:40:51 |
| 42         | 2087      | 2026-05-30 12:03:19 |
+------------+-----------+---------------------+
INDEX (post_id)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reads are expressive, which is the whole appeal, but the second one gets slower as the table grows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- total views&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;post_view&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;post_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- unique views (the thing counters can't do)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;post_view&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;post_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hybrid (where I landed)
&lt;/h3&gt;

&lt;p&gt;The hybrid is really just an honest admission that the counters and the table were each half right. So keep both. You have a &lt;code&gt;post_view&lt;/code&gt; table for the detail and a counter for the bulk. New views go into the table as rows, the way the table-based approach does. Then, on a schedule, a job sweeps up the old rows, deletes them, and folds their count into the counter. When you read the total, you return the counter plus a count of the recent rows still sitting in the table.&lt;/p&gt;

&lt;p&gt;What this buys you is a table that stays bounded instead of growing without end. You still get unique views for the recent window, because that detail is right there in the un-archived rows, and you get a correct running total because the old views were summed into the counter before they were deleted. The mental model that made it click for me was hot versus cold data. Recent views are hot: people query them, you want the detail, uniqueness matters. Old views are cold: nobody's asking which specific user viewed a post eight months ago, so you compress them down to a single number and move on.&lt;/p&gt;

&lt;p&gt;It isn't free. Inserts are still heavy, since every view is a row until the archive job comes along, so the write path needs the same care the table-based approach did. And you are deliberately giving something up: once old rows are archived into the counter, the granular history is gone, so you can't go back and recompute unique views for last year. For us that was an easy trade, because we cared about recent unique views and a correct lifetime total, not about forensic detail on ancient posts. That balance, bounded size, recent detail, accurate total, is what made it the one I actually shipped, and it's the design the rest of this post builds on.&lt;/p&gt;

&lt;p&gt;Two stores work together: a bounded &lt;code&gt;recent_views&lt;/code&gt; table (hot) and an archived counter on &lt;code&gt;posts&lt;/code&gt; (cold). The archive job drains old rows into the counter, and the read combines both sides:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqht443kbzl3dzj1dn456.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqht443kbzl3dzj1dn456.png" alt="hybrid approach" width="800" height="149"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- read: cold total plus hot detail&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;archived_views&lt;/span&gt;
       &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;recent_views&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_views&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;p&gt;Before I get to what I shipped, here's everything side by side. This is the table I wish someone had handed me on day one.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Unique views&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Write cost&lt;/th&gt;
&lt;th&gt;Read cost&lt;/th&gt;
&lt;th&gt;Space&lt;/th&gt;
&lt;th&gt;Durability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Full counter&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;High*&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;DB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Probabilistic&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Approx&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;DB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sharded&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;O(N shards)&lt;/td&gt;
&lt;td&gt;O(N)&lt;/td&gt;
&lt;td&gt;DB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Redis + flush&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Very low&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Risk&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Table&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Exact&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Expensive&lt;/td&gt;
&lt;td&gt;O(views)&lt;/td&gt;
&lt;td&gt;DB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid&lt;/td&gt;
&lt;td&gt;✅ (recent)&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;High insert&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Bounded&lt;/td&gt;
&lt;td&gt;DB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;*High accuracy only once race conditions are handled.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Chose &amp;amp; Why
&lt;/h2&gt;

&lt;p&gt;I went with the hybrid: a recent-views table plus a counter on the post, with a scheduled job that folds old rows into the counter and deletes them. The deciding factor was that we needed two different things at once. We had to show a total view count on every read, and we also had to know how many &lt;em&gt;unique&lt;/em&gt; people viewed a post recently, because that fed into the trending logic. A plain counter can't tell you anything about uniqueness, and a pure per-view table that keeps every row forever would have grown without bound. The hybrid let me keep the per-row detail where it actually mattered (the recent window) and collapse everything older into a single number.&lt;/p&gt;

&lt;p&gt;The other thing pushing me here was the read pattern. Views are write-heavy, but the count gets shown on basically every page load, so reads dominate in volume. With the hybrid, the read stays cheap: it's the counter plus a count of the small recent table, instead of counting over the full history. I should be upfront that we're an early-stage product without a lot of traffic yet, so none of this was about surviving a massive spike today. I picked the hybrid because it was the design I wouldn't have to rip out and rewrite once the table actually started to grow.&lt;/p&gt;

&lt;p&gt;What I gave up was granular history for old views. Once a row gets archived, I no longer know &lt;em&gt;who&lt;/em&gt; viewed a post six months ago or exactly &lt;em&gt;when&lt;/em&gt;, only that the total went up by some amount. We decided that was fine. Nobody asked for "unique viewers over all time," and if that requirement ever shows up, it's a separate analytics problem better solved by shipping raw events somewhere else, not by keeping every row in the hot path forever.&lt;/p&gt;

&lt;p&gt;If our requirements had been different, I'd have picked something simpler. If all anyone cared about was a rough total view count, and losing a handful of counts on a crash was acceptable, I'd have reached for a Redis counter with a periodic flush to the database. It's less code, it's faster, and it sidesteps the archive job entirely. We didn't do that because we genuinely needed the unique-view signal, and "rough" wasn't good enough for the feature that consumed it. Different requirements, different answer.&lt;/p&gt;

&lt;p&gt;I also briefly considered a probabilistic counter and a sharded counter. Both are good tools, but they solve problems we didn't have. Probabilistic counting trades accuracy for memory, and we weren't memory constrained. Sharded counters fight write contention on a single hot row, which only shows up at concurrency levels we are nowhere near. At our scale the added complexity simply wasn't worth it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Implementation Highlights
&lt;/h2&gt;

&lt;p&gt;I already covered the read formula and the archive job, so I won't repeat them here. The piece worth a closer look is how the unique-view check works, including how we handle viewers who aren't logged in.&lt;/p&gt;

&lt;h3&gt;
  
  
  The unique-view check
&lt;/h3&gt;

&lt;p&gt;To avoid counting the same person twice in the recent window, I lean on a unique constraint instead of checking in application code. The table has a unique constraint on &lt;code&gt;(post_id, user_id)&lt;/code&gt;, and I insert with an "ignore if it already exists" clause.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;recent_views&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;viewed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(:&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONFLICT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="k"&gt;NOTHING&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the row already exists, nothing happens and the unique count stays correct.&lt;/p&gt;

&lt;p&gt;Logged-out visitors are the interesting case, since they don't have a &lt;code&gt;user_id&lt;/code&gt; yet. We track them by &lt;code&gt;device_id&lt;/code&gt; instead, so an anonymous view still counts as a unique view for that device. When the person eventually logs in, we bind that &lt;code&gt;device_id&lt;/code&gt; to their account and rewrite the existing view records from &lt;code&gt;device_id&lt;/code&gt; to &lt;code&gt;user_id&lt;/code&gt;. That way a visitor who browsed before signing up doesn't get counted twice, once as a device and again as a user, and their pre-login history folds cleanly into their account.&lt;/p&gt;

&lt;p&gt;One thing to keep in mind: the archive job needs to run inside a single transaction. Count, increment the counter, and delete the old rows must all commit together, or a crash mid-job will double count or lose count.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;If you take nothing else from this writeup, take these. They are the things I would tell myself before picking up that "tiny" ticket again.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The smallest-sounding features can hide the hardest tradeoffs. "Show a view count" sounds like an afternoon of work, but the moment you ask what a view even is and how many you expect per second, you are designing a system, not adding a counter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Decide on your accuracy and uniqueness requirements before you write a line of code. Total versus unique, and "exact" versus "close enough," change the data model. They are not details you can bolt on later.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In a views system the write path is where the cost lives, so design around it first. Reads you can cache or precompute, but every single page open is a write, and that firehose is what kills the naive approaches.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is no single best design here, only the one that fits your requirements. Every approach on my list is the right answer for somebody. The hybrid won for me because of the tradeoffs I cared about, not because it is objectively superior.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;So that is the whole trip: from &lt;code&gt;UPDATE views = views + 1&lt;/code&gt; that I thought would ship before lunch, through probabilistic counters, sharded counters, a Redis layer with periodic flushing, and a full per-view table, and finally to a hybrid that keeps recent views in a table while archiving the old ones into a running counter. Each stop taught me something, and almost none of them were wrong. They were just answers to questions I had not asked yet. I did not land on the hybrid because it is the smartest option on paper, but because it matched what we actually needed: recent unique views, a table that stays a reasonable size, and room for whatever the product team dreams up next.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>database</category>
      <category>architecture</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Beyond LIKE: Mastering PostgreSQL's Built-in Full-Text Search</title>
      <dc:creator>Sheikh Abdullah</dc:creator>
      <pubDate>Sat, 17 May 2025 17:56:57 +0000</pubDate>
      <link>https://dev.to/sheikh566/beyond-like-mastering-postgresqls-built-in-full-text-search-4h35</link>
      <guid>https://dev.to/sheikh566/beyond-like-mastering-postgresqls-built-in-full-text-search-4h35</guid>
      <description>&lt;p&gt;A few months ago, my manager assigned me to a project that required improving its search functionality. The team initially considered Elasticsearch as the primary solution, though they were open to exploring alternatives. I researched the topic and discovered PostgreSQL's full-text search. Since our data was already stored in PostgreSQL, I compared the two tools. The application I worked on served private users and had strict privacy regulations, with complex conditions determining who could view or edit specific data within the company. &lt;/p&gt;

&lt;p&gt;These intricate queries proved challenging for Elasticsearch to handle effectively, while PostgreSQL's full-text search, being built on top of SQL syntax, required no extra work to handle this. It also integrates seamlessly with our existing database, requires no additional infrastructure, and has no licensing costs, unlike Elasticsearch. Additionally, PostgreSQL efficiently manages relational data, while Elasticsearch can be less flexible outside its ecosystem and demands more system resources.&lt;/p&gt;

&lt;p&gt;But what the heck is full-text search? It’s a way to dig through text data fast, finding matches based on words or phrases, not just exact strings. &lt;br&gt;
Think of it like googling something, but for your own database. &lt;br&gt;
Instead of saying "find me this exact sentence," you can ask "show me anything with ‘book’ and ‘read’ in it, "and it'll pull up stuff like "I read a book" or "books for reading." It ignores stop words like "the", "and", "I", etc, to pull more relevant results. In postgres, it uses tricks like turning text into searchable chunks and matching them up, so you’re not stuck with slow, basic searches that miss the point. It’s all about making search feel natural, not robotic.&lt;/p&gt;

&lt;p&gt;PostgreSQL’s full-text search is powered by two core concepts: tsvector and tsquery. A &lt;code&gt;tsvector&lt;/code&gt; is a data type that stores pre-processed, searchable data. Think of it as a transformed, indexed version of your text. When you run a search, PostgreSQL compares your search terms with these indexed values rather than the raw text. This allows for faster and more efficient lookups. A &lt;code&gt;tsquery&lt;/code&gt;, on the other hand, is essentially a search query. It represents the words and phrases you're looking for, possibly enriched with operators to define how terms should be combined or modified. Examples include using &lt;code&gt;&amp;amp;&lt;/code&gt; for logical AND, &lt;code&gt;|&lt;/code&gt; for OR, and &lt;code&gt;!&lt;/code&gt; for NOT, giving you the power to craft complex search conditions.&lt;br&gt;
Here’s a quick example to illustrate how these work together in practice: &lt;br&gt;
Say you have a table &lt;code&gt;documents&lt;/code&gt; with a column named &lt;code&gt;content&lt;/code&gt;, and you want to know which documents contain both "book" and "read" in the content.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'book &amp;amp; read'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Okay, so there is a lot of stuff going on after the &lt;code&gt;WHERE&lt;/code&gt; keyword. Let's break it down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;to_tsvector('english', content)&lt;/code&gt;: This function converts the &lt;code&gt;content&lt;/code&gt; column into a tsvector. The &lt;code&gt;english&lt;/code&gt; argument specifies the language of the text, which is important for language-specific stemming and stopword lists.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;to_tsquery('english', 'book &amp;amp; read')&lt;/code&gt;: This function converts the search query into a tsquery. Again, the &lt;code&gt;english&lt;/code&gt; argument specifies the language of the text.&lt;/li&gt;
&lt;li&gt;Finally, the &lt;code&gt;@@&lt;/code&gt; operator checks if the tsvector matches the tsquery and return TRUE or FALSE.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But what if we want to search for documents containing either "book" or "read" in the content? Then we can use the &lt;code&gt;|&lt;/code&gt; operator like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'book | read'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to search for documents containing "book" and "read" but not "fantasy", you can use the &lt;code&gt;!&lt;/code&gt; operator like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'book &amp;amp; read &amp;amp; !fantasy'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can craft more complex queries by grouping terms with parentheses. For example, to find documents which must contain "book" and either "read" or "fantasy", you can use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'book &amp;amp; (read | fantasy)'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While all these operators are great for building "Advanced Search" features (like Gmail or Wikipedia) in your application, but most of the time, users just want a simple search.&lt;br&gt;
For that, you can use the &lt;code&gt;plainto_tsquery&lt;/code&gt; function. It's similar to &lt;code&gt;to_tsquery&lt;/code&gt; but doesn't interpret special characters as operators.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;plainto_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'book read'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;plainto_tsquery&lt;/code&gt;, each space is treated as an &lt;code&gt;&amp;amp;&lt;/code&gt; operator. So, the query above will return documents containing both "book" and "read".&lt;/p&gt;

&lt;h2&gt;
  
  
  Searching Across Multiple Columns
&lt;/h2&gt;

&lt;p&gt;Often, you'll want to search across multiple columns in a table. For example, you might want to search both the &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;content&lt;/code&gt; of documents. PostgreSQL makes this easy through vector concatenation.&lt;/p&gt;

&lt;p&gt;When searching across multiple columns, you can concatenate tsvectors using the &lt;code&gt;||&lt;/code&gt; operator. This combines the lexemes from different columns into a single tsvector that can be searched with a tsquery.&lt;/p&gt;

&lt;p&gt;Here's how you might search across both the &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;content&lt;/code&gt; columns of a &lt;code&gt;documents&lt;/code&gt; table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
  &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'test'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, &lt;code&gt;test&lt;/code&gt; is the phrase you're searching for. The &lt;code&gt;||&lt;/code&gt; operator concatenates the &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;content&lt;/code&gt; columns into a single tsvector, which is then searched using the &lt;code&gt;@@&lt;/code&gt; operator.&lt;br&gt;
You might be thinking, why not just do the string concatenation in the query like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;❌&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; 
&lt;span class="err"&gt;❌&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="err"&gt;❌&lt;/span&gt;   &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'test'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach is not recommended because the parser treats the title and content as one continuous string, which can lead to unexpected results. &lt;br&gt;
Moreover, column-level weighting (which we will discuss later) couldn't be applied.&lt;/p&gt;
&lt;h2&gt;
  
  
  Speeding Up The Searches
&lt;/h2&gt;

&lt;p&gt;While these examples work, calling to_tsvector() on every query is inefficient. &lt;br&gt;
Each time you run the query, PostgreSQL has to process and transform the entire text content into a tsvector. &lt;br&gt;
For large tables or frequent searches, this can significantly impact performance.&lt;br&gt;
A better approach is to store the tsvector in a separate column and keep it updated &lt;br&gt;
using database triggers or your backend code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="nv"&gt;"vector"&lt;/span&gt; &lt;span class="n"&gt;tsvector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; 
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="nv"&gt;"vector"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To further speed up the searches, you can create a GIN index on the &lt;code&gt;vector&lt;/code&gt; column.&lt;br&gt;
GIN index is a specialized index type that efficiently handles queries on composite data types, such as arrays, jsonb and tsvector.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_vector&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to search on columns spread across multiple tables, you can a materialized view to store the concatenated tsvector.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;documents_search&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;author_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also create a GIN index on the &lt;code&gt;vector&lt;/code&gt; column of the materialized view.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Weights
&lt;/h2&gt;

&lt;p&gt;PostgreSQL's Full-Text Search allows you to assign different weights to different parts of your document,&lt;br&gt;
influencing how search results are ranked. This is achieved using the &lt;code&gt;setweight&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;setweight&lt;/code&gt; function takes two arguments: a &lt;code&gt;tsvector&lt;/code&gt; and a weight label ('A', 'B', 'C', or 'D'). &lt;br&gt;
'A' is the highest weight, and 'D' is the lowest. By default, lexemes are assigned weight 'D'.&lt;/p&gt;

&lt;p&gt;Here's how you can use &lt;code&gt;setweight&lt;/code&gt; to prioritize matches in the &lt;code&gt;title&lt;/code&gt; over matches in the &lt;code&gt;content&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="nv"&gt;"vector"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting weights is useless if you don't need to rank the results. There are two ranking functions in PostgreSQL:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ts_rank&lt;/code&gt;: Basic ranking that considers the number of matching lexemes and their weights&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ts_rank_cd&lt;/code&gt;: Also considers the distance between matching lexemes (coverage density)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's how to use them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'test'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ts_rank_cd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'test'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query will return documents that match the query "test" and rank them by how closely they match the query.&lt;/p&gt;

&lt;p&gt;Earlier we talked about why you shouldn't concatenate columns (string concatenation) before creating the tsvector. Because if you do that, &lt;br&gt;
you won't be able to set weights to individual columns.&lt;/p&gt;
&lt;h2&gt;
  
  
  Highlighting Matches
&lt;/h2&gt;

&lt;p&gt;PostgreSQL provides the &lt;code&gt;ts_headline&lt;/code&gt; function to highlight matching terms in the search results. &lt;br&gt;
This function takes the original text and a tsquery as input, and returns the text with matching terms highlighted using &lt;code&gt;&amp;lt;b&amp;gt;&amp;lt;/b&amp;gt;&lt;/code&gt; tags.&lt;/p&gt;

&lt;p&gt;Here's the basic syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;cte&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'test'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;title_match&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;content_match&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cte&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result will look like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;title_match&lt;/th&gt;
&lt;th&gt;content_match&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A &amp;lt;b&amp;gt;test&amp;lt;/b&amp;gt; document&lt;/td&gt;
&lt;td&gt;This is a &amp;lt;b&amp;gt;test&amp;lt;/b&amp;gt; document that contains the word &amp;lt;b&amp;gt;test&amp;lt;/b&amp;gt; multiple times to &amp;lt;b&amp;gt;test&amp;lt;/b&amp;gt; the highlighting feature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A title&lt;/td&gt;
&lt;td&gt;The second document also contains a &amp;lt;b&amp;gt;test&amp;lt;/b&amp;gt; word&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you want replace the default &lt;code&gt;&amp;lt;b&amp;gt;&amp;lt;/b&amp;gt;&lt;/code&gt; tags with something else, you can define them in the fourth argument of the &lt;code&gt;ts_headline&lt;/code&gt; function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s1"&gt;'StartSel={, StopSel=}'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we replaced &lt;code&gt;&amp;lt;b&amp;gt;&lt;/code&gt; with &lt;code&gt;{&lt;/code&gt; and &lt;code&gt;&amp;lt;/b&amp;gt;&lt;/code&gt; with &lt;code&gt;}&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;PostgreSQL's full-text search feature is simple, flexible and powerful. It can fulfill most of your search needs.&lt;br&gt;
Though every tool has limitations, PostgreSQL's FTS is no exception. There are scenarios where it might not be the optimal choice, and dedicated search engines like Elasticsearch, OpenSearch, or Solr could be more suitable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Extremely Large Datasets and High Query Throughput:&lt;/strong&gt; For applications dealing with terabytes or petabytes of data, or requiring tens of thousands of queries per second, distributed search engines are designed for horizontal scalability and can offer better performance. PostgreSQL FTS operates within the context of the database server, which might become a bottleneck at such scales for search-heavy workloads.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Complex Relevancy Tuning and Machine Learning:&lt;/strong&gt; If your application demands highly sophisticated ranking algorithms, personalized search results, "learning to rank" (LTR) models, or advanced features like "more like this" that go beyond PostgreSQL's &lt;code&gt;ts_rank&lt;/code&gt; and &lt;code&gt;ts_rank_cd&lt;/code&gt; capabilities, dedicated search platforms provide more extensive tools and flexibility.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Requirement for a Decoupled Search Service:&lt;/strong&gt; Architecturally, you might prefer to decouple your search infrastructure from your primary database. This allows for independent scaling, resource allocation, different maintenance windows, and potentially different technology choices for the search component. Using a dedicated search service facilitates this separation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In these situations, the additional complexity and operational overhead of managing a separate search system can be justified by the advanced capabilities and scalability they provide. However, for a vast majority of applications, PostgreSQL's built-in FTS offers an excellent balance of power, simplicity, and integration directly within your database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo Application
&lt;/h2&gt;

&lt;p&gt;Wikipedia also uses a full-text search engine to power its Advanced Search feature.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38hbz65f066pppaftrxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38hbz65f066pppaftrxl.png" alt="Wikipedia's Advanced Search feature" width="800" height="753"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I created a simpler version to showcase PostgreSQL's full-text search capabilities in a real-life application. Check out the deployed demo: &lt;a href="https://tg1r1rgrpd.execute-api.ap-south-1.amazonaws.com/dev/search" rel="noopener noreferrer"&gt;Wikipedia Advanced Search&lt;/a&gt;. You can view the source code on GitHub: &lt;a href="https://github.com/Sheikh566/wikipedia-advanced-search" rel="noopener noreferrer"&gt;Sheikh566/wikipedia-advanced-search&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>backend</category>
      <category>elasticsearch</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
