I want to share my experience of using the native "trigram indexes" of Postgres to achieve a robust "typeahead/autocomplete/fuzzy search" functionality. I am using Algolia as an example, as I have noticed that a lot of people are using it precisely for that. Yet, I believe that you could do perfectly fine by using only Postgres in many situations.
What are trigram indexes? They are a special type of index in Postgres that is available by default; however, they have to be explicitly enabled. Here it is an extract from the official docs:
"The pg_trgm module provides functions and operators for determining the similarity of alphanumeric text based on trigram matching, as well as index operator classes that support fast searching for similar strings.", "A trigram is a group of three consecutive characters taken from a string."
For example, the set of trigrams in the string "cat" is " c", " ca", "cat", and "at ". The set of trigrams in the string "foo|bar" is " f", " fo", "foo", "oo ", " b", " ba", "bar", and "ar ".
Now let's follow what's necessary to build our autocomplete engine.
1) Enable the Postgres extension
CREATE EXTENSION IF NOT EXISTS pg_trgm
2) Index the columns you are going to query
Once you have enabled pg_trgm
you have to index varchar
and text
columns to make use of it. e.g. CREATE INDEX index_fuzzy_searches_on_name_trgm ON fuzzy_searches USING gin(name gin_trgm_ops)
. Notice USING gin(name gin_trgm_ops)
Now we have very fast LIKE
queries with leading and trailing wild cards (%
).
3) Prepare the query
The next step is splitting the user query by spaces and adding a separate WHERE clause per name. For example, if a user searches for "sales cloud". We'd end up with a query like SELECT * FROM mytable WHERE name LIKE '%sales%' AND name LIKE '%cloud%'
A simplified implementation in Ruby on Rails could look like (in this case ServiceSearch
would be MATERIALIZED VIEW that I will explain down the post):
relation = ServiceSearch.popular_first.limit(limit).includes(:service)
query.split("\s").each do |word|
relation = relation.where("content LIKE ?", "%#{word}%")
end
relation.map(&:service)
4) Search for different types of records simultaneously
One of the use cases of using a dedicated full-text search engine is querying all your records simultaneously. That could be achieved through a MATERIALIZED VIEW. Moreover, you can easily give priority to some records. For example, showing Products before Categories. I am managing my views by the awesome scenic gem, but you don't need to.
Here it is the body of an exemplary materialized view:
SELECT
id AS searchable_id,
'Service' AS searchable_type,
1 AS priority,
post_services_count AS weight,
LOWER(name) AS name
FROM services
WHERE state IN ('approved', 'closed')
UNION
SELECT
categories.id AS searchable_id,
'Category' AS searchable_type,
2 AS priority,
COUNT(*) AS weight,
LOWER(name) AS name
FROM categories
JOIN categorizations cats ON cats.category_id = categories.id
WHERE state = 'approved'
GROUP BY categories.id, categories.name
By having a view like that one, we can query multiple records with something as simple as:
SELECT * FROM fuzzy_searches WHERE name LIKE '%test%' ORDER BY priority, weight DESC
Real-world examples
One of the top competitors of SaaSHub - alternativeto.net is using Algolia for their search autocomplete. I'd suggest testing their search and SaaSHub's. You won't find many differences in the UX or speed.
In the end, if you need a simple autocomplete functionality for your project, and you are using Postgres, my suggestion is to consider its trigram indexes + materialized views. Reaching out to external services may not be necessary and could be an overkill. However, I'm sure that there will be many cases in which it will make sense to pay for a professional SaaS product.
p.s. you can find more Algolia alternatives on SaaSHub π.
Top comments (2)
Usually such ngram-search approach has 3 limitations:
I'm glad you wrote this up, Stan.
We discussed Algolia as a search option for a RoR app a while ago, but I couldn't see any benefits over PG search, either using raw queries (I'm a fan of MiniSQL for that) of the pg_search gem.