DEV Community

Cover image for Search optimized database (full-text search) for system design interview
Daniel Lee
Daniel Lee

Posted on • Edited on

1

Search optimized database (full-text search) for system design interview

Traditional databases run a table scan to find a search term in the database. This is slow and efficient if a table stores a large dataset (1000+ rows). To improve, "search optimized database" can be used.

It uses indexing, tokenization and stemming to make search queries fast and efficient (by building "inverted indexes"):

  • Tokenization is a process of reducing words to their root form. For example, "running" and "runs" can be reduced to "run".
  • Stemming is a process of breaking a piece of task into individual words. It helps mapping words to documents containing those words in the inverted indexes.

Something to note is the underlying data structure of search mechanisms of search optimized database - Inverted Indexes.

It is a data structure that maps words to the documents that contain them. For example:

{
  "word1": [doc1,doc2,doc3],
  "word2": [doc2,doc3,doc6]
}
Enter fullscreen mode Exit fullscreen mode

Most search optimized database also support "Fuzzy Search" out of box as a configuration. Fuzzy Search works by leveraging "edit distance calculation" technique, which measures how many letters to be changed/added/removed to transform one word into another. Thus, results with minor misspellings or discrepancy relative to the search term can be returned efficiently in case of human errors.

One of the popular search optimized database is "ElasticSearch".


Reference

https://www.hellointerview.com/learn/system-design/in-a-hurry/key-technologies

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

Billboard image

Try REST API Generation for MS SQL Server.

DreamFactory generates live REST APIs from database schemas with standardized endpoints for tables, views, and procedures in OpenAPI format. We support on-prem deployment with firewall security and include RBAC for secure, granular security controls.

See more!

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay