DEV Community

shantanu mahakale
shantanu mahakale

Posted on

Quick Recap: Elasticsearch & Lucene

Elasticsearch is a distributed search and analytics engine, built on top of Apache Lucene. It is commonly used for full-text search, log analytics, metrics, and real-time data exploration.

Lucene is the core indexing + search library behind Elasticsearch β€” Elasticsearch is Lucene made easy + scalable.


What is Lucene?

Lucene is a low-level search/indexing library written in Java.

It provides:

  • Full-text search
  • Tokenization & analyzers
  • Inverted index mechanism
  • Scoring & relevance ranking

πŸ‘‰ Lucene is NOT a server or database β€” it’s just a library.


What is Elasticsearch?

Elasticsearch is an open-source distributed search engine built on top of Lucene with:

  • REST API-based querying
  • Distributed storage (shards & replicas)
  • Full-text search + filtering + aggregations
  • Scalable architecture for big data
Client β†’ Elasticsearch β†’ Lucene Index β†’ Results
Enter fullscreen mode Exit fullscreen mode

Key Concepts in Elasticsearch

Concept Meaning
Index Like a database
Document JSON object (like a row)
Field Key-value inside document
Shard Partition of an index
Replica Copy of shard (for fault tolerance)

Lucene: Inverted Index (Core Idea)

Lucene converts text into tokens and builds a reverse lookup table:

Text: "Java is great. Java streams are powerful."
Inverted Index:
"java" β†’ [doc1, doc1]
"is" β†’ [doc1]
"streams" β†’ [doc1]
Enter fullscreen mode Exit fullscreen mode

πŸ‘‰ Faster to search β€œjava” instead of scanning entire text.


Elasticsearch Search Flow

  1. Text β†’ Analyzer β†’ Tokens
  2. Tokens stored in inverted index (Lucene)
  3. Query converted into tokens
  4. Relevant documents matched + scored
  5. Sorted + returned as results

Example Query (Search API)

GET users/_search
{
  "query": {
    "match": { "name": "shantanu" }
  }
}
Enter fullscreen mode Exit fullscreen mode

Full Text Search vs Exact Match

Search Type Example When Used
match "java developer" Full-text search
term "400" Exact value match
wildcard "dev*" Pattern-based search

Aggregations (Analytics Support)

GET logs/_search
{
  "aggs": {
    "errors_by_status": {
      "terms": { "field": "status_code" }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

πŸ‘‰ Works like SQL GROUP BY.


When to Use Elasticsearch?

βœ” Full-text search

βœ” Log analytics (ELK stack)

βœ” Real-time dashboards

βœ” Autocomplete & suggestions

βœ” Distributed scalability

Examples: eCommerce search, StackOverflow search, Kibana dashboards, Log analysis.


When NOT to Use Elasticsearch?

❌ As a replacement for a relational DB

❌ For complex transactions

❌ For small datasets

❌ When ACID guarantees are critical

Use PostgreSQL/MySQL instead.


Summary Table

Feature Elasticsearch Lucene
Type Distributed search engine Search library
Interface REST API Java library
Storage JSON documents Inverted index
Scale Distributed (clusters) Single machine
Use Case Search & analytics Core indexing logic

Top comments (0)