Elasticsearch is a distributed search and analytics engine, built on top of Apache Lucene. It is commonly used for full-text search, log analytics, metrics, and real-time data exploration.
Lucene is the core indexing + search library behind Elasticsearch β Elasticsearch is Lucene made easy + scalable.
What is Lucene?
Lucene is a low-level search/indexing library written in Java.
It provides:
- Full-text search
- Tokenization & analyzers
- Inverted index mechanism
- Scoring & relevance ranking
π Lucene is NOT a server or database β itβs just a library.
What is Elasticsearch?
Elasticsearch is an open-source distributed search engine built on top of Lucene with:
- REST API-based querying
- Distributed storage (shards & replicas)
- Full-text search + filtering + aggregations
- Scalable architecture for big data
Client β Elasticsearch β Lucene Index β Results
Key Concepts in Elasticsearch
| Concept | Meaning |
|---|---|
| Index | Like a database |
| Document | JSON object (like a row) |
| Field | Key-value inside document |
| Shard | Partition of an index |
| Replica | Copy of shard (for fault tolerance) |
Lucene: Inverted Index (Core Idea)
Lucene converts text into tokens and builds a reverse lookup table:
Text: "Java is great. Java streams are powerful."
Inverted Index:
"java" β [doc1, doc1]
"is" β [doc1]
"streams" β [doc1]
π Faster to search βjavaβ instead of scanning entire text.
Elasticsearch Search Flow
- Text β Analyzer β Tokens
- Tokens stored in inverted index (Lucene)
- Query converted into tokens
- Relevant documents matched + scored
- Sorted + returned as results
Example Query (Search API)
GET users/_search
{
"query": {
"match": { "name": "shantanu" }
}
}
Full Text Search vs Exact Match
| Search Type | Example | When Used |
|---|---|---|
match |
"java developer" | Full-text search |
term |
"400" | Exact value match |
wildcard |
"dev*" | Pattern-based search |
Aggregations (Analytics Support)
GET logs/_search
{
"aggs": {
"errors_by_status": {
"terms": { "field": "status_code" }
}
}
}
π Works like SQL GROUP BY.
When to Use Elasticsearch?
β Full-text search
β Log analytics (ELK stack)
β Real-time dashboards
β Autocomplete & suggestions
β Distributed scalability
Examples: eCommerce search, StackOverflow search, Kibana dashboards, Log analysis.
When NOT to Use Elasticsearch?
β As a replacement for a relational DB
β For complex transactions
β For small datasets
β When ACID guarantees are critical
Use PostgreSQL/MySQL instead.
Summary Table
| Feature | Elasticsearch | Lucene |
|---|---|---|
| Type | Distributed search engine | Search library |
| Interface | REST API | Java library |
| Storage | JSON documents | Inverted index |
| Scale | Distributed (clusters) | Single machine |
| Use Case | Search & analytics | Core indexing logic |
Top comments (0)