Originally published on medium. Rewritten for Dev.to with added formatting and structure.
Cover photo credit: Photo by Emile Perron on Unsplash
TL;DR
Elasticsearch performance issues often boil down to poor shard setup, missing index templates, and lack of retention policies.
This guide explains how shards, templates, and ILM work together — and gives best practices to fix slow queries, reduce costs, and ensure high availability.
⚠️ This guide explains how things work, not how to configure them.
See the Elasticsearch docs for configuration details.
Quick wins for better performance:
- Use 1
primary
& 1replica
shard for small index (≤ 8GB) - For bigger index (> 30GB), use multiple
primary
shards for better performance. - Align shard count with node count. Use the following formula
number of shards = number of nodes * n
where n = 1,2,3 …
Example: 3 nodes = 3, 6, or 9 shards
Why Is My Elasticsearch So Slow?
Three main culprits:
- Poor shard configuration → No parallel processing
- Missing index templates → Inefficient data storage
- No lifecycle management → Bloated storage, slow queries
Let's take a deeper look.
🔍 Where Does Elasticsearch Store Our Data?
In Elasticsearch, an index
is a logical namespace holding collection of documents (or data).
Think of it like this:
- Document = One piece of data (like a database row).
- Index = Collection of documents (like a database table).
- Shard = Actual storage unit (splits index for performance).
💡 Key insight: Documents aren't stored directly in indices - they're stored in shards!
What Are Index Templates and Why Do We Need Them?
Unlike traditional relational databases, Elasticsearch indices are schema-less, which means they can store flexible data structures — but to query efficiently, we need organization.
That’s where index templates come into play.
Templates define:
✅ Field types (e.g., string, integer, date)
✅ Mappings (how fields should be indexed)
✅ Settings (shard allocation, refresh intervals, etc.)
Basic Workflow: Templates → Indices → Shards → Docs
Each template can define multiple indices. Each index contains shards, and each shard stores documents.
High-level hierarchy — Relationship between template, shards & documents in Elasticsearch
Shards — The Backbone of Elasticsearch Performance
Shards allow Elasticsearch to:
- Process queries in parallel.
- Distribute data across nodes.
- Maintain redundancy using replica shards.
Types of Shards
- Primary: Holds actual data.
- Replica: Backup of a primary, used for failover.
📝 When an index is created, Elasticsearch automatically generates a shard behind the scenes. By default, this is set to 1 (as defined by index.number_of_shards), but it can be adjusted based on specific requirements.
How Shards Enable Parallel Queries?
If we configured 2 primary shards for an index, Elasticsearch can split the query across both shards as illustrated below.
How Elasticsearch distributes request among primary shards for parallel processing
This parallelism speeds up queries, especially for large datasets.
How Do Elasticsearch Prevent Data Loss When Nodes Fail?
To understand how Elasticsearch ensures high availability, it’s essential to grasp shard rebalancing — the process of evenly distributing shards across the cluster.
Shard Rebalancing = High Availability
Imagine an index configured with 2 primary shards only and no replicas on a single node.
Elasticsearch node holding 2 primary shards. Each shard holds documents
As we scale our cluster by adding a new node, Elasticsearch automatically moves shards around for even distribution.
Shards Rebalancing in Elasticsearch
Now, let’s introduce replica shards by setting 1 replica for each primary shard. The new distribution looks like this:
- Each primary shard gets a replica.
- Replica shards are never placed on the same node as their corresponding primary shards.
Shards Rebalancing in Elasticsearch
For instance:
✅ Primary Shard 1 (on Node 1) → Replica Shard 1 (on Node 2)
✅ Primary Shard 2 (on Node 2) → Replica Shard 2 (on Node 1)
Rule of thumb — primary and replica shards NEVER live on the same node.
✏️ Unlike traditional relational databases, where all primary
data resides on the same node, Elasticsearch distributes
both primary and replica shards across different nodes to
enhance resilience.
What Happens If a Node Fails?
Let's simplify our example and consider a cluster of 2 nodes with the following settings:
- 1 Primary Shard (storing documents).
- 1 Replica Shard.
2 Elasticsearch Nodes with 1 primary & 1 replica shard. Each shard stores documents
Now, imagine Node 1 goes down due to an unexpected failure. How does the cluster recover?
Elasticsearch Cluster — 1 Node goes down due to an unexpected failure
Elasticsearch automatically promotes the replica shard to a primary shard, ensuring continued data availability.
Elasticsearch promotes the replica shard to a primary shard, ensuring continued data availability
This seamless failover mechanism is what makes Elasticsearch highly fault-tolerant — allowing it to self-heal and maintain uptime even in the face of hardware failures.
How To Manage Old Data Without Slowing Everything Down?
When dealing with large amounts of time-series data, like logs and metrics, storing and managing it efficiently over time is key.
This is where Index Lifecycle Management (ILM) comes in — it automatically moves data through stages like hot, warm, and cold.
🧊 ILM works best with Hot-Warm-Cold architecture, where different nodes are optimized for recent data, archived data, and deletion.
Phase | Temperature | Use Case | Hardware |
---|---|---|---|
Hot | Active data | Recent logs / metrics maybe within the last 2 weeks | Fast SSDs |
Warm | Older data | From 2 weeks until 1 month. | Balanced storage |
Cold | Archive | From 1 months until 6 months or more. | Cheap HDDs |
Delete | Expired | Remove after 6 months or more. | N/A |
Retention Management with Index Lifecycle Management (ILM)
Why This Matters
Without templates or ILM:
- You get unstructured, bloated indices.
- Queries become slow.
- Storage becomes expensive.
- Cluster maintenance gets harder.
How To Configure Shards for Best Performance?
1. Match Shards to Nodes
Distribute shards evenly to prevent bottlenecks. Use the following formula:
number of shards = number of nodes * n
where n = 1,2,3 …
Example: 3 nodes = 3, 6, or 9 shards
2. Use a Single Replica Shard (Unless Needed)
A single replica shard generally provides a good balance between redundancy and resource efficiency.
⚠️ If you’re considering multiple replicas, it’s best to explore Elasticsearch documentation or case studies to understand the trade-offs.
3. Size Your Shards Right
- Small indices (<8GB): Stick to one primary shard for efficiency.
- Larger indices (up to 30GB): Distribute data across multiple primary shards for better performance.
💡 Key Takeaways
- Shards = Speed: More shards enable parallel processing.
- Templates = Consistency: Define data structure upfront.
- Replicas = Reliability: Automatic failover during failures.
- ILM = Cost Savings: Move old data to cheaper storage.
- Balance = Performance: Distribute shards evenly across nodes.
Remember: Proper configuration today prevents performance headaches tomorrow!
Real World Case Study?
Need a hands-on walkthrough? Check out Improve Elasticsearch Write Throughput.
📘 I write about PostgreSQL, devOps, backend engineering, and real-world performance tuning.
🔗 Find more of my work, connect on LinkedIn, or explore upcoming content: all-in-one
Top comments (0)