Anas Anjaria

Posted on Jul 16 • Originally published at levelup.gitconnected.com

Why Your Elasticsearch Is Slow (and Fixes)

#elasticsearch #programming #webperf #tutorial

Originally published on medium. Rewritten for Dev.to with added formatting and structure.

Cover photo credit: Photo by Emile Perron on Unsplash

TL;DR

Elasticsearch performance issues often boil down to poor shard setup, missing index templates, and lack of retention policies.

This guide explains how shards, templates, and ILM work together — and gives best practices to fix slow queries, reduce costs, and ensure high availability.

⚠️ This guide explains how things work, not how to configure them.

See the Elasticsearch docs for configuration details.

Quick wins for better performance:

Use 1 primary & 1 replica shard for small index (≤ 8GB)
For bigger index (> 30GB), use multiple primary shards for better performance.
Align shard count with node count. Use the following formula

number of shards = number of nodes * n

where n = 1,2,3 …

Example: 3 nodes = 3, 6, or 9 shards

Why Is My Elasticsearch So Slow?

Three main culprits:

Poor shard configuration → No parallel processing
Missing index templates → Inefficient data storage
No lifecycle management → Bloated storage, slow queries

Let's take a deeper look.

🔍 Where Does Elasticsearch Store Our Data?

Index holding a document

In Elasticsearch, an index is a logical namespace holding collection of documents (or data).

Think of it like this:

Document = One piece of data (like a database row).
Index = Collection of documents (like a database table).
Shard = Actual storage unit (splits index for performance).

💡 Key insight: Documents aren't stored directly in indices - they're stored in shards!

What Are Index Templates and Why Do We Need Them?

Unlike traditional relational databases, Elasticsearch indices are schema-less, which means they can store flexible data structures — but to query efficiently, we need organization.

That’s where index templates come into play.

Templates define:

✅ Field types (e.g., string, integer, date)
✅ Mappings (how fields should be indexed)
✅ Settings (shard allocation, refresh intervals, etc.)

Basic Workflow: Templates → Indices → Shards → Docs

Each template can define multiple indices. Each index contains shards, and each shard stores documents.

High-level hierarchy — Relationship between template, shards & documents in Elasticsearch

Shards — The Backbone of Elasticsearch Performance

Shards allow Elasticsearch to:

Process queries in parallel.
Distribute data across nodes.
Maintain redundancy using replica shards.

Types of Shards

Primary: Holds actual data.
Replica: Backup of a primary, used for failover.

📝 When an index is created, Elasticsearch automatically generates a shard behind the scenes. By default, this is set to 1 (as defined by index.number_of_shards), but it can be adjusted based on specific requirements.

How Shards Enable Parallel Queries?

If we configured 2 primary shards for an index, Elasticsearch can split the query across both shards as illustrated below.

How Elasticsearch distributes request among primary shards for parallel processing

This parallelism speeds up queries, especially for large datasets.

How Do Elasticsearch Prevent Data Loss When Nodes Fail?

To understand how Elasticsearch ensures high availability, it’s essential to grasp shard rebalancing — the process of evenly distributing shards across the cluster.

Shard Rebalancing = High Availability

Imagine an index configured with 2 primary shards only and no replicas on a single node.

Elasticsearch node holding 2 primary shards. Each shard holds documents

As we scale our cluster by adding a new node, Elasticsearch automatically moves shards around for even distribution.

Shards Rebalancing in Elasticsearch

Now, let’s introduce replica shards by setting 1 replica for each primary shard. The new distribution looks like this:

Each primary shard gets a replica.
Replica shards are never placed on the same node as their corresponding primary shards.

Shards Rebalancing in Elasticsearch

For instance:
✅ Primary Shard 1 (on Node 1) → Replica Shard 1 (on Node 2)
✅ Primary Shard 2 (on Node 2) → Replica Shard 2 (on Node 1)

Rule of thumb — primary and replica shards NEVER live on the same node.

✏️ Unlike traditional relational databases, where all primary
data resides on the same node, Elasticsearch distributes 
both primary and replica shards across different nodes to
enhance resilience.

What Happens If a Node Fails?

Let's simplify our example and consider a cluster of 2 nodes with the following settings:

1 Primary Shard (storing documents).
1 Replica Shard.

2 Elasticsearch Nodes with 1 primary & 1 replica shard. Each shard stores documents

Now, imagine Node 1 goes down due to an unexpected failure. How does the cluster recover?

Elasticsearch Cluster — 1 Node goes down due to an unexpected failure

Elasticsearch automatically promotes the replica shard to a primary shard, ensuring continued data availability.

Elasticsearch promotes the replica shard to a primary shard, ensuring continued data availability

This seamless failover mechanism is what makes Elasticsearch highly fault-tolerant — allowing it to self-heal and maintain uptime even in the face of hardware failures.

How To Manage Old Data Without Slowing Everything Down?

When dealing with large amounts of time-series data, like logs and metrics, storing and managing it efficiently over time is key.

This is where Index Lifecycle Management (ILM) comes in — it automatically moves data through stages like hot, warm, and cold.

🧊 ILM works best with Hot-Warm-Cold architecture, where different nodes are optimized for recent data, archived data, and deletion.

Phase	Temperature	Use Case	Hardware
Hot	Active data	Recent logs / metrics maybe within the last 2 weeks	Fast SSDs
Warm	Older data	From 2 weeks until 1 month.	Balanced storage
Cold	Archive	From 1 months until 6 months or more.	Cheap HDDs
Delete	Expired	Remove after 6 months or more.	N/A

Retention Management with Index Lifecycle Management (ILM)

Why This Matters

Without templates or ILM:

You get unstructured, bloated indices.
Queries become slow.
Storage becomes expensive.
Cluster maintenance gets harder.

How To Configure Shards for Best Performance?

1. Match Shards to Nodes

Distribute shards evenly to prevent bottlenecks. Use the following formula:

number of shards = number of nodes * n

where n = 1,2,3 …

Example: 3 nodes = 3, 6, or 9 shards

2. Use a Single Replica Shard (Unless Needed)

A single replica shard generally provides a good balance between redundancy and resource efficiency.

⚠️ If you’re considering multiple replicas, it’s best to explore Elasticsearch documentation or case studies to understand the trade-offs.

3. Size Your Shards Right

Small indices (<8GB): Stick to one primary shard for efficiency.
Larger indices (up to 30GB): Distribute data across multiple primary shards for better performance.

💡 Key Takeaways

Shards = Speed: More shards enable parallel processing.
Templates = Consistency: Define data structure upfront.
Replicas = Reliability: Automatic failover during failures.
ILM = Cost Savings: Move old data to cheaper storage.
Balance = Performance: Distribute shards evenly across nodes.

Remember: Proper configuration today prevents performance headaches tomorrow!

Real World Case Study?

Need a hands-on walkthrough? Check out Improve Elasticsearch Write Throughput.

📘 I write about PostgreSQL, devOps, backend engineering, and real-world performance tuning.

🔗 Find more of my work, connect on LinkedIn, or explore upcoming content: all-in-one

DEV Community