Scalability and resilience

Elasticsearch Scalability and Resilience:
- Elasticsearch is distributed by nature, ensuring high availability and scalability.
- Add servers (nodes) to a cluster to increase capacity; Elasticsearch distributes data and queries across nodes.
- No application overhaul needed; Elasticsearch balances multi-node clusters automatically.
- Under the hood, an index consists of one or more physical shards, each being a self-contained index.
- Shards distribute documents across nodes for redundancy and query capacity.
- Two types of shards: primaries (store documents) and replicas (copies of primaries).
- Replicas serve read requests (searching or retrieving documents).
Shard Considerations:
- Shard size and number impact performance.
- Aim for average shard size between a few GB and a few tens of GB.
- Avoid excessive shards per node (proportional to heap space).
- Test with your own data to find optimal configurations.
Cross-Cluster Replication (CCR):
- Synchronize indices from primary to secondary remote clusters.
- Secondary clusters serve as hot backups or read-only followers.
- Active-passive replication; primary handles write requests, secondary is read-only.
- Use CCR for disaster recovery and geo-proximity read requests.

DEV Community