DEV Community

Cover image for Seven rules for OpenSearch sizing
dejanualex for AWS Community Builders

Posted on • Edited on

4

Seven rules for OpenSearch sizing

OpenSearch splits indices into shards. Each shard stores a subset of all documents in an index.

  1. Shard sizes should be between 10–50 GB per shard, 10–30 GB for workloads prioritizing low latency (e.g. search workloads), or between 30–50GB (e.g. logs workloads).

  2. Estimate the total size of the data you plan to store in the index, decide on a shard size based on the rule above, and calculate the number of primary shards: ingested_data_size/shard_size.

  3. The number and size of shards you set for an index corresponds to the size of an index, OpenSearch defaults to one primary and one replica shard, for a total of two shards per index.

  4. Shard count is secondary to shard size.

  5. Shard size impacts both search latency and write performance, too many small shards will exhaust the memory (JVM Heap), and too few large shards prevent OpenSearch from properly distributing requests. The JVM heap size should be based on the available RAM: Set Xms and Xmx to the same value, and no more than 50% of your total memory.

  6. For fast indexing (ingestion), you need as many shards as possible; for fast searching, it is better to have as few shards as possible.

  7. Number of shards for index: you should have at least 1 shard per data node, ideally try to make the index shard count an even multiple of the data node count.

🛠️ Last, here's a simple OpenSearch calculator for shard sizing.

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Best Practices for Running  Container WordPress on AWS (ECS, EFS, RDS, ELB) using CDK cover image

Best Practices for Running Container WordPress on AWS (ECS, EFS, RDS, ELB) using CDK

This post discusses the process of migrating a growing WordPress eShop business to AWS using AWS CDK for an easily scalable, high availability architecture. The detailed structure encompasses several pillars: Compute, Storage, Database, Cache, CDN, DNS, Security, and Backup.

Read full post

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay