DEV Community

Cover image for 📊🔍 OpenSearch Dashboards: Optimizing Massive Data Queries (Big Data) with Asynchronous Search
João Victor
João Victor

Posted on

📊🔍 OpenSearch Dashboards: Optimizing Massive Data Queries (Big Data) with Asynchronous Search

Working with logs, telemetry, or large-scale datasets in OpenSearch can result in slow and heavy queries. This guide covers everything a Tech Lead needs to know about optimizing searches using the Asynchronous Search API in OpenSearch Dashboards.


🧩 Architecture and Concepts

OpenSearch consists of:

  • Cluster → group of nodes that store and process data.
  • Shards → distributed index fragments.
  • Dashboards → visualization interface and REST Dev Tools.
  • Plugins → extensions such as Security, Reports, and Asynchronous Search.

The asynchronous search (_plugins/_asynchronous_search) executes queries in the background, allowing progress tracking, cancellation, or retrieval without blocking the client.

Simplified flow:

[Dashboards / API]
   ↓
POST /_plugins/_asynchronous_search
   ↓
→ Cluster processes the query in the background
   ↓
← Returns ID
   ↓
GET /_plugins/_asynchronous_search/{id}
DELETE /_plugins/_asynchronous_search/{id}
Enter fullscreen mode Exit fullscreen mode

⚙️ Full Request Example

POST /_plugins/_asynchronous_search?index=logs*,events*&keep_on_completion=true&wait_for_completion_timeout=2s
{
  "size": 5000,
  "track_total_hits": false,
  "_source": ["timestamp", "source_ip", "event_type"],
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "@timestamp": { "gte": "now-30d" }
          }
        }
      ],
      "must": [
        {
          "query_string": {
            "query": "*malware*",
            "fields": ["message", "event_type"],
            "analyze_wildcard": true,
            "allow_leading_wildcard": true
          }
        }
      ]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

✅ Expected Response (simplified)

{
  "id": "F3B7E5A6-24C3-11EF-AEA3-12AB34CD56EF",
  "is_partial": false,
  "is_running": true,
  "response": {
    "took": 134,
    "timed_out": false,
    "hits": { "total": 2381, "hits": [] }
  }
}
Enter fullscreen mode Exit fullscreen mode

🧠 Synchronous vs Asynchronous Search

Type Endpoint Blocks Client Ideal for
Synchronous POST /_search ✅ Yes Fast queries (<5s)
Asynchronous POST /_plugins/_asynchronous_search ❌ No Big Data, reports, multi-index queries

Asynchronous search immediately returns an ID, and processing continues on the server side.


📈 Key Parameters

Parameter Description Example Notes
index Target indices logs*, events* Use wildcards cautiously
keep_on_completion Keeps result after completion true Required for later retrieval
wait_for_completion_timeout Initial wait before returning ID 2s Keeps client responsive
keep_alive Result retention period 1d, 7d Useful for scheduled reports
track_total_hits Counts all documents false Disable for Big Data
_source Returned fields ["timestamp", "message"] Avoid wildcard *

🔍 Query Structure

Optimized example:

{
  "bool": {
    "filter": [
      { "range": { "@timestamp": { "gte": "now-7d" } } }
    ],
    "must": [
      { "match": { "event_type": "unauthorized_access" } }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{
  "hits": {
    "total": 542,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "logs-2025.11",
        "_id": "Dfgh12345",
        "_source": {
          "timestamp": "2025-11-03T22:14:25Z",
          "event_type": "unauthorized_access",
          "source_ip": "192.168.3.24"
        }
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

⚡Performance Tips

  • Prefer filter for exact and date fields (cacheable).
  • Avoid leading wildcards (*malware* → slow).
  • Reduce size (1k–5k) and use scroll or search_after.
  • Adjust refresh_interval for static indices.
  • Monitor heap with _nodes/stats/jvm.
  • Disable track_total_hits for non-analytic searches.

🧮 Monitoring and Management

Check progress

GET /_plugins/_asynchronous_search/F3B7E5A6-24C3-11EF-AEA3-12AB34CD56EF
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{
  "id": "F3B7E5A6-24C3-11EF-AEA3-12AB34CD56EF",
  "is_running": false,
  "is_partial": false,
  "response": {
    "took": 3210,
    "hits": { "total": 2381, "hits": [...] }
  }
}
Enter fullscreen mode Exit fullscreen mode

Global statistics

GET /_plugins/_asynchronous_search/stats
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{
  "total": {
    "submitted": 102,
    "completed": 95,
    "running": 7,
    "failed": 0
  }
}
Enter fullscreen mode Exit fullscreen mode

Cancel or delete old searches

DELETE /_plugins/_asynchronous_search/F3B7E5A6-24C3-11EF-AEA3-12AB34CD56EF
Enter fullscreen mode Exit fullscreen mode

🔒 Security and Access Control

  • Required permission: cluster:admin/opensearch/asynchronous_search/*
  • Results stored in internal indices (.opendistro-asynchronous-search*)
  • Always enable TLS and authentication (BasicAuth, JWT, or SAML).
  • Use DLS/FLS to restrict sensitive data.
  • Perform periodic cleanup of old searches.

📚 References

Top comments (0)