If you inherited an OpenSearch deployment and you're now being asked to run agents on it, Q1 2026 has been unusually good news. OpenSearch 3.5 (February) and 3.6 (April) aren't incremental search improvements — they're a clear declaration of intent.
"OpenSearch isn't trying to be a better Elasticsearch; it is focused on being the data layer on which AI applications are built."
That's from the article's author, an engineer who's been migrating teams from log analytics to semantic retrieval. It also captures the entire roadmap.
What actually changed
Better Binary Quantization (BBQ) landed in 3.6. Integrated from the Lucene project, BBQ compresses high-dimensional float vectors into compact binary representations — 32x memory reduction. On the Cohere-768-1M benchmark, BBQ recall at 100 results hits 0.63 vs. 0.30 for Faiss Binary Quantization. With oversampling and rescoring, it exceeds 0.95 on large production datasets. The project is working to make this the default.
Sparse vector search got production-scale tooling. The SEISMIC algorithm enables neural sparse approximate nearest-neighbor search without a full index scan. Most production AI search pipelines land on the hybrid pattern — dense semantic recall + sparse neural precision — and 3.6 is explicitly built around that.
Agent memory is now a platform concern, not a DIY problem. Before 3.5, multi-turn agent memory meant maintaining a session store outside OpenSearch and wiring context management yourself. 3.5 moved conversation memory into ML Commons with hook-based APIs. 3.6 went further: new semantic and hybrid search APIs let agents retrieve contextually relevant prior exchanges via vector similarity or keyword matching — not just the most recent turn.
Token usage tracking, finally. Every LLM call during agent execution is now instrumented — per-turn, per-model token counts, no configuration required. Supports Amazon Bedrock Converse, OpenAI v1, and Gemini v1beta. If you've been flying blind on what your agents cost to run, this is a free upgrade.
MCP support landed. The opensearch-agent-server in 3.6 adds multi-agent orchestration with Model Context Protocol integration. MCP has become the standard for how AI systems communicate with external tools and data sources. Its inclusion signals that OpenSearch wants to be a full participant in agentic tooling ecosystems, not just a backend that happens to store vectors.
Why it matters
OpenSearch is systematically absorbing problems that teams were solving outside the platform — agent memory, token cost observability, distributed tracing for multi-step agent execution (APM on OpenTelemetry is now built in via Dashboards). Each absorbed problem raises the switching cost and makes OpenSearch stickier in the AI application stack.
The MCP integration is the most strategic piece. It's not feature parity. It's connective tissue.
What to do
- Already running OpenSearch for logs or search? Upgrade to 3.6 and benchmark BBQ — the memory savings alone may justify the upgrade.
- Building agents and haven't picked a memory layer? Read the 3.5/3.6 ML Commons docs before you spin up a separate vector store.
- Running agents in production without cost visibility? Token tracking in 3.6 requires zero config. Just upgrade.
-
Using MCP in your agentic stack? The
opensearch-agent-serverintegration is worth evaluating for grounding agents in OpenSearch-held data.
Source: Inside OpenSearch's bid to become the default AI data layer — The New Stack
✏️ Drafted with KewBot (AI), edited and approved by Drew.
Top comments (0)