DEV Community

Vishal Gunjal
Vishal Gunjal

Posted on

Building "Captain Cool": A Deterministic, Multi-Agent AI Orchestration Engine

Most Generative AI applications built at hackathons share the same fatal flaw: they are stateless, brittle chatbots wrapped in a nice UI. They hallucinate under pressure, fail ungracefully when external APIs drop, and offer zero visibility into execution costs.

For the Agentic Premier League 2026 hosted by GDG Cloud Pune, I wanted to build something different.

Instead of a generic prompt wrapper, I engineered Captain Coolβ€”a production-grade, distributed multi-agent AI system that acts as a real-time IPL match strategist. We treated AI agents not as conversationalists, but as highly specialized microservices communicating via a deterministic state machine.Here is a deep dive into the architecture, the SRE-grade observability, and how we built a "Bloomberg Terminal for Cricket."

πŸ”— : https://github.com/vishalgunjalSWE/captain-cool-agent

πŸ—οΈ The Multi-Agent Architecture (DAG)

To satisfy the requirement of complex, multi-turn reasoning, I decoupled the LLM into four distinct, highly constrained agents powered by Gemini 2.5 Flash via the @google/genai SDK.

Instead of one prompt trying to do everything, data flows sequentially through this Directed Acyclic Graph (DAG):

The Analyst (Data Ingestion & Tool Call): Parses the raw match state (e.g., target, overs left, batsmen). It executes a real-time Gemini Tool Call to the Open-Meteo API to fetch Pune's temperature and humidity, calculating a "Dew Factor."

The Strategist (Proposal): Ingests the Analyst's enriched payload and generates a tactical plan. Using strict system instructions and responseMimeType: "application/json", it is forced to output a structured JSON matrix containing the tactic, bowler, win probability, and counterfactual risk.

The Advocate (Automated Red Teaming): This agent is explicitly prompted to never agree. It intercepts the Strategist's JSON matrix, analyzes it for flaws (e.g., "Bowling an off-spinner to left-handers with heavy dew is a 80% loss probability"), and forces a revision.

The Commentator (Broadcast Output): Translates the final, stress-tested JSON output into engaging cricket jargon for the user.

βš™οΈ SRE & Enterprise Differentiators

Building for production means building for failure and scale. Here is how we bulletproofed the engine:

  1. Graceful Degradation (The Fallback Protocol)

Network requests fail. To ensure the orchestration loop survives, the Analyst’s weather API tool call is wrapped in an AbortController with a strict 3-second timeout. If the venue Wi-Fi drops or the API rate-limits, the system catches the error, logs a warning, and seamlessly injects mock historical pitch data to keep the pipeline moving.

  1. FinOps & ELK-Style Telemetry

To prove unit economics, I built a custom trackExecution() higher-order function that wraps every Gemini API call.

It tracks performance.now() latency (in ms).

It parses the usageMetadata object to extract prompt and completion tokens.

It calculates the exact cost of the pipeline in real-time.
Enter fullscreen mode Exit fullscreen mode

All of this telemetry is saved to an audit_trace.json file and streamed to the frontend.

  1. Context Persistence (Memory Across Overs)

To prevent the system from acting like an amnesiac, it writes the final strategic decision to a localized match_state.json file. When the user simulates the next over, the Analyst agent reads this history, allowing the AI to naturally reference past mistakes or successes.

πŸ’» The Observability Console (Next.js + SSE)

To visualize this complex backend, a standard chat UI wouldn't suffice. I built an "Observability Dashboard" using Next.js (App Router), Tailwind CSS v3.4, and shadcn/ui.

To achieve a true "Bloomberg Terminal" aesthetic, the backend Node/Express server streams the agent telemetry to the Next.js frontend via Server-Sent Events (SSE).

As the simulation runs, the UI dynamically updates:

An animated Framer Motion pipeline graph pulses as each agent activates.

The "FinOps Ticker" updates the live compute cost fraction-by-fraction.

The Counterfactual Matrix renders instantly into a clean, dark-mode data table.
Enter fullscreen mode Exit fullscreen mode

πŸš€ Quick Start (One-Click Ignition)

I utilized concurrently in the root monorepo to ensure a flawless developer experience. You can spin up both the Express AI Engine and the Next.js Dashboard with a single command:

Clone the repository

git clone [Your Repo URL]
cd captain-cool-stack

Install dependencies

npm install

Add your Gemini API Key

echo "GEMINI_API_KEY=your_key_here" > captain-cool-agent/.env

Ignite the stack

npm start

πŸ† Conclusion

The Agentic Premier League challenge proved that the future of AI engineering is Systems Engineering. Prompt engineering is just the syntax; architecture, observability, fault-tolerance, and deterministic orchestration are what make Agentic AI valuable in the real world.

Massive thanks to GDG Cloud Pune and Google Cloud for an incredible event!

Stack: Gemini 2.5 Flash, Node.js, Express, Next.js, shadcn/ui, Docker.

Top comments (0)