DEV Community

Apoorv Gupta
Apoorv Gupta

Posted on

Captain Cool: How I Built a Multi-Agent IPL Strategist with Gemini in 3 Hours

Cricket is a captain's game. Every over, someone has to make a call — who bowls, who bats next, when to take the timeout, when to play the Impact Player. These decisions live in the gap between data and instinct.

I wanted to build an AI that operates in that gap — not just analysing the match, but arguing about it. An AI that proposes a call, gets challenged on it, defends or revises, and then explains the whole debate in cricket language any fan can understand.

This is Captain Cool. Here's how I built it at the Agentic Premier League in Pune, powered by Google Gemini.


The Problem with Most "AI" Sports Tools

Before I built anything, I looked at what other people had built at similar events. The pattern was consistent:

  • A polished UI
  • Hardcoded or mock data
  • A README full of words like "AI-powered", "neural analytics", "predictive intelligence"
  • Zero actual Gemini API calls in the codebase

The ones that did use Gemini made a single API call and formatted the response into multiple labelled sections — calling it "multi-agent" when it was really just one prompt with four JSON fields.

I wanted to build something where you could open the code and see four distinct agents, each making their own Gemini call, each with their own system prompt, each contributing something the others can't.


Architecture: The Debate Pipeline

The system runs a 5-step sequential pipeline:

Stats Analyst (with function calling)
    ↓ statistical report
Strategist (proposes tactical call)
    ↓ proposal
Devil's Advocate (challenges the proposal)
    ↓ challenge
Strategist (defends or revises)
    ↓ final decision
Commentator (translates to cricket talk)
    ↓ broadcast commentary
Enter fullscreen mode Exit fullscreen mode

Each step is a separate Gemini call. Each agent has its own system prompt. The Strategist runs twice — once to propose and once to revise after the challenge.

The entire pipeline streams to the frontend via Server-Sent Events, so you watch the debate unfold in real time, agent by agent.


Agent 1: Stats Analyst — Real Function Calling

The Stats Analyst is the only agent with tool access. This is where the genuine agentic behavior lives.

I defined four tool functions:

def get_player_stats(player_name: str, stat_type: str, phase: str) -> dict:
    """Get batting or bowling stats for a player in powerplay/middle/death"""

def get_matchup_data(batter: str, bowler: str) -> dict:
    """Get head-to-head stats between a specific batter and bowler"""

def calculate_win_probability(batting_team, score, wickets, overs, target, innings) -> dict:
    """Calculate win probability with momentum indicator"""

def get_venue_conditions(venue: str) -> dict:
    """Get pitch conditions, avg scores, altitude/dew effects"""
Enter fullscreen mode Exit fullscreen mode

I passed these to Gemini as native tools — not as injected text in the prompt. Gemini sees the function signatures, decides which ones to call, calls them with appropriate arguments, receives the results, and synthesizes a statistical report.

self.model = genai.GenerativeModel(
    model_name="gemini-2.5-flash",
    system_instruction=SYSTEM_PROMPT,
    tools=ALL_TOOLS,
)
Enter fullscreen mode Exit fullscreen mode

The function-calling loop runs until Gemini stops requesting tools — typically 4-7 calls per analysis. Every call is logged so you can watch Gemini's reasoning chain:

[TOOL CALL] get_player_stats('Priyansh Arya', 'batting', 'powerplay')
[TOOL RESULT] {"sr": 226.5, "avg": 36.4, "vs_pace": 218.0}

[TOOL CALL] get_matchup_data('Priyansh Arya', 'Bhuvneshwar Kumar')
[TOOL RESULT] {"balls": 12, "runs": 8, "dismissals": 2, "sr": 66.7}

[TOOL CALL] calculate_win_probability('PBKS', 0, 0, 0.0, 209, '2')
[TOOL RESULT] {"batting_win_prob": 31.0, "required_run_rate": 10.44}
Enter fullscreen mode Exit fullscreen mode

System Prompt:

You are the Stats Analyst for an IPL team's think tank. Your role is 
to process match data and provide ONLY statistical analysis — no opinions, 
no recommendations.

You have access to tools. USE THEM before answering. Do not guess statistics.
Call multiple tools to build a comprehensive picture.
Enter fullscreen mode Exit fullscreen mode

Agent 2: Strategist — The Captain's Call

The Strategist uses gemini-2.5-pro — the most powerful Gemini model — because the tactical call is the highest-stakes output in the system.

It receives the stats report and proposes one specific decision in a structured format:

DECISION: [One clear sentence]
REASONING: [2-3 sentences in cricket language, not ML jargon]
CONFIDENCE: [High/Medium/Low] — [one-line justification]
RISK: [What could go wrong — specific player and scenario]
Enter fullscreen mode Exit fullscreen mode

The structured format was deliberate. It forces the model to commit to one call and own the risk, rather than hedging with "it depends."

System Prompt:

You are the Captain. Think step by step:
1. What phase is the match in?
2. What does the data say about available options?
3. What is the highest-leverage decision right now?

Your decision must cover exactly ONE of:
- Who bowls the next over
- Batting order change
- Field placement shift
- Strategic timeout timing
- Impact Player deployment

Be bold. A captain who hedges every call loses matches.
Enter fullscreen mode Exit fullscreen mode

Agent 3: Devil's Advocate — The Contrarian

This is the innovation that makes Captain Cool different from everything else at the event.

Most AI systems give you one answer. Captain Cool shows you the argument against that answer.

The Devil's Advocate receives the Strategist's proposal and must:

  • Identify the biggest assumption the Strategist made
  • Present a specific counter-scenario where the decision fails (with player names and stats)
  • Suggest a concrete alternative
  • Deliver a verdict: AGREE / DISAGREE / CONDITIONAL

System Prompt:

You are the Devil's Advocate in the IPL captain's think tank.
Your ONLY job is to challenge the Strategist's proposal.

You are not contrarian for fun. You genuinely find where this decision fails.
Think like a batting coach who knows exactly how the opposition exploits this.

CHALLENGE: [Core weakness — specific, with stats]
COUNTER-SCENARIO: [Specific failure mode — over, player, numbers]
ALTERNATIVE: [Different decision + why it's better]
VERDICT: [AGREE / DISAGREE / CONDITIONAL]
Enter fullscreen mode Exit fullscreen mode

Agent 4: Commentator — Cricket for Everyone

The Commentator receives the full debate — all four prior outputs — and translates everything into 4-5 sentences of broadcast-style commentary.

No ML jargon. No bullet points. Just cricket talk.

System Prompt:

You are a cricket commentator in the style of Harsha Bhogle.

Rules:
- Write in present tense, as if commentating live
- Reference specific players by name
- Include the why-this-not-that explanation
- Sound like an IPL broadcast, not an academic paper
- End with one sentence about what to watch for next
- Under 100 words, pure flowing prose
Enter fullscreen mode Exit fullscreen mode

The Orchestrator — Wiring It Together

The orchestrator.py runs the pipeline sequentially, yielding SSE events at each step:

async def run_debate(self, match_state: MatchState) -> AsyncGenerator[str, None]:
    # Step 1
    stats_result = await self.stats_analyst.analyze(match_context)
    yield self._sse_event("agent_output", {"agent": "stats_analyst", ...})

    # Step 2
    proposal_result = await self.strategist.propose(match_context, stats_text)
    yield self._sse_event("agent_output", {"agent": "strategist", ...})

    # Step 3
    challenge_result = await self.devils_advocate.challenge(...)
    yield self._sse_event("agent_output", {"agent": "devils_advocate", ...})

    # Step 4 — Strategist reviews and revises
    revision_result = await self.strategist.revise(...)
    yield self._sse_event("agent_output", {"agent": "strategist", ...})

    # Step 5
    commentary_result = await self.commentator.narrate(...)
    yield self._sse_event("agent_output", {"agent": "commentator", ...})
Enter fullscreen mode Exit fullscreen mode

The frontend subscribes to the SSE stream and renders each agent's output as it arrives — color-coded, labeled, and displayed sequentially.


Live Demo: PBKS vs RCB, Dharamsala

Here's what the system produced for today's real match:

Input: PBKS chasing 209, 0/0 in 0.0 overs. Bhuvneshwar Kumar to open the bowling.

Stats Analyst output (after 5 tool calls):

Priyansh Arya SR 226.5 overall — but drops to 66.7 vs Bhuvneshwar with 2 dismissals in just 12 balls. Bhuvneshwar leads IPL 2026 wicket charts at 22 wickets, economy 7.46. HPCA Dharamsala altitude amplifies his swing. PBKS win probability: 31% at required RRR of 10.44.

Strategist proposal:

DECISION: Promote Shreyas Iyer to No. 3 — ready to walk in on first powerplay wicket.
REASONING: Iyer averages 54.7 at SR 163 this season. The Arya-Bhuvi matchup (SR 66.7, 2 dismissals) makes an early wicket almost certain. PBKS need consolidation followed by acceleration.
CONFIDENCE: High — matchup data and Bhuvi's form both point this way.
RISK: Kills scoring rate if Arya survives. Iyer coming in cold vs Hazlewood at altitude.

Devil's Advocate challenge:

CHALLENGE: Iyer at 3 kills the aggression PBKS need at 10.44 RRR.
COUNTER-SCENARIO: Stoinis as Impact Player provides left-arm option vs right-arm Bhuvi — disrupts the angle immediately.
ALTERNATIVE: Stoinis at 3 on early wicket, not Iyer. Left-right combination.
VERDICT: CONDITIONAL — Iyer at 3 works only if wicket falls after over 5.

Strategist revision:

VERDICT: REVISE. Stoinis as Impact Player at 3 preserves the left-right combination. Iyer holds at 4. The Devil's Advocate correctly identified the left-hander gap.

Commentator:

"The think tank goes for Stoinis as the impact card — left-hander at three to counter Bhuvneshwar's angle. They looked at Iyer, but with a run rate needing to touch double digits from ball one, this isn't a day for safety first. Watch for the Stoinis signal from the dugout early — if Punjab's openers go inside five, the whole innings shape changes."


What I Learned

Prompt structure matters more than model size. The Devil's Advocate prompt was the hardest to write — making it genuinely contrarian without being destructively negative took four iterations. The final version uses the framing "think like a batting coach who knows exactly how the opposition exploits this" which produced the best challenges.

SSE streaming is the right output format for agent systems. Showing each agent's output as it generates — rather than waiting for all five — made the system feel alive in a way a JSON dump never could. The judges watched the debate unfold in real time. That experience is the demo.

Function calling vs RAG is visible in the output quality. When Gemini chooses its own tools and sequences them autonomously, the analysis cites specific numbers from specific tool calls. When you inject data manually, the model summarizes it. The difference in specificity is noticeable.

The Devil's Advocate pattern is underused. Every AI project I've seen shows one answer. Showing the disagreement — the argument against the recommended call — builds more trust than a confident single response. Judges found it more interesting than the decision itself.


Tech Stack

  • Gemini 2.5 Pro — Strategist agent (proposal phase)
  • Gemini 2.5 Flash — All other agents
  • Gemini native function calling — Stats Analyst tool use
  • Python FastAPI — Backend with SSE streaming
  • React 18 + Tailwind — Frontend
  • Google Cloud Run — Deployment
  • Google Antigravity — Primary IDE throughout the build

Live Demo Screenshots:

Links


Huge thanks to GDG Cloud Pune for putting together
the Agentic Premier League — built Captain Cool,
a multi-agent IPL strategist powered by Gemini,
while watching PBKS chase 209 at Dharamsala live
on the big screen.

GoogleCloud #GoogleCloudAPL #BuildwithAI #GDGCloudPune

Built at Agentic Premier League, GDG Cloud Pune, May 17 2026
Vibe-coded with Google Antigravity in 3 hours

Top comments (0)