DEV Community

Cover image for Building an IPL Cricket Stats Assistant with Algolia Agent Studio
Sreenu Sasubilli
Sreenu Sasubilli

Posted on

Building an IPL Cricket Stats Assistant with Algolia Agent Studio

Algolia MCP Server Challenge: Ultimate user Experience

IPL Cricket Stats Assistant — A Conversational AI Powered by Algolia

This is a submission for the Algolia Agent Studio Challenge:

**Consumer-Facing Conversational Experiences
**


What I Built

I built an IPL Cricket Stats Assistant, a consumer-facing conversational AI that answers natural-language questions about IPL batting performance.

Users can ask questions like:

  • “Rohit Sharma highest score”
  • “Sharma highest score”
  • “Virat Kohli at Chinnaswamy Stadium”

The assistant returns grounded, factual answers sourced directly from structured IPL match data.

This assistant is designed for everyday cricket fans, not analysts.

It supports natural language questions using familiar terms, nicknames, and partial names, allowing users to explore IPL statistics conversationally without needing structured filters or technical knowledge.


Demo

Live Agent (Algolia Agent Studio):

  • The agent is published and testable directly inside Algolia Agent Studio.

Frontend Demo:

  • A lightweight React + InstantSearch demo was built locally to validate real-world usage.

Screenshots
Example queries demonstrating alias resolution, ambiguity handling, and deterministic retrieval

  • Alias handling Alias handling
  • Nickname handling

Nickname handling

  • Canonical name + venue filter

Canonical name + venue filter

  • Ambiguity handling + Clarification follow-up

Ambiguity handling + Clarification follow-up

  • Season filter

Season filter


How I Used Algolia Agent Studio

Algolia Agent Studio serves as the orchestration layer between:

  • A fast, structured Algolia Search index
  • A conversational LLM interface
  • Carefully designed agent instructions

Key design choices:

  • Every answer is retrieved using Algolia Search (no guessing).
  • Each record represents one batsman’s performance in one match, enabling deterministic responses.
  • Filters are applied whenever possible (batsman, season, venue, match_id).
  • The agent explicitly handles ambiguous queries (e.g., “Sharma”) by asking for clarification instead of assuming intent.

The result is a conversational experience that feels natural, but behaves like a reliable data system.


Data Source & Modeling

The original data comes from the publicly available IPL Complete Dataset on Kaggle:

The raw dataset contains ball-by-ball delivery data (150K+ rows).

For this project, I transformed the data in a Google Colab notebook to make it agent-friendly.

Modeling decisions:

  • Aggregated ball-level data into one record per batsman per match
  • Precomputed runs, balls, fours, and sixes per match
  • Added a batsman_aliases field to support natural queries (e.g., “Rohit”, “Hitman” → “RG Sharma”)
  • Removed the need for cross-record arithmetic inside the agent

This reduced the dataset to ~9.5K clean, deterministic records, optimized for fast retrieval and conversational accuracy.

Why this mattered:

Modeling the data at the “one batsman, one match” level ensures the agent never invents statistics and can answer questions instantly using pure retrieval.

(Optional: Google Colab notebook showing how raw IPL ball-by-ball data was aggregated into one-record-per-batsman-per-match for Agent Studio ingestion:

https://colab.research.google.com/drive/1UXomb6vJfgX2aT8Patvb1HubTHk38eOG?usp=sharing)


Why Fast Retrieval Matters

Cricket statistics are fact-heavy and precision-sensitive.

A single incorrect number breaks user trust.

Algolia’s fast, contextual retrieval ensures:

  • Sub-100ms responses, even with filters
  • Accurate grounding for every answer
  • Clean handling of ambiguity and partial queries
  • A conversational UX without sacrificing correctness

Instead of generating answers, the agent retrieves facts and explains them.


Final Thoughts

This project demonstrates how Agent Studio + well-modeled data can create conversational experiences that are:

  • Trustworthy
  • Fast
  • User-friendly
  • Production-ready

Rather than building “just a chatbot,” I focused on designing an agent that behaves like a reliable statistical assistant, grounded in real data and optimized for human queries.

Thanks for checking it out!

Top comments (0)