Rajesh Adhikari

Posted on Oct 30

Talk to Your Data Like a Human: How I Built an AI Airline Analyst

#ai #mindsdb #python #hacktoberfest

Hacktoberfest: Maintainer Spotlight

In the world of air travel, every passenger has a story. Some rave about the legroom. Others complain about cold meals or slow Wi-Fi. But here's the thing, what if you could actually talk to those 20,000 stories and get answers that help you run a better airline?

That's exactly what AIRLYTICS does.

I'm thrilled to introduce AIRLYTICS, my project for MindsDB Hacktoberfest 2025. It's not just another analytics dashboard. It's a conversational intelligence platform that transforms messy, unstructured passenger reviews into strategic business insights — the kind that actually tell you what to fix and why it matters.

Think of it as having a genius data analyst who speaks plain English, never sleeps, and always knows exactly which 100 reviews out of 20,000 matter most for your question.

🔗 Explore the GitHub Repository | 🎥 Watch the Demo Video

The Problem: Drowning in Feedback, Starving for Insight

Airlines collect mountains of customer feedback. Reviews pile up across booking sites, social media, and post-flight surveys. But here's where it gets messy.

Traditional analytics tools are great at counting things. They'll tell you the average Wi-Fi rating is 2.8 out of 5. Cool. But they won't tell you that Wi-Fi below 3 stars drops overall satisfaction by 68%, or that upgrading routers matters more than upgrading the catering menu.

They can't understand questions like "Show me Business Class travelers who loved the crew but hated baggage handling." They definitely can't explain why high ratings don't always mean high loyalty, or what that disconnect means for your bottom line.

Most BI tools require you to speak their language — SQL queries, pre-defined metrics, rigid schemas. AIRLYTICS flips that around. You speak English. It handles the rest.

Meet AIRLYTICS: Your AI Airline Analyst

AIRLYTICS is what happens when you combine MindsDB's Knowledge Bases with a dual-agent architecture and sprinkle in some serious semantic search magic.

Here's what makes it different.

🧠 Two Modes, One Seamless Experience

Most analytics platforms force you to choose between simplicity and power. AIRLYTICS gives you both.

Semantic Search Mode is perfect when you're exploring. Type something like "excellent legroom and comfortable seats" or "lost baggage and delayed luggage" and boom — the system surfaces the most relevant reviews using vector embeddings. Not just keyword matching. Real semantic understanding. The kind that knows "mishandled bag" and "lost luggage" mean the same thing.

You get the top 50 matched reviews, plus comprehensive statistics: average ratings across every dimension, seat type distributions, traveler demographics, correlation matrices showing which metrics actually drive satisfaction. It's like having a full statistical report generated in seconds, tailored exactly to your query.

Advanced Analytics Mode is where things get interesting. This is for questions like "Among users complaining about check-in delays, show me seat type distribution for those who still rated ground service above 3."

That's not a simple search. That's a multi-layered analytical question. The AI agent interprets your intent, figures out you're asking for a conditional distribution analysis, rewrites your query for optimal semantic matching, executes the right statistical function, and returns targeted visualizations with contextual explanations.

All from one sentence. No SQL. No setup. Just ask.

The Secret Sauce: Five Intelligent Functions

Behind the scenes, AIRLYTICS uses five specialized analytical engines. When you ask a complex question, the agent automatically routes it to the right one:

General Percentage Distribution handles questions like "What percentage of passengers rated value for money above 4?" — fast, simple threshold checks across any numeric field.

Conditional Distribution Analysis shows you breakdowns like "For delayed passengers, what's the seat type distribution?" — perfect for understanding how issues vary across categories.

Category-to-Category Analysis compares two categorical fields, like "How many Economy passengers were Solo Leisure versus Business travelers?" — great for demographic pivots.

Rating-to-Rating Analysis finds overlaps between numeric conditions, like "Of passengers with low Wi-Fi scores, how many also rated overall experience poorly?" — the correlation detective.

Conditional Rating Analysis connects ratings to outcomes, like "Among high food raters, what percent still recommended the airline?" — revealing those crucial loyalty disconnects.

The beautiful part? You never have to know which function does what. Just ask your question naturally. The agent figures it out.

Smart Sampling: Why Less is More

Here's something cool about AIRLYTICS that most analytics platforms get wrong.

It doesn't analyze all 20,000 reviews for every query. That would be slow, expensive, and frankly, unnecessary. Instead, it uses smart sampling.

Every query retrieves the top N semantically matched reviews — you can choose 10, 20, 50, 75, or 100 (MindsDB's current limit). All statistics, distributions, and correlations are computed from this focused subset.

Why does this matter? Because when you ask about Wi-Fi complaints, you don't need reviews about excellent meals. You need the 100 most relevant Wi-Fi-related reviews. That's your representative sample. That's where your signal lives.

It's faster, more focused, and statistically just as valid. Plus, it means every metric you see is actually about what you asked for, not diluted by thousands of unrelated reviews.

InsightInterpreter: Your AI Strategist

Raw numbers are useful. Strategic recommendations are invaluable.

That's why AIRLYTICS includes InsightInterpreter — think of it as your in-house AI consultant who's allergic to corporate buzzwords and loves cutting through noise.

After any query, hit the "Get AI Insights" button. InsightInterpreter looks at your results — the distributions, correlations, top reviews, all of it — and tells you what actually matters.

Not "Wi-Fi ratings are low."

But "Wi-Fi below 3 stars drops overall satisfaction by 68%. Upgrade routers on long-haul routes — it impacts loyalty more than catering."

Not "Business travelers rate cleanliness at 6.2."

But "Business travelers forgive delays but hate dirty cabins. Prioritize cleaning staff over gate efficiency."

It spots contradictions (high ratings, low loyalty — what gives?), identifies root causes (it's not the food, it's the pricing), and suggests concrete next steps. All written in the voice of a seasoned analyst who knows that real insight is the difference between a refund and a repeat customer.

Rich Metadata Filtering: Slice and Dice Your Way

AIRLYTICS supports filtering across every structured dimension in your data.

Want to see only Emirates Business Class reviews? Done. Need verified Solo Leisure travelers on long-haul flights? Easy. Looking for passengers who rated Wi-Fi below 3 but overall experience above 7? No problem.

You've got 50+ airlines, multiple aircraft types, four seat classes, different traveler types, verification flags, recommendation status, and eight numeric rating fields to work with. Mix and match however you want.

Every filter applies to both semantic searches and analytical queries. The system handles it seamlessly in the background.

Building AIRLYTICS: The MindsDB Magic

So how does all this actually work?

At its core, AIRLYTICS is powered by three MindsDB components working in perfect harmony.

Knowledge Bases: The Foundation

MindsDB Knowledge Bases are what make semantic search possible. Instead of storing just text, they store meaning — vector embeddings that capture the semantic essence of each review.

When you search for "bad food and poor service," the Knowledge Base doesn't just match those exact words. It finds reviews about "terrible meals and rude staff," "cold food and unhelpful crew," or "worst dining experience and slow service." It understands synonyms, context, and intent.

Plus, it respects structured metadata. You can search semantically and filter by airline, seat type, ratings — all in one query. That's the hybrid power MindsDB brings to the table.

The Analytics Agent: The Interpreter

This is where things get really clever.

The analytics_query_agent is trained to understand natural language questions and break them down into two parts:

Part 1: The semantic filter — what kind of reviews should we look at?

Part 2: The analytical question — what measurement or comparison do we want?

If only Part 1 exists, it's a straightforward semantic search. If both parts exist, the agent maps Part 2 to one of the five analytical functions, extracts the right parameters (which fields, what thresholds, which operators), and returns everything as structured JSON.

All you had to do was ask a question in plain English.

The Insight Agent: The Strategist

The insight_interpreter_agent is your executive translator.

It takes the raw analytics output — the numbers, distributions, top reviews — and interprets them through the lens of airline operations strategy. It's trained to spot patterns, identify contradictions, connect dots between seemingly unrelated metrics, and recommend concrete actions.

It doesn't just describe what it sees. It explains what it means and what you should do about it.

Here's a tiny peek at its prompt philosophy:

You are InsightInterpreter — the sharp data analyst inside an airline's analytics division.
Your job: Cut through the noise and tell the manager something they didn't already know.
Focus on contradictions, unexpected drivers, and actionable next steps.

Rules:
1. No recaps. Don't restate the query.
2. No fluff. Skip "The data shows..."
3. Be concrete. Use actual numbers when they matter.
4. Stay tight. 2-3 paragraphs, max.
5. End with action. Always tell what should be done differently.

That's it. Sharp, focused, actionable. Every time.

Beyond Airlines: A Blueprint for Any Feedback Domain

Here's the thing — while AIRLYTICS is built for airline reviews, the architecture is a template.

The same approach works for:

🏨 Hotels — guest reviews, amenity feedback, service ratings

🍽️ Restaurants — delivery experiences, menu feedback, ambiance comments

🛒 E-commerce — product reviews, shopping experiences, return issues

🏥 Healthcare — patient satisfaction, appointment experiences, facility feedback

📞 Support — ticket descriptions, call transcripts, chat logs

Anywhere you have unstructured feedback with structured metadata, this pattern applies. Swap the schema, prepare the agents on domain-specific language, and you're off to the races.

That's the power of MindsDB's Knowledge Bases. They're not industry-specific. They're insight-specific.

Real-World Impact: What This Means for Airlines

Let's get practical for a second.

Imagine you're an airline ops manager. You know Wi-Fi complaints are up, but you don't know if it's a dealbreaker or just noise. You type:

"Users who complained about baggage claim delays — what percentage of those who rated ground service above 4 rated overall experience below 5?"

In 3 seconds, you get:

42% of users with good ground service ratings still had poor overall experiences
A breakdown by seat type showing Business Class is disproportionately affected
A correlation heatmap revealing that baggage issues matter 3× more than food quality for Business travelers
An InsightInterpreter note: "Baggage delays hit Business travelers hardest. They pay premium prices for time — not food. Fast-track baggage handling for premium cabins immediately."

You just went from a hunch to a prioritized action plan in seconds.

That's what conversational analytics looks like. And that's what AIRLYTICS delivers.

The Tech Behind the Magic

For the curious: AIRLYTICS is built with React and TailwindCSS on the frontend, FastAPI on the backend, and MindsDB handling all the AI heavy lifting. Data flows from Google Sheets straight into MindsDB's Knowledge Base — zero ETL pipelines, zero data warehouses.

Docker Compose keeps MindsDB containerized for local development. OpenAI's embeddings power the semantic search. Recharts and Plotly handle visualizations. The whole stack is designed to be clean, modular, and production-ready.

Want to set it up yourself? The full installation guide, architecture diagrams, and SQL examples are all in the GitHub README. Everything you need to get it running locally or extend it to your own domain.

Hacktoberfest 2025: Built for Advanced Capabilities

AIRLYTICS was designed for MindsDB Hacktoberfest Track 2: Advanced Capabilities. That means it checks all the boxes:

✅ Knowledge Base integration with 20,000+ reviews

✅ Dual-agent architecture (analytics + insights)

✅ Metadata filtering across multiple dimensions

✅ Hybrid search capabilities (semantic + structured)

✅ Automated data freshness with MindsDB Jobs

✅ Zero-ETL architecture (Google Sheets → MindsDB direct)

But beyond the checklist, it's a complete RAG-to-BI pipeline. Not a toy demo. Not a proof of concept. A production-grade analytics engine that shows what's possible when you combine semantic understanding, statistical rigor, and AI interpretation.

Try It Yourself

Curious to see AIRLYTICS in action?

🔗 GitHub Repository: rajesh-adk-137/AIRLYTICS

🎥 Demo Video: Watch on YouTube

📖 Full Documentation: Available in the repo README

Clone it. Run it. Break it. Extend it to your own domain. The entire codebase is open source and ready to go.

The Future of Analytics is Conversational

Here's what I learned building AIRLYTICS:

The future of business intelligence isn't about building more dashboards. It's about building systems that understand questions, find answers, and explain what to do next. All in natural language. All in real-time.

MindsDB's Knowledge Bases make that possible. Their agent architecture makes it scalable. And the zero-ETL approach makes it practical.

Whether you're analyzing airline reviews, hotel feedback, or customer support tickets, the pattern is the same: semantic understanding + statistical rigor + AI interpretation = actionable intelligence.

The age of asking your data questions and getting strategic answers has arrived.

And it's powered by MindsDB.

Acknowledgments

Huge thanks to the MindsDB team for creating such a powerful platform, OpenAI for the embedding models, and the entire open-source community for the tools that made this possible.

Built with ❤️ for MindsDB Hacktoberfest 2025.

AIRLYTICS: Transforming unstructured feedback into strategic intelligence, one query at a time.

Ready to unlock intelligence from your own feedback data? Check out the *AIRLYTICS GitHub repo** and start exploring what's possible with MindsDB Knowledge Bases.*

Top comments (1)

Sujan Koirala • Oct 31

Very thoughtful project