The "Trusted Analyst" Problem
We’ve all seen the flashy demos: "Upload a CSV, ask a question, get a chart." It looks like magic.
But in an enterprise environment, "magic" is dangerous. If a CEO asks, "What was our Q3 revenue?" and the AI hallucinates a number because it wrote bad SQL or misread a column header, that’s not a bug—it’s a liability.
For my Google AI Agents Capstone, I didn't want to build just another chatbot. I wanted to build a Cyber-Physical System for Data: an agent that is secure, deterministic, and self-correcting.
Meet ECIA (Enterprise CSV Intelligence Agent), a multi-agent system that separates "thinking" from "doing" to guarantee accurate insights.
The Architecture: A 6-Agent Assembly Line
Most beginners build a single "Super Agent" with 20 tools. This confuses the LLM. Instead, I used the Google Agent Development Kit (ADK) to build a sequential assembly line. Think of it like a real data team:
The Router (Manager): Understands if you want a chart, a summary, or a statistical test.
The Data Prep (Junior Analyst): Sanitizes inputs and fuzzy-matches column names (so "rev" maps to "Total_Revenue").
The Analyst (Statistician): crunches the numbers using deterministic tools—no random Python execution allowed.
The Chart Specialist (Designer): Decides if this data needs a Bar Chart, a Funnel, or a Heatmap.
The Visualizer (Frontend Dev): Generates the actual Plotly code.
The Evaluator (QA Lead): A separate agent that critiques the output before showing it to the user. "Did we actually answer the question?"
Visualizing the Flow
USER QUERY: "What are the top 5 products by revenue?"
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT 1: QueryUnderstanding (Router) │
│ ────────────────────────────────────────────────────────────────│
│ • Semantic intent classification │
│ • Entity extraction (columns, metrics, filters) │
│ • Query complexity assessment │
│ TOOL: extract_entities_logic() │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT 2: DataPrep (Data Loader) │
│ ────────────────────────────────────────────────────────────────│
│ • Schema validation against extracted entities │
│ • Fuzzy column name matching (handles typos) │
│ • Loads DataFrame into SessionContext │
│ TOOL: prepare_data_logic() │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT 3: DataAnalysis (Statistical Analyst) │
│ ────────────────────────────────────────────────────────────────│
│ • Pattern matching: ranking, aggregation, trends, correlation │
│ • Intelligent chart type selection │
│ • Semantic column mapping ("Region" → "Country") │
│ TOOL: perform_analysis_logic() │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT 4: DataViz (Visualization Architect) │
│ ────────────────────────────────────────────────────────────────│
│ • 12+ chart types (bar, line, scatter, heatmap, funnel, etc.) │
│ • Plotly figure generation with enterprise styling │
│ • Interactive element configuration │
│ TOOL: create_visualization_logic() │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT 5: ResponseBuilder (Executive Synthesizer) │
│ ────────────────────────────────────────────────────────────────│
│ • Compiles analysis summary in business language │
│ • Integrates visualization with narrative context │
│ • Suggests alternative visualization approaches │
│ TOOL: build_response_logic() │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT 6: Evaluator (Quality Assurance Judge) │
│ ────────────────────────────────────────────────────────────────│
│ • Reviews query-response alignment │
│ • Validates intent fulfillment │
│ • Generates QA score (1-10) with critique │
│ OUTPUT: QA_SCORE: X/10 | CRITIQUE: [Assessment] │
└──────────────────────────────────────────────────────────────────┘
│
▼
FINAL USER-FACING RESPONSE
+ Interactive Plotly Visualization
The "Secret Sauce": Deterministic Tools
The biggest risk in Agentic AI is arbitrary code execution. If you let an LLM write and run its own Python code, you open the door to security vulnerabilities and broken loops.
ECIA takes a different approach. I built 6 custom Python tools that act as guardrails. The Agent can call these tools, but it cannot rewrite them.
Code Snippet: The Safe Tool Pattern Here is how I implemented the "Analyst" logic using Google's ADK. Notice how the agent is restricted to specific analysis types:
# The agent can only "choose" the analysis, not invent the math
def statistical_analysis_tool(analysis_type: str, target_column: str, group_column: str = None):
"""
Performs deterministic statistical analysis.
Secure: No eval() or exec() used.
"""
df = session.get_data()
if analysis_type == "ranking":
return df.groupby(group_column)[target_column].sum().sort_values(ascending=False).head(5)
elif analysis_type == "correlation":
return df[target_column].corr(df[group_column])
# ... handle other deterministic types
This ensures that when a user asks for "Top 5 Products," the math is always correct, regardless of the LLM's "creativity."
Handling the "Messy Reality" of Data
Real-world data is never clean. Users make typos. Column headers are cryptic (Q3_24_Rev_Adj).
ECIA uses a Semantic Mapping Layer. If a user asks about "Sales," but the column is named gross_merch_vol, a standard regex fails. My DataPrep Agent uses the Gemini model's reasoning capabilities to map the intent to the schema before any code is run.
User: "Show me the trend of injuries."
Dataset Column:
n_woundedAgent Decision: "Map 'injuries' to
n_woundedwith 98% confidence."
The "Evaluator" Loop: AI Checking AI
The final piece of the puzzle is the Evaluator Agent. It acts as a "critic." It reads the user's original prompt and the system's generated chart, then asks:
"Does this chart actually answer the specific question asked?"
If the answer is "No," it triggers a retry loop. This significantly reduces the "lazy agent" problem where models give a generic answer to a specific question.
Here's one of the queries from the demo
Try It Yourself
This project was built for the Google AI Agents Intensive. It taught me that while LLMs are powerful, architecture is what makes them production-ready.
Check out the Code: GitHub Repository
See the Kaggle Notebook: Run the Agent Here
I'd love to hear your feedback—especially on how you handle security in your own agent pipelines!

Top comments (0)