From One Agent to Many: Building a Multi-Agent Team for Snowflake Administration (Part 2)
In Part 1, we built an Admin Agent that answers questions about Snowflake usage, credits, and query history. It's incredibly useful for monitoring operational metrics.
But what if you want to go beyond monitoring and actually find waste and recommend optimizations?
In this post, I'll show you how to:
- Build a Cost Optimizer Agent specialized in finding inefficiencies
- Create an Orchestrator Agent that intelligently routes questions to the right specialist
- Make them work together as a team
By the end, you'll have a multi-agent system where you can ask:
- "How much could I save?" β Routes to Cost Optimizer
- "Show me last month's credits" β Routes to Admin Agent
- "Give me a complete analysis" β Routes to both agents
Let's build it! π
Why Add a Second Agent?
The Admin Agent from Part 1 is great at answering "what happened?" questions:
- "How many credits did we use?"
- "Show me query counts by warehouse"
- "What's our storage trend?"
But it doesn't analyze patterns or recommend actions. For that, we need a specialist focused on optimization.
Enter the Cost Optimizer Agent - designed to:
- Detect idle warehouses wasting credits
- Recommend auto-suspend settings
- Identify oversized warehouses
- Find queries that need optimization
Why separate agents instead of one big agent?
- **FoArchitecture: Three Agents Working Together
Here's what we're building:
User Question (natural language)
β
Orchestrator Agent
β β
Admin Agent Cost Optimizer Agent
β β
Combined Response
Admin Agent (from Part 1):
- Already built and working
- Handles operational metrics
- Answers "what happened?" questions
Cost Optimizer Agent (new):
- Analyzes patterns for waste
- Recommends optimizations
- Answers "what should I fix?" questions
Orchestrator Agent (new):
- Routes questions to appropriate specialist
- Can call multiple agents for comprehensive answers
- Users don't need to know which agent to ask
Example Questions:
- "Which warehouses have high idle time?"
- "Are any warehouses oversized for their workload?"
- "Show me queries with excessive spillage"
Technical Example: Analyzing Warehouse Idle Time
Let me show you how these agents work together to identify and quantify optimization opportunities.
The Pattern (Same as Part 1)
Step 1: Base View - Calculate idle time by comparing warehouse uptime vs. query execution
- Join
WAREHOUSE_METERING_HISTORY(credits consumed) withQUERY_ATTRIBUTION_HISTORY(actual work done) - The gap = idle credits
Step 2: Semantic View - Map the data to natural language
- FACTS:
idle_credits,idle_percentage,idle_cost_usd - DIMENSIONS:
warehouse_name,uptime_date - METRICS: sum of idle credits, average idle percentage
Step 3: Create Agent - Connect the semantic view to the Cortex agent
- Agent instructions: "Find waste and recommend fixes"
- Tool:
IdleWarehouseAnalystusing the semantic view
Step 4: Test It
SELECT SNOWFLAKE.CORTEX.DATA_AGENT_RUN(
'COST_OPTIMIZER_AGENT',
'{"messages":[{"role":"user","content":[{"type":"text","text":"Which warehouses have high idle time?"}]}]}'
);
π Full SQL Implementation: sql/06_create_cost_optimizer_views.sql | sql/07_create_cost_optimizer_semantic_views.sql | sql/10_create_cost_optimizer_agent.sql
Example Response:
BRONZE_LOADER_WH Analysis (Last 30 Days):
Idle Time: 68%
Idle Credits: 892 credits
Pattern: Warehouse runs continuously but data loads are batch-based
Recommendation: Set AUTO_SUSPEND = 300 seconds (5 minutes)
Reasoning: Bronze data loads complete, then warehouse sits idle
waiting for next batch
Estimated savings: ~600 credits/month
Great! Now we have two specialized agents. But there's a problem...
Part 2: Adding the Orchestrator
The Problem: Now users need to know which agent to ask:
- Questions about credits β Ask Admin Agent
- Questions about optimization β Ask Cost Optimizer Agent
That doesn't scale. What if someone asks "Why are my costs high?" - which agent should handle that?
The Solution: An Orchestrator Agent that routes questions automatically.
The Pattern
Step 1: Routing Function - Python UDF that lets one agent call another
- Takes agent FQN and question as parameters
- Calls
SNOWFLAKE.CORTEX.DATA_AGENT_RUNprogrammatically - Returns the agent's response
Step 2: Orchestrator Agent - Meta-agent that routes to specialists
- Instructions define routing logic (keywords: "idle"βCost, "credits"βAdmin)
- Tools reference the routing function
- Can call multiple agents for comprehensive analysis
Step 3: Grant Permissions - Allow orchestrator to invoke specialist agents
π Full Implementation: sql/12_create_orchestrator_agent.sql
Part 3: Seeing It All Work Together
Now let's test the complete system.
Example 1: Simple Routing
Question: "How many credits did BRONZE_LOADER_WH use last month?"
What Happens:
- Orchestrator receives question
- Identifies this as operational metrics (keywords: "credits", "use")
- Routes to Admin Agent
- Admin Agent queries
SV_WAREHOUSE_COST_ANALYSIS - Returns credit numbers
Response:
BRONZE_LOADER_WH Credit Usage (Last 30 Days):
Total Credits: 1,312
Average Daily: 43.7 credits
Active Days: 30 of 30 (runs daily for ETL loads)
Example 2: Optimization Routing
Question: "Is BRONZE_LOADER_WH wasting credits?"
What Happens:
- Orchestrator receives question
- Identifies this as optimization (keywords: "wasting")
- Routes to Cost Optimizer Agent
- Cost Optimizer queries
SV_IDLE_WAREHOUSE_ANALYSIS - Returns analysis with recommendations
Response:
BRONZE_LOADER_WH Waste Analysis:
Idle Percentage: 68%
Idle Credits: 892 credits
Pattern: Batch ETL jobs complete quickly, warehouse stays running
Recommendation:
ALTER WAREHOUSE BRONZE_LOADER_WH SET AUTO_SUSPEND = 300;
Estimated Monthly Savings: ~600 credits
Reasoning: Bronze loads are periodic - warehouse doesn't need to
stay running between batch jobs
Example 3: Multi-Agent Analysis
Question: "Give me a complete analysis of BRONZE_LOADER_WH"
What Happens:
- Orchestrator recognizes need for comprehensive analysis
- Routes question to BOTH agents:
- Admin Agent: Get usage metrics
- Cost Optimizer: Get efficiency analysis
- Aggregates both responses
Response:
BRONZE_LOADER_WH Complete Analysis:
π USAGE (from Admin Agent):
- Total Credits (30 days): 1,312
- Average Daily: 43.7 credits
- Primary Workload: Bronze layer ETL (raw data ingestion)
- Load Schedule: Hourly batches from source systems
- Active Days: 30 of 30
π° EFFICIENCY (from Cost Optimizer):
- Idle Time: 68%
- Wasted Credits: 892 credits
- Pattern: Loads complete in ~15 min, then idle for 45 min
- Current auto_suspend: Not set (or >30 min)
RECOMMENDATION:
ETL warehouse with batch workload pattern. Jobs complete quickly
but warehouse stays running between batches.
Action: SET AUTO_SUSPEND = 300 (5 minutes)
Impact: ~600 credits/month savings with no performance impact
Beautiful! The user just asked one question and got insights from two specialists.
Technical Architecture: How It Actually Works
Let me break down the three layers that make these agents work.
Layer 1: Base Views (Data Calculation)
These views do the actual computational work by querying ACCOUNT_USAGE. For example, idle warehouse detection compares:
-
WAREHOUSE_METERING_HISTORY(when warehouse was running) -
QUERY_ATTRIBUTION_HISTORY(when queries were executing) - The gap = wasted compute
Layer 2: Semantic Views (Natural Language Mapping)
These teach the agent what the data means:
- FACTS: Raw values (idle_credits, idle_percentage)
- DIMENSIONS: Grouping columns (warehouse_name, date)
- METRICS: Aggregations (SUM, AVG)
Layer 3: Agents (Natural Language Interface)
Each agent:
- Receives natural language question
- Determines which semantic view to query
- Cortex Analyst generates SQL automatically
- Executes and returns natural language answer
MCP Integration (Optional)
For easier access, you can build an MCP (Model Context Protocol) server to integrate with GitHub Copilot:
# Simplified MCP Server Pattern
def ask_agent(question: str, agent_fqn: str) -> dict:
payload = {"messages": [{"role": "user", "content": [{"type": "text", "text": question}]}]}
result = conn.cursor().execute(f"SELECT SNOWFLAKE.CORTEX.DATA_AGENT_RUN('{agent_fqn}', '{json.dumps(payload)}')")
return json.loads(result.fetchone()[0])
π Full MCP Server: mcp/server.py
How the Agents Work Together
Here's the beautiful part: the Admin Agent provides context, and the Cost Optimizer provides action.
Example: Analyzing Bronze ETL Warehouse
Question: "Should I optimize BRONZE_LOADER_WH?"
Admin Agent provides context:
BRONZE_LOADER_WH Usage:
- Credits (30 days): 1,312
- Workload: Bronze layer ETL
- Schedule: Hourly batch loads
Cost Optimizer provides analysis:
Efficiency Analysis:
- Idle time: 68%
- Pattern: 15-min loads, 45-min idle gaps
- Recommendation: SET AUTO_SUSPEND = 300
- Expected savings: ~600 credits/month
Together, they provide:
- Context - What the warehouse does (bronze ETL)
- Root cause - Batch pattern creates idle gaps
- Solution - Auto-suspend between batches
- Impact - Savings estimate without performance impact
This combination ensures you get both what's happening and what to do about it in one conversation.
Quick Start: Adding to Your Existing Agent
If you built the Admin Agent from Part 1, here's the deployment sequence:
# 1. Cost Optimizer Agent
sql/06_create_cost_optimizer_views.sql # Base views
sql/07_create_cost_optimizer_semantic_views.sql # Semantic layer
sql/10_create_cost_optimizer_agent.sql # Agent
# 2. Orchestrator Agent
sql/12_create_orchestrator_agent.sql # Routing agent + function
# 3. Test it
SELECT SNOWFLAKE.CORTEX.DATA_AGENT_RUN(
'ORCHESTRATOR_AGENT',
'{"messages":[{"role":"user","content":[{"type":"text","text":"Which warehouses are idle?"}]}]}'
);
π All SQL scripts: GitHub Repository
You now have a working multi-agent system!
Agent Execution Costs
Each agent query executes on a Snowflake warehouse and consumes credits:
Typical costs:
- Simple query (1 warehouse, 1 dimension): ~0.001-0.002 credits
- Complex query (joins, aggregations): ~0.005-0.01 credits
- Very complex analysis (multiple time ranges): ~0.01-0.05 credits
Example usage:
- 100 agent queries/day
- Average 0.005 credits per query
- Total: 0.5 credits/day = ~15 credits/month
- At $3.50/credit: ~$52/month
The agents themselves cost very little to run compared to the insights they provide.
Key Lessons Learned
1. Orchestration Makes It Seamless
Without the Orchestrator, users need to know:
- Which agent handles which questions
- How to call different agents
- When to use multiple agents
With the Orchestrator:
- Just ask naturally
- Routing happens automatically
- Multi-agent queries work transparently
2. Agent Communication Is Key
The routing function enables:
- One agent calling another
- Aggregating responses
- Building complex workflows
Pattern: Function β Agent calls other agents via that function
3. Start with Two, Scale to Many
This architecture scales easily:
- Each new agent is independent
- Orchestrator routing rules are additive
- No changes needed to existing agents
4. Test Routing Logic
Make sure to test:
- Edge cases (ambiguous questions)
- Multi-agent scenarios
- Direct vs. orchestrated access
- Response aggregation quality
What's Next?
In Part 3, I'll show you how to add a Security & Governance Agent that:
- Audits role assignments and privileges
- Detects failed login patterns (brute force attacks)
- Identifies inactive users with access
- Monitors for compliance violations
Then we'll have a complete three-agent team:
- π§ Admin - Operational metrics
- π° Cost Optimizer - Waste detection
- π Security - Compliance and access control
All coordinated by the Orchestrator.
Code Repository
All SQL scripts and complete implementation available at:
GitHub Repository
Includes:
- Base view SQL for idle detection
- Additional views for warehouse sizing, query optimization
- Semantic view definitions
- All three agent configurations
- MCP server code for Copilot integration
- Step-by-step deployment guide
- API endpoints - Integrate with existing dashboards
- Scheduled queries - Automated monitoring with alerts
- GitHub Copilot - Ask questions directly from your IDEgithub.com/LALITHASWAROOPK/agent_snowflake_admin
Includes:
- All SQL scripts (base views, semantic views, agents)
- MCP server for GitHub Copilot integration
- Deployment guide
- Example questions and expected outputs
Conclusion
In Part 1, we built one agent. In Part 2, we:
- β Added a specialized Cost Optimizer Agent
- β Created an Orchestrator for intelligent routing
- β Made them work together as a team
The multi-agent pattern is powerful because:
- Each agent specializes in one domain
- Orchestrator handles routing automatically
- Easy to add new agents without changing existing ones
- Users get comprehensive answers from one question
From the user's perspective:
Before: "Which agent should I ask about idle warehouses?"
After: "Which warehouses are idle?" (Orchestrator figures it out)
This is just the beginning. You can extend this architecture with:
- Security agents for compliance
- Performance agents for query optimization
- Data quality agents for monitoring freshness
- Custom agents for your specific needs
The pattern stays the same: Views β Semantic Views β Agent β Orchestrator
Questions or feedback? Drop a comment below!
Find this helpful? Give it a β€οΈ and stay tuned for Part 3: Security & Governance Agent
Want to discuss? Connect with me on LinkedIn or Twitter
Quick Reference: Agent Responsibilities
| Agent | Purpose | Example Questions |
|---|---|---|
| Admin | Operational metrics | "How many credits did BRONZE_LOADER_WH use?" |
| Cost Optimizer | Waste detection | "Is BRONZE_LOADER_WH running idle?" |
| Orchestrator | Intelligent routing | "Analyze BRONZE_LOADER_WH efficiency" |
Part 1: Build an AI Admin Agent
Part 2: Multi-Agent Architecture (this post)
Part 3: Security & Governance Agent (coming soon)
Top comments (0)