DEV Community

osman uygar kΓΆse
osman uygar kΓΆse

Posted on

Secure Database Access for AI Agents: Building an MCP Server with SQLatte

TL;DR: Learn how to give Claude and other AI agents controlled access to your databases through MCP (Model Context Protocol) with enterprise-grade security, audit logging, and cost optimization using SQLatte.


πŸ€” The Problem

You want AI agents like Claude to query your database directly, but:

  • ❌ Direct database access is a security nightmare
  • ❌ No audit trail means compliance issues
  • ❌ Uncontrolled LLM calls = surprise bills
  • ❌ SQL injection risks from AI-generated queries

The solution? MCP Server + SQLatte = Secure, auditable AI-to-database bridge


πŸ”Œ What is MCP?

MCP (Model Context Protocol) is Anthropic's open standard that lets AI models interact with external systems safely.

Without MCP:
User β†’ Claude β†’ "Here's SQL code" β†’ User copies β†’ Runs manually

With MCP:
User β†’ Claude β†’ [MCP Server] β†’ Database β†’ Results instantly ✨
Enter fullscreen mode Exit fullscreen mode

Key benefit: AI can query data while you maintain complete control.


πŸ—οΈ SQLatte Architecture

SQLatte is an open-source MCP server that provides a secure gateway between AI and databases:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Claude Desktop β”‚
β”‚   or API        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ MCP Protocol (SSE)
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   SQLatte MCP Server    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Audit Logger    │◄──┼─ Every query logged
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Cost Optimizer  │◄──┼─ Smart LLM routing
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ SQL Generator    │◄──┼─ NL β†’ SQL conversion
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Security Layer   │◄──┼─ Injection protection
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
           β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚   Trino       β”‚
   β”‚   BigQuery    β”‚
   β”‚   PostgreSQL  β”‚
   β”‚   MySQL       β”‚
   β”‚   ClickHouse  β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

GitHub: osmanuygar/sqlatte


πŸ” Security: 4 Layers of Protection

1. SQL Injection Protection

# Multi-layer validation
class SecurityValidator:
    def validate_sql(self, sql: str) -> tuple[bool, str]:
        # Layer 1: Keyword blacklist
        dangerous_keywords = ["DROP", "TRUNCATE", "DELETE FROM"]

        # Layer 2: Syntax parsing
        parsed = sqlparse.parse(sql)

        # Layer 3: Risk scoring
        risk_score = self.calculate_risk(parsed)

        # Layer 4: Admin approval for high-risk
        if risk_score > 0.7:
            return False, "Requires admin approval"

        return True, "Safe"
Enter fullscreen mode Exit fullscreen mode

2. Rate Limiting

# config.yaml
security:
  rate_limiting:
    enabled: true
    requests_per_minute: 60
    requests_per_hour: 1000
    per_user: true

  data_limits:
    max_bytes_per_query: 10737418240  # 10 GB
    max_rows_returned: 100000
Enter fullscreen mode Exit fullscreen mode

3. User-Based Access Control

# Per-user restrictions
user_access:
  osmanuygar:
    allowed_databases: ["analytics", "staging"]
    allowed_schemas: ["public", "reports"]
    max_query_cost: 10.00  # USD
    daily_quota: 500
Enter fullscreen mode Exit fullscreen mode

4. Audit Logging

Every query is logged with full context:

{
  "timestamp": "2025-05-20T14:30:45Z",
  "user": "data_analyst",
  "session_id": "sess_abc123",
  "intent_type": "sql_generation",
  "input": "Show top 10 customers by revenue",
  "generated_sql": "SELECT customer, SUM(amount)...",
  "model_used": "claude-sonnet-4-20250514",
  "execution_time_ms": 1243,
  "rows_returned": 10,
  "data_scanned_bytes": 2147483648,
  "cost": 0.015,
  "success": true
}
Enter fullscreen mode Exit fullscreen mode

Export to CSV for compliance audits (SOC2, GDPR).


πŸ’° Cost Optimization: 70% Savings

Task-Based Model Routing

Not every query needs an expensive model. SQLatte routes smartly:

model_routing:
  enabled: true

  tasks:
    # Cheap & fast
    intent_detection:
      model: "claude-haiku-3-5-20241022"
      max_tokens: 500
      cost_per_call: ~$0.0003

    # Accurate & reliable
    sql_generation:
      model: "claude-sonnet-4-20250514"
      max_tokens: 4096
      cost_per_call: ~$0.015

    # Balanced
    insights:
      model: "claude-sonnet-4-20250514"
      max_tokens: 2000
      cost_per_call: ~$0.012
Enter fullscreen mode Exit fullscreen mode

Real Cost Comparison

Scenario Without Routing With Routing Savings
100K queries/month $3,600 $1,012 72%

How it works:

  1. Every query hits Haiku first (intent detection)
  2. Only 50% need Sonnet (complex SQL)
  3. Only 10% need insights (analysis)

Result: Massive cost reduction πŸ’°


πŸ› οΈ MCP Tools: 3 Simple APIs

SQLatte exposes 3 tools via sqlatte_mcp_server.py:

1. ask_database - Natural Language Query

@server.tool()
async def ask_database(question: str) -> str:
    """
    Execute natural language query against database.

    Example: "Show top 10 products by sales last month"
    Returns: Formatted table with results
    """
Enter fullscreen mode Exit fullscreen mode

2. list_tables - Discover Schema

@server.tool()
async def list_tables(schema: str = None) -> str:
    """
    List all tables in database.

    Returns: Table names with row counts
    """
Enter fullscreen mode Exit fullscreen mode

3. get_schema - Table Structure

@server.tool()
async def get_schema(table: str) -> str:
    """
    Get column details for accurate SQL generation.

    Returns: Column names, types, sample values
    """
Enter fullscreen mode Exit fullscreen mode

Why only 3 tools? Keep MCP simple. Advanced features (dashboards, scheduling) live in the web UI.


πŸš€ Setup in 5 Minutes

Step 1: Install

git clone https://github.com/osmanuygar/sqlatte.git
cd sqlatte
pip install -r requirements.txt
pip install mcp httpx
Enter fullscreen mode Exit fullscreen mode

Step 2: Configure

Edit config/config.yaml:

database:
  provider: "trino"  # or bigquery, postgres, mysql
  trino:
    host: "your-host.com"
    port: 443
    user: "your-username"
    password: "your-password"
    catalog: "hive"
    schema: "default"

llm:
  provider: "anthropic"
  anthropic:
    api_key: "sk-ant-your-key"
    model: "claude-sonnet-4-20250514"

model_routing:
  enabled: true
Enter fullscreen mode Exit fullscreen mode

Step 3: Run Server

python app.py
# Server at http://localhost:8000
Enter fullscreen mode Exit fullscreen mode

Step 4: Configure Claude Desktop

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "sqlatte": {
      "command": "python3",
      "args": ["/absolute/path/to/sqlatte/sqlatte_mcp_server.py"],
      "env": {
        "SQLATTE_URL": "http://localhost:8000",
        "TRINO_HOST": "your-host",
        "TRINO_PORT": "443",
        "TRINO_USER": "username",
        "TRINO_PASSWORD": "password",
        "TRINO_CATALOG": "hive",
        "TRINO_SCHEMA": "default",
        "TRINO_HTTP_SCHEME": "https"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Step 5: Test

Restart Claude Desktop, then ask:

"List all tables in the database"
Enter fullscreen mode Exit fullscreen mode

You should see SQLatte responding! πŸŽ‰


πŸ“Š Real Usage Example

Natural Language to SQL

User in Claude Desktop:

"Show me our top 10 customers by revenue in Q1 2025"
Enter fullscreen mode Exit fullscreen mode

What Happens Behind the Scenes:

  1. Intent Detection (Haiku - 50ms, $0.0003)
   Classification: QUERY
Enter fullscreen mode Exit fullscreen mode
  1. SQL Generation (Sonnet - 1.2s, $0.015)
   SELECT 
     customer_name,
     SUM(order_amount) as total_revenue
   FROM orders
   WHERE order_date >= '2025-01-01' 
     AND order_date < '2025-04-01'
   GROUP BY customer_name
   ORDER BY total_revenue DESC
   LIMIT 10
Enter fullscreen mode Exit fullscreen mode
  1. Execution (Trino - 250ms)
   10 rows returned, 2.3 GB scanned
Enter fullscreen mode Exit fullscreen mode
  1. Audit Log Created
   {
     "user": "data_analyst",
     "models_used": ["haiku", "sonnet"],
     "total_cost": "$0.0153",
     "execution_time_ms": 1500
   }
Enter fullscreen mode Exit fullscreen mode

Claude Returns:

Here are your top 10 customers by Q1 2025 revenue:

| Customer Name       | Total Revenue  |
|---------------------|----------------|
| ACME Corporation    | $1,234,567     |
| TechCorp Inc        | $987,654       |
| Global Solutions    | $765,432       |
...
Enter fullscreen mode Exit fullscreen mode

🎯 Key Features

βœ… Multi-Database Support

# Switch database with config
database:
  provider: "bigquery"  # or trino, postgres, mysql, clickhouse
Enter fullscreen mode Exit fullscreen mode

βœ… Semantic Layer

Map business terms to database schema:

# Prevent LLM hallucinations
semantic_layer:
  entities:
    - name: "Customer"
      table: "customer_master"
      columns:
        cust_id: "Customer ID"
        full_name: "Customer Name"
        ltv: "Lifetime Value"
Enter fullscreen mode Exit fullscreen mode

Now ask: "Show customers with high lifetime value"

SQLatte knows ltv column automatically!

βœ… Dashboard System

Save queries as reusable dashboards with auto-generated charts.

βœ… Query Scheduler

Schedule recurring reports with email delivery:

schedule:
  name: "Daily Sales Report"
  frequency: "0 9 * * *"  # Every day at 9 AM
  recipients: ["team@company.com"]
Enter fullscreen mode Exit fullscreen mode

πŸ” Admin Panel

Access at /admin for:

  • Audit Logs: Filter by user, date, intent type
  • Cost Tracking: Per-user LLM spend analytics
  • Semantic Layer: Visual entity/relationship builder
  • Query History: Review all SQL executions
  • Rate Limit Config: Adjust security settings

Export audit logs to CSV for compliance.


πŸ§ͺ Production Best Practices

1. Always Enable Audit Logging

analytics:
  enabled: true  # PostgreSQL for query history
Enter fullscreen mode Exit fullscreen mode

2. Start Conservative with Rate Limits

rate_limiting:
  requests_per_minute: 10  # Start low
  requests_per_hour: 100   # Increase based on usage
Enter fullscreen mode Exit fullscreen mode

3. Use Semantic Layer

Prevents LLM hallucinations by 70%+:

semantic_layer:
  enabled: true
  auto_discovery: true  # Scan DB for suggestions
Enter fullscreen mode Exit fullscreen mode

4. Monitor Costs

# Check audit logs daily
SELECT 
  user_id,
  DATE(timestamp) as date,
  SUM(total_tokens) as tokens,
  COUNT(*) as queries
FROM audit_logs
GROUP BY user_id, date
Enter fullscreen mode Exit fullscreen mode

5. Test with Read-Only User

-- Create read-only database user
CREATE USER sqlatte_ro WITH PASSWORD 'secure_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO sqlatte_ro;
Enter fullscreen mode Exit fullscreen mode

🚧 Common Challenges & Solutions

Challenge 1: LLM Generates Wrong Column Names

Problem: AI hallucinates user_name when column is username

Solution: Always provide schema context + semantic layer

# get_schema tool returns exact column names
system_prompt = f"""
Available columns: {get_schema('users')}
Use ONLY these exact column names.
"""
Enter fullscreen mode Exit fullscreen mode

Challenge 2: Expensive Queries

Problem: User asks vague question, LLM scans entire table

Solution: Add data limits in config

security:
  max_bytes_per_query: 10737418240  # 10 GB limit
Enter fullscreen mode Exit fullscreen mode

Challenge 3: Rate Limit False Positives

Problem: Legitimate power users get blocked

Solution: Per-user quotas

users:
  power_analyst:
    requests_per_hour: 500  # Higher limit
  regular_user:
    requests_per_hour: 100  # Standard limit
Enter fullscreen mode Exit fullscreen mode

πŸ“ˆ What Makes SQLatte Production-Ready?

βœ… Security

  • SQL injection protection (multi-layer)
  • Rate limiting per user
  • Audit logging (100% coverage)
  • User-based access control

βœ… Cost Control

  • Task-based LLM routing (70% savings)
  • Per-user cost tracking
  • Query cost estimation
  • Configurable quotas

βœ… Reliability

  • Connection pooling
  • Async FastAPI architecture
  • Graceful error handling
  • Automatic retries

βœ… Observability

  • Detailed audit logs
  • Token usage tracking
  • Performance metrics
  • CSV export for analysis

🀝 Open Source & MIT Licensed

SQLatte is completely free and open source:

# Contribute
git clone https://github.com/osmanuygar/sqlatte.git
cd sqlatte
git checkout -b feature/your-feature
# Make changes, commit, push, PR!
Enter fullscreen mode Exit fullscreen mode

Welcome contributions:

  • πŸ› Bug fixes
  • ✨ New database providers
  • πŸ“– Documentation improvements
  • 🌍 Translations

πŸ’­ Why MCP + SQLatte?

Compared to Direct Database Access

  • βœ… Security: Multi-layer protection vs direct access
  • βœ… Audit: Full logging vs no trail
  • βœ… Cost: Optimized routing vs uncontrolled

Compared to Custom API

  • βœ… Standard: MCP protocol vs proprietary
  • βœ… Multi-AI: Works with Claude, ChatGPT, Gemini
  • βœ… Community: Shared tooling ecosystem

Compared to BI Tools

  • βœ… Conversational: Natural language vs click-heavy UI
  • βœ… Flexible: Ad-hoc queries vs rigid dashboards
  • βœ… Fast: Instant answers vs long setup

πŸš€ Get Started

# 1. Clone
git clone https://github.com/osmanuygar/sqlatte.git
cd sqlatte

# 2. Install
pip install -r requirements.txt
pip install mcp httpx

# 3. Configure
cp config/config.yaml.example config/config.yaml
# Edit with your database credentials

# 4. Run
python app.py

# 5. Configure Claude Desktop
# (See setup instructions above)

# 6. Ask Claude!
"List all tables"
Enter fullscreen mode Exit fullscreen mode

πŸ“š Resources


🎯 Key Takeaways

  1. MCP enables safe AI-to-database connections
  2. Audit logging is non-negotiable for production
  3. Task-based routing cuts LLM costs by 70%+
  4. Semantic layers prevent AI hallucinations
  5. SQLatte provides enterprise-ready components out of the box

Made with ❀️ and β˜•
Secure, auditable, cost-effective AI database access


Questions? Open an issue on GitHub

Found this helpful? ⭐️ Star the repo and share with your team!

Tags: #ai #mcp #python #database #security #anthropic #claude #sql #opensource #llm

Top comments (1)

Collapse
 
arcan_trkay_ddacd7452639 profile image
Arcan TΓΌrkay

You’ve done an incredible job. Congratulations on your vision and the product you’ve created.πŸ€œπŸ€›