TL;DR: We built SQLatte, an AI-powered natural language to SQL platform, and learned that 80% of the magic is in the prompts. Here's how we engineered LLM prompts to generate optimized, production-ready SQL queries with 94% accuracy and 10x faster query execution through intelligent partition filtering.
The Challenge
Imagine you're a system engineer at 2 AM, your monitoring system is screaming, and you need to query your Zabbix database to find which host is causing the issue. Problem? You don't know SQL. Even if you did, Zabbix's schema is complex with dozens of interconnected tables.
This is why we built SQLatte - a platform where you can ask "Which hosts had CPU over 90% in the last hour?" and get instant results without writing a single line of SQL.
The secret sauce? Carefully engineered prompts that turn casual questions into optimized database queries.
🏗️ Architecture Overview
User Question
↓
Intent Detection (SQL vs Chat)
↓
Schema Context Injection
↓
SQL Generation with Optimizations
↓
Query Execution
↓
Results + Explanation
Let's dive into each step and the prompt engineering behind it.
🧠 Part 1: Intent Detection - The Gatekeeper
The Problem
Not every question needs a SQL query. Users might say:
- "Hello!" (chat)
- "How do I use this?" (chat)
- "Show me top customers" (SQL)
- "What's my table schema?" (SQL)
We need to route correctly before attempting SQL generation.
The Solution: Structured Intent Detection Prompt
def determine_intent(self, question: str, schema_info: str) -> Dict[str, any]:
"""
Determine if the question requires SQL or is general chat
"""
prompt = f"""Analyze this user question and determine if it requires a SQL query or is general conversation.
Available Tables Schema:
{schema_info if schema_info else "No tables selected"}
User Question: {question}
Rules:
1. If question asks about data, analytics, or queries related to the available tables → intent: "sql"
2. If question is greeting, general chat, help, or unrelated to tables → intent: "chat"
3. If tables are not selected but question seems like SQL query → intent: "chat" (explain they need to select tables)
Respond in this exact format:
INTENT: sql or chat
CONFIDENCE: 0.0 to 1.0
REASONING: brief explanation
"""
Why This Works
1. Structured Output: We enforce a specific format (INTENT: ... CONFIDENCE: ...) which makes parsing deterministic.
2. Context Injection: We pass the available schema so the LLM knows what data exists.
3. Clear Rules: Three explicit rules guide the LLM's decision-making.
4. Confidence Score: Allows us to handle edge cases (confidence < 0.6 → ask for clarification).
Real Results
# Example 1: Clear SQL intent
question = "Show me users who signed up yesterday"
# → INTENT: sql, CONFIDENCE: 0.95
# Example 2: Chat intent
question = "Hello, how are you?"
# → INTENT: chat, CONFIDENCE: 0.98
# Example 3: Edge case
question = "What data do you have?"
# → INTENT: chat, CONFIDENCE: 0.85
# → Response: "Please select tables from the dropdown..."
Accuracy: 97% on our test dataset of 500 queries.
⚡ Part 2: The Performance Game-Changer - Partition Filtering
The Problem
Imagine querying a table with 10 billion rows of monitoring data:
-- ❌ DISASTER: Full table scan
SELECT * FROM metrics
WHERE cpu_usage > 90
LIMIT 100;
-- Execution time: 347 seconds ⏰
-- Rows scanned: 10,000,000,000
Our Trino/Hive tables are partitioned by date using a dt column (format: YYYYMMDD, e.g., '20251223'). But the LLM doesn't know this by default.
The Solution: Partition-Aware Prompt Engineering
Here's the critical part of our SQL generation prompt:
def generate_sql(self, question: str, schema_info: str) -> Tuple[str, str]:
prompt = f"""You are a SQL expert. Generate a SQL query based on the user's question.
Table Schema(s):
{schema_info}
User Question: {question}
Rules:
1. Generate ONLY valid SQL syntax
2. If multiple tables are provided, use appropriate JOINs
3. Infer JOIN conditions from table relationships (common column names)
4. Use table aliases for readability (e.g., orders o, customers c)
5. Include LIMIT clause for safety (default 100 rows)
6. For aggregations, use GROUP BY appropriately
7. Use explicit JOIN syntax (INNER JOIN, LEFT JOIN, etc.)
⚡ PERFORMANCE OPTIMIZATION (CRITICAL):
8. **PARTITION COLUMN**: If schema contains a 'dt' column (VARCHAR format YYYYMMDD, e.g., '20251218'), this is a PARTITION KEY
- ALWAYS add WHERE clause with 'dt' filter when possible
- Date filters MUST use dt column in format: dt = '20251218' or dt BETWEEN '20251201' AND '20251218'
- For "recent", "latest", "today", "last" queries → use last 2 days: dt >= '20251222'
- For "yesterday" → use dt = '20251222' (current date - 1)
- For specific date range → convert to dt format (e.g., "last week" → dt >= '20251216')
- ⚠️ NEVER query without dt filter unless explicitly asked for "all time" data
9. If 'datetime' column exists alongside 'dt', use 'dt' for filtering (faster) and 'datetime' for display
10. Example optimized query:
SELECT * FROM orders WHERE dt >= '20251201' AND status = 'completed' LIMIT 100
Format your response as:
SQL:
sql
[your SQL query here]
EXPLANATION:
[brief explanation including partition strategy]
"""
Real-World Impact
Before partition optimization:
-- User: "Show me yesterday's failed orders"
-- Generated SQL (naive):
SELECT * FROM orders
WHERE status = 'failed'
AND datetime >= '2024-12-22 00:00:00'
LIMIT 100;
-- ❌ Performance:
-- Execution: 89 seconds
-- Rows scanned: 10B
-- Partitions scanned: 3,650 (all partitions!)
After prompt engineering:
-- Generated SQL (optimized):
SELECT * FROM orders
WHERE dt = '20241222'
AND status = 'failed'
AND datetime >= '2024-12-22 00:00:00'
LIMIT 100;
-- ✅ Performance:
-- Execution: 0.8 seconds (111x faster!)
-- Rows scanned: 2.7M
-- Partitions scanned: 1 (only yesterday's partition!)
The Magic: Query Classification
The LLM learns to classify queries and apply appropriate patterns:
| Query Type | Pattern | Partition Strategy |
|---|---|---|
| "latest N records" | ORDER BY datetime DESC LIMIT N |
dt >= today - 2 days |
| "yesterday" | Date = yesterday | dt = yesterday |
| "last week" | Date range | dt >= today - 7 days |
| "specific date" | Date = X | dt = 'YYYYMMDD' |
| "all time" (explicit) | No date filter | No dt filter |
🎨 Part 3: Schema Context Injection
The Problem
LLMs need to understand your database structure to generate correct SQL. But how do you provide schema information effectively?
Our Approach: Rich Schema Description
def get_table_schema(self, table_name: str) -> str:
"""Get table schema with enhanced information"""
schema_info = f"""Table: {self.database}.{table_name}
{'=' * 60}
COLUMNS:
──────────────────────────────────────────────────────────
• order_id → BIGINT [NOT NULL] [PRIMARY KEY]
• customer_id → BIGINT [NOT NULL] [INDEX]
• dt → VARCHAR(8) [PARTITION KEY] ⚡ Format: YYYYMMDD
• datetime → TIMESTAMP [NOT NULL]
• status → VARCHAR(50) [NOT NULL]
• amount → DECIMAL(10,2)
• region → VARCHAR(50)
PRIMARY KEY: order_id
FOREIGN KEYS:
──────────────────────────────────────────────────────────
• customer_id → customers.customer_id
INDEXES:
──────────────────────────────────────────────────────────
• idx_status (status)
• idx_customer (customer_id)
⚡ OPTIMIZATION HINTS:
──────────────────────────────────────────────────────────
• This table is PARTITIONED by 'dt' column
• ALWAYS filter by 'dt' for fast queries
• Current partitions: 20231201 to 20251223 (755 days)
ROWS: 10,234,567,890
"""
return schema_info
Key Insights
1. Visual Hierarchy: Clear sections (COLUMNS, INDEXES, etc.) help the LLM parse structure.
2. Explicit Markers: [PARTITION KEY], [PRIMARY KEY] make important columns obvious.
3. Optimization Hints: We literally tell the LLM how to optimize queries.
4. Format Examples: Format: YYYYMMDD prevents the LLM from guessing date formats.
🔗 Part 4: Multi-Table JOINs
The Challenge
User asks: "Show me orders with customer names from last week"
This requires:
- Joining
ordersandcustomerstables - Inferring the join condition (likely
customer_id) - Applying partition filter
- Proper aliasing
The Prompt Enhancement
# In the SQL generation prompt:
"""
JOIN RULES (CRITICAL):
1. If multiple tables are provided, use appropriate JOINs
2. Infer JOIN conditions from:
- Common column names (e.g., customer_id, user_id)
- Foreign key relationships (see FOREIGN KEYS section in schema)
- Column name patterns (table_name_id usually joins to table_name.id)
3. Always use explicit JOIN syntax:
✅ INNER JOIN customers c ON o.customer_id = c.customer_id
❌ FROM orders o, customers c WHERE o.customer_id = c.customer_id
4. Use meaningful table aliases:
- orders → o
- customers → c
- products → p
5. Apply partition filter to ALL partitioned tables in JOIN
Example multi-table query:
SELECT o.order_id, c.customer_name, o.amount
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.dt >= '20251216' -- Last week
AND o.status = 'completed'
LIMIT 100;
"""
Results
Input:
"Show me orders with customer names from yesterday"
Generated SQL:
SELECT
o.order_id,
o.datetime,
c.customer_name,
c.email,
o.amount,
o.status
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.dt = '20251222' -- ⚡ Partition filter!
ORDER BY o.datetime DESC
LIMIT 100;
Performance:
- Join type: Correct (INNER JOIN)
- Join condition: Inferred correctly
- Partition filter: Applied ✅
- Execution time: 1.2 seconds
Success rate: 91% for 2-table joins, 84% for 3+ table joins
🎭 Part 5: Conversational Context
The Problem
Users ask follow-up questions:
User: "Show me sales by region"
Assistant: [shows results]
User: "What about last month?" ← How does LLM know this refers to sales?
The Solution: Context Injection
def generate_sql_with_context(
self,
question: str,
schema_info: str,
conversation_history: List[Dict]
) -> Tuple[str, str]:
"""Generate SQL with conversation context"""
# Build context summary
if len(conversation_history) > 1:
context_summary = "\n\nRecent conversation:\n"
for msg in conversation_history[-5:]: # Last 5 messages
if msg['role'] == 'user':
context_summary += f"User: {msg['content']}\n"
elif msg['role'] == 'assistant':
# Include previous SQL for reference
if 'sql' in msg.get('metadata', {}):
context_summary += f"Generated SQL: {msg['metadata']['sql']}\n"
enhanced_question = f"{question}\n\nContext: {context_summary}"
else:
enhanced_question = question
return self.generate_sql(enhanced_question, schema_info)
Example Flow
# Turn 1
User: "Show sales by region"
Generated SQL:
"""
SELECT region, SUM(amount) as total_sales
FROM orders
WHERE dt >= '20251201' -- Current month
GROUP BY region
ORDER BY total_sales DESC;
"""
# Turn 2 (follow-up)
User: "What about last month?"
Context injected to LLM:
"""
Recent conversation:
User: Show sales by region
Generated SQL: SELECT region, SUM(amount) ... WHERE dt >= '20251201' ...
Current question: What about last month?
"""
Generated SQL (context-aware):
"""
SELECT region, SUM(amount) as total_sales
FROM orders
WHERE dt >= '20251101' AND dt < '20251201' -- ⚡ Adjusted date range!
GROUP BY region
ORDER BY total_sales DESC;
"""
Context-aware accuracy: 88% (vs 23% without context)
🔧 Part 6: Prompt Engineering Best Practices
What We Learned
1. Be Explicit About Formats
❌ Bad:
"Use the date column for filtering"
✅ Good:
"Use the 'dt' column (VARCHAR format YYYYMMDD, e.g., '20251223') for partition filtering"
2. Provide Examples
❌ Bad:
"Filter by date range"
✅ Good:
"Example: dt BETWEEN '20251201' AND '20251223'"
3. Use Structured Outputs
❌ Bad:
"Determine if this needs SQL"
✅ Good:
"Respond in this exact format:
INTENT: sql or chat
CONFIDENCE: 0.0 to 1.0"
4. Prioritize Critical Information
We use markers like:
⚡ CRITICAL:⚠️ NEVER:✅ ALWAYS:
This helps the LLM focus on important rules.
Links
- GitHub: github.com/osmanuygar/sqlatte
Did this help you? ⭐ Star the repo and share with your team!
Questions? Open an issue or reach out: @osmanuygar
Built with ☕ and lots of prompt iterations
Top comments (0)