DEV Community

ABU SUFIAN
ABU SUFIAN

Posted on

Building an AI-Powered Natural Language SQL Interface: An MVP Journey

Have you ever wished you could just ask your database a question in plain English and get an answer? That's exactly what we built—an experimental AI-powered chat interface that translates natural language into safe, read-only SQL queries.

In this post, I'll walk you through how we built it, the security challenges we faced, and the performance optimizations that make it production-ready.

The Problem We're Solving

Most business users and even developers struggle with SQL. They know what data they want, but crafting the right query is time-consuming. Traditional BI tools require learning complex interfaces, and even SQL experts waste time on repetitive queries.

Our solution: A conversational interface where users ask questions like:

  • "Monthly revenue last 12 months"
  • "Top reps by revenue"
  • "Campaign ROI: budget vs revenue"
  • "Unpaid commissions per rep"

The system handles the SQL generation, execution, and visualization automatically.

What It's Capable Of

Core Features

  • Natural Language Processing: Accepts text or voice input
  • Intelligent SQL Generation: Uses LLM (Groq/Llama 3.1) to generate queries
  • Read-Only Safety: Only executes SELECT/WITH queries
  • Rich Visualizations: Tables, line charts, and bar charts
  • Conversational Context: Maintains conversation history
  • Clarification Flow: Asks follow-up questions when needed
  • Server-Side Execution: All database access is server-side (LLM never touches DB)

What It Doesn't Do (By Design)

  • No write operations (INSERT/UPDATE/DELETE)
  • No authentication or multi-tenant support
  • No dashboard builder or saved reports
  • No attempt to replace full BI tools

This is intentionally minimal—built for fast data exploration, not enterprise analytics.

Architecture Overview

┌─────────────┐
│   Frontend  │  Next.js + React + Recharts
│  (Chat UI)  │
└──────┬──────┘
       │ HTTP POST /api/chat
       ▼
┌─────────────────────────────────────┐
│         Next.js API Route           │
│    /api/chat (route.ts)             │
└──────┬──────────────────────────────┘
       │
       ▼
┌─────────────────────────────────────┐
│      Chat Service Layer               │
│  - Schema introspection (cached)      │
│  - Schema shortlisting                │
│  - Prompt building                    │
│  - LLM call                           │
│  - JSON parsing & validation          │
└──────┬──────────────────────────────┘
       │
       ├──► SQL Guard (safety validation)
       │
       └──► PostgreSQL (read-only)
Enter fullscreen mode Exit fullscreen mode

Key Components

Frontend (src/components/Chat/):

  • ChatShell.tsx - Main chat interface
  • MessageBubble.tsx - Message rendering with SQL preview
  • ResultChart.tsx - Recharts-based visualizations
  • VoiceButton.tsx - Web Speech API integration

Backend (src/Server/):

  • chat/service.ts - Core orchestration logic
  • chat/prompts.ts - LLM prompt engineering
  • db/introspect.ts - Schema introspection
  • db/schemaCache.ts - TTL-based caching
  • safety/sqlGuard.ts - SQL validation (247 lines of security)
  • llm/groq.ts - Groq API client

Security: The MVP Guardrails

This was our biggest challenge. How do you safely execute LLM-generated SQL without a full SQL parser?

Our Security Model

We built a defense-in-depth approach with multiple layers:

1. Statement Validation

// Only single statements allowed
const parts = sql.split(";").filter(Boolean);
if (parts.length !== 1) {
  throw new Error("Only single-statement SQL is allowed.");
}
Enter fullscreen mode Exit fullscreen mode

2. Query Type Enforcement

// Only SELECT/WITH queries
if (!(sql.toLowerCase().startsWith("select") || 
      sql.toLowerCase().startsWith("with"))) {
  throw new Error("Only SELECT/WITH queries are allowed.");
}
Enter fullscreen mode Exit fullscreen mode

3. Keyword Blocking

We block dangerous keywords using word-boundary regex:

const blockedKeywords = [
  "insert", "update", "delete", "drop", "alter",
  "truncate", "create", "grant", "revoke",
  "copy", "call", "execute", "prepare",
  "begin", "commit", "rollback", "set",
  // ... and more
];

for (const kw of blockedKeywords) {
  const re = new RegExp(`\\b${escapeRegExp(kw)}\\b`, "i");
  if (re.test(sql)) {
    throw new Error("Write/admin operations are not allowed.");
  }
}
Enter fullscreen mode Exit fullscreen mode

4. Dangerous Function Blocking

We block Postgres functions that could be used for exfiltration:

const blockedFunctions = [
  "pg_read_file",
  "pg_read_binary_file",
  "pg_ls_dir",
  "dblink",
  "postgres_fdw",
  // ...
];
Enter fullscreen mode Exit fullscreen mode

5. SELECT * Prevention

// Blocks accidental data dumps
if (/\bselect\s+\*\b/i.test(sql)) {
  throw new Error("SELECT * is not allowed. Please select specific columns.");
}
Enter fullscreen mode Exit fullscreen mode

6. PII Column Blocking (Optional)

const piiTokens = ["email", "phone", "mobile", "ssn", "passport", "address"];

for (const tok of piiTokens) {
  const re = new RegExp(`\\b${escapeRegExp(tok)}\\b`, "i");
  if (re.test(sql)) {
    throw new Error(`Query references sensitive field '${tok}'. Not allowed.`);
  }
}
Enter fullscreen mode Exit fullscreen mode

7. LIMIT Enforcement

// Adds default LIMIT if missing, clamps if too high
const limitMatch = sql.match(/\blimit\s+(\d+)\b/i);

if (!limitMatch) {
  sql = `${sql} LIMIT ${defaultLimit}`;
} else {
  const requested = parseInt(limitMatch[1], 10);
  if (requested > maxLimit) {
    sql = sql.replace(/\blimit\s+\d+\b/i, `LIMIT ${maxLimit}`);
  }
}
Enter fullscreen mode Exit fullscreen mode

8. Comment Stripping

We remove SQL comments to prevent bypass attempts:

// Remove -- line comments
sql = sql.replace(/--.*$/gm, "");
// Remove /* block comments */
sql = sql.replace(/\/\*[\s\S]*?\*\//g, "");
Enter fullscreen mode Exit fullscreen mode

Security Limitations

⚠️ Important: This is MVP-level safety. For production systems, you should:

  • Use a proper SQL parser (like node-sql-parser or pg-query-parser)
  • Implement allow-listing of tables/columns
  • Add query timeout enforcement
  • Use read-only database users
  • Implement rate limiting
  • Add audit logging

Our approach is best-effort accident prevention, not a security guarantee.

Performance Optimizations

LLM calls are expensive (time and money). Here's how we optimized:

1. Schema Caching

Schema introspection runs once and caches for 10 minutes:

const CACHE = new Map<string, Cached<any>>();

export async function getSchemaSummaryCached(ttlMs = 10 * 60 * 1000) {
  const hit = CACHE.get(key);
  if (hit && hit.expiresAt > Date.now()) return hit.value;

  const schema = await introspectSchema();
  CACHE.set(key, { value: schema, expiresAt: Date.now() + ttlMs });
  return schema;
}
Enter fullscreen mode Exit fullscreen mode

2. Schema Shortlisting

Instead of sending the entire schema, we shortlist only relevant tables:

function shortlistSchema(schemaSummary: any, question: string, maxTables = 8) {
  const q = question.toLowerCase();
  const tokens = q.split(/[^a-z0-9_]+/).filter(w => w.length >= 3);

  // Score tables by relevance
  const scored = schemaSummary.tables.map((t: any) => {
    let score = 0;
    for (const tok of tokens) {
      if (t.name.toLowerCase().includes(tok)) score += 5;
      for (const col of t.columns) {
        if (col.name.toLowerCase().includes(tok)) score += 1;
      }
    }
    return { t, score };
  });

  // Return top N tables
  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, maxTables)
    .map(x => x.t);
}
Enter fullscreen mode Exit fullscreen mode

3. Compact Schema Format

We convert schema to a compact text format instead of JSON:

function schemaToText(shortlisted: any) {
  return shortlisted.tables.map((t: any) => {
    const cols = t.columns
      .slice(0, 50)
      .map((c: any) => `${c.name}:${c.type}`)
      .join(", ");
    return `${t.name}(${cols})`;
  }).join("\n");
}
Enter fullscreen mode Exit fullscreen mode

Before (JSON): ~5000 tokens

After (text): ~500 tokens

Savings: 90% reduction

4. Bounded History

Only the last 6 conversation turns are included in the prompt:

const historyForPrompt = (history || []).slice(-6);
Enter fullscreen mode Exit fullscreen mode

5. Connection Pooling

We use PostgreSQL connection pooling with read-only transactions:

const client = new Client({ connectionString });
await client.connect();

try {
  const result = await client.query(sql);
  return result;
} finally {
  await client.end();
}
Enter fullscreen mode Exit fullscreen mode

Technology Stack

  • Frontend: Next.js 16 (App Router), React 19, TypeScript
  • Backend: Next.js API Routes, Node.js
  • Database: PostgreSQL (pg library)
  • LLM: Groq SDK (Llama 3.1 8B Instant)
  • Visualization: Recharts
  • Validation: Zod schemas

Code Example: The Full Flow

Here's how a request flows through the system:

// 1. API Route receives request
export async function POST(req: Request) {
  const body = BodySchema.parse(await req.json());
  const out = await chatService(body);
  return NextResponse.json(out);
}

// 2. Service orchestrates the flow
export async function chatService(body: ChatBody) {
  // Get cached schema
  const schemaSummary = await getSchemaSummaryCached();

  // Shortlist relevant tables
  const shortlisted = shortlistSchema(schemaSummary, body.message, 8);
  const schemaText = schemaToText(shortlisted);

  // Build prompts
  const sys = systemPrompt();
  const prompt = userPrompt({ 
    schemaText, 
    resolved: { time_range: "all_time" },
    history: body.history.slice(-6),
    message: body.message 
  });

  // Call LLM
  let raw = await llmCall([
    { role: "system", content: sys },
    { role: "user", content: prompt }
  ]);

  // Parse and validate JSON
  let out = ModelOutSchema.parse(parseJsonRobust(raw));

  // If clarification needed, return early
  if (out.type === "clarify") {
    return { ok: true, ...out };
  }

  // Validate SQL safety
  const safeSql = assertSafeReadOnlySQL(out.sql);

  // Execute query
  const result = await runQuery(safeSql);

  // Return formatted response
  return {
    ok: true,
    type: "answer",
    sql: safeSql,
    answer: out.answer,
    insights: out.insights ?? [],
    followups: out.followups ?? [],
    visualization: out.visualization ?? { type: "table" },
    result,
  };
}
Enter fullscreen mode Exit fullscreen mode

Lessons Learned

1. LLMs Are Not Deterministic

We had to build a "repair" mechanism for when the LLM returns invalid JSON:

try {
  out = ModelOutSchema.parse(parseJsonRobust(raw));
} catch {
  // Retry with repair prompt
  const repaired = await llmCall([
    { role: "system", content: sys },
    { role: "user", content: repairPrompt(raw) }
  ]);
  out = ModelOutSchema.parse(parseJsonRobust(repaired));
}
Enter fullscreen mode Exit fullscreen mode

2. Token Costs Add Up Fast

Our optimizations reduced token usage by ~90%, which translates to:

  • Faster responses (less data to process)
  • Lower costs (fewer tokens = lower API bills)
  • Better reliability (shorter prompts = more consistent outputs)

3. Security Requires Multiple Layers

A single validation layer isn't enough. We use:

  • Keyword blocking
  • Function blocking
  • PII detection
  • LIMIT enforcement
  • Comment stripping
  • Statement validation

4. Schema Shortlisting Is Critical

Sending the entire schema (100+ tables) results in:

  • Higher token costs
  • Slower responses
  • More hallucination (LLM gets confused)

Shortlisting to 8 relevant tables dramatically improves quality.

Limitations & Future Work

Current Limitations

  • MVP-level security (not production-grade)
  • No authentication
  • No query history/saved reports
  • Limited to PostgreSQL
  • No support for complex joins across many tables

Potential Improvements

  • [ ] SQL parser-based validation (proper AST parsing)
  • [ ] Allow-listing system (configurable table/column access)
  • [ ] Query timeout enforcement
  • [ ] Rate limiting per user
  • [ ] Audit logging
  • [ ] Multi-database support
  • [ ] Query explanation (why this SQL was generated)

Conclusion

Building an AI-powered SQL interface is challenging but rewarding. The key is balancing:

  • Safety: Multiple validation layers
  • Performance: Aggressive caching and token optimization
  • User Experience: Fast, conversational, helpful

This project demonstrates that with the right guardrails, LLMs can safely generate and execute SQL queries. It's not production-ready for sensitive data, but it's a solid foundation for internal tools and data exploration.

Want to try it? Check out the GitHub repository


What do you think? Have you built similar systems? What security measures did you implement? Let me know in the comments below!

Top comments (0)