ABU SUFIAN

Posted on Feb 22

Building an AI-Powered Natural Language SQL Interface: An MVP Journey

#ai #nextjs #llm #sql

Have you ever wished you could just ask your database a question in plain English and get an answer? That's exactly what we built—an experimental AI-powered chat interface that translates natural language into safe, read-only SQL queries.

In this post, I'll walk you through how we built it, the security challenges we faced, and the performance optimizations that make it production-ready.

The Problem We're Solving

Most business users and even developers struggle with SQL. They know what data they want, but crafting the right query is time-consuming. Traditional BI tools require learning complex interfaces, and even SQL experts waste time on repetitive queries.

Our solution: A conversational interface where users ask questions like:

"Monthly revenue last 12 months"
"Top reps by revenue"
"Campaign ROI: budget vs revenue"
"Unpaid commissions per rep"

The system handles the SQL generation, execution, and visualization automatically.

What It's Capable Of

Core Features

Natural Language Processing: Accepts text or voice input
Intelligent SQL Generation: Uses LLM (Groq/Llama 3.1) to generate queries
Read-Only Safety: Only executes SELECT/WITH queries
Rich Visualizations: Tables, line charts, and bar charts
Conversational Context: Maintains conversation history
Clarification Flow: Asks follow-up questions when needed
Server-Side Execution: All database access is server-side (LLM never touches DB)

What It Doesn't Do (By Design)

No write operations (INSERT/UPDATE/DELETE)
No authentication or multi-tenant support
No dashboard builder or saved reports
No attempt to replace full BI tools

This is intentionally minimal—built for fast data exploration, not enterprise analytics.

Architecture Overview

┌─────────────┐
│   Frontend  │  Next.js + React + Recharts
│  (Chat UI)  │
└──────┬──────┘
       │ HTTP POST /api/chat
       ▼
┌─────────────────────────────────────┐
│         Next.js API Route           │
│    /api/chat (route.ts)             │
└──────┬──────────────────────────────┘
       │
       ▼
┌─────────────────────────────────────┐
│      Chat Service Layer               │
│  - Schema introspection (cached)      │
│  - Schema shortlisting                │
│  - Prompt building                    │
│  - LLM call                           │
│  - JSON parsing & validation          │
└──────┬──────────────────────────────┘
       │
       ├──► SQL Guard (safety validation)
       │
       └──► PostgreSQL (read-only)

Key Components

Frontend (src/components/Chat/):

ChatShell.tsx - Main chat interface
MessageBubble.tsx - Message rendering with SQL preview
ResultChart.tsx - Recharts-based visualizations
VoiceButton.tsx - Web Speech API integration

Backend (src/Server/):

chat/service.ts - Core orchestration logic
chat/prompts.ts - LLM prompt engineering
db/introspect.ts - Schema introspection
db/schemaCache.ts - TTL-based caching
safety/sqlGuard.ts - SQL validation (247 lines of security)
llm/groq.ts - Groq API client

Security: The MVP Guardrails

This was our biggest challenge. How do you safely execute LLM-generated SQL without a full SQL parser?

Our Security Model

We built a defense-in-depth approach with multiple layers:

1. Statement Validation

// Only single statements allowed
const parts = sql.split(";").filter(Boolean);
if (parts.length !== 1) {
  throw new Error("Only single-statement SQL is allowed.");
}

2. Query Type Enforcement

// Only SELECT/WITH queries
if (!(sql.toLowerCase().startsWith("select") || 
      sql.toLowerCase().startsWith("with"))) {
  throw new Error("Only SELECT/WITH queries are allowed.");
}

3. Keyword Blocking

We block dangerous keywords using word-boundary regex:

const blockedKeywords = [
  "insert", "update", "delete", "drop", "alter",
  "truncate", "create", "grant", "revoke",
  "copy", "call", "execute", "prepare",
  "begin", "commit", "rollback", "set",
  // ... and more
];

for (const kw of blockedKeywords) {
  const re = new RegExp(`\\b${escapeRegExp(kw)}\\b`, "i");
  if (re.test(sql)) {
    throw new Error("Write/admin operations are not allowed.");
  }
}

4. Dangerous Function Blocking

We block Postgres functions that could be used for exfiltration:

const blockedFunctions = [
  "pg_read_file",
  "pg_read_binary_file",
  "pg_ls_dir",
  "dblink",
  "postgres_fdw",
  // ...
];

5. SELECT * Prevention

// Blocks accidental data dumps
if (/\bselect\s+\*\b/i.test(sql)) {
  throw new Error("SELECT * is not allowed. Please select specific columns.");
}

6. PII Column Blocking (Optional)

const piiTokens = ["email", "phone", "mobile", "ssn", "passport", "address"];

for (const tok of piiTokens) {
  const re = new RegExp(`\\b${escapeRegExp(tok)}\\b`, "i");
  if (re.test(sql)) {
    throw new Error(`Query references sensitive field '${tok}'. Not allowed.`);
  }
}

7. LIMIT Enforcement

// Adds default LIMIT if missing, clamps if too high
const limitMatch = sql.match(/\blimit\s+(\d+)\b/i);

if (!limitMatch) {
  sql = `${sql} LIMIT ${defaultLimit}`;
} else {
  const requested = parseInt(limitMatch[1], 10);
  if (requested > maxLimit) {
    sql = sql.replace(/\blimit\s+\d+\b/i, `LIMIT ${maxLimit}`);
  }
}

8. Comment Stripping

We remove SQL comments to prevent bypass attempts:

// Remove -- line comments
sql = sql.replace(/--.*$/gm, "");
// Remove /* block comments */
sql = sql.replace(/\/\*[\s\S]*?\*\//g, "");

Security Limitations

⚠️ Important: This is MVP-level safety. For production systems, you should:

Use a proper SQL parser (like node-sql-parser or pg-query-parser)
Implement allow-listing of tables/columns
Add query timeout enforcement
Use read-only database users
Implement rate limiting
Add audit logging

Our approach is best-effort accident prevention, not a security guarantee.

Performance Optimizations

LLM calls are expensive (time and money). Here's how we optimized:

1. Schema Caching

Schema introspection runs once and caches for 10 minutes:

const CACHE = new Map<string, Cached<any>>();

export async function getSchemaSummaryCached(ttlMs = 10 * 60 * 1000) {
  const hit = CACHE.get(key);
  if (hit && hit.expiresAt > Date.now()) return hit.value;

  const schema = await introspectSchema();
  CACHE.set(key, { value: schema, expiresAt: Date.now() + ttlMs });
  return schema;
}

2. Schema Shortlisting

Instead of sending the entire schema, we shortlist only relevant tables:

function shortlistSchema(schemaSummary: any, question: string, maxTables = 8) {
  const q = question.toLowerCase();
  const tokens = q.split(/[^a-z0-9_]+/).filter(w => w.length >= 3);

  // Score tables by relevance
  const scored = schemaSummary.tables.map((t: any) => {
    let score = 0;
    for (const tok of tokens) {
      if (t.name.toLowerCase().includes(tok)) score += 5;
      for (const col of t.columns) {
        if (col.name.toLowerCase().includes(tok)) score += 1;
      }
    }
    return { t, score };
  });

  // Return top N tables
  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, maxTables)
    .map(x => x.t);
}

3. Compact Schema Format

We convert schema to a compact text format instead of JSON:

function schemaToText(shortlisted: any) {
  return shortlisted.tables.map((t: any) => {
    const cols = t.columns
      .slice(0, 50)
      .map((c: any) => `${c.name}:${c.type}`)
      .join(", ");
    return `${t.name}(${cols})`;
  }).join("\n");
}

Before (JSON): ~5000 tokens

After (text): ~500 tokens

Savings: 90% reduction

4. Bounded History

Only the last 6 conversation turns are included in the prompt:

const historyForPrompt = (history || []).slice(-6);

5. Connection Pooling

We use PostgreSQL connection pooling with read-only transactions:

const client = new Client({ connectionString });
await client.connect();

try {
  const result = await client.query(sql);
  return result;
} finally {
  await client.end();
}

Technology Stack

Frontend: Next.js 16 (App Router), React 19, TypeScript
Backend: Next.js API Routes, Node.js
Database: PostgreSQL (pg library)
LLM: Groq SDK (Llama 3.1 8B Instant)
Visualization: Recharts
Validation: Zod schemas

Code Example: The Full Flow

Here's how a request flows through the system:

// 1. API Route receives request
export async function POST(req: Request) {
  const body = BodySchema.parse(await req.json());
  const out = await chatService(body);
  return NextResponse.json(out);
}

// 2. Service orchestrates the flow
export async function chatService(body: ChatBody) {
  // Get cached schema
  const schemaSummary = await getSchemaSummaryCached();

  // Shortlist relevant tables
  const shortlisted = shortlistSchema(schemaSummary, body.message, 8);
  const schemaText = schemaToText(shortlisted);

  // Build prompts
  const sys = systemPrompt();
  const prompt = userPrompt({ 
    schemaText, 
    resolved: { time_range: "all_time" },
    history: body.history.slice(-6),
    message: body.message 
  });

  // Call LLM
  let raw = await llmCall([
    { role: "system", content: sys },
    { role: "user", content: prompt }
  ]);

  // Parse and validate JSON
  let out = ModelOutSchema.parse(parseJsonRobust(raw));

  // If clarification needed, return early
  if (out.type === "clarify") {
    return { ok: true, ...out };
  }

  // Validate SQL safety
  const safeSql = assertSafeReadOnlySQL(out.sql);

  // Execute query
  const result = await runQuery(safeSql);

  // Return formatted response
  return {
    ok: true,
    type: "answer",
    sql: safeSql,
    answer: out.answer,
    insights: out.insights ?? [],
    followups: out.followups ?? [],
    visualization: out.visualization ?? { type: "table" },
    result,
  };
}

Lessons Learned

1. LLMs Are Not Deterministic

We had to build a "repair" mechanism for when the LLM returns invalid JSON:

try {
  out = ModelOutSchema.parse(parseJsonRobust(raw));
} catch {
  // Retry with repair prompt
  const repaired = await llmCall([
    { role: "system", content: sys },
    { role: "user", content: repairPrompt(raw) }
  ]);
  out = ModelOutSchema.parse(parseJsonRobust(repaired));
}

2. Token Costs Add Up Fast

Our optimizations reduced token usage by ~90%, which translates to:

Faster responses (less data to process)
Lower costs (fewer tokens = lower API bills)
Better reliability (shorter prompts = more consistent outputs)

3. Security Requires Multiple Layers

A single validation layer isn't enough. We use:

Keyword blocking
Function blocking
PII detection
LIMIT enforcement
Comment stripping
Statement validation

4. Schema Shortlisting Is Critical

Sending the entire schema (100+ tables) results in:

Higher token costs
Slower responses
More hallucination (LLM gets confused)

Shortlisting to 8 relevant tables dramatically improves quality.

Limitations & Future Work

Current Limitations

MVP-level security (not production-grade)
No authentication
No query history/saved reports
Limited to PostgreSQL
No support for complex joins across many tables

Potential Improvements

[ ] SQL parser-based validation (proper AST parsing)
[ ] Allow-listing system (configurable table/column access)
[ ] Query timeout enforcement
[ ] Rate limiting per user
[ ] Audit logging
[ ] Multi-database support
[ ] Query explanation (why this SQL was generated)

Conclusion

Building an AI-powered SQL interface is challenging but rewarding. The key is balancing:

Safety: Multiple validation layers
Performance: Aggressive caching and token optimization
User Experience: Fast, conversational, helpful

This project demonstrates that with the right guardrails, LLMs can safely generate and execute SQL queries. It's not production-ready for sensitive data, but it's a solid foundation for internal tools and data exploration.

Want to try it? Check out the GitHub repository

What do you think? Have you built similar systems? What security measures did you implement? Let me know in the comments below!

DEV Community