DEV Community

Cover image for SQL + AI: Real-World Database Solutions You Can Use Today
Andre Carbajal
Andre Carbajal

Posted on

SQL + AI: Real-World Database Solutions You Can Use Today

📦 All code examples in this article are available on GitHub:
github.com/andre-carbajal/sql-ai-database-solutions


Databases and AI used to live in separate worlds. Your PostgreSQL instance handled structured queries, and your ML pipeline ran somewhere else entirely. That separation is rapidly disappearing.

In 2025, your SQL database can store embeddings, answer natural language questions, power semantic search, and serve as the memory layer for autonomous AI agents — all without adding a separate vector database to your stack.

This article walks through four real-world patterns with working code:

  1. pgvector — storing and querying embeddings in PostgreSQL
  2. RAG pipelines — augmenting LLMs with your own database content
  3. NL2SQL / Text-to-SQL — letting users query your database in plain English
  4. AI Agents over SQL — autonomous agents that reason and act on your data

Prerequisites

pip install psycopg2-binary pgvector langchain langchain-openai openai sqlalchemy python-dotenv
Enter fullscreen mode Exit fullscreen mode
# Docker: run PostgreSQL with pgvector support
docker run --name pgvector-demo \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  -d pgvector/pgvector:pg16
Enter fullscreen mode Exit fullscreen mode

Set your environment variables:

# .env
OPENAI_API_KEY=sk-...
DATABASE_URL=postgresql://postgres:secret@localhost:5432/aidb
Enter fullscreen mode Exit fullscreen mode

1. pgvector — Semantic Search Inside PostgreSQL

pgvector is a PostgreSQL extension that adds a native vector data type, allowing you to store high-dimensional embeddings and run similarity searches with standard SQL.

Why this matters

Instead of maintaining a separate vector database (Pinecone, Weaviate, Chroma), you keep your vectors beside your relational data. That means you can combine semantic search with SQL filters in a single query.

Setup

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a products table with an embedding column
CREATE TABLE products (
    id         SERIAL PRIMARY KEY,
    name       TEXT NOT NULL,
    description TEXT,
    category   TEXT,
    price      DECIMAL(10, 2),
    embedding  VECTOR(1536)   -- OpenAI text-embedding-3-small dimensions
);

-- Create an HNSW index for fast approximate nearest-neighbor search
CREATE INDEX ON products
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Enter fullscreen mode Exit fullscreen mode

Inserting embeddings with Python

import os
import psycopg2
import numpy as np
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
conn = psycopg2.connect(os.environ["DATABASE_URL"])
cur = conn.cursor()

def get_embedding(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

products = [
    ("Wireless Noise-Cancelling Headphones", "Premium audio with 30hr battery", "electronics", 299.99),
    ("Ergonomic Office Chair",              "Lumbar support, adjustable arms",  "furniture",   449.00),
    ("Python Programming Book",             "Comprehensive guide for beginners", "books",        39.99),
    ("Mechanical Keyboard",                 "Cherry MX switches, RGB backlit",   "electronics", 129.00),
    ("Standing Desk",                       "Electric height-adjustable desk",   "furniture",   599.00),
]

for name, description, category, price in products:
    embedding = get_embedding(f"{name}: {description}")
    cur.execute(
        """
        INSERT INTO products (name, description, category, price, embedding)
        VALUES (%s, %s, %s, %s, %s)
        """,
        (name, description, category, price, embedding)
    )

conn.commit()
print("Products inserted with embeddings.")
Enter fullscreen mode Exit fullscreen mode

Querying: hybrid semantic + structured SQL

def semantic_search(query: str, max_price: float = None, category: str = None, limit: int = 5):
    query_embedding = get_embedding(query)

    sql = """
        SELECT
            name,
            description,
            category,
            price,
            1 - (embedding <=> %s::vector) AS similarity
        FROM products
        WHERE 1=1
    """
    params = [query_embedding]

    if max_price:
        sql += " AND price <= %s"
        params.append(max_price)

    if category:
        sql += " AND category = %s"
        params.append(category)

    sql += " ORDER BY embedding <=> %s::vector LIMIT %s"
    params += [query_embedding, limit]

    cur.execute(sql, params)
    return cur.fetchall()

# "I want something for my home office under $500"
results = semantic_search(
    query="comfortable workspace setup for long work sessions",
    max_price=500.0
)

for name, description, category, price, similarity in results:
    print(f"[{similarity:.3f}] {name} (${price}) — {description}")
Enter fullscreen mode Exit fullscreen mode

Output:

[0.891] Ergonomic Office Chair ($449.00) — Lumbar support, adjustable arms
[0.832] Mechanical Keyboard ($129.00) — Cherry MX switches, RGB backlit
[0.801] Standing Desk ($599.00) — Electric height-adjustable desk
Enter fullscreen mode Exit fullscreen mode

The <=> operator computes cosine distance. pgvector also supports <-> (L2/Euclidean) and <#> (inner product).


2. RAG Pipeline — Your Database as an LLM Knowledge Base

Retrieval-Augmented Generation (RAG) lets you inject your own data into an LLM's context at query time. Instead of fine-tuning, you retrieve the most relevant chunks from your database and pass them as context.

The architecture

User Question
     │
     ▼
Embed the Question
     │
     ▼
Vector Search in PostgreSQL  ──▶  Top-K relevant chunks
     │
     ▼
Build LLM Prompt with context
     │
     ▼
LLM generates answer grounded in your data
Enter fullscreen mode Exit fullscreen mode

Full RAG pipeline with LangChain + pgvector

from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import PGVector
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import TextLoader

# 1. Load and chunk your documents
loader = TextLoader("company_docs.txt")
documents = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
chunks = splitter.split_documents(documents)

# 2. Store chunks + embeddings in PostgreSQL
CONNECTION_STRING = os.environ["DATABASE_URL"]
COLLECTION_NAME = "company_knowledge_base"

vector_store = PGVector.from_documents(
    documents=chunks,
    embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
    collection_name=COLLECTION_NAME,
    connection_string=CONNECTION_STRING,
)

# 3. Build a retrieval-augmented QA chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 4}
)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# 4. Ask questions
result = qa_chain.invoke({"query": "What is our refund policy?"})
print(result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print(f"  - {doc.metadata.get('source', 'unknown')}")
Enter fullscreen mode Exit fullscreen mode

Adding metadata filters for multi-tenant RAG

# Store documents with tenant metadata
vector_store = PGVector.from_documents(
    documents=chunks,
    embedding=OpenAIEmbeddings(),
    collection_name="docs",
    connection_string=CONNECTION_STRING,
    # Each chunk carries metadata you can filter on
)

# Retrieve only documents for a specific tenant
retriever = vector_store.as_retriever(
    search_kwargs={
        "k": 5,
        "filter": {"tenant_id": "acme-corp"}
    }
)
Enter fullscreen mode Exit fullscreen mode

3. NL2SQL — Natural Language to SQL Queries

NL2SQL (also called Text-to-SQL) lets users query your database in plain English. An LLM reads your schema and translates a natural language question into a valid SQL query.

Naive approach: direct schema + prompt

from openai import OpenAI

client = OpenAI()

SCHEMA = """
Tables:
  orders(id, customer_id, product_id, quantity, total_price, created_at, status)
  customers(id, name, email, country, created_at)
  products(id, name, category, price, stock)
"""

def nl_to_sql(question: str) -> str:
    prompt = f"""You are a SQL expert. Given the following database schema, write a PostgreSQL query to answer the user's question.
Return ONLY the SQL query, no explanation.

Schema:
{SCHEMA}

Question: {question}

SQL:"""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )
    return response.choices[0].message.content.strip()


# Real-world examples
questions = [
    "What are the top 5 countries by total revenue this year?",
    "Which products have never been ordered?",
    "Show me customers who placed more than 3 orders in the last 30 days",
]

for q in questions:
    print(f"Q: {q}")
    print(f"SQL: {nl_to_sql(q)}")
    print()
Enter fullscreen mode Exit fullscreen mode

Output:

-- "What are the top 5 countries by total revenue this year?"
SELECT c.country, SUM(o.total_price) AS total_revenue
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE EXTRACT(YEAR FROM o.created_at) = EXTRACT(YEAR FROM CURRENT_DATE)
GROUP BY c.country
ORDER BY total_revenue DESC
LIMIT 5;

-- "Which products have never been ordered?"
SELECT p.id, p.name, p.category
FROM products p
LEFT JOIN orders o ON p.id = o.product_id
WHERE o.id IS NULL;
Enter fullscreen mode Exit fullscreen mode

Production-grade NL2SQL with validation

Never run LLM-generated SQL without validation. Here's a safer pattern:

import sqlparse
import re

ALLOWED_STATEMENTS = {"SELECT"}

def is_safe_sql(sql: str) -> bool:
    """Reject anything that isn't a SELECT statement."""
    parsed = sqlparse.parse(sql)
    if not parsed:
        return False
    statement_type = parsed[0].get_type()
    return statement_type in ALLOWED_STATEMENTS

def execute_nl_query(question: str, conn) -> list[dict]:
    sql = nl_to_sql(question)

    # Strip markdown fences if LLM added them
    sql = re.sub(r"```

sql|

```", "", sql).strip()

    if not is_safe_sql(sql):
        raise ValueError(f"Unsafe SQL rejected: {sql}")

    cur = conn.cursor()
    cur.execute(sql)
    columns = [desc[0] for desc in cur.description]
    rows = cur.fetchall()
    return [dict(zip(columns, row)) for row in rows]
Enter fullscreen mode Exit fullscreen mode

Few-shot prompting for better accuracy

Adding examples dramatically improves query accuracy on complex schemas:

FEW_SHOT_EXAMPLES = """
Example 1:
Question: How many orders were placed last month?
SQL: SELECT COUNT(*) FROM orders WHERE created_at >= DATE_TRUNC('month', NOW() - INTERVAL '1 month') AND created_at < DATE_TRUNC('month', NOW());

Example 2:
Question: What is the average order value by product category?
SQL: SELECT p.category, AVG(o.total_price) AS avg_order_value FROM orders o JOIN products p ON o.product_id = p.id GROUP BY p.category ORDER BY avg_order_value DESC;
"""

def nl_to_sql_fewshot(question: str) -> str:
    prompt = f"""You are a SQL expert for a PostgreSQL e-commerce database.

Schema:
{SCHEMA}

{FEW_SHOT_EXAMPLES}

Question: {question}
SQL:"""
    # ... same API call as before
Enter fullscreen mode Exit fullscreen mode

4. AI Agents over SQL — Autonomous Database Reasoning

The most powerful pattern: an AI agent that can autonomously decide which queries to run, inspect the results, and iterate to answer complex multi-step questions.

Building a SQL agent with LangChain + custom tools

from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_openai import ChatOpenAI
from langchain.tools import tool
from langchain import hub
import psycopg2

conn = psycopg2.connect(os.environ["DATABASE_URL"])

@tool
def query_database(sql: str) -> str:
    """
    Execute a read-only SQL SELECT query against the e-commerce database.
    Use this to retrieve data needed to answer user questions.
    Never use INSERT, UPDATE, DELETE, or DROP.
    """
    if not is_safe_sql(sql):
        return "Error: Only SELECT queries are allowed."
    try:
        cur = conn.cursor()
        cur.execute(sql)
        columns = [desc[0] for desc in cur.description]
        rows = cur.fetchmany(50)  # cap results
        result = [dict(zip(columns, row)) for row in rows]
        return str(result)
    except Exception as e:
        return f"Query error: {e}"

@tool
def get_schema() -> str:
    """
    Returns the database schema — table names, columns, and types.
    Call this first to understand what data is available before writing queries.
    """
    cur = conn.cursor()
    cur.execute("""
        SELECT table_name, column_name, data_type
        FROM information_schema.columns
        WHERE table_schema = 'public'
        ORDER BY table_name, ordinal_position;
    """)
    rows = cur.fetchall()
    schema = {}
    for table, column, dtype in rows:
        schema.setdefault(table, []).append(f"{column} ({dtype})")
    return "\n".join(
        f"{table}: {', '.join(cols)}"
        for table, cols in schema.items()
    )

@tool
def semantic_product_search(query: str) -> str:
    """
    Search for products using semantic/natural language similarity.
    Use this when the user describes what they want rather than specifying exact product names.
    """
    results = semantic_search(query, limit=5)
    return str([{"name": r[0], "description": r[1], "price": r[3]} for r in results])

# Assemble the agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [get_schema, query_database, semantic_product_search]

prompt = hub.pull("hwchase17/openai-tools-agent")

agent = create_openai_tools_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    max_iterations=10
)

# Complex multi-step question
response = agent_executor.invoke({
    "input": """
    I'm looking for ergonomic office products. 
    Find the top 3 most relevant products, then check if we have customers 
    from Germany who bought office furniture in the past 6 months. 
    Give me a summary of both findings.
    """
})

print(response["output"])
Enter fullscreen mode Exit fullscreen mode

The agent will autonomously:

  1. Call semantic_product_search("ergonomic office products")
  2. Call get_schema() to understand the tables
  3. Write and run a SQL query for German office furniture buyers
  4. Synthesize both results into a coherent answer

Putting It All Together: Architecture Overview

┌─────────────────────────────────────────────────────┐
│                   PostgreSQL Database               │
│                                                     │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────┐  │
│  │  products   │  │  embeddings  │  │  orders   │  │
│  │  customers  │  │  (pgvector)  │  │  ...      │  │
│  └─────────────┘  └──────────────┘  └───────────┘  │
└───────────────────────┬─────────────────────────────┘
                        │
           ┌────────────┼────────────┐
           ▼            ▼            ▼
     ┌──────────┐ ┌──────────┐ ┌──────────┐
     │ Semantic │ │  NL2SQL  │ │  RAG     │
     │  Search  │ │  Agent   │ │ Pipeline │
     └──────────┘ └──────────┘ └──────────┘
           │            │            │
           └────────────┼────────────┘
                        ▼
                  ┌──────────┐
                  │  LLM     │
                  │ (GPT-4o) │
                  └──────────┘
                        │
                        ▼
                  User Response
Enter fullscreen mode Exit fullscreen mode

Performance Tips

Index strategy for pgvector:

-- HNSW: faster queries, slower inserts — best for production read-heavy workloads
CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- IVFFlat: faster inserts, slightly less accurate — good for bulk loading
CREATE INDEX ON products USING ivfflat (embedding vector_l2_ops)
WITH (lists = 100);
-- Rule of thumb: lists ≈ sqrt(number_of_rows)
Enter fullscreen mode Exit fullscreen mode

Batching embeddings:

# Don't call the embeddings API one row at a time
texts = [f"{p['name']}: {p['description']}" for p in products]

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=texts  # batch up to 2048 inputs at once
)
embeddings = [item.embedding for item in response.data]
Enter fullscreen mode Exit fullscreen mode

Caching repeated NL2SQL queries:

import hashlib
import json

query_cache: dict[str, str] = {}

def cached_nl_to_sql(question: str) -> str:
    key = hashlib.md5(question.lower().strip().encode()).hexdigest()
    if key not in query_cache:
        query_cache[key] = nl_to_sql(question)
    return query_cache[key]
Enter fullscreen mode Exit fullscreen mode

Security Checklist

Before shipping any of these patterns to production:

  • Allowlist only SELECT — reject DDL and DML from LLM-generated SQL
  • Use parameterized queries when executing with user-supplied values
  • Create a read-only DB user for your AI layer
  • Set query timeouts to prevent expensive accidental full-table scans
  • Cap result set sizes from agent tool calls
  • Log all LLM-generated queries for auditing
  • Never expose raw error messages — they leak schema information
-- Create a restricted role for your AI application
CREATE ROLE ai_readonly;
GRANT CONNECT ON DATABASE aidb TO ai_readonly;
GRANT USAGE ON SCHEMA public TO ai_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO ai_readonly;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ai_readonly;
Enter fullscreen mode Exit fullscreen mode

Public Repository

All the code from this article — including Docker Compose setup, schema migrations, and ready-to-run examples — is in the companion repository:

🔗 github.com/yourusername/sql-ai-database-solutions

The repo structure:

sql-ai-database-solutions/
├── docker-compose.yml          # PostgreSQL + pgvector, ready to go
├── schema/
│   └── 001_init.sql            # Tables, pgvector extension, indexes
├── examples/
│   ├── 01_pgvector_search.py   # Semantic search
│   ├── 02_rag_pipeline.py      # RAG with LangChain
│   ├── 03_nl2sql.py            # Natural language to SQL
│   └── 04_sql_agent.py         # Autonomous SQL agent
├── .env.example
└── README.md
Enter fullscreen mode Exit fullscreen mode

What's Next?

The patterns here are just the start. Some directions worth exploring:

  • DB-GPT — an open-source, full-stack AI database framework
  • Vanna.AI — RAG-based NL2SQL with self-learning capabilities
  • pgai — Timescale's extension for running ML directly inside PostgreSQL
  • Multi-agent SQL workflows — agents that collaborate across multiple databases

The line between "database" and "AI system" is getting thinner every month. If you're already running PostgreSQL, you're closer to a production AI stack than you might think.


Have questions or want to share what you've built? Drop a comment below or open an issue on the repo!

Top comments (0)