DEV Community

Cover image for # SQL AI Database Solutions: Code and Real-World Examples

# SQL AI Database Solutions: Code and Real-World Examples

Public Example Repository: https://github.com/Abel-GG-777/sql-ai-database-solutions

Abstract

SQL AI Database Solutions combine database systems with artificial intelligence techniques that allow users to ask questions in natural language and receive SQL queries or query results. This article explains the concept of Text-to-SQL, why AI is useful for database querying, and how a small Python application can translate natural language into safe SQLite SELECT statements. The repository includes a working Streamlit demo, a realistic sample database, code examples, and tests. It also discusses real-world use cases, benefits, limitations, risks, and security considerations for production systems.

Introduction

Databases are essential in modern organizations because they store sales records, customer information, student data, inventory levels, support tickets, and many other operational facts. However, many users who need answers from a database do not know SQL. A sales manager may ask, "Which products generated the most revenue this month?" A student services worker may ask, "Which active students are enrolled in Computer Science?" These questions are easy for humans to understand, but they normally require technical SQL knowledge.

SQL AI Database Solutions address this gap by using artificial intelligence to help users communicate with databases in natural language. The goal is not to remove database professionals, but to make data access faster and more understandable while preserving security and correctness.

What Are SQL AI Database Solutions?

SQL AI Database Solutions are systems that use AI models, rules, or hybrid methods to help users interact with relational databases. In a typical solution, the user writes a natural language question, the system reads the database schema, an AI component generates an SQL query, and the database executes the query.

The solution may return the generated SQL, the query results, or both. In educational and professional environments, showing the generated SQL is useful because users can learn how natural language maps to database logic. It also supports transparency, review, and debugging.

What Is Text-to-SQL?

Text-to-SQL is the task of converting natural language into SQL. For example:

Natural language question:

Which products are low in stock?
Enter fullscreen mode Exit fullscreen mode

Generated SQL:

SELECT
    sku,
    name,
    category,
    stock_quantity,
    reorder_level
FROM products
WHERE stock_quantity <= reorder_level
ORDER BY stock_quantity ASC;
Enter fullscreen mode Exit fullscreen mode

Modern Text-to-SQL systems often use large language models trained or prompted to understand schemas, table relationships, and user intent. Some systems use general language models, while others use specialized models available through platforms such as Hugging Face.

Why AI Is Useful for Database Querying

AI is useful for database querying because it can reduce the technical barrier between users and data. A non-technical user can ask a question in normal language instead of memorizing SQL syntax. AI can also speed up the work of analysts by creating first drafts of queries, suggesting joins, and helping explore unfamiliar schemas.

AI also supports education. Students can compare their natural language questions with generated SQL and understand how tables, filters, grouping, and ordering work. This repository supports that learning goal by showing the generated query before displaying the result.

Architecture of the Demo Application

The demo application uses a simple and safe architecture:

  1. The user enters a natural language question in Streamlit.
  2. The application connects to a local SQLite database.
  3. The schema reader extracts table and column information.
  4. The query generator creates an SQL query.
  5. The validator allows only SELECT queries and blocks dangerous keywords.
  6. The query executor runs the query.
  7. Streamlit displays the generated SQL and the results.

The main files are:

  • app.py: Streamlit user interface.
  • src/database.py: SQLite connection and database creation.
  • src/schema_reader.py: Table and column extraction.
  • src/query_generator.py: Optional Hugging Face generation and rule-based fallback.
  • src/query_executor.py: SQL validation and execution.
  • data/seed_data.sql: Sample database schema and realistic data.

Code Examples

Database Connection

The database module opens a SQLite connection and configures rows so they can be converted into dictionaries.

import sqlite3
from pathlib import Path

DEFAULT_DB_PATH = Path("data/sample.db")

def get_connection(db_path=DEFAULT_DB_PATH):
    connection = sqlite3.connect(db_path)
    connection.row_factory = sqlite3.Row
    return connection
Enter fullscreen mode Exit fullscreen mode

SQLite is used because it is local, portable, and does not require exposing credentials.

Schema Extraction

The application reads the database schema so the query generator can understand available tables and columns.

def list_tables(connection):
    query = """
        SELECT name
        FROM sqlite_master
        WHERE type = 'table'
          AND name NOT LIKE 'sqlite_%'
        ORDER BY name;
    """
    return [row["name"] for row in connection.execute(query).fetchall()]
Enter fullscreen mode Exit fullscreen mode

The extracted schema can be formatted as prompt context:

products(product_id INTEGER, sku TEXT, name TEXT, category TEXT, unit_price REAL)
sales(sale_id INTEGER, customer_id INTEGER, product_id INTEGER, sale_date TEXT)
Enter fullscreen mode Exit fullscreen mode

Natural Language Prompt Input

In the Streamlit interface, users write a natural language question:

question = st.text_area(
    "Ask a question about the sample database",
    value="Which products are low in stock?"
)
Enter fullscreen mode Exit fullscreen mode

This makes the interface accessible to users who do not know SQL.

SQL Query Generation

The demo supports optional Hugging Face integration using environment variables:

if os.getenv("HF_API_TOKEN"):
    return _generate_with_hugging_face(question, schema_text)
return generate_rule_based_sql(question)
Enter fullscreen mode Exit fullscreen mode

If no token is configured, the app uses a rule-based fallback:

if "low" in normalized_question and "stock" in normalized_question:
    return """
    SELECT sku, name, category, stock_quantity, reorder_level
    FROM products
    WHERE stock_quantity <= reorder_level
    ORDER BY stock_quantity ASC
    """
Enter fullscreen mode Exit fullscreen mode

This keeps the classroom demo functional without paid services or external APIs.

SQL Validation

The demo limits generated SQL to SELECT statements only:

if first_token != "SELECT":
    raise SqlValidationError("Only SELECT queries are allowed.")
Enter fullscreen mode Exit fullscreen mode

It also blocks dangerous keywords:

BLOCKED_SQL_KEYWORDS = {
    "ALTER", "CREATE", "DELETE", "DROP", "INSERT",
    "REPLACE", "TRUNCATE", "UPDATE", "VACUUM"
}
Enter fullscreen mode Exit fullscreen mode

This is basic validation for a sample project. It is not enough for production environments.

SQL Query Execution

The executor validates the SQL before running it:

def execute_select_query(connection, sql):
    safe_sql = validate_sql(sql)
    cursor = connection.execute(safe_sql)
    return [dict(row) for row in cursor.fetchall()]
Enter fullscreen mode Exit fullscreen mode

Result Display

Streamlit displays the generated SQL and the returned rows:

st.code(safe_sql, language="sql")
st.dataframe(pd.DataFrame(rows), use_container_width=True)
Enter fullscreen mode Exit fullscreen mode

This design helps users see both the query and the data result.

Real-World Use Cases

Business Dashboard Queries

A manager can ask:

What are the top selling products by revenue?
Enter fullscreen mode Exit fullscreen mode

The system can generate a query that joins sales and products, groups by product, and orders by revenue. This supports quick dashboard exploration.

Student Records Search

A university office can ask:

Find active students in Computer Science.
Enter fullscreen mode Exit fullscreen mode

The system can filter the students table by program and enrollment status. This is useful for academic advising, reporting, and enrollment management.

Sales Analysis

A sales analyst can ask:

Show total sales by month.
Enter fullscreen mode Exit fullscreen mode

The SQL query can group records by month and calculate revenue. This helps identify trends over time.

Inventory Management

An operations employee can ask:

Which products are low in stock?
Enter fullscreen mode Exit fullscreen mode

The query can compare stock_quantity with reorder_level. This supports restocking decisions.

Customer Support Data Extraction

A support supervisor can ask:

List open high priority support tickets.
Enter fullscreen mode Exit fullscreen mode

The system can join tickets with customers and filter by status and priority. This helps teams prioritize urgent cases.

Advantages

SQL AI Database Solutions provide several advantages:

  • They make databases easier for non-technical users.
  • They reduce the time needed to write common queries.
  • They help analysts explore unfamiliar schemas.
  • They can support training and SQL education.
  • They improve transparency when generated SQL is shown to the user.
  • They can be integrated into dashboards, internal tools, and support systems.

Limitations and Risks

AI-generated SQL can be incorrect. It may choose the wrong table, misunderstand a business term, miss a filter, or create an expensive query. Natural language can also be ambiguous. For example, "best customers" could mean highest revenue, highest number of purchases, or highest satisfaction score.

Large language models can also hallucinate columns or tables that do not exist. Even when the SQL is valid, the result may not answer the user's actual question. For this reason, generated SQL should be reviewed, especially when used for financial, academic, legal, or operational decisions.

Security Considerations

Security is one of the most important topics in SQL AI systems. A production system should not simply generate SQL and run it with full database permissions. It should use a dedicated read-only database account, strict access control, audit logging, query limits, and strong validation.

This demo includes basic safety rules:

  • SQLite is used locally.
  • No database credentials are required.
  • Only SELECT statements are allowed.
  • Dangerous keywords such as DROP, DELETE, UPDATE, INSERT, ALTER, and TRUNCATE are blocked.
  • Hugging Face integration is optional and controlled by environment variables.

Production systems need stronger validation, permissions, logging, monitoring, rate limiting, and human review workflows. Sensitive data should be protected with masking, role-based access, and privacy rules.

Conclusion

SQL AI Database Solutions can make data access more natural, faster, and more inclusive. By combining natural language input, schema extraction, Text-to-SQL generation, validation, and result display, users can interact with relational databases without writing SQL manually.

This repository demonstrates the core idea using Python, Streamlit, and SQLite. It includes a local rule-based fallback so the project works without an API key, and it includes optional Hugging Face integration for experimentation with AI models. The project is intentionally simple, but it shows the foundation of real-world systems that help people talk to databases using natural language.

References

Top comments (0)