Natasha Joshi for Precogs AI

Posted on Apr 23

SQL Injection in Python: Example, Exploitation, Detection, and Prevention

#ai #computerscience #codesecurity #cybersecurity

TL;DR - SQL Injection in Python

Root cause: f-strings, .format(), %-formatting, and + concatenation embed user input directly into SQL — the database can't tell the difference between your query and an attacker's payload.
The fix: parameterized queries (? for sqlite3, %s for psycopg2) or an ORM. Values are bound after SQL is parsed — injection is structurally impossible, not just filtered.
Easy to miss: ORDER BY and column names can't be parameterized — use an explicit allowlist instead.
Detection: grep surfaces candidates fast, but SAST tools with taint tracking catch what grep misses — especially queries assembled across multiple lines or helper functions.
In production: least-privilege DB accounts limit blast radius; suppressed error messages deny attackers schema information.

Introduction

SQL injection (SQLi) has topped the OWASP Top 10 for over a decade and continues to be the root cause of high-profile data breaches. The attack is simple in concept: user-controlled input is concatenated directly into a SQL query, allowing an attacker to modify the query's logic rather than just supply data values.

SQL injection in Python is one of the most common and consequential vulnerabilities in backend applications today. Python's ecosystem flexibility — quick prototyping with raw database drivers like sqlite3 and psycopg2, multiple web frameworks, and a culture of readable, concise code — creates real opportunities for this bug to slip through code review. A developer writes a login endpoint in Flask using an f-string query. It works. Tests pass. The vulnerability ships to production.

This article covers the full lifecycle of SQL injection in Python: what it is, how attackers exploit it, how to detect it through code review and automated tooling, and — most importantly — how to fix it with concrete, copy-paste-ready examples.

What Is SQL Injection in Python?

SQL injection is a code injection technique where an attacker inserts or "injects" malicious SQL statements into an input field that gets incorporated into a database query. The root cause is treating user input as part of the query's syntax rather than as a literal data value to be passed to the database.

When a database receives a query, it parses the SQL text to build an abstract syntax tree before execution. If attacker-controlled input has already been concatenated into that text, the database cannot distinguish between the developer's intended query structure and the injected commands. The attacker becomes the query author.

🔴 CWE-89 — Improper Neutralization of Special Elements used in an SQL Command. SQL injection is classified under CWE-89 and is consistently ranked in the OWASP Top 10 under A03:2021 Injection. The CVSS base score for exploitable SQL injection in a web application typically ranges from 7.5 (High) to 9.8 (Critical), depending on the database's privilege level and the application's data sensitivity.

Types of SQL Injection

In-band SQLi — Results are returned directly in the HTTP response. The most common and easiest to detect. Includes error-based and UNION-based variants.
Blind SQLi (Boolean-based) — No data is returned directly, but the application behaves differently based on whether a condition is true or false. Slower to exploit but equally dangerous.
Blind SQLi (Time-based) — The attacker infers data by measuring response latency using functions like SLEEP() or pg_sleep().
Out-of-band SQLi — Data is exfiltrated via DNS or HTTP requests triggered by the database itself. Less common but used to bypass WAFs.

Python SQL Injection Example (Vulnerable Code)

The following is a realistic login endpoint written with Flask and Python's sqlite3 driver. It looks functional, handles both success and failure cases, and uses standard library components — but it contains a critical SQL injection vulnerability.

# app.py — Vulnerable login endpoint
import sqlite3
from flask import Flask, request, jsonify

app = Flask(__name__)

def get_db():
    return sqlite3.connect("users.db")

@app.route("/login", methods=["POST"])
def login():
    username = request.form["username"]
    password = request.form["password"]

    db = get_db()
    cursor = db.cursor()

    # ❌ VULNERABLE: user input concatenated directly into the query string
    query = f"SELECT * FROM users WHERE username = '{username}' AND password = '{password}'"
    cursor.execute(query)

    user = cursor.fetchone()

    if user:
        return jsonify({"status": "success", "user_id": user[0]})
    else:
        return jsonify({"status": "error", "message": "Invalid credentials"}), 401

The problem is on the highlighted line. The f-string embeds username and password directly into the SQL string. The database receives a fully-formed SQL statement where both values are part of the query text itself — not bound as parameters. Any SQL metacharacters the attacker includes (', --, ;, OR) are interpreted as SQL syntax.

The same vulnerability appears in codebases using %-formatting or +-concatenation:

# Other vulnerable patterns in Python
# Pattern 1: % string formatting
query = "SELECT * FROM users WHERE username = '%s'" % username
# Pattern 2: string concatenation
query = "SELECT * FROM orders WHERE user_id = " + user_id
# Pattern 3: .format() — equally vulnerable
query = "SELECT * FROM products WHERE id = {}".format(product_id)
# Pattern 4: dynamic ORDER BY — commonly missed
query = f"SELECT * FROM logs ORDER BY {sort_column} DESC"

⚠️ The ORDER BY pattern is frequently missed by code review. You cannot parameterize column names or SQL keywords — only values. Dynamic column names require explicit allowlisting, not parameterization.

This allows payloads like ' OR '1'='1 to turn the WHERE clause into a tautology — returning every row in the table. A payload like admin'-- eliminates the password check entirely by commenting out the rest of the query.

How Attackers Exploit SQL Injection

Given the vulnerable login endpoint above, here is what an attacker actually sends and what happens at the database level.

Authentication bypass — entering admin' -- as the username causes the database to ignore the password check entirely:

WHERE username = 'admin' -- (password check removed by comment operator)

Data exfiltration via UNION — if query results are reflected in the response, an attacker appends a UNION SELECT to read arbitrary tables:

' UNION SELECT username, password, NULL FROM users --

Schema discovery works the same way: ' UNION SELECT name, sql, NULL FROM sqlite_master -- returns table and column names, giving the attacker a full map of the database.

Common SQL injection payloads

Payload	Technique	Impact	Risk
' OR '1'='1	Boolean tautology	Returns all rows, bypasses WHERE filter	Critical
admin'--	Comment truncation	Authentication bypass for known username	Critical
' UNION SELECT null,null--	UNION-based	Column count detection, precursor to exfiltration	High
1; DROP TABLE users--	Stacked queries	Data destruction (driver-dependent)	Critical
1 AND SLEEP(5)--	Time-based blind	Confirms injection point, enables data extraction	High

How to Detect SQL Injection in Python

Manual code review

The first line of defense is knowing what patterns to look for during review. In Python, the signal is clear: a string passed to .execute() or similar methods that was assembled using f-strings, %-formatting, .format(), or +-concatenation.

A targeted grep can surface candidates quickly across a codebase:

# Find .execute() calls that use f-strings or % formatting
grep -rn "\.execute(f['\"]" ./app/
grep -rn "\.execute(.*%.*)" ./app/
grep -rn "\.execute(.*\.format(" ./app/

# Find raw query construction with concatenation
grep -rn "SELECT.*+\s*" ./app/ --include="*.py"
grep -rn "WHERE.*+\s*" ./app/ --include="*.py"

# Broader: any execute() call — review manually
grep -rn "cursor\.execute\|db\.execute\|conn\.execute" ./app/ --include="*.py"

Grep is useful for a quick scan, but it produces false positives for parameterized queries and misses cases where the query is assembled across multiple lines or helper functions. It is a starting point, not a definitive audit.

Static analysis tools (SAST)

For automated detection at scale, SAST tools analyze data flow: they trace where user-controlled values originate (HTTP request parameters, environment variables, file reads) and whether they reach a sink — a function that executes SQL — without passing through a sanitizer or parameterization step.

Common options include pattern-based scanners like Bandit and Semgrep, and semantic analyzers like CodeQL. Each varies in precision, false positive rate, and setup overhead — for a detailed comparison, see Best SAST Tools in 2026.

Flask SQL Injection Vulnerability

Unlike login examples, SQL injection frequently appears in search endpoints using LIKE clauses — which developers often assume are safe because they don't handle authentication. The pattern below is common in Flask filter and search endpoints.

Vulnerable Flask endpoint using request.args

# Flask search endpoint — vulnerable
from flask import Flask, request, jsonify
import sqlite3

app = Flask(__name__)

@app.route("/search")
def search():
    # request.args pulls directly from the URL query string
    term = request.args.get("q", "")

    conn = sqlite3.connect("products.db")
    cursor = conn.cursor()

    # ❌ term is user-controlled — never embed in SQL string
    query = f"SELECT * FROM products WHERE name LIKE '%{term}%'"
    cursor.execute(query)

    results = cursor.fetchall()
    return jsonify(results)

Attack request targeting this endpoint

GET /search?q=%' UNION SELECT username,password,NULL FROM users--
-- Results in: WHERE name LIKE '%' UNION SELECT username,password,NULL FROM users--%'
-- The entire users table is returned in place of product results.

Fixed Flask endpoint — parameterized query

# Flask search endpoint — secure
@app.route("/search")
def search():
    term = request.args.get("q", "")

    conn = sqlite3.connect("products.db")
    cursor = conn.cursor()

    # ✅ Wildcard characters added in Python, value passed as parameter
    cursor.execute(
        "SELECT * FROM products WHERE name LIKE ?",
        (f"%{term}%",)
    )

    results = cursor.fetchall()
    return jsonify(results)

💡 Note on LIKE patterns: The % wildcard characters must be added in Python before the value is passed as a parameter — not inside the SQL string itself. The database driver handles escaping the contents of term, including any % or _ characters the user might include.

How to Fix SQL Injection in Python

The correct fix for SQL injection is parameterized queries — also called prepared statements. Instead of embedding values into the query string, you pass them as a separate argument to the driver. The database engine then handles escaping internally, ensuring that values are always treated as data, never as SQL syntax.

Diagram: Parameterized Query vs Unsafe Query

❌ UNSAFE — String Concatenation

SQL String
"WHERE id = " + uid
+

User Input
1 OR 1=1

⚠ Injection Risk
WHERE id = 1 OR 1=1

✅ SAFE — Parameterized Query

SQL Statement
"WHERE id = ?"
,

Parameters
(uid,)

✅ Safe Execution
Value always treated as data

Fix 1: Parameterized queries with sqlite3 / psycopg2

# sqlite3 — use ? placeholders
query = "SELECT * FROM users WHERE username = ? AND password = ?"
cursor.execute(query, (username, password))

# psycopg2 / MySQL — use %s placeholders
query = "SELECT * FROM users WHERE username = %s AND password = %s"
cursor.execute(query, (username, password))

Values are passed as a separate tuple — never embedded in the SQL string. The database parses the SQL syntax tree first, then binds the values afterward, meaning user input never becomes part of the SQL structure. Injection is structurally impossible, not just filtered.

Fix 2: SQLAlchemy ORM (recommended for application code)

# Secure query with SQLAlchemy ORM
from sqlalchemy.orm import Session
from models import User

def get_user(db: Session, username: str, password: str):
    # ORM methods always parameterize internally — no raw strings
    return db.query(User).filter_by(username=username, password=password).first()

# If you need raw SQL with SQLAlchemy, use text() with bound params:
from sqlalchemy import text

def get_user_raw(db: Session, username: str):
    stmt = text("SELECT * FROM users WHERE username = :username")
    return db.execute(stmt, {"username": username}).fetchone()

Fix 3: Dynamic ORDER BY — allowlisting

Column names and SQL keywords cannot be parameterized. For dynamic ORDER BY or column selection, use an explicit allowlist:

# ❌ VULNERABLE — never do this
query = f"SELECT * FROM logs ORDER BY {request.args.get('sort')} DESC"

# ✅ SECURE — explicit allowlist of valid column names
ALLOWED_SORT_COLUMNS = {"created_at", "severity", "user_id"}

sort_param = request.args.get("sort", "created_at")
sort_col = sort_param if sort_param in ALLOWED_SORT_COLUMNS else "created_at"

query = text(f"SELECT * FROM logs ORDER BY {sort_col} DESC")
cursor.execute(query)

Quick reference — safe vs unsafe patterns

Pattern	Safe?	Reason
`f"SELECT ... {user_input}"`	❌	Input becomes part of SQL syntax
`"SELECT ..." % user_input`	❌	No separation between query and data
`"SELECT ...".format(user_input)`	❌	No separation between query and data
`cursor.execute(query, (value,))`	✅	Value bound after SQL is parsed
`db.query(Model).filter_by(...)`	✅	ORM parameterizes internally
`text("... :param")` + bound params	✅	Explicit binding in raw SQLAlchemy

Why Traditional SAST Tools Produce False Positives

Pattern-based SAST tools work by matching source code against a library of known-bad patterns — essentially grep with data flow awareness. This approach has a fundamental limitation: the tool flags any code that looks like a vulnerability, regardless of whether it is actually exploitable. For a broader breakdown of how SAST, DAST, and SCA differ in practice, see SAST vs DAST vs SCA: What's the Difference.

❌ False Positive Example

A SAST tool flags a cursor.execute() call that constructs a query with an f-string — but the only variable interpolated is an integer user_id that has already been validated and cast via int(). The query is not injectable, but the tool reports a critical finding regardless.

✅ What AI Analysis Does Instead

An AI-native analyzer traces the complete data flow: it determines that user_id originates from a JWT-validated session, passes through an int() cast, and reaches the query with no user-controlled string interpolation. No finding is raised.

False positives have a real cost. When a tool reports 200 findings per week and 60–80% are false positives, engineers start triaging mechanically rather than carefully. Real vulnerabilities get dismissed in the noise. The tool loses credibility, gets disabled in CI, and the security program breaks down.

The three most common sources of false positives for SQL injection specifically are: values that look like user input but originate from internal configuration; string formatting used in query construction where the interpolated value is a validated constant; and ORMs that generate parameterized queries internally but whose intermediate string representations trigger pattern matches.

Detecting Real SQL Injection Vulnerabilities with Precogs AI

Most SAST tools detect SQL injection by pattern-matching: they flag any .execute() call assembled with an f-string or + operator, regardless of whether the value is actually user-controlled. Precogs AI uses taint analysis instead — tracing data from HTTP sources through your call graph to SQL sinks, and raising a finding only when exploitability is confirmed. Findings surface directly inside PRs and CI pipelines, mapped to OWASP Top 10 and CWE Top 25, so security context arrives where developers are already working.

1.92×
More Effective
than SAST + LLM solutions at finding real vulnerabilities

63×
Fewer False Positives
compared to combined SAST + LLM in benchmark testing

30+
Languages & Frameworks
Python, Flask, Django, FastAPI and beyond

Figures based on the CASTLE benchmark evaluation.

What a finding looks like in practice

Precogs AI surfaces the exact taint path — from HTTP input source through to SQL execution sink — alongside a code diff showing the vulnerable line and the parameterized fix.

How Taint Analysis Works

🌐
HTTP Source

request.args
request.form
route params

🔄
Transformations

Type casts
Validation fns
ORM layers

🔍
Boundary Check

Parameterized?
User-controlled?
Reaches SQL sink?

✅
No Finding
Parameterized —
not injectable

🚨
Finding Raised
Full taint path
shown to developer

Capability	Pattern-based SAST	Precogs AI
Detection method	Regex / AST pattern matching	Full taint-path analysis
Tracks data through function calls	✗ No	✓ Yes
Understands type casts (int())	✗ No	✓ Yes
ORM-aware (SQLAlchemy, Django)	✗ Limited	✓ Yes
False positive rate	High (60–80% noise typical)	63× lower in benchmarks
Finding includes taint path	✗ No	✓ Yes — line-by-line
CI/CD integration	Varies	GitHub & GitLab via REST API

Best Practices for Preventing SQL Injection in Python

Parameterized queries fix the vulnerability at the code level. These practices build defense in depth at the architecture and process level.

Conclusion

SQL injection in Python is not a subtle or complex vulnerability — it has a clear cause and a definitive fix. Every instance traces back to the same root: user-controlled input treated as SQL syntax rather than bound as a parameter. Switching to parameterized queries with sqlite3, psycopg2, or SQLAlchemy's text() eliminates the vulnerability structurally, not through filtering or escaping.

Defense in depth — least-privilege database accounts, suppressed error messages in production, SAST in CI, and security unit tests — reduces the blast radius if something slips through. But none of those layers substitute for parameterized queries at every database call.

If you want to verify your Python codebase is clean, Precogs AI traces taint paths from HTTP inputs to SQL sinks and surfaces only confirmed, exploitable findings — so your team spends time fixing real vulnerabilities, not triaging noise.

<h2>Catch SQL Injection Before It Ships</h2>
<p>
  Precogs AI traces taint paths from HTTP inputs to SQL sinks across your entire Python codebase — raising findings only when <b>exploitability is confirmed, not on every pattern match.</b>
</p>

Scan your codebase free →

DEV Community