Stop writing SQL strings that look like a ransom note. Here's how to write queries that are readable, safe, and maintainable.
The Problem With "Good Enough" SQL Formatting
Most Python developers start here:
user_id = 5
query = "SELECT * FROM users WHERE id = " + str(user_id)
cursor.execute(query)
It works. Until it doesn't — and when it breaks, it breaks badly: SQL injection, cryptic errors from mismatched types, and queries that take 45 minutes to debug at 2am. Let's fix that, permanently.
1. Never Concatenate User Input — Use Parameterized Queries
This is rule #1 and it's non-negotiable.
❌ The Wrong Way (SQL Injection Waiting to Happen)
username = request.args.get("username") # Could be: ' OR '1'='1
query = f"SELECT * FROM users WHERE username = '{username}'"
cursor.execute(query)
If username is ' OR '1'='1, your entire users table just got exposed.
✅ The Right Way: Parameterized Queries
username = request.args.get("username")
# psycopg2 (PostgreSQL)
cursor.execute("SELECT * FROM users WHERE username = %s", (username,))
# sqlite3
cursor.execute("SELECT * FROM users WHERE username = ?", (username,))
# SQLAlchemy Core
from sqlalchemy import text
result = conn.execute(text("SELECT * FROM users WHERE username = :name"), {"name": username})
The database driver handles escaping. You never touch it. This pattern is immune to SQL injection by design.
Gotcha: Note the trailing comma in
(username,). Without it, Python treats the string as an iterable and passes each character as a separate parameter. This is one of the most common beginner bugs.
# 💥 Bug: passes ('a', 'l', 'i', 'c', 'e') instead of ('alice',)
cursor.execute("SELECT * FROM users WHERE username = %s", (username))
# ✅ Correct: single-element tuple
cursor.execute("SELECT * FROM users WHERE username = %s", (username,))
2. Multi-Line Queries: Triple Quotes + Consistent Indentation
For anything longer than one clause, use triple-quoted strings. They're readable, diffable, and don't require escape characters.
❌ String Concatenation Chaos
query = "SELECT u.id, u.name, o.total " + \
"FROM users u " + \
"JOIN orders o ON u.id = o.user_id " + \
"WHERE o.total > 100"
✅ Triple-Quoted SQL
query = """
SELECT
u.id,
u.name,
u.email,
o.total,
o.created_at
FROM users u
JOIN orders o
ON u.id = o.user_id
WHERE
o.total > 100
AND u.active = TRUE
ORDER BY o.created_at DESC
LIMIT 50
"""
cursor.execute(query)
Style conventions that matter:
- SQL keywords in
UPPERCASE - One clause per line (
SELECT,FROM,WHERE,ORDER BY) - Indent column lists by 4 spaces
- Put
ONconditions on their own line, indented underJOIN
3. Dynamic Queries: Building Them Safely
Real applications need dynamic filters. Here's how to do it without string concatenation.
Scenario: A search endpoint with optional filters
def get_orders(user_id: int, status: str = None, min_total: float = None):
base_query = """
SELECT
o.id,
o.status,
o.total,
o.created_at
FROM orders o
WHERE o.user_id = %s
"""
params = [user_id]
conditions = []
if status:
conditions.append("AND o.status = %s")
params.append(status)
if min_total is not None:
conditions.append("AND o.total >= %s")
params.append(min_total)
full_query = base_query + "\n ".join(conditions)
cursor.execute(full_query, params)
return cursor.fetchall()
This is the accumulator pattern: build a list of clauses and params in parallel, then join them. The number of %s placeholders always matches len(params).
Gotcha: Never dynamically interpolate column names or table names with parameters — the DB driver only parameterizes values, not identifiers. If you need a dynamic column name, whitelist it explicitly:
ALLOWED_SORT_COLUMNS = {"created_at", "total", "status"}
def get_orders_sorted(sort_by: str):
if sort_by not in ALLOWED_SORT_COLUMNS:
raise ValueError(f"Invalid sort column: {sort_by}")
# Now safe to interpolate — it came from our whitelist, not user input
query = f"""
SELECT id, status, total, created_at
FROM orders
ORDER BY {sort_by} DESC
"""
cursor.execute(query)
4. IN Clauses: The Tricky One
Parameterizing IN (...) trips up almost everyone the first time.
❌ This Won't Work
ids = [1, 2, 3, 4]
cursor.execute("SELECT * FROM users WHERE id IN (%s)", (ids,))
# Error: query expects 1 placeholder, got a list
✅ Generate Placeholders Dynamically
ids = [1, 2, 3, 4]
# sqlite3
placeholders = ", ".join("?" * len(ids))
query = f"SELECT * FROM users WHERE id IN ({placeholders})"
cursor.execute(query, ids)
# psycopg2 — uses %s for each
placeholders = ", ".join(["%s"] * len(ids))
query = f"SELECT * FROM users WHERE id IN ({placeholders})"
cursor.execute(query, ids)
✅ SQLAlchemy makes this trivial
from sqlalchemy import select, bindparam
stmt = select(users_table).where(users_table.c.id.in_([1, 2, 3, 4]))
result = conn.execute(stmt)
5. Use SQLAlchemy Core for Complex Queries
When your raw SQL starts looking like a maze, SQLAlchemy Core gives you composable, Pythonic query building without the full ORM overhead.
from sqlalchemy import create_engine, select, func, and_, desc
from sqlalchemy import Table, MetaData
engine = create_engine("postgresql://user:pass@localhost/mydb")
meta = MetaData()
meta.reflect(bind=engine)
orders = meta.tables["orders"]
users = meta.tables["users"]
stmt = (
select(
users.c.id,
users.c.name,
func.sum(orders.c.total).label("lifetime_value"),
func.count(orders.c.id).label("order_count"),
)
.join(orders, users.c.id == orders.c.user_id)
.where(
and_(
orders.c.status == "completed",
orders.c.created_at >= "2024-01-01",
)
)
.group_by(users.c.id, users.c.name)
.having(func.sum(orders.c.total) > 500)
.order_by(desc("lifetime_value"))
.limit(20)
)
with engine.connect() as conn:
results = conn.execute(stmt).fetchall()
This compiles to safe, parameterized SQL automatically. You can also print the generated SQL for debugging:
print(stmt.compile(dialect=engine.dialect, compile_kwargs={"literal_binds": True}))
6. Formatting for Logging and Debugging
When you need to log a query (for debugging, audit trails, etc.), format it cleanly — but never log with real user data embedded.
import logging
import re
logger = logging.getLogger(__name__)
def log_query(query: str, params: tuple):
# Normalize whitespace for clean log output
clean_query = re.sub(r'\s+', ' ', query.strip())
logger.debug("Executing SQL: %s | Params: %s", clean_query, params)
log_query("""
SELECT id, name
FROM users
WHERE active = %s
""", (True,))
# Output: Executing SQL: SELECT id, name FROM users WHERE active = %s | Params: (True,)
Gotcha: Don't use
cursor.mogrify()(psycopg2) to log queries with real values in production — it defeats the purpose of parameterization from an audit standpoint, and if logs are exfiltrated, you've exposed real data.
7. Query Constants: SQL as Module-Level Variables
For queries used across your codebase, define them as constants at the top of your module. This makes them findable, testable, and easy to update.
# queries.py
GET_USER_BY_ID = """
SELECT id, name, email, created_at
FROM users
WHERE id = %s
AND active = TRUE
"""
GET_USER_ORDERS = """
SELECT
o.id,
o.status,
o.total,
o.created_at
FROM orders o
WHERE o.user_id = %s
ORDER BY o.created_at DESC
"""
INSERT_ORDER = """
INSERT INTO orders (user_id, total, status, created_at)
VALUES (%s, %s, %s, NOW())
RETURNING id
"""
# Usage in your service layer
from queries import GET_USER_BY_ID, GET_USER_ORDERS
def get_user_with_orders(user_id: int):
cursor.execute(GET_USER_BY_ID, (user_id,))
user = cursor.fetchone()
cursor.execute(GET_USER_ORDERS, (user_id,))
orders = cursor.fetchall()
return user, orders
This makes SQL grep-able, version-controllable, and reviewable in PRs.
8. Gotchas Quick Reference
| Gotcha | What Goes Wrong | Fix |
|---|---|---|
(value) vs (value,)
|
String unpacked as characters | Always use trailing comma for single-element tuples |
| f-string with user input | SQL injection | Use parameterized queries |
Dynamic IN (...)
|
Wrong number of placeholders | Generate placeholders programmatically |
| Dynamic column names | Parameters don't work for identifiers | Whitelist allowed column names |
| Logging with real values | Data exposure in logs | Log query template + params separately |
None in params |
NULL comparison breaks | Use IS NULL in SQL, not = NULL
|
# None/NULL gotcha
user_role = None
# ❌ Won't match any rows — SQL NULL != NULL
cursor.execute("SELECT * FROM users WHERE role = %s", (user_role,))
# ✅ Use IS NULL when the value might be None
if user_role is None:
cursor.execute("SELECT * FROM users WHERE role IS NULL")
else:
cursor.execute("SELECT * FROM users WHERE role = %s", (user_role,))
Summary
| Situation | Recommended Approach |
|---|---|
| Simple parameterized query | cursor.execute(sql, (val,)) |
| Multi-line query | Triple-quoted string with clause-per-line formatting |
| Dynamic filters | Accumulator pattern (list of clauses + params) |
IN clause |
Generate placeholders: ", ".join(["%s"] * len(ids))
|
| Complex joins/aggregates | SQLAlchemy Core |
| Reusable queries | Module-level string constants |
| Dynamic column names | Whitelist validation + f-string |
The core rule is simple: SQL structure is yours to format; SQL values are the driver's job to escape. Internalize that boundary and most of these best practices follow naturally.
Found a gotcha I missed? Drop it in the comments — SQL has a way of surprising everyone eventually.
Top comments (0)