Part of my 100 Days of Code journey. Today we make the API actually usable at scale.
The Challenge
Add advanced query features to the Task Management API: filtering, sorting, pagination, search, and statistics.
The Problem: Yesterday's API works, but imagine having 1,000 tasks:
- Can't find specific tasks (no search)
- Loading all 1,000 at once (slow, wasteful)
- Can't sort by priority (no sorting)
- No insights into completion rates (no analytics)
The Solution: Implement production-grade query features with proper indexing.
Why Advanced Queries Matter
Real scenario: You have 5,000 tasks in your database.
Without these features:
# User clicks "View Tasks"
tasks = db.query(Task).all() # Loads ALL 5,000 tasks!
# Result:
# - 3 second page load
# - 5MB data transfer
# - Angry users
With these features:
# User searches "meeting", page 1
tasks = db.query(Task)\
.filter(Task.title.ilike("%meeting%"))\
.order_by(desc(Task.created_at))\
.limit(20)
# Result:
# - 0.05 second response
# - 50KB data transfer
# - Happy users
This is the difference between a toy and a product.
What We're Building
Query Features:
- Filtering - Get only completed or pending tasks
- Sorting - Order by date, title, or status
- Pagination - Load 20 tasks at a time
- Search - Find tasks by keyword
- Combined - Use all features together
Statistics:
- Total task count
- Completion rate
- Tasks created today/this week
Performance:
- Database indexes for speed
- Query constraints for safety
Understanding Query Parameters
Query parameters are the ?key=value parts of URLs:
https://api.example.com/tasks?completed=false&sort_by=created_at&limit=20
↑ ↑ ↑
Filter Sort Paginate
In FastAPI:
from fastapi import Query
@app.get("/tasks")
def get_tasks(
completed: Optional[bool] = Query(None),
sort_by: str = Query("created_at"),
skip: int = Query(0, ge=0),
limit: int = Query(20, ge=1, le=100)
):
# FastAPI auto-extracts and validates!
What FastAPI does automatically:
- Extracts
completedfrom URL - Converts "true" string → Python
Trueboolean - Validates
skip >= 0andlimitbetween 1-100 - Returns 422 error if validation fails
Beautiful!
Step-by-Step Implementation
Step 1: Query Parameter Setup
from fastapi import Query
from typing import Optional
@router.get("/", response_model=List[schemas.TaskResponse])
def get_tasks(
# Filtering
completed: Optional[bool] = Query(None, description="Filter by status"),
# Sorting
sort_by: str = Query("created_at", description="Field to sort by"),
order: str = Query("desc", description="asc or desc"),
# Pagination
skip: int = Query(0, ge=0, description="Records to skip"),
limit: int = Query(20, ge=1, le=100, description="Max records"),
# Search
search: Optional[str] = Query(None, description="Search term"),
db: Session = Depends(get_db),
current_user: models.User = Depends(get_current_user)
):
# Implementation below...
Query() parameters explained:
-
Query(None)- Optional, defaults to None -
Query(0, ge=0)- Defaults to 0, must be >= 0 -
Query(20, ge=1, le=100)- Defaults to 20, must be 1-100
Why constrain limit to 100?
Without constraints:
GET /tasks?limit=1000000 # User requests 1 million records!
# Server crashes, database overloaded
With constraints:
limit: int = Query(20, ge=1, le=100)
GET /tasks?limit=1000000 # FastAPI returns 422 error
# Server protected!
Step 2: Building the Query
The magic of query chaining:
# Start with base query
query = db.query(models.Task).filter(models.Task.owner_id == current_user.id)
# Add filter if provided
if completed is not None:
query = query.filter(models.Task.completed == completed)
# Add search if provided
if search:
search_term = f"%{search}%"
query = query.filter(
or_(
models.Task.title.ilike(search_term),
models.Task.description.ilike(search_term)
)
)
# Add sorting
valid_sort_fields = ["created_at", "updated_at", "title", "completed"]
if sort_by not in valid_sort_fields:
sort_by = "created_at"
column = getattr(models.Task, sort_by)
if order == "desc":
query = query.order_by(desc(column))
else:
query = query.order_by(asc(column))
# Add pagination
query = query.offset(skip).limit(limit)
# Execute query (only NOW does it hit database!)
return query.all()
Key insight: Query isn't executed until .all() is called!
This means:
query = db.query(Task) # No SQL yet
query = query.filter(...) # Still no SQL
query = query.order_by(...) # Still building
tasks = query.all() # NOW executes SQL!
Benefit: SQLAlchemy generates one optimized SQL query, not multiple.
Step 3: Understanding Filtering
Basic filtering:
if completed is not None:
query = query.filter(models.Task.completed == completed)
Why is not None?
Consider these requests:
GET /tasks # completed = None (don't filter)
GET /tasks?completed=true # completed = True (filter completed)
GET /tasks?completed=false # completed = False (filter pending)
Without is not None:
# Bad
if completed:
query = query.filter(Task.completed == completed)
# GET /tasks?completed=false
# completed = False
# if False: -> doesn't execute!
# Filter not applied! Bug!
With is not None:
# Good
if completed is not None:
query = query.filter(Task.completed == completed)
# GET /tasks?completed=false
# completed = False
# if False is not None: -> True
# Filter applied correctly!
Step 4: Implementing Search
Case-insensitive search:
if search:
search_term = f"%{search}%"
query = query.filter(
or_(
models.Task.title.ilike(search_term),
models.Task.description.ilike(search_term)
)
)
Breaking it down:
ILIKE (case-insensitive LIKE):
Task.title.ilike("%meeting%")
Matches:
- "Team Meeting" ✅
- "meeting notes" ✅
- "MEETING" ✅
- "Status meeting tomorrow" ✅
Doesn't match:
- "Project discussion" ❌
The % wildcards:
-
meeting%- Starts with "meeting" -
%meeting- Ends with "meeting" -
%meeting%- Contains "meeting" anywhere
OR condition:
or_(
Task.title.ilike("%meeting%"),
Task.description.ilike("%meeting%")
)
SQL generated:
WHERE (title ILIKE '%meeting%' OR description ILIKE '%meeting%')
Searches both title AND description!
Step 5: Dynamic Sorting
valid_sort_fields = ["created_at", "updated_at", "title", "completed"]
if sort_by not in valid_sort_fields:
sort_by = "created_at" # Fallback to safe default
column = getattr(models.Task, sort_by)
if order == "desc":
query = query.order_by(desc(column))
else:
query = query.order_by(asc(column))
Why validate sort fields?
Without validation (DANGEROUS):
# User sends malicious request
GET /tasks?sort_by=hashed_password
column = getattr(Task, "hashed_password") # Leaks password hashes!
query.order_by(column)
With validation (SAFE):
valid_fields = ["created_at", "title", "completed"]
if sort_by not in valid_fields:
sort_by = "created_at" # Use safe default
# User sends malicious request
GET /tasks?sort_by=hashed_password
# sort_by not in valid_fields
# Defaults to "created_at"
# Attack prevented!
Using getattr dynamically:
# Static (hardcoded)
if sort_by == "title":
query.order_by(Task.title)
elif sort_by == "created_at":
query.order_by(Task.created_at)
# ... many elif statements
# Dynamic (elegant)
column = getattr(Task, sort_by)
query.order_by(column)
Step 6: Pagination
query = query.offset(skip).limit(limit)
How pagination works:
# Page 1 (records 1-20)
skip=0, limit=20 # SELECT * FROM tasks LIMIT 20 OFFSET 0
# Page 2 (records 21-40)
skip=20, limit=20 # SELECT * FROM tasks LIMIT 20 OFFSET 20
# Page 3 (records 41-60)
skip=40, limit=20 # SELECT * FROM tasks LIMIT 20 OFFSET 40
Calculating skip:
page_number = 3
page_size = 20
skip = (page_number - 1) * page_size # (3-1) * 20 = 40
Frontend implementation:
// React example
const [page, setPage] = useState(1);
const pageSize = 20;
fetch(`/tasks?skip=${(page - 1) * pageSize}&limit=${pageSize}`)
Step 7: Statistics Endpoint
@router.get("/stats", response_model=schemas.TaskStatsResponse)
def get_task_stats(
db: Session = Depends(get_db),
current_user: models.User = Depends(get_current_user)
):
# Base query
user_tasks = db.query(models.Task).filter(
models.Task.owner_id == current_user.id
)
# Total count
total = user_tasks.count()
# Completed count
completed = user_tasks.filter(models.Task.completed == True).count()
# Pending count
pending = total - completed
# Completion rate (prevent division by zero!)
completion_rate = (completed / total * 100) if total else 0
# Tasks created today
today_start = datetime.utcnow().replace(
hour=0, minute=0, second=0, microsecond=0
)
created_today = user_tasks.filter(
models.Task.created_at >= today_start
).count()
# Tasks created this week
week_start = today_start - timedelta(days=today_start.weekday())
created_this_week = user_tasks.filter(
models.Task.created_at >= week_start
).count()
return {
"total": total,
"completed": completed,
"pending": pending,
"completion_rate": round(completion_rate, 2),
"created_today": created_today,
"created_this_week": created_this_week
}
Understanding the date calculations:
Today start (midnight):
today_start = datetime.utcnow().replace(hour=0, minute=0, second=0)
# If now is: 2026-03-05 14:30:00
# today_start: 2026-03-05 00:00:00
Week start (Monday):
# weekday() returns 0=Monday, 6=Sunday
today_start.weekday() # If Tuesday: returns 1
week_start = today_start - timedelta(days=1)
# Tuesday - 1 day = Monday
Edge case handling:
# Bad (crashes on empty task list)
completion_rate = (completed / total * 100) # ZeroDivisionError!
# Good (handles zero total)
completion_rate = (completed / total * 100) if total else 0
Step 8: Database Indexing
Update models.py:
class Task(Base):
__tablename__ = "tasks"
# Add index=True to frequently queried columns
title = Column(String(100), nullable=False, index=True)
completed = Column(Boolean, default=False, index=True)
created_at = Column(DateTime, default=datetime.utcnow, index=True)
owner_id = Column(UUID(as_uuid=True), ForeignKey("users.id"), index=True)
# Composite indexes for common query patterns
__table_args__ = (
Index('idx_owner_completed', 'owner_id', 'completed'),
Index('idx_owner_created', 'owner_id', 'created_at'),
)
Why indexes matter:
Without index:
SELECT * FROM tasks WHERE completed = false;
-- Database scans every row (full table scan)
-- 1,000 rows: ~50ms
-- 100,000 rows: ~5,000ms (5 seconds!)
With index:
SELECT * FROM tasks WHERE completed = false;
-- Database uses index (binary search)
-- 1,000 rows: ~5ms
-- 100,000 rows: ~50ms
~100x faster with indexes!
Composite indexes:
Index('idx_owner_completed', 'owner_id', 'completed')
Optimizes queries like:
WHERE owner_id = 'uuid' AND completed = false
Common in our API:
query.filter(Task.owner_id == current_user.id, Task.completed == False)
Index trade-offs:
Pros:
- ✅ Much faster queries
- ✅ Better user experience
Cons:
- ❌ Slightly slower writes (index must be updated)
- ❌ More disk space (~10-30% overhead)
Worth it for read-heavy applications!
Testing the Advanced Features
Test 1: Filtering
# Get pending tasks
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?completed=false"
# Get completed tasks
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?completed=true"
Test 2: Sorting
# Newest first
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?sort_by=created_at&order=desc"
# Alphabetical by title
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?sort_by=title&order=asc"
Test 3: Search
# Find tasks with "meeting"
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?search=meeting"
# Case doesn't matter!
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?search=MEETING" # Same results
Test 4: Pagination
# First page
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?skip=0&limit=20"
# Second page
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?skip=20&limit=20"
Test 5: Combined Query
# Pending tasks with "project", sorted by date, page 1
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks?completed=false&search=project&sort_by=created_at&order=desc&limit=10"
Test 6: Statistics
curl -H "Authorization: Bearer TOKEN" \
"http://localhost:8000/tasks/stats"
Response:
{
"total": 25,
"completed": 10,
"pending": 15,
"completion_rate": 40.0,
"created_today": 3,
"created_this_week": 8
}
Performance Comparison
Before (Day 3):
# Get all tasks (no pagination)
SELECT * FROM tasks WHERE owner_id = 'uuid'
# Returns: 1,000 tasks
# Time: 500ms
# Data transfer: 500KB
After (Day 4):
# Get filtered, sorted, paginated tasks
SELECT * FROM tasks
WHERE owner_id = 'uuid' AND completed = false
ORDER BY created_at DESC
LIMIT 20 OFFSET 0
# Returns: 20 tasks
# Time: 50ms (with indexes)
# Data transfer: 10KB
Improvement: 10x faster, 50x less data!
Real-World Impact
Before advanced queries:
- User has 5,000 tasks
- Page loads all 5,000 (slow, crashes mobile apps)
- Can't find specific tasks
- No insights into productivity
After advanced queries:
- Loads 20 tasks at a time (fast on any device)
- Search finds tasks instantly
- Filter shows only what's needed
- Statistics provide insights
This transforms the API from proof-of-concept to product.
Common Mistakes I Avoided
Mistake 1: Not Checking completed is not None
# Bad - breaks for completed=false
if completed:
query.filter(Task.completed == completed)
# Good - works for all cases
if completed is not None:
query.filter(Task.completed == completed)
Mistake 2: No Sort Field Validation
# Bad - SQL injection risk
column = getattr(Task, sort_by) # User could send "password"
# Good - whitelist validation
valid_fields = ["created_at", "title"]
if sort_by not in valid_fields:
sort_by = "created_at"
column = getattr(Task, sort_by)
Mistake 3: Unlimited Pagination
# Bad - user can request millions of records
limit: int = Query(20)
# Good - constrain to reasonable max
limit: int = Query(20, ge=1, le=100)
Mistake 4: Case-Sensitive Search
# Bad - won't match "Meeting" when searching "meeting"
Task.title.like(f"%{search}%")
# Good - case-insensitive
Task.title.ilike(f"%{search}%")
Mistake 5: Division by Zero
# Bad - crashes on empty task list
completion_rate = (completed / total * 100)
# Good - handle zero case
completion_rate = (completed / total * 100) if total else 0
What I Learned
Technical Skills
✅ Advanced SQLAlchemy query composition
✅ FastAPI Query parameter validation
✅ Database indexing strategies
✅ Pagination implementation
✅ Case-insensitive search (ILIKE)
✅ Aggregation queries (COUNT, date filters)
✅ Dynamic column access with getattr
Conceptual Understanding
✅ Query chaining and lazy evaluation
✅ Index trade-offs (speed vs space)
✅ Security via input validation
✅ Edge case handling (zero division)
✅ Performance optimization strategies
API Design Patterns
✅ Standard query parameter conventions
✅ Pagination with skip/limit
✅ Multi-field sorting
✅ Combined query parameters
✅ Statistics endpoints for analytics
The "Aha!" Moments
1. Queries aren't executed until .all()
Building queries is cheap. Executing is expensive. Chain operations freely!
2. Indexes are like book indexes
Without index: Read every page to find "PostgreSQL"
With index: Jump directly to pages with "PostgreSQL"
3. Validation prevents attacks
Whitelisting sort fields isn't just best practice - it's security!
4. Query parameters compose beautifully
?completed=false&search=meeting&sort_by=created_at
Each parameter is independent but they work together!
Tomorrow's Plans
Day 5 will add:
- Task categories/tags (many-to-many relationship)
- Tag-based filtering
- Bulk operations (delete multiple, update multiple)
- Export functionality (CSV download)
But today? Today I celebrate building a production-ready query system that handles scale! 🎉
Resources That Helped
Full Code
Complete code on GitHub:
100-days-of-python/days/004
Time Investment: 2.5 hours
Knowledge Gained: Production query optimization skills
Feeling: Like I built something that actually scales 📊
Day 4/100 complete! ✅
Tomorrow: Many-to-many relationships and bulk operations
Following my journey?
📝 Blog | 🐦 Twitter | 💼 LinkedIn
Tags: #100DaysOfCode #Python #FastAPI #PostgreSQL #SQLAlchemy #BackendDevelopment #DatabaseOptimization #APIDesign #PerformanceTuning #LearningInPublic
Top comments (0)