Optimizing Django Generic Relations: From N+1 to 2 Queries
Problem
Generic Relations cause N+1 queries.
class Enrollment(Model):
user = ForeignKey(User, CASCADE)
content_type = ForeignKey(ContentType, CASCADE)
content_id = CharField(max_length=36)
content = GenericForeignKey("content_type", "content_id")
enrollments = Enrollment.objects.filter(user=user)
for e in enrollments:
print(e.content.title) # Query for each!
print(e.content.owner.name) # Another query for each!
10 enrollments = 21 queries (1 + 10 + 10)
Common Solutions (still not optimal)
1. prefetch_related
Enrollment.objects.prefetch_related('content', 'content__owner')
Doesn't work well with Generic FK:
- Query 1: enrollments
- Query 2: courses
- Query 3: media
- Query 4: course owners
- Query 5: media owners
= 5 queries
Better than N+1, but not optimal.
2. Manual bulk fetch
enrollments = list(Enrollment.objects.filter(user=user))
course_ids = [e.content_id for e in enrollments if e.content_type.model == 'course']
media_ids = [e.content_id for e in enrollments if e.content_type.model == 'media']
courses = Course.objects.filter(id__in=course_ids).select_related('owner')
media = Media.objects.filter(id__in=media_ids).select_related('owner')
Still 3-4 queries (1 + models + owner queries).
Solution: Union + JSONObject
Key idea: Combine all content types into a single query.
Why Union?
Different models (Course, Media) can be fetched in one query if they return the same fields.
Why JSONObject?
Union with .values() can't use select_related() for deep objects.
JSONObject lets us include related data (like owner) in the same query.
Without JSONObject:
Query 1: Union(courses, media)
Query 2: Get owners
With JSONObject:
Query 1: Union(courses + owners, media + owners)
Step 1: Group by content type
enrollments = list(Enrollment.objects.filter(user=user))
content_ids = defaultdict(set)
for e in enrollments:
key = (e.content_type.app_label, e.content_type.model)
content_ids[key].add(e.content_id)
Step 2: Build union query with JSONObject
JSONObject embeds related objects as JSON:
from django.db.models import F
from django.db.models.functions import JSONObject
union_qs = []
for Model in [Course, Media, Exam]:
key = (Model._meta.app_label, Model._meta.model.__name__.lower())
ids = content_ids.get(key)
if not ids:
continue
union_qs.append(
Model.objects
.filter(id__in=ids)
.annotate(
owner_obj=JSONObject( # Pack owner fields into JSON
id=F("owner__id"),
name=F("owner__name"),
email=F("owner__email"),
avatar=F("owner__avatar")
)
)
.values(
"id",
"title",
"description",
"thumbnail",
"owner_obj" # Include packed owner
)
)
# Single query for all content types
qs = union_qs[0].union(*union_qs[1:])
contents = {content["id"]: content for content in qs}
Union requires all queries to return the same fields.
JSONObject packs related data into a single field so union works.
Step 3: Attach to enrollments
Reconstruct objects from JSON:
for e in enrollments:
content_data = contents.get(e.content_id)
if not content_data:
continue
# Reconstruct owner from JSON
owner = User(**content_data.pop("owner_obj"))
# Reconstruct content
Model = get_model(e.content_type.app_label, e.content_type.model)
e._content_cache = Model(**content_data, owner=owner)
Complete Example
@classmethod
async def get_enrolled(cls, *, user_id: str, page: int, size: int):
# Query 1: Get enrollments
enrollments = await cls.objects.filter(user_id=user_id).to_list()
if not enrollments:
return []
# Group by content type
content_ids = defaultdict(set)
for e in enrollments:
key = (e.content_type.app_label, e.content_type.model)
content_ids[key].add(e.content_id)
# Query 2: Union fetch with deep objects
union_qs = []
for Model in ENROLLABLE_MODELS:
key = (Model._meta.app_label, Model._meta.model.__name__.lower())
ids = content_ids.get(key)
if not ids:
continue
union_qs.append(
Model.objects
.filter(id__in=ids)
.annotate(
owner_obj=JSONObject(
id=F("owner__id"),
name=F("owner__name"),
email=F("owner__email")
)
)
.values("id", "title", "description", "owner_obj")
)
qs = union_qs[0].union(*union_qs[1:])
contents = {c["id"]: c async for c in qs}
# Attach
for e in enrollments:
content_data = contents.get(e.content_id)
if content_data:
owner = User(**content_data.pop("owner_obj"))
Model = get_model_class(e.content_type)
e._content_cache = Model(**content_data, owner=owner)
return enrollments
Result
10 enrollments (3 courses, 7 media, each with owner):
N+1: 21 queries
├─ Query 1: enrollments
├─ Query 2-11: each content
└─ Query 12-21: each owner
prefetch_related: 5 queries
├─ Query 1: enrollments
├─ Query 2: courses
├─ Query 3: media
├─ Query 4: course owners
└─ Query 5: media owners
Union + JSONObject: 2 queries
├─ Query 1: enrollments
└─ Query 2: union(courses+owners, media+owners)
Key Techniques
Union for multiple models:
Course.objects.values(...).union(Media.objects.values(...))
Combines different models into one query. Requires same field structure.
JSONObject for deep relations:
.annotate(
owner_obj=JSONObject(
id=F("owner__id"),
name=F("owner__name")
)
)
Packs related object fields into JSON so union works.
Reconstruct from JSON:
owner = User(**content_data.pop("owner_obj"))
content = Model(**content_data, owner=owner)
Unpack JSON back into Django model instances.
When to Use
- Generic Relations with multiple models
- Need deep object data (related objects)
- High-traffic queries
- Enrollment, notification, activity feed patterns
Trade-offs
Pros:
- Minimal queries (2 instead of N+1)
- Works with deep objects
- Scales with content types
Cons:
- More complex code
- Manual object reconstruction
- Union requires same field structure
For simple cases, prefetch_related is fine.
For performance-critical Generic FK queries, this is worth it.
Top comments (0)