Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting
If you're building user-generated content platforms, marketplace verification systems, or anything that ingests images from untrusted sources, you've probably noticed the synthetic media problem getting worse. Gen-AI tools have become good enough that casual users can't spot the fakes anymore.
But here's the thing: the physics still breaks. And if you know what to look for, you can build surprisingly effective validation layers using simple computer vision techniques.
The Shadow Problem: Your First Line of Defence
In any real photograph, shadows share a common light source geometry. This is basic physics—light travels in straight lines. When AI image generators compose scenes from learned patterns rather than simulated physics, they consistently mess this up.
Here's what you can check programmatically:
1. Shadow Direction Consistency
Extract shadow vectors across different objects in the scene. In a genuine photo, these should converge toward a common vanishing point (the light source).
# Pseudocode for shadow vector analysis
def analyse_shadow_consistency(image):
objects = segment_objects(image)
shadow_vectors = []
for obj in objects:
if obj.has_shadow():
vector = calculate_shadow_angle(obj, obj.shadow)
shadow_vectors.append(vector)
# Check if vectors converge within tolerance
convergence_score = calculate_convergence(shadow_vectors)
return convergence_score > THRESHOLD
This won't catch everything, but it'll flag a surprising number of AI-generated images, especially from earlier-generation models or rushed prompts.
2. Shadow Intensity Relative to Distance
Shadows should soften and lighten with distance from the object casting them. AI generators often produce shadows with uniform intensity throughout, or inconsistent softness between objects at similar distances.
You can measure this with edge detection and gradient analysis across shadow boundaries.
Reflection Geometry: The Silent Tell
Reflections are even harder for generators to get right. Water reflections, glass surfaces, and metallic objects all follow strict geometric rules.
What to check:
- Reflections should be vertically symmetrical (for horizontal surfaces)
- Reflection angles must match viewing angles
- Environmental lighting in reflections should match the scene lighting
A simple geometric validation:
def validate_reflection(surface_region, reflected_region):
# Extract key points from both regions
surface_features = extract_keypoints(surface_region)
reflection_features = extract_keypoints(reflected_region)
# Check if reflection obeys mirror symmetry
symmetry_score = calculate_mirror_symmetry(
surface_features,
reflection_features
)
return symmetry_score > MIN_SYMMETRY_THRESHOLD
Where This Actually Matters in Production
If you're building:
Marketplace verification systems — Preventing fake product photos that don't represent real inventory. One e-commerce platform I know ran into this when sellers started using AI to generate "lifestyle" product shots that looked professional but didn't match the actual items.
Content moderation pipelines — Flagging synthetic profile pictures in identity verification flows, or detecting manipulated images in insurance claims.
Media asset management — Automatically tagging AI-generated images in your DAM system so teams know what they're working with.
You don't need perfect detection. You need a confidence score that feeds into your review queue prioritisation.
The Detection Tool Ecosystem
Before you roll your own, know what exists:
- Hive Moderation and Illuminarty offer API-based detection with probabilistic scoring
- Google's SynthID watermarks AI-generated content at generation time (only useful if the generator cooperates)
- Open-source models like the ones from Hugging Face give you full control but require more infrastructure
These tools are useful as part of a layered approach, but they're not silver bullets. False positive rates are still high enough that you'll need human review for edge cases.
For a deeper look at the specific physical tells and how they manifest across different generators, spotting AI-generated imagery has become a critical organisational competency.
Building a Practical Validation Layer
Here's a sensible architecture:
- Fast rejection filters — Basic checks for impossible lighting/shadows using CV libraries (OpenCV, scikit-image)
- API-based scoring — Send suspicious images to a detection service
- Human review queue — Images above a certain suspicion threshold get reviewed
- Feedback loop — Feed confirmed cases back into your filters
async def validate_uploaded_image(image_file):
# Layer 1: Fast physics checks
physics_score = await check_lighting_consistency(image_file)
if physics_score < 0.3: # Clearly fake
return {"status": "rejected", "reason": "lighting_anomaly"}
if physics_score > 0.8: # Probably real
return {"status": "approved", "confidence": physics_score}
# Layer 2: API-based detection for ambiguous cases
api_score = await detection_service.analyse(image_file)
if api_score < 0.5:
return {"status": "review_queue", "scores": {"physics": physics_score, "api": api_score}}
return {"status": "approved", "confidence": min(physics_score, api_score)}
The Bottom Line
You can't stop AI-generated images from being uploaded. But you can build systems that flag the physically impossible ones before they cause problems downstream. If you're working on platforms where image authenticity matters—or you're helping clients navigate this space through AI automation and software development—basic physics checks should be in your validation pipeline.
The generators will get better. But until they're simulating actual light physics rather than pattern-matching, the tells will remain.
Top comments (0)