Marc Newstead

Posted on May 18

Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting

#ai #computervision #security #python

Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting

If you're building user-generated content platforms, marketplace verification systems, or anything that ingests images from untrusted sources, you've probably noticed the synthetic media problem getting worse. Gen-AI tools have become good enough that casual users can't spot the fakes anymore.

But here's the thing: the physics still breaks. And if you know what to look for, you can build surprisingly effective validation layers using simple computer vision techniques.

The Shadow Problem: Your First Line of Defence

In any real photograph, shadows share a common light source geometry. This is basic physics—light travels in straight lines. When AI image generators compose scenes from learned patterns rather than simulated physics, they consistently mess this up.

Here's what you can check programmatically:

1. Shadow Direction Consistency

Extract shadow vectors across different objects in the scene. In a genuine photo, these should converge toward a common vanishing point (the light source).

# Pseudocode for shadow vector analysis
def analyse_shadow_consistency(image):
    objects = segment_objects(image)
    shadow_vectors = []

    for obj in objects:
        if obj.has_shadow():
            vector = calculate_shadow_angle(obj, obj.shadow)
            shadow_vectors.append(vector)

    # Check if vectors converge within tolerance
    convergence_score = calculate_convergence(shadow_vectors)
    return convergence_score > THRESHOLD

This won't catch everything, but it'll flag a surprising number of AI-generated images, especially from earlier-generation models or rushed prompts.

2. Shadow Intensity Relative to Distance

Shadows should soften and lighten with distance from the object casting them. AI generators often produce shadows with uniform intensity throughout, or inconsistent softness between objects at similar distances.

You can measure this with edge detection and gradient analysis across shadow boundaries.

Reflection Geometry: The Silent Tell

Reflections are even harder for generators to get right. Water reflections, glass surfaces, and metallic objects all follow strict geometric rules.

What to check:

Reflections should be vertically symmetrical (for horizontal surfaces)
Reflection angles must match viewing angles
Environmental lighting in reflections should match the scene lighting

A simple geometric validation:

def validate_reflection(surface_region, reflected_region):
    # Extract key points from both regions
    surface_features = extract_keypoints(surface_region)
    reflection_features = extract_keypoints(reflected_region)

    # Check if reflection obeys mirror symmetry
    symmetry_score = calculate_mirror_symmetry(
        surface_features, 
        reflection_features
    )

    return symmetry_score > MIN_SYMMETRY_THRESHOLD

Where This Actually Matters in Production

If you're building:

Marketplace verification systems — Preventing fake product photos that don't represent real inventory. One e-commerce platform I know ran into this when sellers started using AI to generate "lifestyle" product shots that looked professional but didn't match the actual items.

Content moderation pipelines — Flagging synthetic profile pictures in identity verification flows, or detecting manipulated images in insurance claims.

Media asset management — Automatically tagging AI-generated images in your DAM system so teams know what they're working with.

You don't need perfect detection. You need a confidence score that feeds into your review queue prioritisation.

The Detection Tool Ecosystem

Before you roll your own, know what exists:

Hive Moderation and Illuminarty offer API-based detection with probabilistic scoring
Google's SynthID watermarks AI-generated content at generation time (only useful if the generator cooperates)
Open-source models like the ones from Hugging Face give you full control but require more infrastructure

These tools are useful as part of a layered approach, but they're not silver bullets. False positive rates are still high enough that you'll need human review for edge cases.

For a deeper look at the specific physical tells and how they manifest across different generators, spotting AI-generated imagery has become a critical organisational competency.

Building a Practical Validation Layer

Here's a sensible architecture:

Fast rejection filters — Basic checks for impossible lighting/shadows using CV libraries (OpenCV, scikit-image)
API-based scoring — Send suspicious images to a detection service
Human review queue — Images above a certain suspicion threshold get reviewed
Feedback loop — Feed confirmed cases back into your filters

async def validate_uploaded_image(image_file):
    # Layer 1: Fast physics checks
    physics_score = await check_lighting_consistency(image_file)

    if physics_score < 0.3:  # Clearly fake
        return {"status": "rejected", "reason": "lighting_anomaly"}

    if physics_score > 0.8:  # Probably real
        return {"status": "approved", "confidence": physics_score}

    # Layer 2: API-based detection for ambiguous cases
    api_score = await detection_service.analyse(image_file)

    if api_score < 0.5:
        return {"status": "review_queue", "scores": {"physics": physics_score, "api": api_score}}

    return {"status": "approved", "confidence": min(physics_score, api_score)}

The Bottom Line

You can't stop AI-generated images from being uploaded. But you can build systems that flag the physically impossible ones before they cause problems downstream. If you're working on platforms where image authenticity matters—or you're helping clients navigate this space through AI automation and software development—basic physics checks should be in your validation pipeline.

The generators will get better. But until they're simulating actual light physics rather than pattern-matching, the tells will remain.

DEV Community

Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting

Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting

The Shadow Problem: Your First Line of Defence

1. Shadow Direction Consistency

2. Shadow Intensity Relative to Distance

Reflection Geometry: The Silent Tell

Where This Actually Matters in Production

The Detection Tool Ecosystem

Building a Practical Validation Layer

The Bottom Line

Top comments (0)