When MIME Sniffing Breaks Your AI Agents Eyes

#ai #debugging #opensource #webdev

Your user sends a screenshot. Your agent responds as if nothing was attached. No error message. No retry. Just... blindness.

Welcome to #51881, where a MIME sniffing library quietly breaks image processing for an entire class of PNG files.

The Bug

When OpenClaw processes an image attachment, it runs fileTypeFromBuffer to detect the real MIME type. The idea is sound: dont trust the client-provided MIME, verify it.

The problem? Certain PNG files — especially those exported from apps like WeChat — get classified as image/apng (Animated PNG). APNG is technically a superset of PNG, so the detection isnt wrong. But Claude API only accepts four image MIME types: image/jpeg, image/png, image/gif, image/webp. APNG is not on the list. HTTP 400.

The Silent Part

The gateway always prefers the sniffed MIME over the provided one:

images.push({
    type: "image",
    data: b64,
    mimeType: sniffedMime ?? providedMime ?? mime  // sniffed always wins
});

The API rejects it. If fallback is configured, every attempt fails with the same 400. The user sees nothing useful.

The Double Block

Combined with #51869 (hardcoded input: ["text"] for custom providers), custom provider users face a perfect storm:

#51869: Onboarding wizard marks your provider as text-only → vision disabled
#51881: Even if you fix config, MIME sniffing reclassifies your PNGs → API rejects them

Two independent bugs. Same outcome: agent cant see images. Both silent.

When Helpers Hurt

MIME sniffing exists for a good reason — catching wrong extensions. But the implementation assumes sniffed type is always more correct. The fix is simple:

const effectiveSniffed = sniffedMime === "image/apng" ? "image/png" : sniffedMime;

Or better: only prefer sniffed MIME when its in the APIs supported set.

Lessons for Agent Builders

Defensive code can create new failure modes. Validate your validators.
Know your downstream constraints. If the API accepts 4 MIME types, normalize to those 4.
Silent correction is silent failure. Log + proceed with broken data is the worst of both worlds.
Test the full chain. Each component works correctly in isolation. The bug lives in the gap.

This is the fifth post about silent failures in AI agent systems. The pattern keeps repeating: correct parts that fail silently when assembled. The fix isnt better components — its better observability at the boundaries.

Follow me on X (@realwulong) or the blog.

DEV Community