DEV Community

Cover image for Turns Out "Made with AI" Detection Is Just Reading a Text File
Kumar Kislay
Kumar Kislay

Posted on

Turns Out "Made with AI" Detection Is Just Reading a Text File

This post was originally written on https://forg.to/articles/turns-out-made-with-ai-detection-is-just-reading-a-text-file

Two hours of my life spent assuming there was a GPU cluster somewhere running an image classifier.
There wasn't.
It's reading metadata. That's it. And I shipped the same thing on forg.to in an afternoon with zero new dependencies and zero added cost.
Here's the actual story.


Why forg.to needed this
forg.to is where indie hackers and builders share their product journey. Real wins, real losses, real progress.
As AI image generation got good, posts started showing up with suspiciously polished mockups. Nobody was lying. But context was missing. Was this a real screenshot or a Firefly render?
I wanted a small badge on AI-generated images. Hover over it, see which tool made it. Transparent, automatic, zero friction for the person posting.
Two hard constraints going in: the site stays fast, and this costs nothing. Those two constraints ruled out almost every obvious solution.
So I went digging into how the badge actually works.


C2PA: the boring open standard doing all the work
There is an open standard called C2PA, backed by Adobe, Microsoft, Google, OpenAI, and others. The basic idea: when an AI tool generates an image, it embeds a cryptographically signed manifest into the file recording how the image was made, which tool made it, and when.
This manifest travels with the file. Download it, share it, post it anywhere. The manifest stays embedded.
The official way to read these manifests is a library from Adobe called c2pa-js. It uses WebAssembly to parse them in the browser. Capable, well-documented, and 1.5MB of WASM.
I almost used it. Then I asked: what does that WASM actually buy me?
Full cryptographic verification of the provenance chain. Genuinely useful for Adobe's enterprise customers and publishers worried about deepfakes.
For a community feed? I just need to know: was this image made by an AI tool? I don't need to verify the cryptographic signature. I need to read a text field.
That's when I looked at what C2PA actually writes to the file.


It's just XMP
JPEG and PNG files are not just pixel data. They carry metadata segments, chunks of text embedded alongside the image data. The most common format is XMP, a subset of XML written into an APP1 marker segment near the start of most image files.
XMP is how Lightroom stores your star ratings. How your phone records GPS coordinates. And how C2PA-compliant AI tools record their provenance.
Open a DALL-E 3 generated image in a hex editor. Scroll past the JPEG header. You'll find something like this:
xml
<?xpacket begin="..." id="W5M0MpCehiHzreSzNTczkc9d"?>



c2pa:claim_generatorc2pa-rs/0.28.0 OpenAI/1.0/c2pa:claim_generator
/rdf:Description
/rdf:RDF
/x:xmpmeta
<?xpacket end="w"?>

That http://c2pa.org namespace is the fingerprint. If it's there, the image has Content Credentials.
Even AI tools that don't implement full C2PA often write standard XMP fields:
xml
xmp:CreatorToolAdobe Firefly/xmp:CreatorTool
photoshop:SourceAdobe Generative Fill/photoshop:Source

No WASM. No cryptography. No external library. Just ArrayBuffer, TextDecoder, and a regex.


The implementation
I created src/lib/utils/detectAiImage.ts. A standalone function that takes a File and returns whether it's AI-generated and which tool made it.
typescript
export async function detectAiImage(file: File): Promise {
try {
const slice = file.slice(0, 65536); // Read first 64KB only
const buffer = await slice.arrayBuffer();
const raw = new TextDecoder("latin1").decode(buffer);
const start = raw.indexOf("<?xpacket begin");
if (start === -1) return null;
const end = raw.indexOf("<?xpacket end", start);
const xmp = end !== -1 ? raw.slice(start, end + 40) : raw.slice(start, start + 32768);
if (xmp.includes("c2pa.org") || xmp.includes("contentauth")) {
const generator =
extractText(xmp, "claim_generator") ??
extractText(xmp, "CreatorTool") ??
extractText(xmp, "Software");
const toolName = generator ? (matchAiTool(generator) ?? generator.slice(0, 50)) : undefined;
return { isAi: true, toolName };
}
const candidates = [
extractText(xmp, "CreatorTool"),
extractText(xmp, "Software"),
extractText(xmp, "Generator"),
extractText(xmp, "Source"),
].filter(Boolean) as string[];
for (const val of candidates) {
const toolName = matchAiTool(val);
if (toolName) return { isAi: true, toolName };
}
return null;
} catch {
return null;
}
}

Three decisions worth explaining:
Why latin1 encoding? XMP is UTF-8 XML, but we're reading a binary JPEG that mixes binary pixel data with text metadata. Latin-1 maps every byte 0–255 to a Unicode code point 1:1, so indexOf and slice work correctly across both sections without byte corruption. We only care about ASCII characters in the XMP namespace URLs anyway.
Why 64KB? XMP always appears in the first APP1 segment of a JPEG, right after the SOI marker. It's virtually always within the first few kilobytes. 64KB is generous headroom for unusual files while reading about 1% of a typical 5MB image.
Why file.slice()? File.slice() creates a new Blob pointing to the same underlying buffer. It doesn't copy bytes. The arrayBuffer() call only reads those 64KB. For a 10MB image, we process 0.6% of it.
The matchAiTool function tests the extracted string against known patterns:
typescript
const AI_TOOL_PATTERNS = [
{ pattern: /firefly/i, name: "Adobe Firefly" },
{ pattern: /dall[-\s]?e/i, name: "DALL-E" },
{ pattern: /midjourney/i, name: "Midjourney" },
{ pattern: /stable[\s-]?diffusion/i, name: "Stable Diffusion" },
{ pattern: /image\s*creator/i, name: "Microsoft Image Creator" },
{ pattern: /generative\s*fill/i, name: "Adobe Generative Fill" },
{ pattern: /canva/i, name: "Canva AI" },
{ pattern: /ideogram/i, name: "Ideogram" },
{ pattern: /flux/i, name: "Flux" },
];

If no pattern matches but the image has a C2PA namespace, we still show the raw claim_generator value truncated to 50 characters. That way we don't silently drop data from tools we haven't added yet.


The timing problem that would have broken everything
forg.to compresses images before uploading. Before any image hits Cloudinary, it goes through a canvas-based compressor that re-renders it at reduced quality.
When a JPEG goes through canvas.toDataURL(), it becomes a new JPEG. The canvas knows nothing about XMP. It writes pixels. All metadata is gone.
So detection has to happen before compression, on the original File object, the moment the user selects it.
This is where we hook in:
typescript
let insertOffset = imageFiles.length;
for (const file of files.slice(0, 4 - imageFiles.length)) {
if (!file.type.startsWith("image/")) continue;
setImageFiles((prev) => [...prev, file]);
setImagePreviews((prev) => [...prev, URL.createObjectURL(file)]);
const idx = insertOffset++;
setImageAiDetails((prev) => [...prev, null]);
import("@/lib/utils/detectAiImage").then(({ detectAiImage }) =>
detectAiImage(file).then((result) =>
setImageAiDetails((prev) => {
const next = [...prev];
next[idx] = result;
return next;
})
)
);
}

The dynamic import() is intentional. The detection module loads lazily, only the first time a user opens the composer and picks an image. It adds zero bytes to the initial JS bundle. Subsequent detections use the cached module.
The insertOffset variable solves a subtle race condition: if a user selects three images at once, imageFiles.length reads the same stale value across all three loop iterations because React batches state updates. Tracking the offset manually gives each file the correct array index.
Detection runs async without blocking the UI. If a file has no XMP or the scan throws, null is stored and no badge appears. Silent failures are correct behavior for an optional feature.


Persisting it through the stack
Detection is useless if we throw the result away before saving the post.
The type:
typescript
media?: {
type: "image" | "video" | "gif";
url: string;
aiDetails?: { isAi: boolean; toolName?: string };
}[];

The Mongoose schema:
typescript
media: {
type: [{
type: { type: String, enum: ["image", "video", "gif"] },
url: String,
aiDetails: {
isAi: { type: Boolean },
toolName: { type: String },
},
}],
default: [],
},

No migration needed. MongoDB handles new optional fields on existing documents gracefully. Documents without aiDetails just return undefined, and the badge code already guards against that with optional chaining.
The crosspost service merges AI details into each media object at write time:
typescript
if (mediaUrls && mediaUrls.length > 0) {
updateData.media = mediaUrls.map((url, i) => ({
type: 'image',
url,
...(mediaAiDetails?.[i] ? { aiDetails: mediaAiDetails[i] } : {}),
}));
}

If there are no AI details, the field is simply not written to the document. No aiDetails: null noise in the database.


What this actually costs
SurfaceCostInitial page load0ms, 0KB. The utility is never imported until the composer opens.Feed rendering~0ms. A conditional div and a Tooltip, only rendered if aiDetails.isAi is true.Image selection (first time)20-50ms. Module import plus reading 64KB from a local File object.Image selection (after first)5-15ms. Module is cached, just the file read.Database~0ms overhead. Two optional fields on an existing document, no new indexes.
The 20–50ms on first image selection runs off the main thread on the async microtask queue. The user sees zero visual delay.


What it catches and what it doesn't
Detected: any image with a c2pa.org XMP namespace, DALL-E 3, Adobe Firefly, Adobe Generative Fill, Midjourney (when exported with metadata), Stable Diffusion (with metadata-preserving exporters), Microsoft Image Creator, Canva AI, Ideogram, Flux, and anything new via the raw claim_generator fallback.
Not detected: screenshots of AI images (a screenshot is a new file, no XMP from the original), images run through a compressor or re-saved before posting, AI images from tools that don't write C2PA or XMP metadata.
This is the same fundamental limitation any metadata-based approach has. The honest framing: this is a transparency signal for users who generated an image and posted the original file. It's not a detection system for bad actors. Anyone who wants to hide the origin can screenshot it. But users who just generated something with Firefly and posted it directly? They get the badge automatically, without doing anything.
That's the right tradeoff.


The final diff
6 files. ~150 lines. Zero new dependencies. Zero cost. Zero performance impact.
The engineering instinct when you see a polished feature on a big platform is to assume there's a serious ML pipeline behind it. Sometimes there is. Often, though, the interesting features are built on boring standards that have been sitting in files on your hard drive for years.
C2PA isn't new. XMP isn't new. JPEG metadata segments aren't new. What changed is that major AI labs started writing to them, because an open standard for provenance is genuinely in everyone's interest as AI-generated content becomes ubiquitous.
We got this feature for free by reading what those tools already wrote.
Next time you see something on a big platform and think "that must require serious infrastructure," open the spec. Sometimes it's just a text file inside a JPEG.

Top comments (0)