EthanCole

Posted on Jun 18

Building an AI detector report around uncertainty

#webdev #ai #tooling

AI detection is a tempting product category to over-simplify.

A user gives you text. A model gives you a probability. The UI can easily turn that into a red label, a green label, and a false sense of certainty.

That is the product mistake I wanted to avoid while working on Detector de IA, a small Next.js detector workflow for pasted text and compatible documents. The implementation question was not just "how do I call a detector?" It was "how do I make the result useful without pretending it proves authorship?"

This article was drafted with AI assistance and manually reviewed against the current codebase and topic constraints.

The Core Shape Is A Report Pipeline

The useful abstraction is not a verdict API. It is a report pipeline:

Normalize the text input.
Preserve the source type.
Extract or read document text when needed.
Split the text into sentence objects.
Build text features that can explain reliability.
Run detection.
Align highlights back to sentences.
Return limitations with the report.

That shape matters because an AI detector result is only useful when the user can inspect how it was produced. A single score is easy to display, but it is not enough context for a careful decision.

The current report type keeps that context together: locale, source type, model, verdict, risk, reliability reasons, scores, summary, analysis bullets, sentence highlights, limitations, text features, source text, and sentences.

Document Handling Changes The Trust Boundary

Pasted text is direct. Documents are not.

For PDF and DOCX input, the browser-side extraction flow reads the file before analysis. PDF extraction uses pdfjs-dist, while DOCX extraction uses the Mammoth browser package. The extracted text is normalized, and an empty extraction becomes an explicit "no readable text" error instead of a silent low-quality report.

TXT and Markdown are handled differently. The server-side upload path accepts direct text-like formats, rejects unsupported direct uploads, reads plain text, normalizes it, and returns a source type with the file name.

That split is worth making visible in the product. A detector is not judging "the PDF" as an object. It is judging the text that made it through extraction. If a scan, table, protected file, or odd layout produces incomplete text, the report should not encourage the user to over-trust the result.

Sentence Objects Are More Useful Than Raw Highlight Strings

One practical design detail is converting text into sentence objects before building the report.

Each sentence carries an ID, text, and character count. That gives later steps something stable to reference. If a detector returns highlighted text snippets, the application can align those snippets back to sentence IDs and show the user where the signal appeared.

That is a better UX than only showing "this was 78% AI." It lets the reviewer ask concrete questions:

Which sentence triggered attention?
Does the wording change compared with nearby sentences?
Is the passage generic, unsupported, or just formal?
Does the source document extraction explain the odd wording?

Sentence-level highlighting also makes the limitation easier to state honestly: for document uploads, highlights are aligned against the extracted text shown in the report.

Reliability Needs Features, Not Just A Probability

The implementation also builds a text feature summary. It includes character count, sentence count, paragraph count, average sentence length, sentence length variance, short and long sentence ratios, character variety, repeated segment ratio, punctuation variety, and generic-marker examples.

Those features are not a substitute for detection, but they help explain why evidence can be strong or weak.

For example, a short sample with only a few sentences has less internal rhythm to compare. A long document with more sentence variety gives the report more texture. Repeated segments and generic markers can support an analysis bullet, while low sentence count can weaken the reliability note.

This is the kind of product detail that keeps an AI feature from sounding more confident than it should.

The Fallback Path Should Admit It Is A Fallback

The detector can use a primary detection service from the server runtime. If that service does not respond successfully, the report can fall back to local text-pattern signals.

The important part is not merely having a fallback. It is labeling the fallback in the summary and limitations. A backup estimate is useful for continuity, but it should not masquerade as the same signal as the primary detector response.

That pattern applies to a lot of AI product work: degradation is fine; invisible degradation is not.

Constraints Belong In The Product Copy

The topic constraints are part of the system behavior, not legal copy to hide at the bottom.

For this workflow, the text should be between 300 and 100,000 characters. Compatible files should stay under 12 MB. Requests can be rate-limited. False positives and false negatives are expected limitations. The report should not be the only basis for academic, employment, legal, or disciplinary decisions.

Those constraints make the product less dramatic, but more usable. They teach the user how to interpret the output.

What I Would Reuse In Other AI Tools

The same pattern is useful outside AI detection:

keep the raw success metric away from "proof" language;
preserve the source type and transformation path;
attach explanations to specific text spans or records;
return limitations as first-class report fields;
make fallback behavior visible;
design the UI around the user's next review step.

The goal is not to make the system look uncertain for its own sake. The goal is to make uncertainty operational. A reviewer should leave the report knowing what to inspect next.

Detector de IA is a small implementation of that posture for Spanish AI text and document review:

https://detector-de-ia.net/

Top comments (1)

Alex Shev • Jun 19

Building the output as a report instead of a verdict is the right product move.

AI detection is too often sold as certainty. A better report should show evidence, uncertainty, text segments, and limits of the method. The user needs help deciding what to review next, not a red/green label pretending to prove authorship.