The Hard Way
Before I shipped Aidetector, I spent two weeks doing AI detection manually.
I'm not joking. A client asked me to review a batch of blog posts for AI-generated content, and I had no reliable free tool. So I did what any developer does when they're stubborn and slightly overconfident — I started reading papers.
I pulled research on AI writing patterns. I opened a spreadsheet. I flagged things like:
- Sentence length variance (AI texts are suspiciously uniform)
- Overuse of hedging language ("it is important to note that...")
- Low lexical diversity in paragraph transitions
- Predictable semantic structure — topic sentence, three supporting points, wrap-up
I was manually scoring documents on a 12-point rubric. It took me about 20 minutes per article. For 40 articles.
That's when I thought: this should be a tool.
Why I Built It
Most free AI detectors at the time were either:
- Capped at 500 words (useless for long-form content)
- Requiring signup or API keys
- Running on a single heuristic with no transparency about what they were actually checking
I wanted something that ran entirely in the browser, explained its reasoning, supported recent models like GPT-5 and Claude 3.7, and had zero word limits. No backend. No user data. No nonsense.
The Tech Stack
The entire thing runs client-side:
- Vanilla JavaScript — no framework overhead, just fast DOM manipulation
- HTML/CSS — keeping it lightweight and accessible
- No external APIs — everything is computed locally in the browser
The detection logic runs 12 linguistic pattern checks derived from published NLP research. These include:
- Burstiness score (variance in sentence lengths)
- Perplexity approximation (word predictability heuristics)
- Hedging phrase frequency
- Passive voice ratio
- Transition word overuse
- Semantic flatness (paragraph topic variance)
... and six more
Each check returns a weighted score. The final result is a composite confidence percentage, broken down so the user can actually see why the tool flagged something.
The Technical Challenges
1. Approximating perplexity without an LLM
True perplexity requires a language model to score token probabilities. I don't have a backend, so I approximated it using a trigram frequency lookup built from a curated corpus. It's not perfect, but it's directionally accurate for the patterns I care about.
2. Avoiding false positives on technical writing
Technical documentation naturally has low sentence variance and formal structure — exactly what my detector was flagging as AI. I had to add a context-aware exemption layer that detects domain-specific vocabulary density and adjusts scoring accordingly.
3. Keeping up with new models
GPT-5 and Claude 3.7 write noticeably differently than earlier models. I had to collect new sample sets and re-weight several heuristics. This is an ongoing calibration problem — the patterns shift as models improve.
Lessons Learned
Doing it the hard way first was actually useful. Building a manual rubric before automating it forced me to understand the problem domain deeply. I wasn't just wiring up someone else's API — I actually knew what I was detecting and why.
Transparency builds trust. Showing users which patterns triggered and why has been the most-praised feature. People don't want a black box percentage. They want to understand the reasoning.
No-login tools get used. Friction kills adoption. Removing signup entirely meant people actually came back and shared it.
Browser-only is a genuine constraint, not just a gimmick. You have to think carefully about what's computationally feasible without a server. Some things I wanted to add (real perplexity scoring, model fine-tuning) are simply not possible client-side at scale.
Try It
If you're an educator reviewing student submissions, a content editor checking freelance work, or just curious how your own writing scores — give it a shot: aidetector.getinfotoyou.com
No word limits. No login. No API key. Paste your text and see what it finds.
I'm still actively improving the heuristics. If you find a false positive or a miss, I'd genuinely like to know.
Top comments (0)