DEV Community

The AI Entrepreneur
The AI Entrepreneur

Posted on

How to Detect AI-Generated Text with a Free API

Everyone's arguing about whether AI detectors work. I built one as an API and here's what I learned about what actually triggers them.

With LLMs generating a huge chunk of web content, the ability to distinguish between a human's lived experience and a machine's statistical average has become a survival skill for developers, editors, and SEOs.

Whether you're building a content moderation tool, verifying freelance submissions, or just trying to keep your "human-only" community actually human, you need a reliable way to spot the bots. But here's the kicker: most "premium" detectors like GPTZero or Originality.ai are expensive black boxes. You send text, get a percentage, and have no idea why it was flagged.

Today, we're looking at a different approach—a math-based, transparent detection system accessible via a simple API.

Why AI Detection Matters in 2026

  1. Search Engine Trust: Google and Bing have moved beyond "is it AI?" to "is it helpful?". However, pure AI text often lacks the "Information Gain" that search engines now prioritize.
  2. Brand Integrity: If your blog starts sounding like a generic support manual, you lose your audience.
  3. Academic & Professional Integrity: In a world of "vibe coding" and automated reports, knowing the source of a document is about accountability.

The 9-Factor Scoring System: How the API Thinks

Most detectors try to use another AI to catch the first AI. It's "poison vs. poison." This API uses a different philosophy: Statistical Fingerprinting. It analyzes 9 specific linguistic signals that machines consistently fail to replicate.

  1. Sentence Length Uniformity: AI loves the middle ground. Humans write 5-word punches followed by 40-word explanations. AI stays in the 15-25 word "safe zone."
  2. AI Phrase Density: Tracks the "tell-tale" transitions like furthermore, it is worth noting, and in the realm of.
  3. Vocabulary Diversity: AI uses high-probability words. Humans use "jagged" vocabulary with unexpected synonyms.
  4. Starter Diversity: Machines often start 60% of sentences with "The," "It," or "This." Humans vary their sentence openings significantly more.
  5. Paragraph Uniformity: Analyzes if paragraphs are eerily similar in structure and length.
  6. Comma Patterns: AI tends to use commas with mathematical precision (e.g., always after introductory phrases), whereas human punctuation is more stylistic.
  7. Burstiness: This is the "rhythm" of the text. Human writing has high variance in consecutive sentence lengths (bursts of info). AI is smooth and monotone.
  8. Readability Grade Targeting: Most LLMs default to a grade 8-12 reading level. Human text is wildly inconsistent, ranging from grade 4 to 16+.
  9. Structure Repetition: Detects if the underlying "template" of sentences (Noun-Verb-Adjective) repeats too frequently.

Calling the API: Code Examples

The endpoint we're using is hosted on Apify: https://george-the-developer--ai-text-humanizer-api.apify.actor/detect. It's a POST request that expects a JSON body with a text field.

JavaScript (Fetch API)

const detectAI = async (content) => {
  const response = await fetch('https://george-the-developer--ai-text-humanizer-api.apify.actor/detect', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text: content })
  });

  const result = await response.json();

  console.log(`AI Score: ${result.ai_score}%`);
  console.log(`Verdict: ${result.verdict}`);
  console.log(`Confidence: ${result.confidence}`);
};

const myText = "In the rapidly evolving landscape of AI, it is worth noting...";
detectAI(myText);
Enter fullscreen mode Exit fullscreen mode

Python (Requests)

import requests

def detect_ai_text(text):
    url = "https://george-the-developer--ai-text-humanizer-api.apify.actor/detect"
    payload = {"text": text}

    response = requests.post(url, json=payload)
    data = response.json()

    print(f"Verdict: {data['verdict']}")
    print(f"AI Probability: {data['ai_score']}%")
    print(f"Human Probability: {data['human_score']}%")

sample_text = "I went to the store and bought some milk. It was a sunny day, and I felt great."
detect_ai_text(sample_text)
Enter fullscreen mode Exit fullscreen mode

Reading the Results: AI vs. Human

When you call the /detect endpoint, you don't just get a "Yes/No." You get a weighted analysis.

Sample AI Output (GPT-4o):

  • AI Score: 82%
  • Verdict: ai_generated
  • Details: High uniformity (22.5), low burstiness, and 4 flagged phrases (delve into, tapestry, landscape).

Sample Human Output:

  • AI Score: 14%
  • Verdict: human_written
  • Details: High burstiness (35.2 variance), varied starters, and 0 flagged AI phrases.

The beauty of this 9-factor system is that it's harder to "prompt engineer" your way around it. Even if you tell an AI to "be creative," its underlying statistical distribution of sentence lengths and comma placements usually gives it away.

The Antidote: The Humanizer Endpoint

If you're on the other side of the fence—perhaps you're a writer using AI to brainstorm but want to ensure your final draft doesn't trigger every detector on the planet—the same API offers a "humanize" endpoint.

While the /detect endpoint finds the patterns, the /humanize endpoint breaks them. It injects burstiness, varies the sentence structures, and swaps out the "robot vocabulary" for something more organic.

Try it out: If your text is getting flagged, hit the humanizer endpoint to see how a few surgical changes in structure can completely shift the statistical fingerprint of your writing.

Conclusion

AI content detection isn't about "banning" AI; it's about transparency. In a world of automated noise, the ability to verify the signal is priceless. By focusing on measurable math—burstiness, uniformity, and phrase density—we can build a more trustworthy web.

Top comments (0)