Şahin Uygutalp

Posted on Mar 28

How I Detect AI-Generated Text Without Calling an LLM

#security #python #opensource #ai

Most AI detection tools make the same mistake: they use an LLM to detect an LLM.

That's expensive, slow, and ironic. You're spending money on the exact technology you're trying to filter out.

For PR-Sentry — a GitHub Action that protects open source maintainers from AI-generated PR spam — I needed something different. Detection had to be free, fast, and impossible to rate-limit. Here's how I built it.

The core insight: AI text has a statistical fingerprint

Human writing is messy. Sentence lengths vary. Word choice is idiosyncratic. Structure is inconsistent.

AI writing is suspiciously uniform. It favors certain words, certain patterns, certain rhythms. Not because it's programmed to — but because it learned from a corpus that rewards this style.

This uniformity is detectable without a model. You just need the right signals.

Signal 1: Buzzword density

AI models consistently overuse a specific vocabulary. Not randomly — these words appear because they score well in RLHF training. "Robust", "seamless", "leverage", "utilize", "comprehensive", "innovative", "streamline".

I built a weighted buzzword list and calculate density per 100 words. A human developer writing a PR description might use one of these. An AI will use four.

def buzzword_density(text):
    words = text.lower().split()
    hits = sum(1 for w in words if w in BUZZWORDS)
    return hits / len(words) * 100

Signal 2: Passive voice ratio

AI consistently overuses passive constructions. "The function is called", "the error is handled", "the issue was fixed". Human developers write more directly.

Passive voice detection is simple with a few regex patterns against auxiliary verb + past participle constructions. Not perfect — but in combination with other signals, it adds real weight.

Signal 3: Sentence length uniformity

This is the subtlest signal and the most reliable. Human writing has high variance in sentence length. Some sentences are short. Others run longer because the thought requires it, building toward a point that the writer is trying to make clearly before moving on.

AI writing has low variance. Everything converges toward a medium length. Calculate the standard deviation of sentence lengths — if it's suspiciously low, something is off.

Signal 4: Repetition score

AI frequently restates the same point in different words. "This fixes the bug. The issue has been resolved. The problem no longer occurs." Count unique trigrams vs total trigrams. High repetition = low uniqueness = suspicious.

Combining the signals

slop_score = (
    buzzword_density * 30 +
    passive_voice_ratio * 20 +
    sentence_length_uniformity * 20 +
    repetition_score * 30
)

is_slop = slop_score >= 60

The weights aren't arbitrary — they reflect how reliably each signal distinguishes AI from human writing in my test corpus. Buzzword density and repetition are the strongest predictors. Sentence uniformity catches cases where the other two don't fire.

Bonus: Shannon entropy for secret detection

This isn't about AI detection — but it's the same "statistics over syntax" philosophy applied to security.

Regex patterns for API keys have a fundamental problem: they only catch known formats. A new service launches, uses a different key format, and your patterns miss it.

Shannon entropy catches anything. Real secrets — API keys, tokens, passwords — are high-entropy strings. Human-readable text is not.

import math
from collections import Counter

def shannon_entropy(s):
    counts = Counter(s)
    length = len(s)
    return -sum(
        (c / length) * math.log2(c / length)
        for c in counts.values()
    )

# Flag anything above 4.5 bits per character
is_suspicious = shannon_entropy(token) > 4.5

AKIA4EXAMPLE123KEY → entropy: 4.7 → flagged.

hello world → entropy: 3.1 → clean.

Does it work?

In testing against a corpus of real PRs and AI-generated PRs, the slop detector catches roughly 70% of AI-generated descriptions with a false positive rate under 5%. It's not perfect — a careful human could write to fool it, and a well-prompted AI can avoid the obvious buzzwords.

But that's fine. The goal isn't perfection. The goal is to avoid calling Claude on every PR that starts with "This PR implements a robust solution to seamlessly address the issue."

The LLM runs on the hard cases. The heuristics handle the obvious ones for free.

The full implementation is in PR-Sentry

If you want to see the complete code — including the full buzzword list, the passive voice patterns, and how this integrates with the GitHub Actions workflow — it's all open source:

→ github.com/Ebuodinde/PR_SENTRY

What signals would you add? I'm especially curious whether anyone has tried perplexity scoring at inference time — the compute cost might be worth it for high-value repos.

DEV Community