How I Built an AI Content Detection System from Scratch

#ai #python #machinelearning #opensource

A few months ago, my friend sent me a LinkedIn post and asked if I thought it was written by ChatGPT. I had no idea. And that bothered me. I'm an engineer, I should be able to figure this out.

So I did what any engineer would do: I went down a rabbit hole and ended up building an entire detection system. This is how it went.

Why I Bothered

Look, I'm not on some crusade against AI-generated content. I use LLMs daily. But there are real situations where it matters: academic submissions, journalism, legal documents, job applications. People deserve to know what they're reading.

Every existing tool I tried was either behind a paywall, unreliable, or a black box. I wanted something open source that actually showed its reasoning. So I built AI Provenance Tracker.

You can try the live demo if you want to skip the technical stuff.

The Stack

I went with FastAPI for the backend and Next.js for the frontend. Nothing fancy. I wanted to get to the interesting part, which is the detection logic.

┌─────────────────────────────────────────────────┐
│              Web Interface (Next.js)             │
└─────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────┐
│              REST API (FastAPI)                  │
└─────────────────────────────────────────────────┘
                        │
          ┌─────────────┴─────────────┐
          ▼                           ▼
┌───────────────────┐     ┌───────────────────┐
│   Text Detector   │     │  Image Detector   │
│  - Perplexity     │     │  - FFT Analysis   │
│  - Burstiness     │     │  - Artifacts      │
│  - Vocabulary     │     │  - Metadata       │
└───────────────────┘     └───────────────────┘

The detection side has two engines. One for text, one for images. Let me walk through both.

Text Detection: What Actually Works

I tried a bunch of approaches before landing on three signals that actually hold up.

Perplexity

This one's the most intuitive. Perplexity basically measures how "surprised" a language model would be by a piece of text. AI-generated text tends to score lower because it's literally optimised to produce probable, fluent output.

def calculate_perplexity(words: list[str]) -> float:
    word_counts = Counter(words)
    total_words = len(words)

    entropy = 0.0
    for count in word_counts.values():
        prob = count / total_words
        entropy -= prob * math.log2(prob)

    return 2 ** entropy

Humans are messy writers. We use weird words, go off on tangents, make unusual word choices. AI is smoother. Almost too smooth.

Burstiness

This was the surprising one. Burstiness measures how much sentence length varies in a piece of text. Turns out, AI writes like a metronome. Consistently medium-length sentences with similar complexity.

Humans don't do that. We write a short punchy sentence. Then we follow it with this long, meandering thought that goes on for a while because we're trying to explain something complicated and we don't stop to restructure it. Then short again.

def calculate_burstiness(sentences: list[str]) -> float:
    lengths = [len(s.split()) for s in sentences]
    mean_length = np.mean(lengths)
    std_length = np.std(lengths)
    return std_length / mean_length

The coefficient of variation tells the whole story. AI text clusters around 0.2-0.3. Human text is all over the place, like 0.4, 0.5, sometimes higher.

Vocabulary Richness

The third signal is type-token ratio and n-gram repetition. AI has this habit of recycling phrases. "it's important to note that" three times in one article is a dead giveaway. Humans vary their transitions naturally without thinking about it.

Image Detection: The Frequency Domain Trick

This part was genuinely fun to build. AI-generated images leave fingerprints that are invisible to the naked eye but show up clearly in the frequency domain.

FFT Analysis

The Fast Fourier Transform converts an image from spatial to frequency representation. Real photographs have frequency distributions shaped by optics and sensor physics. Diffusion models like Stable Diffusion produce mathematically different patterns.

from scipy import fft

def analyze_frequency_domain(img_array: np.ndarray) -> float:
    gray = np.mean(img_array, axis=2)
    f_transform = fft.fft2(gray)
    f_shift = fft.fftshift(f_transform)
    magnitude = np.abs(f_shift)
    # AI images have unusual high-frequency distributions
    ...

I also check for artifact patterns (weird texture uniformity, edge inconsistencies around hair and fingers) and metadata forensics. Real photos have EXIF data from cameras. AI images almost never do.

Combining Everything

Here's the thing I learned the hard way: no single signal is reliable enough. Perplexity alone? A carefully edited AI text fools it. FFT alone? Heavily compressed JPEGs produce false positives.

The magic happens when you combine them with weighted averaging:

def make_prediction(perplexity, burstiness, vocab_richness, ml_score=None):
    signals = []
    weights = []

    if ml_score is not None:
        signals.append(ml_score)
        weights.append(0.40)

    signals.append(perplexity_signal)
    weights.append(0.25)

    signals.append(burstiness_signal)
    weights.append(0.20)

    signals.append(vocab_signal)
    weights.append(0.15)

    confidence = sum(s * w for s, w in zip(signals, weights))
    return confidence > 0.5, confidence

I tuned the weights through experimentation. The ML model (when available) gets the highest weight because it captures patterns I can't articulate in code.

What I Learned

Four months in, here's what I'd tell someone starting a similar project:

Detection is probabilistic, not binary. I always show confidence scores and explain the reasoning. Saying "73% likely AI-generated" is honest. Saying "this is AI" is not.

Ensemble methods are worth the complexity. The jump from single-signal to multi-signal detection was dramatic. Same principle as spam filtering and fraud detection. One signal is easy to game, five signals together are much harder.

The arms race is real. People actively try to evade detection by adding random typos, varying sentence lengths, post-processing images. I've already had to update the detection logic three times.

Open source builds trust. When the detection methods are visible, people can understand why the system reached a conclusion. Black-box detection creates suspicion.

What's Next

I'm working on audio deepfake detection (voice cloning is getting scary good), a browser extension for real-time detection, and fine-tuning ML models on larger datasets. The roadmap is in the repo if you're curious.