Posted on Jun 19 • Edited on Jul 1

Was this Python written by a human or an AI? 7 signs to spot LLM-generated code

#webdev #programming #ai

The uncanny valley of code

You open a Python script. It runs clean, the indentation is perfect, the variables are named like they just walked out of a computer science textbook… but something feels off. Not wrong, just uncanny. Like it was written by someone who never had to meet a deadline or debug a 2 a.m. production error.

That’s because maybe it wasn’t written by a human at all.

Over the past couple of years, large language models like ChatGPT, Claude, and GitHub Copilot have gone from novelties to daily tools for developers. They autocomplete your functions, comment your code, and sometimes even explain your own spaghetti logic back to you in better words than you could.

But here’s the thing: LLMs write differently. They’re too helpful. Too clean. They don’t break rules the way real developers do. And that makes their code identifiable if you know what to look for.

Why does this matter?

Code reviews: You might be reviewing code that looks right but hides subtle hallucinations.
Hiring: Some devs submit AI-written code as part of interviews and it shows.
Debugging: LLM code often lacks real-world edge-case handling. You’ll see it when it breaks.

One night, I was reviewing a pull request that looked like it was written by an overcaffeinated Stack Overflow moderator. Turns out, the dev had just asked “How do I build a Python CLI for a CSV parser” into ChatGPT and pasted the result. The thing worked… but not in production.

That got me thinking: could we spot LLM-generated code like we spot copied homework?

Turns out, yes.

Let’s go through the 7 dead giveaways that a Python script was born from an LLM prompt and not a human brain with two deadlines, a Slack ping, and a ramen lunch.

1. Every line has a purpose… maybe too much purpose

One of the first signs you’re looking at AI-written Python? It’s that every function is dressed up like it’s attending a tech conference.

Take this gem, for example:

def add_numbers(a: int, b: int) -> int:
    """
    Adds two numbers and returns the result.

    Parameters:
        a (int): The first number.
        b (int): The second number.

    Returns:
        int: The sum of a and b.
    """
    return a + b

Yes, it’s “correct.” But also… why? No human writes a docstring to explain that a + b = c unless they’re on their very best behavior (or on ChatGPT).

LLMs are trained on best practices, so they default to over-explaining. It’s like reviewing code written by an overly polite intern trying to get a full-time offer. Every helper function gets a header. Every class is annotated like it’s a PhD thesis.

This isn’t to say comments are bad. But developers, especially experienced ones, tend to write contextual comments the kind that save future devs (or themselves) from pain, not the kind that just restate the obvious.

LLMs? They explain the obvious beautifully.

In short, if a codebase reads like a Python tutorial rather than a Python project, you’re probably looking at something that came out of an LLM.

2. That variable name sounds like it read the dictionary

Ever stumbled upon variables like total_user_input_character_count or is_feature_toggle_enabled_for_beta_users in a tiny script that prints “Hello, world”?

Congratulations. You’ve probably met an LLM.

Unlike human developers who’ll happily write x, tmp, or flag because we know what we mean (or will guess later), language models go full-on descriptive clarity mode. They use names that sound like they came from an API design committee meeting that went on too long.

Why? Because LLMs are trained to follow the golden rule of readability, and naming is one of the few things they consistently try to get “right.” Unfortunately, they often overshoot and name things like they’re writing a legal contract.

Let’s compare:

# Human-written
x = sum(values)

# LLM-written
total_sum_of_numeric_values_in_list = sum(numeric_values)

This makes the code more readable… until it doesn’t. After 3–4 of these verbose names, your code starts looking like an SEO-optimized blog post.

In real projects, naming is about balance: clear enough to make sense, short enough to not block your flow. LLMs tend to err on the side of clarity through verbosity, which ironically makes it harder to maintain in long files or when scanning quickly.

Real dev tip:

If you have to scroll horizontally to read a variable name… it’s probably not a human’s doing.

Up next: beautifully structured code that’s almost suspiciously clean.

3. Structure so perfect it hurts

We all appreciate clean code. But there’s clean… and then there’s sterilized.
LLM-generated Python often falls into the second category.

Every function has exactly one job.
Every block is neatly separated.
There are no dead imports, no trailing commas, no accidental print() left behind for debugging.

It’s like reading code written by a machine that’s afraid of being judged by your senior dev on GitHub.

A quick example:

def read_file(file_path: str) -> str:
    try:
        with open(file_path, 'r') as file:
            return file.read()
    except FileNotFoundError:
        print("File not found.")
        return ""

Perfect? Yes. But also generic, safe, and eerily emotionless.
LLMs love this pattern because it ticks all the right boxes: exception handling, proper context managers, correct typing, and even a fallback. It looks like it came straight out of a textbook because, well, it kinda did.

Now contrast that with how many human devs handle it:

with open(f, 'r') as f:
    stuff = f.read()

If it breaks, we’ll deal with it later. Or never.

LLMs overuse things like:

try/except even when not strictly needed
Redundant helper functions that add clarity but not much value
Typing annotations that are technically correct but don’t help with comprehension

This obsession with correctness isn’t bad on its own but in isolation, it starts to feel robotic. Like someone wrote it with zero knowledge of how real projects bend, break, and mutate over time.

Spot the sign:

If the code looks like it was written by a linter who dreams of getting promoted, it’s probably an LLM.

4. Lack of external context or file dependencies

Real-world Python code is rarely a self-contained masterpiece. It has baggage. It reads from config files, loads .env variables, interacts with messy APIs, writes logs to stderr, or breaks because someone renamed a JSON key in 2021.

LLM-generated code?
No context. No strings attached. Just perfectly self-isolated code that could run on a desert island.

Example time:

Ask an LLM to write a script that fetches user data. Here’s what you’ll often get:

import requests

def get_user_data(user_id):
    url = f"https://api.example.com/users/{user_id}"
    response = requests.get(url)
    return response.json()

It looks fine… until you realize:

There’s no auth token being passed
No config file or .env usage for the API key
No error handling for timeouts, bad status codes, or malformed JSON
No logging
No retries
No rate limiting
No CLI args or integration with other modules

In other words: no real-world messiness.

LLMs love complete, context-free examples. They don’t reach for global config management, feature flags, or dependency injection unless specifically told to. So if you’re reading a script that seems to operate in a vacuum? That’s your clue.

Human code smells like environment

We’ve all seen this:

import os
API_KEY = os.getenv("API_KEY")

And half the time it crashes because .env wasn’t set properly.
That’s real dev life.

If the code doesn’t interact with files, environments, or anything outside its little world, chances are it came from the mind of a model not a developer balancing 3 microservices and a migraine.

5. It solves toy problems, not real ones

If you’ve ever asked an LLM to “write a Python script to clean up CSV data,” you’ve probably gotten something like this:

import csv

def clean_csv(input_file, output_file):
    with open(input_file, 'r') as infile, open(output_file, 'w', newline='') as outfile:
        reader = csv.reader(infile)
        writer = csv.writer(outfile)

        for row in reader:
            cleaned = [item.strip() for item in row]
            writer.writerow(cleaned)

Nice. It works. It’s clean. But it’s also… kinda useless.

Where’s the:

Schema validation?
Logging for skipped rows?
Column renaming or formatting?
Handling of missing or malformed data?
Real input/output handling via CLI or a data pipeline?

LLMs excel at toy problems small, self-contained puzzles with a single, obvious answer. They don’t naturally reach for broader architecture, integrations, or long-term maintainability unless you walk them there prompt by prompt.

Here’s the kicker:

Real developer code is rarely just a neat function. It’s a messy web of:

Context
Constraints
Trade-offs
And edge cases

LLM code usually assumes the happy path a perfectly formatted CSV, a stable API, a cooperative user. Basically, life as it exists in tutorials, not production.

Warning flag:

If a script looks like it came from a coding challenge site instead of a real dev repo, it’s probably LLM-born.

6. Perfectly balanced functions… as all things should be?

LLM-generated Python has a weird obsession with balance.

Functions are almost always:

3 to 5 lines
Single responsibility
Cleanly named
Strictly typed

And you might think: “That’s great, isn’t it?”
Yes… until it’s not.

Here’s what that usually looks like:

def get_input() -> str:
    return input("Enter your name: ")

def format_name(name: str) -> str:
    return name.title()

def greet_user(name: str) -> None:
    print(f"Hello, {name}!")

Perfectly modular. Textbook SOLID principles. But in real life? A developer probably smashed this into one function, added a try/except, logged the result, and moved on with their day.

Real-world functions sometimes:

Mix I/O with business logic (yeah, we know… we’re sorry)
Return weird, inconsistent types
Have fast hacks or TODO comments buried in them
Grow over time like fungus in a dark repo

Why do LLMs avoid that?

Because they’ve been trained on curated datasets that often emphasize style over survival. They default to elegant form, not chaotic function.

And let’s be real, sometimes elegance gets in the way. When you’re fixing a production bug at 2AM, you’re not thinking about “single responsibility.” You’re thinking, “Why the hell is this not working?”

Dev instinct vs AI discipline:

If a codebase is filled with tiny, perfect LEGO bricks of functions but no real mess, urgency, or scars it’s probably AI code.

7. It looks like it read 10,000 Stack Overflow answers and merged them

You ever read a Python snippet and think,
“Wait… didn’t I see this exact thing in a Stack Overflow post from 2016?”

That’s because you probably did and so did the LLM.

One of the dead giveaways of AI-generated code is its copy-paste Frankenstein effect:
It pulls patterns, syntax, and solutions from the entire open-source universe and stitches them together into one eerily polished solution.

Like this:

import re
import argparse

def parse_input():
    parser = argparse.ArgumentParser(description="Parse input with regex.")
    parser.add_argument("input", type=str, help="The input string")
    return parser.parse_args()

def match_pattern(input_string):
    pattern = r"^\w+@\w+.\w+$"
    return re.match(pattern, input_string)

if name == "main":
    args = parse_input()
    if match_pattern(args.input):
        print("Valid email")
    else:
        print("Invalid email")

Looks good, right? But look closer this is textbook “best practice” mashup:

Regex from a tutorial
CLI from the argparse docs
Main guard like in every beginner’s project
No real logging, no test, no extensibility

This isn’t wrong it’s just… hollow.

Why it happens:

LLMs don’t truly understand what they’re writing they recognize patterns and stitch them together. So what you get is a soulless (albeit functional) blend of answers that sound right because they were correct somewhere on the internet.

But real devs? We reinvent wheels, use half-baked libraries, name things inconsistently, and write bugs with confidence. Our code lives. It bleeds. It changes shape when you push it to production.

Spot the Frankenstein:

If the code feels like it came from 5 tutorials at once and no single author with a voice, intuition, or questionable judgment it’s probably an LLM patchwork.

Time to wrap it all up.

The human touch still matters

Let’s be clear: LLMs aren’t the enemy.
They’re fast, consistent, and honestly better at typing than half of us before coffee. But you can spot them not because they’re bad but because they’re too good in the wrong ways.

Here’s a quick recap of how to sniff out AI-written Python:

Over-commented simplicity it explains a + b like it’s solving world peace
Overly descriptive variable names loop_index instead of just i
Structurally flawless code like it was groomed by a linter and raised in a style guide
Lacks external context no configs, no logs, no dirty .env love
Solves toy problems with no concern for prod constraints or real-world mess
Functions too balanced modular perfection, but missing real dev laziness
Feels like Stack Overflow had a baby with an API doc technically correct, spiritually empty

Does this mean AI-generated code is bad?
Not at all.

In fact, LLMs are a superpower when paired with a human brain that knows when to say:

“Cool, but let me clean this up so I can live with it in production.”

The goal isn’t to stop using LLMs it’s to use them deliberately. Know their blind spots. Review their logic. Don’t copy-paste without a sanity check. And don’t assume that clean = correct.

The future of coding isn’t human vs AI. It’s human with AI.
And the best devs will be the ones who know which parts to trust and which parts to rewrite at 2AM, swearing the whole time.

DEV Community