Unknown Destroyer

Posted on Jan 6

Why AI-Generated Code Feels Weird

#ai #softwareengineering #coding #programming

you’ve probably felt it. you’re reviewing a pull request, or maybe you’re inheriting a new module, and something about the code just feels… off. it runs fine, the logic is correct, but it has this sterile, textbook-perfect vibe that real developers rarely produce. it’s the uncanny valley of software—code that looks human, but isn’t.

after staring at thousands of lines from both humans and LLMs, certain patterns become glaringly obvious. here’s a deep dive into the tells, written from the trenches.

1. formatting & layout patterns

what it looks like: imagine opening a file where every line is neatly wrapped at exactly 80 characters, every indentation is a perfect 4 spaces, and there’s a symmetrical, almost rhythmic, spacing around every operator and after every comma. braces are always on the same line (or always on the next), and files are organized with a rigid, cookie-cutter structure.

why it stands out: this is the output of a model trained on style guides, not a person typing under a deadline. humans are inconsistent. we might use 2 spaces in one file and 4 in another. we let lines run long if we’re in a hurry. we might put the opening brace on a new line for a function but keep it inline for a short if statement. this "over-consistent" formatting lacks the natural drift that comes from context switching, fatigue, or simply not caring about aesthetic perfection in a one-off script.

how humans usually differ: a human-written codebase often has slight formatting variations across files, especially if multiple people have touched it. you’ll see a mix of tab widths, occasional trailing whitespace, and inconsistent blank lines. it’s lived-in.

javascript

// AI-like: Perfectly symmetrical, regimented.
function calculateTotal(items) {
    let total = 0;
    for (let item of items) {
        total += item.price;
    }
    return total;
}

// Human-like: a bit looser pragmatic
function calcTotal(items) {
  let t = 0;
  for (const item of items) t += item.price;
  return t;
}

2. commenting behavior

what it looks like: every function, no matter how trivial, has a formal docstring block. line-by-line comments explain what the code is doing, not why. you’ll see novellas for a simple adder function: """Adds two numbers together and returns the result.""". comments feel like subtitles for someone reading the code aloud.

why it stands out: ai is trained to "document" by describing syntax, not intent. it lacks the human judgment to know when a comment is superfluous. perfect, uniform commenting across an entire codebase is a huge red flag. humans comment sparingly, usually to explain a non-obvious "why," a tricky workaround, or a // TODO: fix this hack.

how humans usually differ: human code is under-documented. we comment only where it hurts—where the logic is complex, the business rule is obscure, or we’re leaving a warning for our future selves. throwaway scripts and internal helpers often have zero comments.

python

# AI-like: Commenting the obvious.
# Create a list of numbers from 0 to 9
numbers = list(range(10))
# Loop through the list
for num in numbers:
    # Print the current number
    print(num)

# Human-like: commenting the "why."
# Use a fixed seed for reproducible test failures
random.seed(42)
# TODO: This breaks on February 29th, needs proper date lib
def is_weekday(date_str):

3. naming conventions

what it looks like: variable and function names are often hyper-descriptive, like calculate_total_amount_from_items_list or final_processed_data_output. conversely, they can also swing to extremely generic names like temp, data, or value1 in the same file. the names are syntactically perfect but soulless—devoid of the slang, shorthand, or personal quirks you see in human code (like getStuff() or final_final_parser_v3).

why it stands out: LLMs stitch together naming patterns from their training data, which ranges from meticulously named open-source projects to hastily written snippets. without true intent, the names lack a consistent "voice." they’re either robotically precise or weirdly vague, and the style can shift dramatically within a single file.

how humans usually differ: humans develop a personal or team naming culture. we might be consistently terse (getX) or consistently descriptive, but we’re rarely both in the same scope. we also break our own rules when we’re tired, leading to foo, bar, or thingy in a quick prototype.

python

# AI-like: Inconsistent, overly verbose, or oddly generic.
def process_user_input_data(input_data_string):
    parsed_data = json.loads(input_data_string)
    result = perform_calculation(parsed_data)
    return result

def helper(a, b):  # Suddenly super generic.
    return a + b

# Human-like: more consistent tone with accepted shorthand
def parse_input(json_str):
    data = json.loads(json_str)
    return calc(data)

def add(a, b):
    return a + b

4. structural & architectural choices

what it looks like: a simple 20-line script is refactored into a class with three helper methods and an abstract base "just in case." there’s an overuse of design patterns like singletons or factories for problems that don’t need them. you’ll see enumerate() where a simple loop would do, or unnecessary try...except blocks wrapping every operation.

why it stands out: ai models are trained on "best practice" examples, which often emphasize abstraction, reusability, and defensive patterns. they tend to over-apply these patterns, resulting in code that feels academic and over-engineered for the task at hand. it’s solving for textbook correctness, not for shipping.

how humans usually differ: senior developers know when to abstract and when to keep it simple. we often write straightforward, slightly duplicative code first and only refactor when duplication becomes a real problem. we avoid creating classes for things that are just functions.

javascript

// AI-like: Over-abstracted for a simple task.
class DataProcessor {
  constructor(data) {
    this.data = data;
  }
  validate() { /* ... */ }
  normalize() { /* ... */ }
  process() {
    this.validate();
    this.normalize();
    // ... actual processing
  }
}

// Human-like: a plain function that gets the job done.
function processData(data) {
  if (!data) return null;
  // quick validation and normalization inline
  const cleaned = data.map(item => ({ ...item, value: Number(item.value) }));
  // ... process
  return cleaned;
}

5. error handling & edge cases

what it looks like: every function has a blanket try...catch that logs a generic error message like "An error occurred." edge cases are handled obsessively, even for scripts that will run once in a controlled environment. you might see custom error classes for a simple CLI tool.

why it stands out: the model is trained on code that aims to be robust, so it defaults to a "bubble wrap" approach. it lacks the contextual understanding to know when defensive coding is essential (e.g., a payment service) and when it’s overkill (e.g., a one-time data migration script).

how humans usually differ: we’re pragmatic. in production code, we handle specific, likely errors. in scripts, we might let things crash and read the stack trace. we often comment // assuming the input is valid for now or // FIXME: add proper error handling as a placeholder.

python

# AI-like: Defensive to a fault.
def read_config(path):
    try:
        with open(path, 'r') as f:
            data = json.load(f)
        return data
    except FileNotFoundError:
        print("File not found.")
        return None
    except json.JSONDecodeError:
        print("Invalid JSON.")
        return None
    except Exception as e:
        print(f"An error occurred: {e}")
        return None

# Human-like: often more direct, handling errors where it matters.
def read_config(path):
    with open(path) as f:
        return json.load(f)  # Let it crash if the file is missing or malformed.

6. language & library preferences

what it looks like: the code uses the safest, most conservative defaults. it avoids "hacky" but common solutions (like using setTimeout with 0 for deferring execution in JavaScript). imports are often bloated, bringing in every library that might be relevant, leading to unused import statements at the top of files.

why it stands out: ai models are optimized to generate correct, widely acceptable code. they shy away from idiomatic shortcuts that are technically impure but practically useful. the import bloat happens because the model is trying to ensure all possible dependencies are available, a "just-in-case" approach.

how humans usually differ: we use language-specific idioms and sometimes employ clever hacks (with a comment explaining why). we trim imports ruthlessly and only add a library after we’ve confirmed we need it.

python

# AI-like: Conservative and import-heavy.
import numpy as np
import pandas as pd
import re, json, os, sys, time, datetime  # Many unused.

def find_pattern(text):
    return re.search(r'\d+', text)

# Human-like: lean imports, idiomatic hack.
import re

def find_pattern(text):
    # re.search returns None if not found, which is falsy.
    return re.search(r'\d+', text) or 0

7. statistical & stylistic regularities

what it looks like: if you look across multiple files generated by the same ai, you’ll notice a eerie repetition in phrasing, structure, and even variable name choices. there’s a symmetry and polish that humans rarely maintain consistently. the code lacks "fatigue"—there are no rushed, sloppy sections that were written at 2 a.m.

why it stands out: an ai doesn’t get tired, bored, or impatient. its output is statistically regular, drawn from the most common patterns in its training data. humans, on the other hand, have good days and bad days, which shows in the code.

how humans usually differ: our code tells a story. the careful, well-commented function written at the start of a project contrasts with the messy, hardcoded function added during a late-night bug fix. this variation is a signature of human authorship.

8. comparison with real human code

the difference often boils down to intent. junior developers might write messy code, but it’s mess with a personal fingerprint. senior developers write clean code, but it’s clean with purpose—not perfection.

why production code is messier: real-world code is shaped by conflicting requirements, tight deadlines, legacy constraints, and bug fixes. it accumulates scars (// HACK:), temporary workarounds, and layers of history. ai code, in contrast, often feels like a greenfield project that never had to endure the chaos of maintenance.

where ai code feels "too careful": it’s like a student who has memorized the textbook but hasn’t yet learned when to break the rules for practical gain. the code is correct, but it lacks the confident, sometimes ruthless, pragmatism of an experienced developer who knows what to ignore.

wrapping up

spotting ai-generated code isn’t about finding bugs—it’s about spotting the absence of human fingerprints. it’s the lack of inconsistency, the surplus of explanation, and the over-application of textbook patterns.

as ai tools become more integrated, this uncanny valley might shrink. but for now, a healthy dose of skepticism towards code that feels "too perfect" is probably a good thing. real software development is messy, human, and beautifully imperfect.

Any sufficiently advanced technology is indistinguishable from magic.

Arthur C. Clarke

Extra: AI-Looking codes are actually generated by AI using multiple AI's.

DEV Community

Why AI-Generated Code Feels Weird

1. formatting & layout patterns

2. commenting behavior

3. naming conventions

4. structural & architectural choices

5. error handling & edge cases

6. language & library preferences

7. statistical & stylistic regularities

8. comparison with real human code

wrapping up

Top comments (0)