Understanding Large Codebases: Why AST Analysis Beats Asking an LLM

#python #ai #programming #architecture

TL;DR

LLMs are probabilistic—they predict text; they don't parse logic. When navigating massive legacy codebases, "guessing" isn't enough. By using Abstract Syntax Trees (AST) and Cyclomatic Complexity, you can map out technical debt deterministically instead of relying on an AI's "vibe check."

The 2:00 AM Inheritance

Last week, I inherited a 50,000-line Python monolith.

You know the type: a Django app from 2017 with six different authentication schemes, import statements that circle back on themselves, and a utils.py that was somehow 3,000 lines long.

My first instinct? Ask an LLM to explain it.

The response was… fine. It gave me a plausible architectural overview and identified some patterns. But when I drilled into specific files, the LLM’s confidence didn’t match reality. It would describe a function’s behavior perfectly but miss that it had a cyclomatic complexity of 23 and was nested seven layers deep in exception handlers.

That’s when it clicked: LLMs are great at natural language, but code isn’t natural language. It’s a formal grammar with a deterministic structure. If you want to understand structure, you need structural tools.

The Bottleneck is Mapping, Not Reading

When we talk about "technical debt," we’re really talking about mental load. How long does it take a human to load this codebase into their brain?

Reading code top to bottom is like trying to understand a skyscraper by walking through it blindfolded. You’ll eventually get there, but it’s slow and error prone. You need a mental model of:

Which functions call which other functions?
How do files relate to each other?
Where does the complexity actually live?

An LLM can summarize what it sees, but it’s fundamentally probabilistic. It can't tell you that Function A has 18 execution paths while Function B has 3. That distinction matters when you’re deciding what to refactor first.

The Ground Truth: Abstract Syntax Trees (AST)

Every piece of code you write is parsed into an Abstract Syntax Tree (AST) by the interpreter before it ever runs. That tree is the "ground truth" of your program.

When you visualize an AST, you can literally see complexity:

Deep branches: Heavily nested loops and conditionals.
Wide branches: Functions with too many decision paths.
Tangled roots: Circular dependencies.

Instead of asking an AI "is this code complex?", AST-based tools allow you to measure it.

The Math of Bugs: Cyclomatic Complexity

One of the most useful signals you can extract from an AST is Cyclomatic Complexity. Developed by Thomas McCabe, it measures the number of independent paths through a function’s code.

The formal representation is:
M = E - N + 2P

(Where E = edges, N = nodes, and P = connected components.)

In simpler terms: Start with a base of 1, and add +1 for every decision point (if, elif, for, while, try/except, and, or).

The Risk Scale

Complexity	Risk Level	Meaning
1–10	Low	Easy to maintain. High testability.
11–20	Moderate	Needs monitoring; getting "wordy."
21+	High	Refactor candidate. Statistically high bug density.

Research shows that functions with a complexity score over 10 are significantly more likely to contain bugs. Not because the code is "bad," but because humans struggle to track more than 7-10 execution paths in working memory.

The "Cyborg" Workflow: How to Cut the Mess

I’ve stopped approaching legacy code blindly. Here is my 3-step structural workflow using ast-visualizer.com:

1. The Satellite View (Dependency Graphs)

First, I visualize the project’s import structure as a network graph.

The Goal: Identify bottleneck files—the "God Objects" everything depends on.
Why it beats LLMs: An AI might say "This looks like a core module." A graph shows you that 94% of your codebase imports it. If you touch that file, you touch everything.

2. The Terrain Map (Complexity Heatmaps)

I run a "Depth-over-Sequence" chart. Imagine a line graph where the X-axis is "lines of code" and the Y-axis is "nesting depth."

Green: Depth 0–6 (Safe)
Red: Depth 10+ (The "Danger Zone") When I ran this on my 3,000-line helpers.py, 80% of it was red. I didn’t need to read a single line to know exactly where the bugs were hiding.

3. The Microscope (Individual Function AST)

For the gnarliest functions, I visualize the tree itself. I look for asymmetric branches. If one side of an if/else is 50 lines deep and the other is 2, I've found a hidden edge case that’s likely poorly understood and untested.

Conclusion: Use the Right Tool for the Job

I’m not saying "don't use LLMs." I use them daily. But I’ve learned to use them for what they’re good at: Semantics.

Use LLMs for: Generating boilerplate, explaining what a function intends to do, and suggesting refactors after you’ve found the target.
Use AST Analysis for: Structure. Measuring complexity objectively and mapping the territory of a massive project.

If you're staring at a "helper" file that gives you anxiety, stop reading and start mapping.