DEV Community

John sellin
John sellin

Posted on

Why LLM debugging fails on fragmented repository context

After spending a lot of time debugging large repositories with ChatGPT/Claude, I kept noticing the same failure pattern:

the model was not necessarily "bad at coding" — it was operating on a broken map of the repository.

Typical workflow looks something like:

  • search manually across many files
  • paste fragmented snippets
  • omit surrounding implementation paths
  • lose imports/dependencies/state flow
  • ask the model to reason across missing context

At that point the model starts interpolating architecture that may not actually exist.

A lot of “hallucinated code” is really just missing repository state.

That was the motivation behind building grab.

grab is a small terminal-native workflow tool for iterative repository context extraction.

The workflow becomes:

search → extract → accumulate → recurse
Enter fullscreen mode Exit fullscreen mode

Instead of repeatedly starting over with disconnected snippets, repository context is accumulated incrementally across extraction passes.

Example:

grab --tree
grab auth
grab --functions server.py
grab 500 635 auth.cs
Enter fullscreen mode Exit fullscreen mode

Function indexing ended up being especially useful because it exposes exact extraction coordinates:

file:start-end
Enter fullscreen mode Exit fullscreen mode

That allows the model to recursively request additional implementation context explicitly instead of guessing hidden code paths.

The accumulated context is continuously copied into the active clipboard/tmux buffer, which makes iterative debugging surprisingly fast in practice.

The tool is intentionally simple right now:

  • ripgrep
  • sed
  • terminal workflows
  • clipboard/tmux accumulation

But I think the underlying problem is larger than simple snippet retrieval.

Once repository context becomes fragmented, the model loses architectural continuity and starts constructing plausible-but-invented relationships between components.

I’ve started thinking about this more as:

  • repository-state acquisition
  • incremental context accumulation
  • deterministic extraction
  • semantic repository cognition

rather than “prompt engineering.”

There’s a short demo in the README showing the workflow in practice.

Repo:
https://github.com/johnsellin93/grab

Top comments (0)