Praveen

Posted on May 23

Title: LineageLens: A "Git Blame" for AI-Generated Code

#devchallenge #githubchallenge #ai #opensource

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

When an engineer uses an AI agent in the terminal to write or refactor code, git simply records the engineer as the author. The context of which prompt generated it, which model was used, and how many iterations it took is completely lost the moment the terminal session ends.

I built LineageLens to fix this. It is an open-source, self-hosted proxy (running on port 8788) that intercepts AI dev tool traffic. It parses the native AI tool calls and logs the exact prompt, model, and applied edit to a local database, creating a searchable audit trail and dashboard for AI-generated code.

Demo

GitHub Repository: karnati-praveen/lineagelens
VS Code Extension: LineageLens Marketplace

The Comeback Story

The initial prototype of LineageLens was a massive headache. It relied heavily on brittle regex text-scraping to pull code blocks out of standard LLM markdown responses. It broke constantly, and the project stalled because it couldn't tell if a developer actually accepted the AI's suggestion or rejected it.

For this challenge, I completely ripped out the regex engine and started over. I built native protocol adapters that parse Anthropic’s tool_use blocks and OpenAI’s apply_patch DSL directly from the API streams. More importantly, I introduced a state machine. It now correlates an AI's proposed edit with the next turn's tool_result to definitively track if the code was applied, rejected, or errored. It transformed the project from a noisy text logger into a highly accurate governance tool.

My Experience with GitHub Copilot

Rebuilding the core engine required handling complex, fragmented Server-Sent Events (SSE) and assembling streaming JSON payloads for the tool calls. GitHub Copilot was instrumental in accelerating this refactor. It helped quickly scaffold the FastAPI endpoints, write the tedious string-parsing logic for the proxy stream interception, and auto-complete the SQLAlchemy models needed for the new state-machine database architecture. It turned weeks of manual API debugging into just a few days of rapid implementation.

Top comments (4)

AudioProducer.ai • May 23

The state-machine framing - proposed / applied / rejected / errored - travels well outside dev tools. We hit the same audit-trail gap on the creative side at AudioProducer.ai: the rendered chapter is the artifact, but the structured edits underneath it (per-line speaker map, per-paragraph soundscape annotation, per-line emotion tag) all started as model proposals, and once the writer locks a chapter the prompt / model / ruleset that produced each line is gone the same way your terminal session erases the agent context. Holding the proposal-vs-applied state per line, rather than just storing the final assignment, is what lets you answer "why does this character suddenly sound different in chapter 7" three months later without re-running the whole pass.

The native-protocol-adapter move (parsing tool_use blocks and apply_patch DSL directly instead of regex over markdown) is the right level too - we kept hitting the same brittleness trying to scrape voice and emotion tags out of free-form model responses before moving to a schema-enforced structured-outputs contract. One thing I'd be curious whether your state machine captures: the "edited then applied" case, where the developer accepted the AI's edit but tweaked one identifier or one branch before committing it. On our side that's the most informative signal we have - the writer disagreed with the model just enough to override one parameter but kept the rest, which is the granularity at which the model is actually wrong in a learnable way.

Praveen • May 24

It is fascinating to see the exact same architectural problem play out in the creative space with AudioProducer.ai. You are spot on—tracking intent versus outcome is a universal AI governance challenge, not just a coding one.

Regarding the "edited then applied" case: you have identified the holy grail of fine-tuning data.

Because LineageLens currently operates purely at the network/proxy layer, our state machine logs the agent's successful apply_patch tool execution as "applied." If the developer manually tweaks the code in their editor after the agent applies it but before they type git commit, the proxy doesn't see those manual keystrokes.

To capture that precise "disagreement delta," the next architectural step is cross-referencing our proxy's logged patch against the final git commit tree. Pinpointing that exact granularity—where the human overrides the machine just slightly—is the ultimate goal for generating learnable signal.

Thanks for such a brilliant, cross-domain breakdown!

ender minyard • May 23 • Edited

How does this differ from the purpose and function of Git AI?

Also, what is the meaning of this line:

(Note: Don't forget to insert your GIF or screenshot of the LineageLens dashboard here using the image button in the editor!)

Praveen • May 24

Great question! The easiest way to think about it is that they do the exact opposite of each other.

Tools like Git AI (and most AI commit generators) look at a diff after the code is written and use AI to summarize it for human readers.

LineageLens doesn't use AI to summarize human code; it acts as an audit trail for what the AI wrote. Instead of looking at the final git diff, it sits at the network layer and intercepts the AI's tool calls before the commit happens. It tracks the exact prompt, the specific model (like Claude 3.5 Sonnet), and whether you applied or rejected the code, giving you a full provenance trail that standard git completely misses.