DEV Community

Vinayak G Hejib
Vinayak G Hejib

Posted on

Building an AI-Powered Git Commit & PR Assistant

Every engineer loves writing code(On a good note, I believe 😉).

Not every engineer loves writing:

  • commit messages
  • PR descriptions
  • testing notes
  • release summaries
  • sprint updates

In fast-moving teams, these things usually become an afterthought.

And honestly, it shows.

We’ve all seen commits like:

fixed the bug 
changes into MyView file
final-final-fix
ui updates on list view
Enter fullscreen mode Exit fullscreen mode

Or some PR descriptions that simply say:

“Please review.”

The bigger the codebase gets, the worse this problem becomes.

In modular enterprise applications with multiple teams working in parallel, poor PR communication slows reviews, increases confusion, and creates release risks.

So I started building something for myself:

An AI-powered Git assistant that could understand code changes and automatically generate:

  • meaningful commit messages
  • structured PR summaries
  • testing notes
  • risk indicators
  • release notes

And I wanted it to work offline.


Why I Built It

Initially, this started as a tiny productivity experiment.

I simply wanted:

  • cleaner commit messages
  • less repetitive writing
  • faster PR creation

But after using it for a few weeks, I realized something interesting:

The real value wasn’t automation.

It was reducing cognitive overhead.

After spending hours solving architectural or UI problems, context-switching into documentation mode becomes mentally expensive.

The assistant helped bridge that gap.


High-Level Architecture

The workflow is actually pretty straightforward.

Git Diff
   ↓
File Analysis
   ↓
Context Extraction
   ↓
Prompt Generation
   ↓
Local LLM
   ↓
Commit + PR Output
Enter fullscreen mode Exit fullscreen mode

The difficult part was making the outputs:

  • concise
  • trustworthy
  • reviewer-friendly
  • non-robotic

Extracting Git Changes

The first version simply passed the raw git diff directly into the model.

That worked terribly.

Large diffs:

  • exceeded context windows
  • produced noisy summaries
  • generated inaccurate PR descriptions

So I added preprocessing layers.

Example:

import subprocess

diff = subprocess.check_output(
    ["git", "diff", "--cached"],
    text=True
)

print(diff[:1000])
Enter fullscreen mode Exit fullscreen mode

But raw diffs were too noisy.

So the assistant started:

  • grouping changes by module
  • ignoring formatting-only modifications
  • detecting API changes
  • identifying added vs removed logic

Filtering Noise

One surprisingly important improvement was removing low-signal changes.

For example:

def should_ignore(line):
    ignored_patterns = [
        "import ",
        "swiftlint",
        "whitespace"
    ]

    return any(p in line for p in ignored_patterns)
Enter fullscreen mode Exit fullscreen mode

This dramatically improved the quality of generated summaries.


Prompt Engineering Was Harder Than Expected

One thing I underestimated was how sensitive outputs were to prompting.

A vague prompt generated vague PRs.

An overly detailed prompt generated essays nobody wanted to read.

Eventually I settled on prompts focused on:

  • behavioral changes
  • architectural impact
  • reviewer clarity
  • testing implications

Example:

prompt = f"""
You are an experienced software engineer reviewing a pull request.

Generate:
1. concise commit message
2. PR summary
3. testing notes
4. possible risks

Ignore formatting-only changes.

Git Diff:
{processed_diff}
"""
Enter fullscreen mode Exit fullscreen mode

The single line:

Ignore formatting-only changes

improved results massively.


Running AI Offline

This became the most interesting part of the project.

I specifically wanted:

  • local-first workflows
  • zero cloud dependency
  • privacy for enterprise repositories
  • low latency
  • no API costs

Sending proprietary diffs to external APIs was something I wanted to avoid entirely.

So I experimented with:

  • Ollama
  • llama.cpp
  • quantized local models
  • Apple Silicon optimizations

Calling the local model was surprisingly simple:

import requests

response = requests.post(
    "http://localhost:11434/api/generate",
    json={
        "model": "mistral",
        "prompt": prompt,
        "stream": False
    }
)

print(response.json()["response"])
Enter fullscreen mode Exit fullscreen mode

For commit generation and PR summaries, smaller local models were often more than sufficient.


The Most Difficult Problems

The difficult part wasn’t generating text.

It was generating trustworthy text.

Hallucinated Features

Sometimes the model inferred functionality that didn’t exist.

Especially when refactors looked similar to feature additions.

To reduce this:

  • prompts became shorter
  • context became more constrained
  • diffs were chunked intelligently

Huge Enterprise Diffs

Large modular repositories create massive PRs.

Passing entire diffs into the model quickly became inefficient.

So I added:

  • chunking
  • module prioritization
  • high-signal file detection

Example:

MAX_CHUNK_SIZE = 4000

chunks = [
    diff[i:i + MAX_CHUNK_SIZE]
    for i in range(0, len(diff), MAX_CHUNK_SIZE)
]
Enter fullscreen mode Exit fullscreen mode

Robotic Language

Early outputs sounded overly AI-generated.

Too many phrases like:

  • “enhanced functionality”
  • “optimized architecture”
  • “improved user experience”

Real engineers don’t write like that.

So prompts were tuned toward:

  • concise engineering tone
  • direct wording
  • reviewer readability

Features That Became Surprisingly Useful

The project slowly evolved beyond commit generation.

PR Risk Detection

The assistant flags:

  • shared module changes
  • navigation flow modifications
  • authentication updates
  • API contract changes

Automatic Testing Notes

Example output:

Tested:
- Login flow
- Session recovery
- Token refresh handling
- Deep link navigation
Enter fullscreen mode Exit fullscreen mode

This alone ended up saving a surprising amount of time.


Release Notes Generation

The assistant can summarize:

  • bug fixes
  • user-facing improvements
  • technical refactors

directly from merged commits.


Example Output

Before

fixed auth issue
Enter fullscreen mode Exit fullscreen mode

After

Refactor authentication recovery flow to support token refresh handling during session expiration
Enter fullscreen mode Exit fullscreen mode

What I Learned

The biggest realization from this project was:

AI works best when augmenting engineering workflows, not replacing engineering decisions.

The assistant is not writing code for me.

It is removing repetitive cognitive work around:

  • communication
  • formatting
  • summarization
  • workflow overhead

And that turns out to be incredibly valuable.


Future Improvements

There’s still a lot I want to explore:

  • Xcode integration
  • Git hook automation
  • reviewer suggestions
  • Jira linking
  • architecture drift detection
  • PR quality scoring
  • Slack release summaries

I also want to experiment with embedding-based code understanding for better long-term context awareness.


Final Thoughts

Building AI tooling for toy projects is easy.

Building AI tooling that works reliably in large modular enterprise repositories is a completely different challenge.

But that’s also what makes it exciting.

What started as a small commit-message helper slowly evolved into a developer workflow copilot that now saves me time almost every day.

And honestly, this feels like just the beginning of what local AI can do for software engineering workflows.

Top comments (0)