Vinayak G Hejib

Posted on May 15

Building an AI-Powered Git Commit & PR Assistant

#ai #productivity #git #ios

Every engineer loves writing code(On a good note, I believe 😉).

Not every engineer loves writing:

commit messages
PR descriptions
testing notes
release summaries
sprint updates

In fast-moving teams, these things usually become an afterthought.

And honestly, it shows.

We’ve all seen commits like:

fixed the bug 
changes into MyView file
final-final-fix
ui updates on list view

Or some PR descriptions that simply say:

“Please review.”

The bigger the codebase gets, the worse this problem becomes.

In modular enterprise applications with multiple teams working in parallel, poor PR communication slows reviews, increases confusion, and creates release risks.

So I started building something for myself:

An AI-powered Git assistant that could understand code changes and automatically generate:

meaningful commit messages
structured PR summaries
testing notes
risk indicators
release notes

And I wanted it to work offline.

Why I Built It

Initially, this started as a tiny productivity experiment.

I simply wanted:

cleaner commit messages
less repetitive writing
faster PR creation

But after using it for a few weeks, I realized something interesting:

The real value wasn’t automation.

It was reducing cognitive overhead.

After spending hours solving architectural or UI problems, context-switching into documentation mode becomes mentally expensive.

The assistant helped bridge that gap.

High-Level Architecture

The workflow is actually pretty straightforward.

Git Diff
   ↓
File Analysis
   ↓
Context Extraction
   ↓
Prompt Generation
   ↓
Local LLM
   ↓
Commit + PR Output

The difficult part was making the outputs:

concise
trustworthy
reviewer-friendly
non-robotic

Extracting Git Changes

The first version simply passed the raw git diff directly into the model.

That worked terribly.

Large diffs:

exceeded context windows
produced noisy summaries
generated inaccurate PR descriptions

So I added preprocessing layers.

Example:

import subprocess

diff = subprocess.check_output(
    ["git", "diff", "--cached"],
    text=True
)

print(diff[:1000])

But raw diffs were too noisy.

So the assistant started:

grouping changes by module
ignoring formatting-only modifications
detecting API changes
identifying added vs removed logic

Filtering Noise

One surprisingly important improvement was removing low-signal changes.

For example:

def should_ignore(line):
    ignored_patterns = [
        "import ",
        "swiftlint",
        "whitespace"
    ]

    return any(p in line for p in ignored_patterns)

This dramatically improved the quality of generated summaries.

Prompt Engineering Was Harder Than Expected

One thing I underestimated was how sensitive outputs were to prompting.

A vague prompt generated vague PRs.

An overly detailed prompt generated essays nobody wanted to read.

Eventually I settled on prompts focused on:

behavioral changes
architectural impact
reviewer clarity
testing implications

Example:

prompt = f"""
You are an experienced software engineer reviewing a pull request.

Generate:
1. concise commit message
2. PR summary
3. testing notes
4. possible risks

Ignore formatting-only changes.

Git Diff:
{processed_diff}
"""

The single line:

Ignore formatting-only changes

improved results massively.

Running AI Offline

This became the most interesting part of the project.

I specifically wanted:

local-first workflows
zero cloud dependency
privacy for enterprise repositories
low latency
no API costs

Sending proprietary diffs to external APIs was something I wanted to avoid entirely.

So I experimented with:

Ollama
llama.cpp
quantized local models
Apple Silicon optimizations

Calling the local model was surprisingly simple:

import requests

response = requests.post(
    "http://localhost:11434/api/generate",
    json={
        "model": "mistral",
        "prompt": prompt,
        "stream": False
    }
)

print(response.json()["response"])

For commit generation and PR summaries, smaller local models were often more than sufficient.

The Most Difficult Problems

The difficult part wasn’t generating text.

It was generating trustworthy text.

Hallucinated Features

Sometimes the model inferred functionality that didn’t exist.

Especially when refactors looked similar to feature additions.

To reduce this:

prompts became shorter
context became more constrained
diffs were chunked intelligently

Huge Enterprise Diffs

Large modular repositories create massive PRs.

Passing entire diffs into the model quickly became inefficient.

So I added:

chunking
module prioritization
high-signal file detection

Example:

MAX_CHUNK_SIZE = 4000

chunks = [
    diff[i:i + MAX_CHUNK_SIZE]
    for i in range(0, len(diff), MAX_CHUNK_SIZE)
]

Robotic Language

Early outputs sounded overly AI-generated.

Too many phrases like:

“enhanced functionality”
“optimized architecture”
“improved user experience”

Real engineers don’t write like that.

So prompts were tuned toward:

concise engineering tone
direct wording
reviewer readability

Features That Became Surprisingly Useful

The project slowly evolved beyond commit generation.

PR Risk Detection

The assistant flags:

shared module changes
navigation flow modifications
authentication updates
API contract changes

Automatic Testing Notes

Example output:

Tested:
- Login flow
- Session recovery
- Token refresh handling
- Deep link navigation

This alone ended up saving a surprising amount of time.

Release Notes Generation

The assistant can summarize:

bug fixes
user-facing improvements
technical refactors

directly from merged commits.

Example Output

Before

fixed auth issue

After

Refactor authentication recovery flow to support token refresh handling during session expiration

What I Learned

The biggest realization from this project was:

AI works best when augmenting engineering workflows, not replacing engineering decisions.

The assistant is not writing code for me.

It is removing repetitive cognitive work around:

communication
formatting
summarization
workflow overhead

And that turns out to be incredibly valuable.

Future Improvements

There’s still a lot I want to explore:

Xcode integration
Git hook automation
reviewer suggestions
Jira linking
architecture drift detection
PR quality scoring
Slack release summaries

I also want to experiment with embedding-based code understanding for better long-term context awareness.

Final Thoughts

Building AI tooling for toy projects is easy.

Building AI tooling that works reliably in large modular enterprise repositories is a completely different challenge.

But that’s also what makes it exciting.

What started as a small commit-message helper slowly evolved into a developer workflow copilot that now saves me time almost every day.

And honestly, this feels like just the beginning of what local AI can do for software engineering workflows.

DEV Community

Building an AI-Powered Git Commit & PR Assistant

Why I Built It

High-Level Architecture

Extracting Git Changes

Filtering Noise

Prompt Engineering Was Harder Than Expected

Running AI Offline

The Most Difficult Problems

Hallucinated Features

Huge Enterprise Diffs

Robotic Language

Features That Became Surprisingly Useful

PR Risk Detection

Automatic Testing Notes

Release Notes Generation

Example Output

Before

After

What I Learned

Future Improvements

Final Thoughts

Top comments (0)