Gevik Babakhani

Posted on Nov 3

Teaching an AI to `git commit`: building a tool-aware agent in LangChain.js

#ai #langchain #git #agents

We’ve all seen “AI that writes your git commit messages”. Most of them are wrappers around git diff + “hey model, make a message”.

This project is not that.

This project turns the model into the brain and gives it an actual toolbox — a real, callable git tool it can use to inspect, stage, and commit. The model doesn’t just guess what changed, it can ask the repo.

That’s the core idea I want to walk through in this article: what happens when you stop treating LLMs as autocomplete and start treating them as agents with tools.

And we’ll do it with a real codebase: the repo contains a TypeScript CLI called @blendsdk/git-commit-agent with:

a LangChain-powered agent (src/index.ts)
a single, well-typed master git tool (src/tools/git-master.tool.ts)
prompt generators (src/prompts/*)
config merging from CLI, env, and defaults (src/config/*)
safety + error shaping for git (src/utils/*)

This is a good example of “AI agent + real tool” architecture, and it’s small enough to study.

1. What problem are we solving?

Normal “AI commit” tools have two big limitations:

They don’t see what you see. They rely on you pasting diffs or on a single git diff call the tool made for you.
They can’t act. Even if the model knows the right commit message, the integration layer has to run git add / git commit manually.

This project flips it.

The agent is allowed to run git.

Via one master tool. With a schema. With validation. With guard rails.

So the model can do this flow by itself:

“What changed?” → call tool → git status --porcelain
“Let me see more detail.” → call tool → git diff --stat
“OK, I will stage modified files.” → call tool → git add -u
“Now I will create a Conventional Commit message.” → call tool → git commit -m "feat: …"

That’s the whole point of tool-aware agents.

2. The high-level architecture

Let’s start with the big picture, based on the source:

┌───────────────────────────┐
│         CLI (yargs)       │  ← src/index.ts + src/config/cli-parser.ts
└──────────────┬────────────┘
               │
               ▼
┌───────────────────────────┐
│   Config merger           │  ← CLI > ENV (.agent-config, .env) > defaults
│   src/config/config-merger│
└──────────────┬────────────┘
               │
               ▼
┌───────────────────────────┐
│   Prompt generators       │  ← src/prompts/system-prompt.ts
│   (system + task)         │     src/prompts/git-prompt-generator.ts
└──────────────┬────────────┘
               │
               ▼
┌───────────────────────────┐
│  LangChain Agent          │  ← createAgent(...)
│  + OpenAI model           │  ← @langchain/openai
│  + ONE git tool           │  ← src/tools/git-master.tool.ts
└──────────────┬────────────┘
               │
               ▼
┌───────────────────────────┐
│   Git repo (local)        │  ← execa("git", [...])
└───────────────────────────┘

Key design choices the code makes:

One tool, not many. Instead of exposing 10 tools (gitStatus, gitCommit, gitDiff, …) it exposes a single master tool that can run any git command, but only after validation.
Prompt-driven workflow. The task prompt literally tells the agent how to work: first inspect, then decide, then commit.
Configurable personality. CLI/env decide things like detail level, max subject length, whether to auto-stage, and whether pushes are allowed.

This is a pattern you can reuse for a dozen other dev tools.

3. Boot sequence (what actually happens)

The entrypoint is src/index.ts. Even though the file in the ZIP is partially elided, the structure is clear:

Load env

   // src/index.ts
   await loadEnvironment();

This pulls from:

a global ~/.agent-config (nice touch for user-wide defaults)
a local .env (project-specific) See: src/config/env-loader.ts

Parse CLI

   const cliConfig = parseCliArguments(); // src/config/cli-parser.ts

Merge config

   const config = loadFinalConfig(cliConfig); // src/config/config-merger.ts
   // priority: CLI > ENV > DEFAULTS (see src/config/prompt-config.ts)

Detect git version

   const gitVersion = await getGitVersion(); // src/utils/git-commands.ts

Generate prompts

   const systemPrompt = generateSystemPrompt(config, gitVersion);
   const gitPrompt = generateGitPrompt(config);

Create agent

   const model = new ChatOpenAI({
     model: process.env.OPENAI_MODEL?.toString() || "gpt-5-nano-2025-08-07",
     apiKey: process.env.OPENAI_API_KEY || "<NEED API KEY>",
     maxRetries: 3,
   });

   const agent = await createAgent({
     model,
     tools: [execute_git_command_tool],
     systemPrompt,
   });

Run it

   const streamResponse = await agent.invoke({
     messages: [new HumanMessage(gitPrompt)],
   });

So: CLI → config → prompts → agent → agent calls tool(s).

That’s the whole flow.

4. The core of it all: the git tool

This project could have exposed five tools. Instead it exposes one:

src/tools/git-master.tool.ts

Why is that smart?

The model has one choice → less confusion.
You can heavily validate the input → no git reset --hard by accident.
You can do nice error shaping → so the model knows what went wrong.

Here’s a simplified version of what the code is doing (TypeScript):

// src/tools/git-master.tool.ts (simplified)
import { execa } from "execa";
import { tool } from "langchain";
import { z } from "zod";
import { validateGitRepo, validateCommandSyntax } from "../utils/git-commands.js";
import { GitError, type ToolResult } from "../utils/git-error.js";

export const execute_git_command_tool = tool(
  async (input): Promise<ToolResult> => {
    // 1. Must be in a git repo
    await validateGitRepo();

    // 2. Validate the command is allowed
    validateCommandSyntax(input.command, input.args, input.allowDangerous);

    try {
      const startedAt = Date.now();

      const { stdout, stderr, exitCode } = await execa("git", [
        input.command,
        ...(input.args ?? []),
      ]);

      return {
        success: exitCode === 0,
        command: `git ${input.command} ${(input.args ?? []).join(" ")}`,
        stdout,
        stderr,
        executionTime: Date.now() - startedAt,
      };
    } catch (error: any) {
      // Wrap in a domain-specific error shape
      throw new GitError(
        "GIT_COMMAND_FAILED",
        `Git command failed: ${error.message}`,
        {
          commandTried: input.command,
          suggestion: "Run `git status` and try again.",
        }
      );
    }
  },
  {
    name: "execute_git_command",
    description:
      "Run a git command in the current repository. Used for status, diff, add, commit. Will reject dangerous commands unless allowDangerous=true.",
    schema: z.object({
      command: z
        .string()
        .describe("The git subcommand, e.g. 'status', 'diff', 'add', 'commit'"),
      args: z
        .array(z.string())
        .optional()
        .describe("Arguments for the git subcommand."),
      allowDangerous: z
        .boolean()
        .optional()
        .describe("Set to true if you intentionally want to run a destructive command."),
      commitMessage: z
        .string()
        .optional()
        .describe(
          "For 'git commit': a full conventional commit message, multi-line supported."
        ),
    }),
  }
);

A couple of nice things here:

Zod gives the model a schema → the model can fill the right JSON.
execa is a good choice for CLI tooling (captures stdout/stderr easily).
There are safety checks in src/utils/git-commands.ts to ensure we’re in a git repo and that the requested command is allowed.

This is what I mean by “a world is opening up”: once the model can run this, it can do anything git can do — and you can give it more tools the same way.

5. The prompts: teaching the model to work like a developer

The repo has two relevant prompt files:

src/prompts/system-prompt.ts → who you are + what’s allowed
src/prompts/git-prompt-generator.ts → what to do right now

The system prompt is the permanent instruction. It tells the model:

you are a git repository management assistant
you have one master tool
you must not push (unless config says so)
you should log your steps if verbose
you should produce conventional commits

Because the config is passed in, the prompt can adapt:

if config.push === true → allow push
if config.verbose === true → ask for more diagnostic steps
if a git version was detected → mention it

That part in code looks a bit like this:

// src/prompts/system-prompt.ts (conceptual)
export function generateSystemPrompt(config: PromptConfig, gitVersion: string) {
  return `
You are an AI assistant specialized in git repository management via ONE master tool.

Git version: ${gitVersion}

${config.push ? "You MAY push to remote when instructed." : "Do NOT push to remote."}

When you need information, CALL the tool.
When you need to stage files, CALL the tool.
When you need to commit, CALL the tool with the full commit message.

Always produce Conventional Commits.
Max subject length: ${config.subjectMaxLength}
Detail level: ${config.detailLevel}
`;
}

Then the task prompt (generateGitPrompt) is more like a script:

check status
get diffs (staged + unstaged)
decide what to stage (depending on autoStage)
generate commit message
commit

This is a simple but powerful pattern: system = persona, task = workflow.

6. Configuration: CLI > ENV > Defaults

The repo shows a mature approach to config.

src/config/prompt-config.ts defines the shape (PromptConfig)
src/config/config-merger.ts defines the priority (CLI beats env, env beats default)
src/config/env-loader.ts handles where we load from (~/.agent-config, local .env)

That lets you do this:

# 1. global, for all projects
echo "OPENAI_API_KEY=sk-..." >> ~/.agent-config

# 2. local, for this repo
echo "COMMIT_TYPE=feat" >> .env

# 3. per run
git-commit-agent --detail-level detailed --subject-max-length 72

Since the CLI uses yargs (src/config/cli-parser.ts), developers will feel at home.

7. Putting it all together (end-to-end run)

Runtime looks like this:

User types:

   git-commit-agent --detail-level detailed --auto-stage modified

Agent prompt says: “first get status”

   execute_git_command({
     command: "status",
     args: ["--porcelain"]
   })

Agent sees there are modified files → agent calls:

   execute_git_command({
     command: "add",
     args: ["-u"]
   })

Agent synthesizes commit:

   execute_git_command({
     command: "commit",
     args: ["-m", "feat: improve AI commit agent"],
     commitMessage: `feat: improve AI commit agent

- add verbose mode
- include git version in prompt
- improve error reporting
`
   })

CLI prints the agent’s final message.

All of this is driven by the model, but bounded by the tool’s schema.

8. What this demonstrates (and why it matters)

This repo is “just” a git commit helper, but it demonstrates three ideas that are the basis of serious AI tooling:

Agents need real tools. If a model can’t act, you’re back to copy/paste.
Tools need strong contracts. Zod, narrow descriptions, and safety checks give the model confidence and you control.
Prompts should describe workflows. Not just “write a commit message”, but “inspect → decide → act → explain”.

Once you get this pattern, you can swap the tool:

git → docker
git → kubectl
git → Jira/GitHub API
git → filesystem refactor
git → codegen

The model stays the brain.

The tool changes the superpower.

That’s the “world that’s opening up”.

9. Extra: positioning vs “AI commit” plugins

A question your dev.to readers will definitely have:

“How is this different from Copilot’s commit message thing / random CLI that just sends git diff to GPT?”

You can add this comparison table:

Aspect	Typical “AI commit”	This project
Who decides what to run	Shell script / wrapper	Agent decides via tool
Visibility into repo	Usually just a diff	Any git command the tool allows
Ability to act	Usually none (user runs `git commit`)	Agent can call `git add`/`git commit`
Safety	Depends on script	Validated tool + schema
Extensibility	Hard	Add more tools

That’s a strong selling point.

10. Extra: a “Try it yourself” section

You can close the article with a hands-on block:

# 1. set your key
export OPENAI_API_KEY=sk-...

# 2. run in a git repo
npx tsx src/index.ts --detail-level detailed --auto-stage modified

# 3. inspect what it staged
git status

# 4. re-run with verbose
npx tsx src/index.ts --verbose

If you want to impress people, show them that you can run it in dry-run mode (the entrypoint in your ZIP hints at non-destructive runs) and that all it does is plan the commands.

11. Extra: future directions 🧭

Closing with future work always makes a dev.to post feel “alive”:

Multi-step planning: let the agent decide whether a commit is even necessary.
Commit style adapters: company-specific rules → pass as config.
PR description generator: once it can run git, it can also call GitHub.
Telemetry: log which git commands were chosen → improve prompts.
Non-git tools: npm test, npm run lint, pnpm run coverage as callable tools.

12. Takeaways for seasoned developers

This codebase is opinionated: one tool, CLI-first, Conventional Commits, LangChain v1.
The boundaries are clean:
- prompts in src/prompts/
- tools in src/tools/
- config in src/config/
- validation in src/utils/
It’s production-ish: global config loading (~/.agent-config), env-first, verbose mode, dry-run option in the entrypoint, custom error class (GitError).
It’s reproducible: you can drop in another tool and the architecture doesn’t fall apart.

13. Appendix: runnable snippet (minimal CLI-ish entry)

#!/usr/bin/env node
import { ChatOpenAI } from "@langchain/openai";
import { createAgent, HumanMessage } from "langchain";
import { loadEnvironment } from "./src/config/env-loader.js";
import { parseCliArguments } from "./src/config/cli-parser.js";
import { loadFinalConfig } from "./src/config/config-merger.js";
import { generateSystemPrompt, generateGitPrompt } from "./src/prompts/index.js";
import { execute_git_command_tool } from "./src/tools/git-master.tool.js";
import { getGitVersion } from "./src/utils/git-commands.js";

async function main() {
  await loadEnvironment();
  const cli = parseCliArguments();
  const config = loadFinalConfig(cli);

  const gitVersion = await getGitVersion().catch(() => "unknown");

  const systemPrompt = generateSystemPrompt(config, gitVersion);
  const taskPrompt = generateGitPrompt(config);

  const model = new ChatOpenAI({
    model: process.env.OPENAI_MODEL || "gpt-5-nano-2025-08-07",
    apiKey: process.env.OPENAI_API_KEY,
  });

  const agent = await createAgent({
    model,
    tools: [execute_git_command_tool],
    systemPrompt,
  });

  const res = await agent.invoke({
    messages: [new HumanMessage(taskPrompt)],
  });

  console.log(res.messages.at(-1)?.content ?? "No response.");
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

14. AI Agent Workflow Example For the Future

This shows a simple, tool-aware TypeScript-style agent that can execute a business task by selecting and running the right tools.

Code

type ToolName = "getCustomer" | "getInvoices" | "draftEmail";

interface ToolContext {
  customerId?: string;
  intent: string;
}

async function getCustomer(ctx: ToolContext) {
  // pretend this is a DB/API call
  return {
    id: ctx.customerId ?? "CUST-1001",
    name: "Avery Dennison",
    segment: "enterprise"
  };
}

async function getInvoices(ctx: ToolContext) {
  // pretend this is accounting data
  return [
    { invoice: "INV-001", amount: 1200, due: "2025-10-01", status: "overdue" },
    { invoice: "INV-002", amount: 800, due: "2025-11-01", status: "open" }
  ];
}

async function draftEmail(ctx: ToolContext & { customer?: any; invoices?: any[] }) {
  const overdue = (ctx.invoices ?? []).filter(i => i.status === "overdue");
  const lines = overdue.map(i => `- ${i.invoice} (€${i.amount}) due ${i.due}`).join("\n");
  return `
Subject: Overdue invoice(s)

Hi ${ctx.customer?.name ?? "there"},

We noticed the following invoice(s) are still open:

${lines || "- none -"}

Could you let us know the status?

Thanks,
Billing Bot
  `.trim();
}

const toolMap: Record<ToolName, (ctx: any) => Promise<any>> = {
  getCustomer,
  getInvoices,
  draftEmail,
};

async function runAgent(userGoal: string, customerId?: string) {
  // 1) infer steps from the goal (normally the LLM would do this)
  const plan: ToolName[] = ["getCustomer", "getInvoices", "draftEmail"];

  const ctx: any = { intent: userGoal, customerId };

  for (const step of plan) {
    const fn = toolMap[step];
    const result = await fn(ctx);
    if (step === "getCustomer") ctx.customer = result;
    if (step === "getInvoices") ctx.invoices = result;
    if (step === "draftEmail") ctx.email = result;
  }

  return ctx.email;
}

// demo
runAgent("send a polite reminder about overdue invoices", "CUST-1001")
  .then(msg => console.log(msg))
  .catch(console.error);

How it Works

User gives a business goal (e.g. “send a polite reminder about overdue invoices”).
Agent creates a mini-plan (get customer → get invoices → draft email).
Agent calls your data/systems (CRM, finance, DMS — here mocked).
Agent returns usable output (an email ready to send).

Final Thought

The real win is not that the AI can write text — it's that you can wrap any repeatable workflow in your company (onboarding, quoting, PGS9 permit checks, DEX/rewards reporting, invoice chasing, deployment notes, even Git ops) into small, tool-aware agents like this. Once your data and operations are callable, people can talk to the business instead of clicking through 5 different internal tools. That’s the moment AI stops being a demo and starts being an orchestrator of work — getting you faster cash collection, cleaner compliance, and feature delivery that doesn't require more headcount.

DEV Community