Georgii Motrenko

Posted on Mar 25 • Originally published at Medium

How I Replaced the Finance Department with a Chain of AI Agents

#ai #automation #fintech #productivity

How I Replaced the Finance Department with a Chain of AI Agents

I have a simple rule: if a task repeats more than twice — it's time to delegate. I used to delegate to people. Now — to AI agents that work with databases, spreadsheets, and task trackers directly.

Over three months, I built a system of 17 AI skills that took over all financial reporting for a B2B platform with 200+ partners. Without a single line of traditional code.

Context: A B2B Platform and Financial Routine

I work as a PO at a B2B platform at the intersection of insurance and finance. We aggregate financial products for partners: banks, lending institutions, marketplaces. Each partner wants to see their own funnel — applications, conversions, disbursements, commissions.

With 200+ active partners, reporting becomes a conveyor belt:

Daily reports — every morning a partner expects yesterday's numbers
Monthly reconciliations — hundreds of rows of deals broken down by vendors, with different revenue-split percentages
Ad-hoc requests — "show me the conversion for this product over the last two weeks"
Legal documents — generating standard contracts for new partners

2–3 hours a day, two people. A textbook case for delegation — but there was nobody to delegate to. Then a tool appeared that can work with SQL, APIs, and spreadsheets on its own.

Not One Agent — a Chain of Agents

An important distinction: this isn't "one smart bot that does everything." It's a system of specialized agents, each solving a specific task. In Claude Code terminology, each agent is a skill (slash-command) with clearly defined logic.

The agents are connected through shared infrastructure:

MCP server for the database (MS SQL, read-only) — the agent builds SQL queries and retrieves data
Google Sheets API — creating spreadsheets, writing data, formatting, formulas
Google Drive — file storage, uploading vendor exports
Task tracker — syncing with team tasks
GitLab API — analyzing development team productivity

One agent generates financial reports. Another runs monthly reconciliations. A third collects dev team statistics. A fourth transcribes meeting recordings and produces structured reports. They don't know about each other, but they operate within a single ecosystem.

How the Most Frequently Used Skill Works: The Financial Report

A manager enters a single command — the partner name and a date range. 40 seconds later, a complete report appears in Google Sheets. Here's what happens during those 40 seconds:

1. Counterparty Resolution
The agent maintains a mapping of human-readable aliases to internal identifiers. No need to remember UUIDs — a short name is enough.

2. Database Query
A SQL query is generated that builds the funnel for a financial product: applications → transitions → disbursements → commissions. The query accounts for the correct deal statuses — that's a story in itself, because the database documentation and the actual enum values don't always match.

3. Aggregation
Data is grouped by day, conversion rates between stages are calculated, commission totals are summed. Different partners have different revenue-sharing terms — the agent handles that.

4. Writing and Formatting
A spreadsheet is created or updated: headers, data, summary rows, number formats, column widths. Formatting is applied strictly after data is written — if you do both simultaneously, the API sometimes applies styles to empty cells (the data hasn't arrived yet).

5. Self-Verification
The agent re-reads what it wrote and compares it against the source data. This isn't paranoia — Google Sheets API actually drops rows during batch writes. Without verification, errors are inevitable.

Monthly Reconciliations: When One Agent Isn't Enough

The daily report is a relatively straightforward task. Monthly reconciliation is a different league entirely.

A single partner's traffic might flow through 3–4 vendors. Each has its own export format, its own rules for matching deals, its own commission rates. You need to:

Download vendor files from cloud storage
Match each deal against data in our database
Calculate commissions with different splits for different partners
Generate a summary sheet with formulas
All in one document, with separate tabs for each counterparty

This used to take a full workday. Now — a command with two parameters and 3–5 minutes of waiting.

What Else the System Covers

Over three months, the agents expanded well beyond reporting:

Skill	Task	Time
Financial Report	Funnel for a single partner	40 sec
Daily Digest	Report across all active partners	2–3 min
Monthly Reconciliation	Vendor reconciliation with formulas	3–5 min
Contract Generator	Standard contract for a new partner	1–2 min
Team Tracker	Product team management	interactive
GitLab Analytics	Developer activity comparison across 17 repos	2–3 min
Meeting Report	Meeting transcription (600+ segments) → structured report	3–5 min
Recruiting	Candidate search by parameters	interactive

17 skills, 9 MCP servers (SQL, Google Sheets, Drive, Calendar, Gmail, GitLab, Tracker, Telegram, GitHub). Not a single line of Python, Go, or TypeScript. All logic lives in markdown files with instructions for the agent.

Why Claude Code, Not LangChain

I tried several approaches:

ChatGPT + Zapier — too primitive, no direct database access
LangChain/LangGraph — requires a developer, long iteration cycles
Custom API + GPT-4 — works, but every change = rebuild and redeploy

Claude Code won because of one thing: a 30-second feedback loop. Write an instruction → test → fix → it works. No deployment, no CI/CD, no Docker. MCP servers give the agent direct access to infrastructure, and skills are just text files containing business logic.

Architecture:

Claude Code — runtime, the brain
MCP servers — bridge to external systems (database, Google, GitLab, Tracker)
Skills — markdown files with instructions (business logic)
Google Sheets/Drive — presentation layer for users

Pitfalls Worth Knowing About

Google Sheets API is not atomic. Write 100 rows — you might find only 97 in the spreadsheet. The rule: always re-read and verify after every write. Without this, bugs will appear regularly and silently.

Database documentation lies. The MCP server for SQL returns a glossary with table descriptions. But actual enum values may not match the documentation. We lost an entire day before we started joining reference tables directly instead of trusting the glossary.

Formatting is a separate step. If you write data and apply styles simultaneously, the API may format empty cells (the data hasn't landed yet). Data first → verification → then visuals.

Context is not infinite. When processing large volumes (200+ rows), the agent can lose its initial context. The solution: the agent keeps a log of every step and periodically re-reads it. This sounds odd, but it works — essentially, the agent creates its own "external memory."

Merge commits inflate statistics. When analyzing productivity via the GitLab API, it turned out that both feature commits and merge commits are counted separately. And auto-generated code (ORM migrations) can bloat statistics by 20,000+ lines. Without filtering, the numbers are meaningless.

People's aliases don't match across systems. The same developer can appear under different names in GitLab, the task tracker, and corporate email. The agent maintains an alias mapping — without it, reports for a team of 11 simply won't come together.

The Numbers

Metric	Before	After
Daily reports	2–3 hours/day	5–10 minutes
Monthly reconciliation	8 hours	5 minutes
Contract generation	2–3 hours	2 minutes
Errors in reports	3–5/month	0 (verification)
People on financial routine	2	0 (PO handles it)

~60 person-hours saved per month. Cost of Claude Code — $200/month on the Max plan.

Three Things I Learned

AI agents are a manager's tool, not a developer's. A PO who understands the process builds working automation faster than a team going through the classic cycle of specs → sprint → review → deploy. Because there's no translation from "business language" to "code" and back.

MCP servers are what turn a chatbot into an operator. Without access to databases and APIs, an agent can only generate text. With MCP, it becomes a full participant in the workflow — operating the same systems a human employee would.

Verification matters more than generation. 80% of the effort went not into prompts, but into the verification system. Every step is checked: did the data get written? Do the formulas add up? Is the formatting correct? An agent that doesn't check its own work is an agent that will eventually fail silently.

If you're a manager spending hours on data routine — start with one skill. Automate the most painful report. Once you see a three-hour task completed in a minute — the other 16 skills will follow naturally.

About the author: Georgii Motrenko — PO / Head of Product Growth in a B2B InsurTech/FinTech platform. Building chains of AI agents to automate financial and product operations.

Follow me:

LinkedIn
Telegram — Geo in IT channel
Threads

DEV Community

How I Replaced the Finance Department with a Chain of AI Agents

How I Replaced the Finance Department with a Chain of AI Agents

Context: A B2B Platform and Financial Routine

Not One Agent — a Chain of Agents

How the Most Frequently Used Skill Works: The Financial Report

Monthly Reconciliations: When One Agent Isn't Enough

What Else the System Covers

Why Claude Code, Not LangChain

Pitfalls Worth Knowing About

The Numbers

Three Things I Learned

Top comments (0)