What I've Actually Learned Using Coding Agents to Build Software

#agents #githubcopilot #productivity #softwaredevelopment

Most of the writing about coding agents right now falls into two buckets: breathless excitement or dismissive skepticism. Neither is particularly useful if you're trying to figure out how to actually build things with these tools.

I've been using GitHub Copilot (backed by Claude) as my primary development partner for a while now, building a project with real scope — a Kubernetes operator, a CLI, an embedded web dashboard, CI pipelines, the works. Not a demo. Not a weekend project. Software I ship and maintain.

What follows isn't advice about prompting. It's what I've come to believe about the relationship between your codebase, your infrastructure, and how much leverage you actually get from an agent.

Pattern Inertia Is Real, and It Cuts Both Ways

Coding agents are pattern-completion machines. They look at your codebase and write more of what's already there. This sounds obvious, but the second-order effects are worth thinking about carefully.

Here's a concrete example. My project has two interfaces — a CLI and a web dashboard. The CLI got built first. When it came time to bring the dashboard up to feature parity, I pointed the agent at the work and let it go. What it produced was functional and correct — and completely duplicated. ~800 lines of business logic, reimplemented in a second location, because that's what the existing code looked like. Two implementations of secret management, two implementations of tunnel lifecycle, two implementations of kubectl wrappers.

The agent wasn't wrong. It was doing exactly what the patterns suggested. The problem was that nobody had named the abstraction. In a team of humans, someone eventually says "hey, we should pull this out" in a code review. The agent will never feel the mess accumulating. That's your job.

What's interesting is what happened when I did name it. I said, roughly, "there's a lot of duplicated code between the CLI and the dashboard — look at this closely and refactor." The agent analyzed both codebases, identified the shared logic, designed a core/ package with clean interfaces, migrated every call site, fixed the tests, and verified the build. Hundreds of lines removed, zero regressions. But it would never have initiated that on its own.

The lesson I take from this is that your early architectural decisions carry more weight than they used to. In agent-assisted development, a pattern isn't just a pattern — it's a template the agent will replicate indefinitely. Get the abstractions right early, and you're not writing one good module. You're establishing a convention the agent follows for the rest of the project. Get them wrong, and you're compounding technical debt at the speed of generation.

It Will Write Like You (If You Let It)

This one genuinely surprised me.

Every developer has idioms. The way you handle errors. The way you name things. The abstractions you reach for instinctively. I expected to spend a lot of time nudging the agent away from generic patterns and toward my preferences.

What actually happened, once the codebase had enough of my own code in it, is that the agent internalized my style. Not "clean Go" in the abstract. My Go. My error handling conventions, my naming patterns, my preferred ways of structuring packages. At a certain point, code the agent wrote was indistinguishable from code I wrote — not because it was copying, but because it had enough signal to generalize from.

This has a practical implication that I think people undervalue: the time you spend early on writing clean, consistent, opinionated code isn't just good hygiene. It's training data. The more coherent your existing codebase is, the less correction you do later. Your style becomes the agent's style, and the collaboration gets smoother the longer you work together.

Your Pipeline Is the Real Feedback Loop

Here's maybe the most important structural insight I've arrived at: the agent needs guardrails it can run against, and your CI/CD pipeline is where those guardrails live.

An agent can produce a lot of plausible-looking code quickly. "Plausible-looking" and "actually works" are different things. Without a tight feedback loop, you accumulate silent drift — code that reads fine, passes a casual glance, and breaks somewhere downstream.

In practice, what I've found works is treating the build-test-lint cycle as the environment that keeps the agent honest. When the agent refactored my codebase — touching 19 files across two packages — the Go compiler immediately caught every signature mismatch, every missing import, every call site that didn't match the new interface. The agent fixed them iteratively, ran the tests, confirmed zero regressions, and moved on. I didn't review individual lines during that process. I didn't need to. The pipeline told both of us whether the code was correct.

This is also what makes delegation possible. If you trust your pipeline to catch problems, you can hand the agent a substantial task — "write an e2e test suite that exercises every feature group against a real Kind cluster" — and let it work. Without that trust, you're reading every line of output, which isn't meaningfully faster than writing it yourself.

The tighter and faster the feedback, the less drift you get. A build that runs in seconds is worth more than a code review that happens in hours.

All of this is how and why I built Kindling.

The feedback loop I just described only works if it's fast. Cloud CI is slow. You push a commit, wait in a queue behind everyone else's jobs, watch a runner spin up from scratch, and by the time the feedback lands you've already context-switched into something else. That's not a feedback loop. That's a waiting room.

Kindling flips the model. It's a Kubernetes operator that routes your GitHub Actions CI jobs back to your own laptop, running in a local Kind cluster. When you push code, the job comes to you. It builds your container image via Kaniko, right there on your machine, and deploys a full ephemeral staging environment — Deployment, Service, Ingress, all auto-provisioned dependencies wired up and ready — to localhost. In seconds. On compute you already own.

The patterns I've been describing aren't abstract. They showed up here. The CLI and dashboard diverging into ~800 lines of duplication? That happened in Kindling — two interfaces, one codebase, and an agent that faithfully reproduced the mess until I explicitly told it otherwise. The types that caught mistakes early? Go's type system running in a tight local build loop, surfacing errors before they had a chance to become integration bugs. The idiosyncratic style the agent picked up? A Go codebase opinionated enough about its abstractions that Copilot eventually just wrote them the same way.

Kindling also ships a kindling generate command that uses an LLM to scan your repo and produce a GitHub Actions workflow tailored to what's actually in your code — detected services, dependencies, ports, credentials, the works. It's a coding agent operating on your infrastructure configuration. And it works well precisely because the environment it's generating for is well-defined and testable. The feedback loop is tight. The builds are fast. The agent stays honest.

Strong Types Are Cheap Insurance

This is a narrower point but it matters more than I expected: invest in strong typing early, even when it feels like overhead.

When the agent makes a mistake in a dynamically typed language, that mistake might not surface until runtime, or integration, or production. In a strongly typed language, the compiler catches it in seconds and tells the agent exactly what went wrong and where. A compile failure is free. An integration bug discovered three days later costs a week.

The same logic applies to schemas, API contracts, CRD definitions — anything that gives the agent a formal specification to check its work against. Without that, the agent is improvising. Improvisation at speed is how you get subtle inconsistencies that are painful to debug later.

I think of the type system as just another part of the feedback loop. Every checkpoint that catches drift early makes the whole system more reliable.

What This Actually Changes About Your Job

The shift I've experienced isn't "I write code faster now." It's that my job has changed shape.

I spend less time writing individual functions and more time thinking about architecture — where the boundaries should be, what patterns I want replicated, what the feedback loops look like. The agent handles volume. I handle direction. When I see something wrong, I name it at the level of design intent — "this is duplicated," "this should be a shared package," "write an e2e that covers every feature group" — and the agent figures out the implementation.

This isn't a small change. It means the skills that matter shift toward systems thinking, toward knowing when an abstraction is missing, toward building infrastructure that validates correctness automatically. The developers who get the most out of these tools won't be the ones who learn to prompt better. They'll be the ones who build environments where the agent can do good work — tight feedback loops, clean patterns, strong types, fast builds.

That's less exciting than "AI writes all my code." But it's closer to the truth, and honestly, it's more interesting. You're not being replaced by the tool. You're designing the system in which the tool operates. That's a different skill, and it's worth developing deliberately.