Mitesh Sharma

Posted on Jun 14

AI Coding at Scale: What Actually Works

#ai #programming #devops #productivity

AI Coding at Scale: What Actually Works

I've been building Ethos, a personality-first AI agent, and like many engineers today, I rely heavily on AI coding tools. Claude Code, Codex, and others have become part of my daily workflow.

What I've learned is that simply adding AI to software development doesn't automatically make development faster or better.

Without structure, AI tends to:

Generate more code than necessary
Skip established conventions
Take implementation shortcuts
Create large changes that are difficult to review
Optimize for completion rather than maintainability

After experimenting with different workflows, a few of us sat down and discussed what actually works when AI becomes part of a real engineering process.

The conclusion was surprisingly simple:

Spend most of the effort on planning
Build automated feedback loops into execution
Enforce deterministic checks
Keep tasks and PRs small
Standardize engineering practices

None of these principles are new. The interesting part is how much more important they become when AI is writing a significant portion of the code.

1. Planning is the Highest-Leverage Activity

The biggest shift for me has been treating planning as the primary activity rather than a precursor to coding.

Most implementation mistakes are not coding mistakes. They are understanding mistakes.

If the feature requirements are unclear, architecture decisions are incomplete, edge cases are unexplored, or test strategies are undefined, AI simply executes the ambiguity faster.

I now spend the majority of effort upfront:

Understanding the problem
Identifying architecture implications
Finding gaps in requirements
Thinking through failure scenarios
Defining validation and testing strategies

A problem discovered during planning is a discussion.

The same problem discovered after implementation is usually a rewrite.

Model selection matters here as well. Planning is where reasoning quality has the highest impact, so I use the strongest model available for planning. Execution can often be delegated to a cheaper and faster model once the direction is clear.

2. Execution Should Include Feedback Loops

AI-generated code should not go directly from implementation to merge.

Execution needs review loops.

One approach that has worked well is using different models for implementation and review. If one model writes the code, another reviews it.

Different models tend to have different strengths and blind spots. Having a second model review the output often surfaces issues that would otherwise make it into a PR.

The goal is not to remove human review.

The goal is to ensure humans spend their time reviewing decisions and outcomes rather than catching avoidable mistakes.

3. Deterministic Checks Matter More Than AI Reviews

AI reviews are helpful.

Deterministic checks are mandatory.

Linting, formatting, tests, type checks, and validation should run automatically and consistently.

The key insight is simple:

If something must happen every single time, do not rely on AI to remember it.

Automate it.

Start with one check if necessary. A single enforced linting rule provides more value than an elaborate workflow that nobody adopts.

Once that foundation exists, expand gradually.

4. Small Tasks Produce Better Results

Large tasks create large outputs.

Large outputs create difficult reviews.

Whether the reviewer is a human or an AI model, review quality drops as scope increases.

Breaking work into milestones, small tickets, and focused PRs improves:

Implementation quality
Review quality
Iteration speed
Rollback safety

A smaller change is easier to reason about, easier to validate, and easier to ship.

5. Standardize Principles, Not Tools

One mistake teams often make is trying to standardize every tool and workflow.

What matters more is standardizing the engineering principles.

Different engineers can use different tools as long as they follow the same expectations:

Plan before implementing
Build feedback loops into execution
Use deterministic validation
Keep changes small
Follow shared engineering practices

Consistency of thinking matters more than consistency of tooling.

My Current Workflow

Today, my workflow is fairly straightforward.

I work inside a development sandbox and use both Claude Code and Codex depending on the task.

Before opening a PR:

Local hooks run validation
Tests execute locally
Linting and formatting are enforced
The same checks that run in CI run locally first

After implementation:

One model reviews the generated code
A second model reviews the review

The result is significantly less back-and-forth during code review and higher confidence before a human reviewer ever looks at the change.

I also use Git worktrees extensively to isolate parallel streams of work.

That led to one of the more important lessons I've learned about AI-assisted development.

Natural Language for Guidance. Hooks for Guarantees.

Initially, I documented a simple rule in AGENTS.md:

Always use worktrees.

For a while, it worked.

Then it didn't.

When I asked Claude why it occasionally ignored the instruction, the answer was surprisingly accurate: instructions are guidance, not enforcement.

A model can follow them.

A model can also drift from them.

This is the distinction that matters when building reliable AI workflows.

Use natural-language instructions for preferences.

Use deterministic systems for requirements.

If something is mandatory, enforce it with hooks, automation, validation, or policy.

Do not rely on a probabilistic system to provide deterministic guarantees.

Once I moved worktree enforcement into hooks, the issue disappeared.

Closing Thoughts

The most useful mental model I've found is to treat AI like another engineer on the team.

A very fast engineer.

A very capable engineer.

But still an engineer operating within a system.

Good engineering organizations do not rely on individual engineers to remember every rule, every process, and every validation step.

They create systems that make the right thing easy and the wrong thing difficult.

The same principle applies to AI.

Use strong models for planning.

Use automated feedback loops during execution.

Enforce deterministic validation.

Keep changes small.

And whenever something absolutely must happen every time, automate it.

PS: I'm building Ethos in public — an AI agent with a soul: locked core values, evolving expression, governed self-improvement. If this kind of thing interests you, come say hi: @EthosAgentAI.

Top comments (1)

Marouane K • Jun 15

Hi miteshethos, I saw your post about AI Coding at Scale and thought of Clypify, which can help you streamline your content workflow. We'd love to help you manage your content variants and editorial copy. Free plan at clypify.com — no card needed.