DEV Community

Cover image for AI Coding at Scale: What Actually Works
Mitesh Sharma
Mitesh Sharma

Posted on

AI Coding at Scale: What Actually Works

AI Coding at Scale: What Actually Works

I've been building Ethos, a personality-first AI agent, and like many engineers today, I rely heavily on AI coding tools. Claude Code, Codex, and others have become part of my daily workflow.

What I've learned is that simply adding AI to software development doesn't automatically make development faster or better.

Without structure, AI tends to:

  • Generate more code than necessary
  • Skip established conventions
  • Take implementation shortcuts
  • Create large changes that are difficult to review
  • Optimize for completion rather than maintainability

After experimenting with different workflows, a few of us sat down and discussed what actually works when AI becomes part of a real engineering process.

The conclusion was surprisingly simple:

  1. Spend most of the effort on planning
  2. Build automated feedback loops into execution
  3. Enforce deterministic checks
  4. Keep tasks and PRs small
  5. Standardize engineering practices

None of these principles are new. The interesting part is how much more important they become when AI is writing a significant portion of the code.

1. Planning is the Highest-Leverage Activity

The biggest shift for me has been treating planning as the primary activity rather than a precursor to coding.

Most implementation mistakes are not coding mistakes. They are understanding mistakes.

If the feature requirements are unclear, architecture decisions are incomplete, edge cases are unexplored, or test strategies are undefined, AI simply executes the ambiguity faster.

I now spend the majority of effort upfront:

  • Understanding the problem
  • Identifying architecture implications
  • Finding gaps in requirements
  • Thinking through failure scenarios
  • Defining validation and testing strategies

A problem discovered during planning is a discussion.

The same problem discovered after implementation is usually a rewrite.

Model selection matters here as well. Planning is where reasoning quality has the highest impact, so I use the strongest model available for planning. Execution can often be delegated to a cheaper and faster model once the direction is clear.

2. Execution Should Include Feedback Loops

AI-generated code should not go directly from implementation to merge.

Execution needs review loops.

One approach that has worked well is using different models for implementation and review. If one model writes the code, another reviews it.

Different models tend to have different strengths and blind spots. Having a second model review the output often surfaces issues that would otherwise make it into a PR.

The goal is not to remove human review.

The goal is to ensure humans spend their time reviewing decisions and outcomes rather than catching avoidable mistakes.

3. Deterministic Checks Matter More Than AI Reviews

AI reviews are helpful.

Deterministic checks are mandatory.

Linting, formatting, tests, type checks, and validation should run automatically and consistently.

The key insight is simple:

If something must happen every single time, do not rely on AI to remember it.

Automate it.

Start with one check if necessary. A single enforced linting rule provides more value than an elaborate workflow that nobody adopts.

Once that foundation exists, expand gradually.

4. Small Tasks Produce Better Results

Large tasks create large outputs.

Large outputs create difficult reviews.

Whether the reviewer is a human or an AI model, review quality drops as scope increases.

Breaking work into milestones, small tickets, and focused PRs improves:

  • Implementation quality
  • Review quality
  • Iteration speed
  • Rollback safety

A smaller change is easier to reason about, easier to validate, and easier to ship.

5. Standardize Principles, Not Tools

One mistake teams often make is trying to standardize every tool and workflow.

What matters more is standardizing the engineering principles.

Different engineers can use different tools as long as they follow the same expectations:

  • Plan before implementing
  • Build feedback loops into execution
  • Use deterministic validation
  • Keep changes small
  • Follow shared engineering practices

Consistency of thinking matters more than consistency of tooling.

My Current Workflow

Today, my workflow is fairly straightforward.

I work inside a development sandbox and use both Claude Code and Codex depending on the task.

Before opening a PR:

  • Local hooks run validation
  • Tests execute locally
  • Linting and formatting are enforced
  • The same checks that run in CI run locally first

After implementation:

  • One model reviews the generated code
  • A second model reviews the review

The result is significantly less back-and-forth during code review and higher confidence before a human reviewer ever looks at the change.

I also use Git worktrees extensively to isolate parallel streams of work.

That led to one of the more important lessons I've learned about AI-assisted development.

Natural Language for Guidance. Hooks for Guarantees.

Initially, I documented a simple rule in AGENTS.md:

Always use worktrees.

For a while, it worked.

Then it didn't.

When I asked Claude why it occasionally ignored the instruction, the answer was surprisingly accurate: instructions are guidance, not enforcement.

A model can follow them.

A model can also drift from them.

This is the distinction that matters when building reliable AI workflows.

Use natural-language instructions for preferences.

Use deterministic systems for requirements.

If something is mandatory, enforce it with hooks, automation, validation, or policy.

Do not rely on a probabilistic system to provide deterministic guarantees.

Once I moved worktree enforcement into hooks, the issue disappeared.

Closing Thoughts

The most useful mental model I've found is to treat AI like another engineer on the team.

A very fast engineer.

A very capable engineer.

But still an engineer operating within a system.

Good engineering organizations do not rely on individual engineers to remember every rule, every process, and every validation step.

They create systems that make the right thing easy and the wrong thing difficult.

The same principle applies to AI.

Use strong models for planning.

Use automated feedback loops during execution.

Enforce deterministic validation.

Keep changes small.

And whenever something absolutely must happen every time, automate it.

PS: I'm building Ethos in public — an AI agent with a soul: locked core values, evolving expression, governed self-improvement. If this kind of thing interests you, come say hi: @EthosAgentAI.

Top comments (1)

Collapse
 
marouaneks profile image
Marouane K

Hi miteshethos, I saw your post about AI Coding at Scale and thought of Clypify, which can help you streamline your content workflow. We'd love to help you manage your content variants and editorial copy. Free plan at clypify.com — no card needed.