Vikrant Bagal

Posted on May 8

Production-Grade Engineering Skills for AI Coding Agents

#ai #programming #claude #agentskills

AI coding agents have revolutionized how we write software. They can implement features, fix bugs, and review code at incredible speed. But there's a catch: AI agents default to the shortest path, which often means skipping specs, tests, security reviews, and the practices that make software reliable.

The solution? Production-grade engineering skills for AI coding agents—structured workflows that enforce the same discipline senior engineers bring to production code.

The Problem: AI Agents Need Guardrails

When you give an AI agent a vague prompt like "build a dashboard," it will produce something that looks functional. But will it be:

✅ Well-specified with clear success criteria?
✅ Tested with comprehensive coverage?
✅ Secure against common vulnerabilities?
✅ Performant and maintainable?

Without structured workflows, the answer is often "no." The agent optimizes for "looks right" rather than "is right."

The Solution: Agent Skills

Agent Skills is a production-grade collection of 20 structured workflows for AI coding agents. With 33,000+ stars on GitHub, it's become the de facto standard for reliable AI-assisted development.

Each skill encodes hard-won engineering judgment from Google's engineering culture, including concepts from Software Engineering at Google and Google's engineering practices guide.

The 20 Skills Cover the Entire Lifecycle

Define Phase:

Idea Refinement - Structured divergent/convergent thinking
Spec-Driven Development - Write a PRD before any code

Plan Phase:

Planning and Task Breakdown - Decompose specs into verifiable tasks

Build Phase:

Incremental Implementation - Thin vertical slices with feature flags
Context Engineering - Feed agents the right information at the right time
Source-Driven Development - Ground decisions in official documentation
Frontend UI Engineering - Component architecture, design systems, accessibility
API and Interface Design - Contract-first design, error semantics
Test-Driven Development - RED-GREEN-REFACTOR workflow

Verify Phase:

Browser Testing with DevTools - Chrome DevTools MCP for runtime data
Debugging and Error Recovery - Five-step triage: reproduce, localize, reduce, fix, guard

Review Phase:

Code Review and Quality - Five-axis review, change sizing, severity labels
Code Simplification - Chesterton's Fence, Rule of 500
Security and Hardening - OWASP Top 10 prevention, auth patterns
Performance Optimization - Measure-first approach, Core Web Vitals

Ship Phase:

Git Workflow and Versioning - Trunk-based development, atomic commits
CI/CD and Automation - Shift Left, Faster is Safer, feature flags
Deprecation and Migration - Code-as-liability mindset
Documentation and ADRs - Architecture Decision Records
Shipping and Launch - Pre-launch checklists, staged rollouts

Deep Dive: Spec-Driven Development

The most critical skill is Spec-Driven Development. Before writing any code, the agent creates a specification covering:

The Six-Core Spec Template

# Spec: [Project/Feature Name]

## Objective
What we're building and why. User stories or acceptance criteria.

## Tech Stack
Framework, language, key dependencies with versions

## Commands
Build: npm run build
Test: npm test -- --coverage
Lint: npm run lint --fix
Dev: npm run dev

## Project Structure
src/ → Application source code
src/components → React components
src/lib → Shared utilities
tests/ → Unit and integration tests

## Code Style
Example snippet + key conventions

## Testing Strategy
Framework, test locations, coverage requirements

## Boundaries
- Always: Run tests before commits, follow naming conventions
- Ask first: Database schema changes, adding dependencies
- Never: Commit secrets, edit vendor directories

## Success Criteria
Specific, testable conditions for completion

## Open Questions
Anything unresolved that needs human input

Why This Works

Surfaces Assumptions Early - The spec forces clarity before code
Shared Source of Truth - Human and agent agree on what "done" means
Prevents Rework - A 15-minute spec prevents hours of debugging
Living Document - Updated when decisions change, committed to version control

Deep Dive: Test-Driven Development

The Test-Driven Development skill enforces the RED-GREEN-REFACTOR cycle:

The Cycle

RED: Write a test that fails (proves the test works)
GREEN: Write minimal code to make it pass
REFACTOR: Clean up the implementation
Repeat for each new behavior

Example: Task Service

// RED: This test fails because createTask doesn't exist yet
describe('TaskService', () => {
  it('creates a task with title and default status', async () => {
    const task = await taskService.createTask({ title: 'Buy groceries' });
    expect(task.id).toBeDefined();
    expect(task.title).toBe('Buy groceries');
    expect(task.status).toBe('pending');
    expect(task.createdAt).toBeInstanceOf(Date);
  });
});

// GREEN: Minimal implementation
export async function createTask(input: { title: string }): Promise<Task> {
  const task = {
    id: generateId(),
    title: input.title,
    status: 'pending' as const,
    createdAt: new Date(),
  };
  await db.tasks.insert(task);
  return task;
}

Test Pyramid

80% Unit Tests (small, fast, isolated)
15% Integration Tests (component interactions, API boundaries)
5% E2E Tests (full user flows, real browser)

Real-World Examples

Example 1: Bug Fix with TDD

Bug Report: "Completing a task doesn't update completedAt timestamp"

Agent writes failing test that reproduces the bug
Test confirms bug exists (RED)
Agent implements fix (GREEN)
Agent runs full test suite to ensure no regressions
Result: Bug fixed with guaranteed correctness

Example 2: Parallel Agent Workflow

Scenario: Full-stack feature implementation

Agent 1 (backend): Implements API endpoints in feature branch A
Agent 2 (frontend): Builds React components in feature branch B
Agent 3 (tests): Writes integration tests in feature branch C
Human: Reviews and merges all branches after parallel completion

Result: 3x faster than sequential development

Example 3: Security Review

Agent Skills includes security-and-hardening skill:

OWASP Top 10 prevention patterns
Authentication and authorization patterns
Secrets management and dependency auditing
Three-tier boundary system

Before any code merge, the security skill runs automatically, catching vulnerabilities early.

Common Pitfalls (And How to Avoid Them)

1. Over-Prompting

Problem: Long, repetitive, contradictory prompts confuse agents
Solution: Say what you need once, clearly. Restate rather than append.

2. Under-Reviewing

Problem: Assuming AI-generated code is correct because it looks right
Solution: Review every diff like a pull request from a teammate

3. Skipping Git Isolation

Problem: Running agents on main branch leads to conflicts
Solution: Always use feature branches; use git worktrees for parallel agents

4. Ignoring Session Boundaries

Problem: One long session leads to context bloat and inconsistent decisions
Solution: Start fresh sessions for new tasks; keep sessions focused

5. Testing Implementation Details

Problem: Tests break when refactoring even if behavior unchanged
Solution: Test inputs and outputs, not internal structure

Getting Started

Install Agent Skills

For Claude Code:

/plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

For Cursor: Copy SKILL.md files into .cursor/rules/

For Gemini CLI:

gemini skills install https://github.com/addyosmani/agent-skills.git --path skills

Create Your First AGENTS.md

Create an AGENTS.md file at your repository root with:

Project layout and important directories
Build, test, lint commands
Engineering conventions
Constraints and do-not rules

Start with Spec-Driven Development

Create a new feature branch
Run the spec-driven development skill
Write the spec with human review
Break into tasks with acceptance criteria
Implement incrementally with tests

The Bottom Line

AI coding agents are powerful but need guardrails. Production-grade engineering skills provide the structure, workflows, and best practices that make AI-assisted development reliable.

The agents who write the best prompts aren't the most productive. The ones with the best processes around prompting are.

Start with spec-driven development. Add test-driven development. Review everything. Manage context like a resource. And watch your AI coding agents transform from fast code generators to reliable engineering partners.

Ready to level up your AI coding workflow? Start with the Agent Skills repository and implement one skill at a time. Your future self (and your production environment) will thank you.

LinkedIn: https://www.linkedin.com/in/vikrant-bagal

DEV Community