DEV Community

冉淦元
冉淦元

Posted on

Aegis: A Method Pack for More Reliable AI Coding Agents

AI coding agents are getting much better at writing code.

But in real engineering work, the hard problems are often not just about whether the model can generate a patch.

The harder questions are:

  • Did it read the right project baseline first?
  • Did it understand the actual owner of the behavior?
  • Did it verify the fix before claiming completion?
  • Did it distinguish facts from assumptions?
  • Did it preserve architecture boundaries?
  • Did it leave enough evidence for a human or another agent to continue safely?

I built Aegis to work on that layer.

GitHub: https://github.com/GanyuanRan/Aegis

What is Aegis?

Aegis is an open-source method pack for AI coding agents.

It is not a new model.

It is not an IDE.

It is not a runtime core or an authoritative gate.

Instead, Aegis provides workflow discipline that can be installed into different AI coding hosts.

The current positioning is:

Aegis Method Pack: runtime-ready workflow discipline for AI coding agents.

That means Aegis focuses on the behaviors around the coding agent:

  • how it starts a task
  • how it reads project context
  • how it plans
  • how it debugs
  • how it applies TDD
  • how it verifies completion
  • how it reports residual risk
  • how it avoids pretending that a method-layer checklist is runtime authority

Why I built it

When working with AI coding agents, I kept seeing the same failure patterns.

The agent would often:

  • skip the architecture baseline
  • patch the consumer instead of the canonical owner
  • add another fallback instead of retiring the old path
  • pass one narrow test and claim the whole task was done
  • forget to report architecture drift
  • lose the user's requested language or output format
  • treat logs or tool output as prompt payload instead of evidence
  • confuse the target project with the installed method-pack support path

These issues are not solved only by writing better prompts.

They need repeatable workflow pressure.

What Aegis includes

Aegis currently includes workflow guidance for:

  • baseline-first project context
  • brainstorming and design clarification
  • first-principles review
  • writing implementation plans
  • test-driven development
  • systematic debugging
  • long-task continuation
  • verification before completion
  • architecture alignment reporting
  • ADR backfill checks
  • cross-host method-pack installation guidance

The goal is not to make the agent more verbose.

The goal is to make the agent less likely to skip the boring steps that protect real projects.

Example: verification before completion

Aegis treats completion claims as something that must be backed by evidence.

Instead of saying:

Done, should work now.

Aegis pushes the agent toward a compact evidence shape:

Evidence Card:
- Command / Check:
- Exit Status:
- Covered:
- Not Covered:
- Residual Risk:
- Confidence:
Enter fullscreen mode Exit fullscreen mode

For architecture-sensitive work, it also asks for an explicit architecture alignment result:

Architecture Alignment:
- Trigger:
- Scope:
- Baseline checked:
- Result: aligned | architecture drift | architecture defect
- Evidence:
- Residual architecture risk:
Enter fullscreen mode Exit fullscreen mode

This is still advisory method-pack discipline.

It does not grant final authority.

It does not become a runtime GateDecision.

It simply makes skipped reasoning harder to hide.

Example: workspace helper boundaries

One recent fix in Aegis was about a subtle but important boundary.

A method-pack helper should belong to the installed Aegis method-pack path.

The target project should be passed separately:

python <aegis-workspace-helper> check --root <target-project-root>
Enter fullscreen mode Exit fullscreen mode

That sounds small, but it prevents agents from assuming that every project repository must contain its own scripts/aegis-workspace.py.

This is the kind of problem Aegis tries to catch: not just whether a command exists, but whether the ownership model is correct.

What Aegis is not

Aegis deliberately does not claim to be a full runtime platform.

It does not own:

  • authoritative runtime core decisions
  • authoritative GateDecision
  • authoritative PolicySnapshot
  • final completion authority

That boundary matters.

A method pack can improve behavior, structure evidence, and make workflows more reliable.

But it should not pretend to be the final source of truth for a project.

The target project's rules, architecture baseline, ADRs, and human decisions still matter.

Who might find this useful?

Aegis may be useful if you:

  • use AI coding agents on real codebases
  • care about architecture drift
  • want stronger verification before completion claims
  • want repeatable debugging and TDD workflows
  • work across multiple agent hosts
  • want agents to preserve project-specific rules instead of inventing new owners

It is probably less useful if you only want a lightweight one-shot coding assistant for small isolated snippets.

Try it

The repo is here:

https://github.com/GanyuanRan/Aegis

The README includes host-specific install notes and verification commands.

A typical verification path includes:

python scripts/aegis-doctor.py --write-config --json
bash tests/e2e/layer1-fast-check.sh --host-profile none
Enter fullscreen mode Exit fullscreen mode

The project is still evolving, and feedback is welcome.

I am especially interested in feedback on:

  • whether the method-pack boundary is clear
  • whether the install flow is understandable
  • which AI coding workflows should be hardened next
  • what failure modes people see most often in real agent-assisted development

Closing thought

AI coding agents are no longer just code generators.

They are becoming collaborators in planning, debugging, refactoring, verification, and handoff.

That means the surrounding workflow matters.

Aegis is my attempt to make that workflow more explicit, testable, and reusable.

Disclosure: I used AI assistance to draft and edit this post, then reviewed and adapted it before publishing.

Top comments (1)

Collapse
 
_879c5a0279451d52e43c3 profile image
冉淦元

Happy to answer questions about the method-pack boundary, host support, or the workflow checks. I am especially looking for feedback from people using coding agents on real codebases.