SetraTheX

Posted on May 31

I’m building a local-first multi-agent workflow for AI-assisted coding

#opensource #codex #ai #vibecoding

I’m building a local-first multi-agent workflow for AI-assisted coding

Over the last few days, I have been working heavily on an open-source project called Codex Engineering Workflow Pack.

GitHub: https://github.com/SetraTheXX/Codex-Engineering-Workflow-Pack
npm: https://www.npmjs.com/package/@setrathex/codex-engineering-workflow-pack

It started as a simple idea:

What if Codex had a reusable engineering workflow instead of starting from scratch every time?

At first, the project was only a skill pack. The goal was to give Codex structured workflows for things like:

writing PRDs
slicing issues
debugging
TDD
handoffs
architecture review
codebase analysis

That was useful, but after using AI coding tools more seriously, I started seeing a bigger problem.

AI coding often becomes messy when the workflow is not structured.

A model can generate code quickly, but the process around it is usually unclear:

What exactly is the task?
Which files is the AI allowed to edit?
How do we isolate work?
How do we review the output?
What happens if two agents edit overlapping files?
How do we prevent accidental changes outside the task scope?
How do we keep an audit trail?

That is the problem I wanted to solve.

The goal

My goal with CEWP is not simply to make AI write more code.

The goal is to make AI-assisted development more structured, auditable, and closer to a real engineering workflow.

Instead of one AI model randomly editing the repo, I want a workflow where:

A manager role plans the work.
Worker roles implement isolated tasks.
A reviewer role checks the outputs.
File scopes are enforced.
Dangerous boundaries are guarded.
The user gets a final report before critical actions.

In other words:

AI should not just “code”. It should work inside a controlled engineering system.

What CEWP became in v0.2

With v0.2.0-beta.1, CEWP is no longer just a skill pack.

It now includes a local-first workflow runtime built around the cewp CLI.

Some of the current pieces are:

Codex skill pack
cewp CLI
Coordinator Mode runtime
Git worktree isolation
worker / reviewer roles
dispatch planning
guarded execution
sequential and parallel workers
reviewer gates
finalize / cleanup / prune helpers
operator policy modes
harness smoke tests

The project is still beta, but it now behaves much more like a developer tooling product than a collection of prompt files.

Why local-first?

I wanted CEWP to be local-first because AI coding workflows usually happen inside a real repository.

The repo already contains the important context:

source code
README files
docs
roadmap files
PRDs
issues
tests
local scripts
Git history

So CEWP stores its runtime state inside the repo under .cewp/.

A run has its own folder:

.cewp/runs/<run-id>/

That run can contain things like:

run metadata
board/task state
prompts
reports
review packets
adapter output
event logs

This makes the workflow visible and inspectable. The user can see what happened instead of treating the AI as a black box.

Worktree isolation

One of the most important parts of CEWP is Git worktree isolation.

When workers run, they do not all edit the same working directory.

Each worker can get its own Git worktree.

That matters because parallel AI work can become dangerous very quickly if multiple agents edit the same files in the same directory.

With separate worktrees, each worker has a separate workspace. This makes it easier to inspect, collect, and validate changes.

The basic idea is:

worker-a -> separate worktree -> task A
worker-b -> separate worktree -> task B
reviewer -> checks collected output

This is the foundation for controlled parallel work.

Parallel workers

The part I spent the most effort on recently was the parallel worker system.

The idea is simple:

If two tasks are independent, two workers should be able to work at the same time.

But “parallel AI agents” only make sense if the system checks the risk first.

Before parallel execution, CEWP checks things like:

Are the workers using separate worktrees?
Are their file scopes overlapping?
Are allowedFiles defined?
Are forbiddenFiles respected?
Are output paths separate?
Are worktree paths safe?
Will one worker accidentally touch the other worker’s scope?

For example, these should be treated as overlapping:

docs/**
docs/install.md

Because docs/install.md is inside docs/**.

CEWP now detects these cases and blocks unsafe parallel execution.

The goal is not “run everything at once”.

The goal is:

Run things in parallel only when the workflow can prove that the tasks are isolated enough.

File scope guardrails

AI coding tools can accidentally edit files outside the intended task.

So CEWP uses task-level file scopes.

A worker task can define:

{
  "allowedFiles": ["README.md", "docs/install.md"],
  "forbiddenFiles": ["package.json"]
}

In v0.2.0-beta.1, real worker execution now requires explicit non-empty allowedFiles.

That means a worker cannot run with an empty scope and freely edit the repo.

CEWP also checks both:

uncommitted changes
committed branch changes

This part is important.

At one point, an automated review pointed out a real problem: if a worker committed a file before returning, git status could look clean, and the scope check might miss the change.

That was fixed by recording a baseCommit and checking committed changes since that base.

So the worker cannot bypass scope checks just by committing its changes.

Reviewer gate

The workflow does not end after workers finish.

CEWP collects worker reports and creates a review packet.

The reviewer then checks the output and writes a decision.

Finalize requires:

Decision: PASS

This means finalize is not just “whatever the worker did is accepted”.

There is a separate review gate.

This is important because AI-generated changes should be inspected before becoming the final run state.

Operator policy modes

Another thing I wanted was a permission model.

Some users want a safe, step-by-step flow.

Some users want to give more authority to the tool.

I personally often use AI coding tools with a lot of local permission, but only when the task is clear and the repository boundaries are well defined.

So CEWP now has policy modes:

safe
trusted
full-authority

The default is safe.

In safe mode, high-impact actions are blocked.

In full-authority mode, CEWP can run local workflow actions with fewer pauses, such as:

workers
reviewer
pipeline
finalize
cleanup
prune deletion

But full-authority does not disable the guardrails.

Even in full-authority mode:

allowedFiles still matters
forbiddenFiles still matters
worktree isolation still matters
reviewer gates still matter
target worktree safety still matters
no automatic push / publish / release happens

That distinction is important.

Full authority means:

The user trusts CEWP to run the local workflow.

It does not mean:

The AI can do anything with the repository.

What I hardened in v0.2.0-beta.1

The 0.2.0-beta.1 release was mostly about hardening.

Some of the main improvements were:

runtime policy enforcement
required allowedFiles for real worker execution
better parallel scope overlap detection
external/absolute targetWorktree path blocking
safer cleanup behavior
stronger harness tests
npm scripts for validation
clearer docs and release notes

The package now has scripts like:

npm test
npm run smoke
npm run check
npm run pack:dry-run

The harness tests cover things like:

policy gates
worktree creation
committed diff visibility
outside-allowed file detection
parallel overlap detection
target worktree policy
package surface checks

What the workflow looks like

A simplified CEWP flow looks like this:

run init
  ↓
worktrees create
  ↓
dispatch plan
  ↓
dispatch check
  ↓
dispatch prompts
  ↓
workers execute
  ↓
collect
  ↓
reviewer executes
  ↓
finalize

The important part is not the command list.

The important part is the model:

plan -> isolate -> execute -> collect -> review -> finalize

That is the workflow I want AI coding tools to follow.

The longer-term vision

Right now CEWP is Codex-focused.

But I do not want the idea to stay limited to one model or one tool.

The long-term vision is an adapter-based system where different models can take different roles:

Codex as manager
Claude as reviewer
Gemini as worker
OpenCode or API-based models as workers
manual adapters for human-in-the-loop steps

The goal is not to blindly run many models.

The goal is to create a role-based workflow where each model can be used where it makes sense.

For example:

manager -> plans the tasks
worker-a -> implements one scope
worker-b -> implements another scope
reviewer -> checks the result
user -> approves critical boundaries

This also helps with limits and cost. If everything depends on one model, usage limits become a bottleneck. A future adapter system could make CEWP more flexible.

What I learned

A few things became clear while building this:

1. AI coding needs boundaries

The more power we give to coding agents, the more important scope boundaries become.

allowedFiles, forbiddenFiles, worktrees, and reviewer gates are not optional details. They are the difference between a useful workflow and a risky one.

2. Parallelism is only useful with isolation

Running two agents at the same time sounds impressive, but it is only useful if their work is isolated.

Otherwise, it just creates confusion faster.

3. Local-first workflows are easier to audit

When the workflow writes reports, events, prompts, and review packets locally, the user can inspect what happened.

This is much better than relying only on chat history.

4. “Full authority” should not mean “no rules”

Some users really do want to give AI more permission.

That is valid.

But the system should still keep hard safety rules. CEWP’s full-authority mode is designed around that idea.

Current status

CEWP is still beta.

The current version is:

0.2.0-beta.1

Published on npm:

npm install @setrathex/codex-engineering-workflow-pack

GitHub:

https://github.com/SetraTheXX/Codex-Engineering-Workflow-Pack

npm:

https://www.npmjs.com/package/@setrathex/codex-engineering-workflow-pack

What comes next

The next big direction is v0.3.

The main things I want to explore are:

adapter registry
fake adapter for deterministic tests
package install smoke tests
CI
better operator docs
commandless usage patterns
future Gemini / Claude / OpenCode adapter experiments

The bigger goal is still the same:

Make AI-assisted development more structured, auditable, and safe enough to use on real projects.

Feedback

I am especially interested in feedback from people who use:

Codex
Claude Code
Cursor
GitHub Copilot
Gemini
OpenCode
other AI coding tools

Questions I am thinking about:

How would you design file scope rules for AI workers?
Should full-authority mode ever include commit/push/publish?
What should a model-independent adapter contract look like?
How should multi-agent coding workflows be reviewed?
What would make this easier to try in a real repo?

If this topic interests you, I would love feedback on the project.

GitHub: https://github.com/SetraTheXX/Codex-Engineering-Workflow-Pack

DEV Community

I’m building a local-first multi-agent workflow for AI-assisted coding

I’m building a local-first multi-agent workflow for AI-assisted coding

The goal

What CEWP became in v0.2

Why local-first?

Worktree isolation

Parallel workers

File scope guardrails

Reviewer gate

Operator policy modes

What I hardened in v0.2.0-beta.1

What the workflow looks like

The longer-term vision

What I learned

1. AI coding needs boundaries

2. Parallelism is only useful with isolation

3. Local-first workflows are easier to audit

4. “Full authority” should not mean “no rules”

Current status

What comes next

Feedback

Top comments (0)