DEV Community

Cover image for Workflow Series (10): Enterprise Architecture — Registry, Composition, and Governance
WonderLab
WonderLab

Posted on

Workflow Series (10): Enterprise Architecture — Registry, Composition, and Governance

From One Workflow to a System

One workflow is easy to manage. At five, problems appear:

  • Users don't know a workflow exists, or can't find the right one
  • Two workflows overlap 70% in functionality; maintenance doubles
  • Someone modifies a shared workflow and breaks every dependent
  • A production incident leaves no trace of which workflow version caused it

A Workflow Registry solves discovery. Workflow composition solves duplication. Governance mechanisms handle modification rights and accountability.


Workflow Registry

A Skill Registry manages Skill discovery. A Workflow Registry manages workflow discovery.

# workflow-registry.yaml
workflows:
  - id: wf-bug-e2e
    name: End-to-End Bug Fix
    description: "Triggered by Jira ticket  auto-analyzes root cause, generates fix code, submits for review"
    version: "1.3.0"
    trigger_keywords:
      - fix bug
      - auto fix
      - process bug
    domain: engineering
    owner: "@chendongqi"
    status: active
    metrics:
      monthly_runs: 45
      e2e_success_rate: 0.82
      avg_fix_rounds: 1.3
      avg_cost_usd: 0.51

  - id: wf-sprint-planning
    name: Weekly Sprint Planning
    description: "Runs Monday morning  summarizes P0/P1 Bug status, generates weekly work plan"
    version: "1.0.0"
    trigger_keywords:
      - weekly plan
      - sprint planning
    domain: management
    owner: "@team-lead"
    status: active

  - id: wf-code-review-assist
    name: Code Review Assistant
    description: "Analyzes code quality after Gerrit change submission, generates review comments"
    version: "0.5.0"
    status: experimental
Enter fullscreen mode Exit fullscreen mode

Three Uses for the Registry

Use 1: Workflow discovery

User input: "fix this bug." The system matches trigger_keywords, finds wf-bug-e2e, and routes automatically. No list to browse.

Use 2: Health monitoring

Keep the metrics field updated and build a cross-workflow health dashboard:

Workflow                  Runs/mo   Success   Avg cost
──────────────────────────────────────────────────────
wf-bug-e2e                45        82%        $0.51
wf-sprint-planning         4        100%       $0.08
wf-code-review-assist      12        67%       $0.23  ← needs attention
Enter fullscreen mode Exit fullscreen mode

Use 3: Version dependency tracking

When wf-sprint-planning calls wf-bug-e2e (see next section), the Registry records that dependency. Before wf-bug-e2e releases a MAJOR version, the system checks which workflows depend on it and notifies affected owners.


Workflow Composition

Workflow composition means one workflow calls another as a pipeline stage.

Scenario: wf-sprint-planning runs every Monday. For each P0 Bug found, it triggers wf-bug-e2e, waits for all results, then generates the weekly plan report.

# wf-sprint-planning/workflow.md

## Phase 2: Fix P0 Bugs
For each P0 Bug from Phase 1, trigger wf-bug-e2e:

Enter fullscreen mode Exit fullscreen mode


yaml
phase_2_fix_p0:
type: workflow_fanout
child_workflow: wf-bug-e2e
inputs:
- source: phases.phase1.p0_bugs
map_to: jira_key
wait_strategy: collect-all
timeout: 4h
on_timeout: continue_with_available


## Phase 3: Generate Weekly Plan
Using Phase 2 results (fix status for each Bug), generate the weekly plan report.
Enter fullscreen mode Exit fullscreen mode


yaml

Child workflow interface design:

Like subagents, child workflows need declared input and output contracts:

# wf-bug-e2e interface declaration (in SKILL.md)
interface:
  inputs:
    - name: jira_key
      type: string
      required: true
  outputs:
    - path: workflow_state.json
      fields:
        - fix_result.passed
        - fix_result.commit_sha
        - fix_result.summary
        - total_cost_usd
Enter fullscreen mode Exit fullscreen mode

The parent workflow reads only declared output fields — not the child's internal state files.


Cross-Tool Portability

The same workflow runs in an OpenClaw environment calling Claude Code tools, and in a different environment calling different tool implementations — the workflow definition doesn't change.

# config.yaml tool_bindings
tool_bindings:
  read_file:
    openclaw: "claude_code_read"
    generic: "python:open"

  create_cron:
    openclaw: "claude_code_cron"
    generic: "python:crontab"

  send_notification:
    openclaw: "lark-im"
    generic: "slack-webhook"
Enter fullscreen mode Exit fullscreen mode
# workflow.md declares capabilities, not tool names
Phase 6 Step 6.1: Create polling cron job (capability: create_cron)
Phase 7 Step 7.2: Send closing notification (capability: send_notification)
Enter fullscreen mode Exit fullscreen mode
def resolve_tool(capability: str, runtime: str) -> str:
    return config["tool_bindings"][capability][runtime]

tool = resolve_tool("create_cron", current_runtime)
Enter fullscreen mode Exit fullscreen mode

Switching deployment environments requires only a config.yaml change — workflow logic stays identical.


Three Governance Questions

Question 1: Who can modify a workflow?

Personal workflow (single owner):
  → Free to modify, no review required

Team-shared workflow:
  → PATCH/MINOR: 1 reviewer + CI passing before merge
  → MAJOR: 2 reviewers + Gate 3 (end-to-end regression)

Production-critical workflow (directly affects business):
  → Any change requires owner + business stakeholder review
  → Production workflow files are read-only; updates go through a release process
Enter fullscreen mode Exit fullscreen mode

Implementation: Git CODEOWNERS declares review permissions:

# .github/CODEOWNERS
skills/wf-bug-e2e/**          @chendongqi @team-lead
skills/wf-sprint-planning/**  @team-lead
Enter fullscreen mode Exit fullscreen mode

Question 2: What did the workflow do, and who is accountable?

Write audit.json after each workflow completes (covered in W6), adding accountability chain fields:

{
  "workflow_id": "wf-bug-e2e-AE-33995-20260601",
  "workflow_version": "1.3.0",
  "triggered_by": "user:chendongqi",
  "triggered_at": "2026-06-01T10:00:00+08:00",
  "outcome": "success",
  "external_writes": [
    {"action": "git_push", "target": "gerrit/android-project", "phase": 5},
    {"action": "jira_comment", "target": "AE-33995", "phase": 7}
  ],
  "human_approvals": [
    {"gate": "gate_B", "approver": "user:zhang3", "at": "2026-06-01T11:30:00+08:00"}
  ],
  "cost_usd": 0.51
}
Enter fullscreen mode Exit fullscreen mode

Accountability chain: triggerer → workflow version → phase → human approver. Every external write traces back to a specific person and workflow version.

Question 3: How to roll back quickly?

Version snapshot:

# Before releasing a new version, snapshot the old one
cp -r skills/wf-bug-e2e skills/wf-bug-e2e.v1.2.0.bak

# On rollback
cp -r skills/wf-bug-e2e.v1.2.0.bak skills/wf-bug-e2e
sed -i 's/version: "1.3.0"/version: "1.2.0"/' skills/wf-bug-e2e/SKILL.md
Enter fullscreen mode Exit fullscreen mode

More systematic: use git tags:

# Tag on release
git tag wf-bug-e2e-v1.3.0

# Rollback: checkout old tag's files
git checkout wf-bug-e2e-v1.2.0 -- skills/wf-bug-e2e/
Enter fullscreen mode Exit fullscreen mode

Handling in-flight instances:

After rolling back, workflow instances still running (status != done) used the rolled-back version. Notify their owners:

def handle_in_flight_on_rollback(state_dir: Path, rolled_back_version: str, previous_version: str):
    for state_file in state_dir.glob("**/workflow_state.json"):
        state = json.loads(state_file.read_text())
        if (state["workflow_version"] == rolled_back_version
                and state.get("phase") != "done"):
            notify_owner(
                f"In-flight workflow {state['workflow_id']} used rolled-back "
                f"version {rolled_back_version}. Options: "
                f"1) Let it complete  2) Abort and restart with {previous_version}"
            )
Enter fullscreen mode Exit fullscreen mode

Implementation Roadmap

Level 1 — Individual (do now):
  □ Create workflow-registry.yaml listing all existing workflows
  □ Fill in domain, owner, status, trigger_keywords for each

Level 2 — Team (1-2 weeks):
  □ git CODEOWNERS declares review permissions for workflow files
  □ CHANGELOG before every change (W7), with reasons documented
  □ Team-shared MAJOR changes require two reviewers

Level 3 — Enterprise (ongoing):
  □ Registry metrics updated continuously; cross-workflow health dashboard
  □ Critical workflows have complete audit logs
  □ Version tags + snapshots; any version rollback within 1 hour
Enter fullscreen mode Exit fullscreen mode

Summary

  1. The Registry is the foundation for discovery and monitoring: without it, workflows accumulate as black boxes; with it, each workflow's status, cost, and success rate are visible, and trigger_keywords enable automatic routing
  2. Composition gives the system hierarchy: a parent workflow (sprint planning) calls child workflows (bug fix), forming two tiers; child workflows declare input/output contracts using exactly the same principles as subagent design
  3. Three governance questions, three concrete answers: modification rights through CODEOWNERS + review process; accountability through audit.json with triggerer, version, and approver; rollback through git tags with in-flight instance notification

Check out PrimeSkills — a curated marketplace of AI agents and skills that have been validated in real-world, enterprise-grade workflows. No fluff, just what actually works.

Find more useful knowledge and interesting products on my Homepage

Top comments (0)