What I Built
Copilot Guardian -- a deterministic safety layer for GitHub Copilot that turns CI failures into auditable diagnosis, risk-stratified patches, and fail-closed quality verdicts.
When CI breaks, most developers stare at logs and guess. Most AI tools give you one answer and hope it's right. Guardian takes a different approach:
Multi-hypothesis reasoning -- Copilot generates 3 competing root-cause theories with confidence scores and evidence. You see why it picked one, not just that it did.
3 patch strategies -- Conservative, Balanced, Aggressive. Different risk profiles for different situations. You choose.
Deterministic quality guard -- An independent guardrail layer checks every patch for scope violations, bypass anti-patterns (
continue-on-error,--insecure,NODE_TLS_REJECT_UNAUTHORIZED=0), and slop signals. EnforcesNO_GOwhen safety conditions are violated.Forced abstain policy -- For non-patchable failure classes (401/403 auth, rate limits, infrastructure), Guardian emits
NOT_PATCHABLEand refuses to generate unsafe patches.Full artifact trail --
analysis.json,patch_options.json,quality_review.*.json, raw model traces. Every word Copilot said is auditable.
This is not a one-shot demo. It's an engineering system designed for real CI workflows.
Repository: github.com/flamehaven01/copilot-guardian
Judge Quick Test (90 seconds)
Prerequisites (10 seconds):
-
gh auth statussucceeds - Copilot access is enabled for your account/session
# Fastest path (no install)
npx copilot-guardian@latest run \
--repo flamehaven01/copilot-guardian \
--last-failed \
--show-options \
--fast \
--max-log-chars 20000
Expected outputs: analysis.json, patch_options.json, quality_review.*.json
Demo
Runtime: 3m43s | Profile: --fast --max-log-chars 20000
Quick Repro Paths
Judge Quick Test (
npx ... run) is already shown above.
Use the paths below when you prefer install/build workflows.
# 2) Global install path
npm install -g copilot-guardian@latest
copilot-guardian run \
--repo flamehaven01/copilot-guardian \
--last-failed \
--show-options \
--fast \
--max-log-chars 20000
# 3) Source path (clone + build)
gh repo clone flamehaven01/copilot-guardian
cd copilot-guardian
npm install
npm run build
copilot-guardian run \
--repo flamehaven01/copilot-guardian \
--last-failed \
--show-options \
--fast \
--max-log-chars 20000
Multi-Hypothesis Diagnosis:
Three competing theories with confidence scores, evidence, and disconfirming signals. Guardian selects the strongest hypothesis but preserves the full reasoning trace.
Patch Spectrum with Quality Verdicts:
Three strategies at different risk levels. Each gets an independent quality review. The deterministic guard can override the model verdict and force NO_GO when it detects bypass patterns or scope violations.
Output Artifacts
.copilot-guardian/
analysis.json # Multi-hypothesis diagnosis
reasoning_trace.json # Full hypothesis audit trail
patch_options.json # 3 strategies + verdicts
fix.*.patch # Generated strategy patches
quality_review.*.json # Per-strategy quality results
copilot.*.raw.txt # Raw model responses
abstain.report.json # Forced abstain (if triggered)
Challenge Rubric Mapping
Use of Copilot CLI -- Guardian is operated from terminal flows; the project demonstrates reproducible
copilot-guardianandgh-based workflows for CI failure recovery.Usability / UX -- The 90-second Judge Quick Test plus explicit expected outputs make validation fast and deterministic.
Originality -- Multi-hypothesis diagnosis + risk-stratified patches + deterministic fail-closed guard + forced abstain policy.
Receipts: Structured + Fail-Closed Evidence
Representative analysis.json excerpt (redacted):
{
"selected_hypothesis": {
"id": "h1",
"summary": "Missing API_URL in workflow environment",
"confidence": 0.89,
"evidence": [
"CI log contains: API_URL is not defined",
"Failure reproduces in Actions context only"
],
"disconfirming_signals": [
"No Node version mismatch in failing run"
]
}
}
Representative quality_review.aggressive.json excerpt (redacted):
{
"strategy": "aggressive",
"verdict": "NO_GO",
"deterministic_flags": [
"bypass_pattern: continue-on-error: true"
],
"slop_score": 0.73,
"reason": "Safety policy violation detected by deterministic guard"
}
My Experience with GitHub Copilot CLI and SDK
Most people use Copilot to write code faster. I used it to build a reasoning engine.
Copilot Guardian is a terminal CLI tool. Copilot requests use @github/copilot-sdk as the default path, and optional gh copilot flows are provided for reproducible terminal-first local operations.
Pattern 1: Multi-Hypothesis Prompting
Instead of asking "what's wrong?", I structured the prompt to force multiple competing explanations:
# prompts/analysis.v2.txt (excerpt)
You must explore multiple hypotheses before selecting
the most likely root cause.
Produce exactly 3 hypotheses in descending confidence order.
Each hypothesis must include: evidence, disconfirming signals,
and a next\_check action.
This eliminates confirmation bias. Copilot can't jump to conclusions -- it has to show competing theories with evidence for and against each one. The result is a structured JSON object validated against a schema, not free-form text.
Pattern 2: Risk-Stratified Generation
A single "fix" is never enough in production CI. I prompt Copilot to generate three strategies at once:
# prompts/patch.options.v1.txt (excerpt)
Generate THREE alternative patch strategies:
1) conservative: minimal, safest change
2) balanced: standard best practice fix
3) aggressive: broader change (often over-engineered)
SAFETY CONSTRAINTS:
- Only touch files in allowed\_files.
- Do NOT weaken security (no disabling SSL,
no continue-on-error, no force installs).
This gives developers actual choice. A production hotfix needs Conservative. A planned refactor might pick Balanced. The Aggressive option often gets flagged -- which is itself useful data.
Pattern 3: AI Auditing AI (Anti-Slop)
After patch generation, I send each strategy back through Copilot with a quality audit prompt:
# prompts/quality.v1.txt (excerpt)
ANTI-SLOP CHECKS (Critical):
- Detect placeholder code (TODO, FIXME)
- Detect over-abstraction (unnecessary layers)
- Detect complexity explosion (>3x LOC for minimal fix)
- Detect deprecated / suspicious Actions usage
If any anti-slop signals detected,
MUST set verdict to NO\_GO and include slop\_score.
But I don't trust the model alone. A deterministic quality guard runs before the model review, checking for 15+ bypass anti-patterns:
// src/engine/patch_options.ts
// deterministicQualityReview() - hard-coded bypass detection
const bypassPatterns: RegExp[] = [
/continue-on-error:\s*true/,
/NODE_TLS_REJECT_UNAUTHORIZED\s*=\s*['"]?0/,
/GIT_SSL_NO_VERIFY/,
/curl\s+(?:-k|--insecure)/,
/npm\s+--insecure|strict-ssl\s+false/,
/\|\|\s*true|set\s+\+e/
];
If the deterministic guard says NO_GO, the model verdict is overridden. Fail-closed, always.
Pattern 4: Transparent Artifact Trail
Every Copilot interaction is persisted as raw text:
copilot.analysis.raw.txt # Exact model response
copilot.patch.options.raw.txt # Patch generation response
copilot.quality.\*.raw.txt # Quality review responses
reasoning\_trace.json # Complete audit trail
You can diff what Copilot said versus what Guardian decided. No black boxes.
Honest Take: What Worked, What Didn't
I use Claude Code, Codex CLI, Gemini CLI, and Copilot CLI in parallel across different projects. So this isn't my first AI CLI tool -- and I'm not going to pretend Copilot CLI was flawless.
What genuinely worked well:
GitHub-native context. Copilot CLI understands repos, issues, PRs, and Actions logs without extra configuration. For a project that lives entirely in GitHub, this was a real advantage over general-purpose AI CLIs that need manual context feeding.
Structured output compliance. Once I locked the prompts to strict JSON-only constraints, Copilot reliably produced schema-valid responses. The structured reasoning quality -- especially disconfirming evidence in hypothesis generation -- was better than I expected.
Terminal-first workflow. No editor, no browser, no context switching. For CI debugging specifically, staying in the terminal felt natural and fast.
What was frustrating:
Session drops on long prompts. This was the most persistent issue. When the input context grew large (deep log analysis + source files + MCP context), the session would disconnect mid-generation. I had to implement retry logic with exponential backoff and a
--max-log-charscap specifically to work around this. It happened often enough that it shaped the architecture -- the--fastmode exists partly because shorter prompts are more stable.SDK maturity. The
@github/copilot-sdkis still early. Error messages are sometimes opaque, and the documentation was thin when I started. I spent real time reverse-engineering behavior that should have been documented.
Compared to alternatives:
Honestly, once the retry and timeout handling was solid, the actual reasoning quality was competitive. The GitHub integration advantage is real -- other CLIs can't natively pull Actions logs, workflow context, and repo metadata the way Copilot does. For this specific use case (CI failure diagnosis on GitHub repos), nothing else fit as naturally.
The session stability issue is the main thing holding it back. Fix that, and Copilot CLI becomes a genuinely strong tool for GitHub-centric automation.
What I Learned
Raw AI output is not enough. Copilot produces good reasoning, but CI automation requires schema validation and deterministic safety checks on top of it.
Fail-closed beats fail-open. Malformed responses, bypass patterns, and scope creep must be blocked by default. The deterministic guard caught issues the model review missed.
Multi-hypothesis prompting produces better reasoning. Forcing 3 theories with disconfirming evidence significantly improved diagnosis quality compared to single-answer prompts.
Build around the tool's limits, not against them. The session drop issue could have been a blocker. Instead, it pushed me toward better architecture: shorter prompts, explicit timeouts, retry logic, and a fast mode. The constraints made the tool more robust.
Net Impact
GitHub Copilot accelerated both implementation and iteration. But the biggest gain came from combining Copilot with strict guardrails and explicit runtime policies. This turned "AI-generated suggestions" into a controllable CI engineering workflow -- and the honest friction along the way made the result more production-ready than a smooth ride would have.
Built by Flamehaven (Yun) -- Trust is built on receipts, not magic.




Top comments (0)