This is a submission for the GitHub Copilot CLI Challenge
What I Built
CI Guardian is implemented as a GitHub CLI extension (gh ci-guardian) and runs entirely from the terminal, integrating GitHub Actions logs with GitHub Copilot CLI for safe, human-in-the-loop remediation.
Instead of blindly applying AI-generated patches, CI Guardian analyzes real CI logs, summarizes the failure, and attempts a minimal fix only if it’s low-risk. If the fix is unclear or unsafe, it stops and leaves the decision to a human.
The tool can:
- Diagnose CI failures with structured root-cause analysis
- Attempt minimal, semantic fixes
- Automatically open PRs only when patches apply cleanly
- Refuse unsafe or low-confidence fixes and escalate to a human when necessary
I tested CI Guardian on both a small demo repo and a real fork of Flask, including scenarios with fork permissions, pull-request-only CI, and multiple workflows.
Demo
Repository:
https://github.com/sasubillis/gh-ci-guardian
The extension entrypoint maps directly to ci_guardian/cli.py, which handles run discovery, log extraction, Copilot prompting, patch validation, and PR creation.
All screenshots below were captured against real repositories with real failing CI runs, including a fork of Flask to demonstrate behavior on a production-scale codebase.
Example usage:
# Diagnose the latest failing CI run
gh ci-guardian diagnose --latest --branch all
# Attempt a safe fix and open a PR if possible
gh ci-guardian fix --latest --branch all
What the demo shows:
- CI failures diagnosed into structured JSON
- Copilot-generated unified diffs
- Automatic PR creation when patches are safe
- Graceful refusal with preserved diffs when fixes are unsafe (human-in-the-loop)
This behavior was demonstrated on a real Flask fork where CI failures only surface on pull requests, not direct pushes.
Diagnosis on Failing CI with demo repo
Fix made by ci-guardian on demo repo
PR opened in GitHub by ci-guardian
When a fix is safe and minimal, CI Guardian automatically opens a remediation pull request.

Diagnosis on Failing CI on real repo (Flask)
CI Guardian converts a real failing GitHub Actions run into a structured, machine-readable diagnosis using GitHub Copilot CLI.

Human-in-the-loop Intervention
CI Guardian safely refuses to auto-fix an ambiguous CI failure on a real Flask fork and escalates to human review.
My Experience with GitHub Copilot CLI
GitHub Copilot CLI was used as a reasoning engine, not a blind code generator. I used copilot -p to:
- Summarize CI logs into structured root-cause explanations
- Generate minimal unified diffs grounded in real failure logs
- Draft concise pull request titles and descriptions
The key insight was that Copilot is most effective when paired with strict guardrails. CI Guardian treats Copilot output as a proposal, not a command, and enforces safety checks before applying any change. This results in automation that accelerates debugging without sacrificing trust or correctness.


Top comments (0)