This is a submission for the GitHub Copilot CLI Challenge
Copilot Sherlock — Governed Incident Investigation CLI
AI proposes. Humans decide. The system verifies.
What I Built
I built Sherlock, a CLI-based incident investigation system that demonstrates how to use GitHub Copilot CLI inside a real production-style workflow, not just as a code generator or chat assistant.
Sherlock turns raw production evidence (logs, metrics, deployments) into a governed, auditable incident decision with a strict lifecycle:
- AI performs bounded reasoning
- Humans explicitly decide (ACCEPT / MODIFY / REJECT)
- The system enforces authority and lifecycle rules
- Outcomes are recorded as append-only organizational memory
- All artifacts are cryptographically verifiable
AI never decides. Humans always decide.
This is not an AI demo.
It’s an incident lifecycle system with AI inside it.
The 90-Second Pitch
Most incident tools do one of two things:
- Give you AI suggestions with no governance
- Give you checklists with no intelligence
Sherlock separates intelligence from authority:
- AI proposes multiple competing hypotheses with evidence and confidence
- Humans approve or reject the outcome
- The system enforces lifecycle gates, memory isolation, and execution rules
- Every decision is immutable and externally verifiable
There are no feedback loops, no auto-approval, and no adaptive behavior.
This is what production-grade AI assistance looks like.
Demo
Video Walkthrough (Full End-to-End Run)
GitHub Repository
https://github.com/kris70lesgo/sherlock-demo
One-Command Demo
./run-demo.sh
This executes the entire incident lifecycle:
- Evidence normalization and trust validation
- Scope reduction (thousands of log lines → a few signals)
- AI hypothesis reasoning using GitHub Copilot CLI
- Mandatory human governance (ACCEPT / MODIFY / REJECT)
- Organizational memory write (append only, read only)
- Operational integration (Slack / JIRA simulation)
- Cryptographic provenance and trust report generation
Fresh Incident IDs
To avoid append-only memory conflicts, you can run:
./sherlock investigate INC-999
Copilot Auth vs Offline Mode
Sherlock uses GitHub Copilot CLI for Phase 3 reasoning when authenticated.
If Copilot is not authenticated, Sherlock falls back to an offline post-mortem generator so the demo still completes.
To authenticate Copilot:
gh auth login
My Experience with GitHub Copilot CLI
Copilot CLI was used as a reasoning engine, not as an authority.
Specifically:
- Copilot generates multi-hypothesis RCAs from scoped evidence
-
Each hypothesis includes:
- Evidence FOR
- Evidence AGAINST
- Confidence contribution
Confidence is budgeted and uncertainty is explicit
Hypotheses can be ruled out with justification
Crucially:
- Copilot cannot finalize anything
- It cannot approve itself
- It cannot modify governance rules
- Its output is preserved even when rejected
The most valuable aspect of Copilot CLI here wasn’t speed it was structured reasoning under constraints.
Once those constraints were clear, Copilot became a reliable component in a larger system instead of a black box.
Why This Matters
Sherlock demonstrates:
- How to use Copilot CLI inside a production workflow
- How to prevent AI feedback loops
- How to enforce human authority mechanically
- How to make AI output externally verifiable
- How to integrate AI without letting it run the system
Most tools ask you to trust them.
Sherlock gives you a way to verify them.
That’s the difference between a demo and a product.
Final Notes
- Language: Bash + Python
- Interface: CLI (interactive and non-interactive modes)
- Architecture: 7 phase, strictly isolated pipeline
- Status: Frozen, validated, demo ready
Thanks for checking it out.
Top comments (0)