DEV Community

Agastya Khati
Agastya Khati

Posted on

Copilot Sherlock : Governed Incident Investigation CLI

GitHub Copilot CLI Challenge Submission

This is a submission for the GitHub Copilot CLI Challenge

Copilot Sherlock — Governed Incident Investigation CLI

AI proposes. Humans decide. The system verifies.


What I Built

I built Sherlock, a CLI-based incident investigation system that demonstrates how to use GitHub Copilot CLI inside a real production-style workflow, not just as a code generator or chat assistant.

Sherlock turns raw production evidence (logs, metrics, deployments) into a governed, auditable incident decision with a strict lifecycle:

  • AI performs bounded reasoning
  • Humans explicitly decide (ACCEPT / MODIFY / REJECT)
  • The system enforces authority and lifecycle rules
  • Outcomes are recorded as append-only organizational memory
  • All artifacts are cryptographically verifiable

AI never decides. Humans always decide.

This is not an AI demo.
It’s an incident lifecycle system with AI inside it.


The 90-Second Pitch

Most incident tools do one of two things:

  • Give you AI suggestions with no governance
  • Give you checklists with no intelligence

Sherlock separates intelligence from authority:

  • AI proposes multiple competing hypotheses with evidence and confidence
  • Humans approve or reject the outcome
  • The system enforces lifecycle gates, memory isolation, and execution rules
  • Every decision is immutable and externally verifiable

There are no feedback loops, no auto-approval, and no adaptive behavior.

This is what production-grade AI assistance looks like.


Demo

Video Walkthrough (Full End-to-End Run)

GitHub Repository

https://github.com/kris70lesgo/sherlock-demo


One-Command Demo

./run-demo.sh
Enter fullscreen mode Exit fullscreen mode

This executes the entire incident lifecycle:

  1. Evidence normalization and trust validation
  2. Scope reduction (thousands of log lines → a few signals)
  3. AI hypothesis reasoning using GitHub Copilot CLI
  4. Mandatory human governance (ACCEPT / MODIFY / REJECT)
  5. Organizational memory write (append only, read only)
  6. Operational integration (Slack / JIRA simulation)
  7. Cryptographic provenance and trust report generation

Fresh Incident IDs

To avoid append-only memory conflicts, you can run:

./sherlock investigate INC-999
Enter fullscreen mode Exit fullscreen mode

Copilot Auth vs Offline Mode

Sherlock uses GitHub Copilot CLI for Phase 3 reasoning when authenticated.

If Copilot is not authenticated, Sherlock falls back to an offline post-mortem generator so the demo still completes.

To authenticate Copilot:

gh auth login
Enter fullscreen mode Exit fullscreen mode

My Experience with GitHub Copilot CLI

Copilot CLI was used as a reasoning engine, not as an authority.

Specifically:

  • Copilot generates multi-hypothesis RCAs from scoped evidence
  • Each hypothesis includes:

    • Evidence FOR
    • Evidence AGAINST
    • Confidence contribution
  • Confidence is budgeted and uncertainty is explicit

  • Hypotheses can be ruled out with justification

Crucially:

  • Copilot cannot finalize anything
  • It cannot approve itself
  • It cannot modify governance rules
  • Its output is preserved even when rejected

The most valuable aspect of Copilot CLI here wasn’t speed it was structured reasoning under constraints.
Once those constraints were clear, Copilot became a reliable component in a larger system instead of a black box.


Why This Matters

Sherlock demonstrates:

  • How to use Copilot CLI inside a production workflow
  • How to prevent AI feedback loops
  • How to enforce human authority mechanically
  • How to make AI output externally verifiable
  • How to integrate AI without letting it run the system

Most tools ask you to trust them.

Sherlock gives you a way to verify them.

That’s the difference between a demo and a product.


Final Notes

  • Language: Bash + Python
  • Interface: CLI (interactive and non-interactive modes)
  • Architecture: 7 phase, strictly isolated pipeline
  • Status: Frozen, validated, demo ready

Thanks for checking it out.

Top comments (0)