This is a submission for the GitHub Copilot CLI Challenge
π Hambugsy: Finding the Bug in Your Stack
What I Built
Hambugsy is a CLI tool that answers the question every developer asks when a test fails:
"Is my test wrong, or is my code wrong?"
Instead of spending 30+ minutes investigating, Hambugsy gives you an instant verdict with confidence scores and recommended fixes.
The Problem It Solves
Every developer has experienced this:
β FAILED: testCalculateDiscount
Expected: 90
Actual: 85
Now begins the investigation:
- Was the test written correctly?
- Did someone change the business logic?
- Is this a regression?
- Which file do I need to fix?
This investigation typically takes 30-60 minutes per failing test.
The Solution
$ hambugsy analyze ./src/OrderService.java
π HAMBUGSY - Finding the bug in your stack...
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π calculateTotal() - line 47 β
β βββ β Test FAILS: testCalculateTotal_WithDiscount β
β βββ π¬ Analysis: β
β β β’ Test expects: 10% discount (written: 2025-03-15) β
β β β’ Code applies: 15% discount (changed: 2026-01-22) β
β β β’ Git blame: "Updated discount per new pricing policy" β
β β β
β βββ π― VERDICT: Code CHANGED β Test OUTDATED β
β βββ π‘ Fix: Update test assertion line 23: 90 β 85 β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Summary: 1 outdated test | Time saved: ~45 minutes
Demo
Watch the Full Demo
The demo showcases:
- analyze - Diagnose test failures and determine if the test or code is wrong
- --run-tests - Execute real tests for accurate failure detection
- suggest - Find missing tests and generate suggestions
- fix - Auto-fix detected issues (with --dry-run preview)
- Multiple output formats: console, JSON, markdown, GitHub Actions
Try It Yourself
# Install
npm install -g hambugsy
# Analyze your project
hambugsy analyze ./src
How GitHub Copilot CLI Powers Hambugsy
Hambugsy is fundamentally built around GitHub Copilot CLI's capabilities. Here's how each feature uses Copilot:
1. Semantic Code Analysis
When Hambugsy needs to understand what a test expects vs what the code does, it uses Copilot:
// Using Copilot to analyze test intent
const testAnalysis = await exec(`
gh copilot explain "What behavior does this test verify: ${testCode}"
`);
// Using Copilot to analyze code behavior
const codeAnalysis = await exec(`
gh copilot explain "What does this function actually do: ${sourceCode}"
`);
2. Intelligent Fix Suggestions
Copilot generates the specific fix recommendations:
const fixSuggestion = await exec(`
gh copilot suggest -t code "
The test expects: ${testExpectation}
The code does: ${actualBehavior}
Generate a fix for the ${isTestWrong ? 'test' : 'code'}
"
`);
3. Commit Message Analysis
Copilot helps interpret whether code changes were intentional:
const intentAnalysis = await exec(`
gh copilot explain "
Was this change intentional or accidental based on the commit message:
'${commitMessage}'
"
`);
4. Natural Language Explanations
Every verdict includes a human-readable explanation generated by Copilot:
const explanation = await exec(`
gh copilot explain "
Explain why the test '${testName}' fails:
- Test expects: ${expected}
- Code returns: ${actual}
- Test was written: ${testDate}
- Code was changed: ${codeDate}
Explain in plain English for a developer.
"
`);
The Verdict System
Hambugsy classifies every failing test into four categories:
| Verdict | Icon | When Applied |
|---|---|---|
| Code Bug | π | Test is correct, code has defect |
| Outdated Test | π | Code changed intentionally, test needs update |
| Flaky Test | π² | Test passes/fails inconsistently |
| Environment Issue | π | External dependency problem |
Decision Tree
Test Failure
β
ββββββββββββββ΄βββββββββββββ
β β
Code changed? Code unchanged
β β
ββββββββ΄βββββββ ββββββββ΄βββββββ
β β β β
Intentional? Regression Test valid? Test invalid
β β β β
βΌ βΌ βΌ βΌ
OUTDATED CODE CODE TEST
TEST BUG BUG BUG
Features
Multi-Language Support
# Java/JUnit
hambugsy analyze ./src/main/java/
# TypeScript/Jest
hambugsy analyze ./src/ --framework=jest
# Python/pytest
hambugsy analyze ./tests/
| Language | Frameworks | Status |
|---|---|---|
| Java | JUnit 4/5, TestNG | β Full |
| TypeScript | Jest, Mocha, Vitest | β Full |
| Python | pytest, unittest | β Full |
| Go | go test, testify | β Full |
| Rust | #[test], tokio::test | β Full |
| C# | NUnit, xUnit, MSTest | β Full |
π Missing Test Suggestions
Beyond analyzing failures, Hambugsy proactively finds untested code paths:
$ hambugsy suggest ./src/PaymentService.java
π Finding gaps in your test coverage...
π processPayment() @ line 5
βββ β
TESTED: Happy path
βββ β MISSING: null request handling
βββ β MISSING: negative amount validation
βββ β MISSING: large amount threshold
π‘ SUGGESTED TESTS: [generates actual test code]
This is what sets Hambugsy apart - it doesn't just analyze failures, it prevents future bugs by identifying missing test coverage.
CI/CD Integration
# GitHub Actions
- name: Analyze Tests
run: hambugsy analyze ./src --format=github
# The tool outputs GitHub Actions annotations
# ::error file=src/Service.java,line=47::CODE BUG: Missing null check
Rich Configuration
# .hambugsy.yml
patterns:
source: ["src/**/*.java"]
test: ["test/**/*.java"]
analysis:
git_history_days: 90
confidence_threshold: 0.7
ci:
fail_on_bugs: true
fail_on_outdated_tests: false
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Hambugsy CLI β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββ βββββββββββ βββββββββββ β
β β Parser β βAnalyzer β βReporter β β
β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β
β ββββββββββββββΌβββββββββββββ β
β β β
β ββββββββββΌβββββββββ β
β β Copilot Bridge β β
β ββββββββββ¬βββββββββ β
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββ€
β β β
β βββββββββββ ββββββΌβββββ βββββββββββ β
β β Git β β Copilot β β File β β
β β History β β CLI β β System β β
β βββββββββββ βββββββββββ βββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Why "Hambugsy"?
π Ham + π Bug + π© Bugsy
- Like a hamburger with layers - bugs hide between the test layer and code layer
- We hunt bugs - finding who's guilty
- Bugsy Siegel - the gangster who always found the guilty party
"Finding the bug in your stack"
Repository
Hambugsy
The CLI tool that tells you WHO is wrong: your test or your code.
π¦ View on npm β’ π Website β’ β‘ Quick Start
Quick Install
npm install -g hambugsy
hambugsy analyze ./src
Demo
The Problem
Every developer knows this pain:
β FAILED: testCalculateDiscount
Expected: 90
Actual: 85
Now what? Is the test wrong? Is the code wrong? Did someone change the business logic? Is the test outdated?
You spend 30 minutes investigating only to find the test was written for the OLD discount logic.
The Solution
$ hambugsy analyze ./src/OrderService.java
π HAMBUGSY - Test Failure Diagnostics π
βββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π Method: calculateDiscount() @ line 32 β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β FAILING TEST: testPremiumDiscount β
β β
β π¬ ANALYSIS: β
β βββ Test expects: 10% discount (written: 2024-03-15) β
β βββ Code applies: 15% discount (changed: 2024-11-22) β
β β
β π― VERDICT: π OUTDATED TESTβ¦Quick Links
- π¦ npm Package
- π Website
- π Documentation
- π Issue Tracker
What's Next
- [x] VS Code Extension (included!)
- [x] Auto-fix mode (
hambugsy fix) - [ ] IntelliJ Plugin
- [ ] Team analytics dashboard
- [ ] Slack/Teams notifications
Try It Now
# Prerequisites
gh extension install github/gh-copilot
# Install Hambugsy
npm install -g hambugsy
# Run on your project
hambugsy analyze ./src
# See what's really causing your test failures
Feedback Welcome!
I'd love to hear your thoughts:
- Is this useful for your workflow?
- What languages/frameworks would you need?
- Any features you'd want to see?
Drop a comment below or open an issue on GitHub! π
Built with β€οΈ for the GitHub Copilot CLI Challenge

Top comments (0)