DEV Community

Cover image for Hambugsy: The CLI That Tells You WHO Is Wrongβ€”Your Test or Your Code
Adam Porkolab
Adam Porkolab

Posted on

Hambugsy: The CLI That Tells You WHO Is Wrongβ€”Your Test or Your Code

GitHub Copilot CLI Challenge Submission

This is a submission for the GitHub Copilot CLI Challenge

πŸ” Hambugsy: Finding the Bug in Your Stack

What I Built

Hambugsy is a CLI tool that answers the question every developer asks when a test fails:

"Is my test wrong, or is my code wrong?"

Instead of spending 30+ minutes investigating, Hambugsy gives you an instant verdict with confidence scores and recommended fixes.

The Problem It Solves

Every developer has experienced this:

❌ FAILED: testCalculateDiscount
   Expected: 90
   Actual: 85
Enter fullscreen mode Exit fullscreen mode

Now begins the investigation:

  • Was the test written correctly?
  • Did someone change the business logic?
  • Is this a regression?
  • Which file do I need to fix?

This investigation typically takes 30-60 minutes per failing test.

The Solution

$ hambugsy analyze ./src/OrderService.java

πŸ” HAMBUGSY - Finding the bug in your stack...

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ“ calculateTotal() - line 47                                  β”‚
β”‚  β”œβ”€β”€ ❌ Test FAILS: testCalculateTotal_WithDiscount             β”‚
β”‚  β”œβ”€β”€ πŸ”¬ Analysis:                                               β”‚
β”‚  β”‚   β€’ Test expects: 10% discount (written: 2025-03-15)         β”‚
β”‚  β”‚   β€’ Code applies: 15% discount (changed: 2026-01-22)         β”‚
β”‚  β”‚   β€’ Git blame: "Updated discount per new pricing policy"     β”‚
β”‚  β”‚                                                              β”‚
β”‚  └── 🎯 VERDICT: Code CHANGED β†’ Test OUTDATED                   β”‚
β”‚      └── πŸ’‘ Fix: Update test assertion line 23: 90 β†’ 85         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“Š Summary: 1 outdated test | Time saved: ~45 minutes
Enter fullscreen mode Exit fullscreen mode

Demo

Watch the Full Demo

The demo showcases:

  • analyze - Diagnose test failures and determine if the test or code is wrong
  • --run-tests - Execute real tests for accurate failure detection
  • suggest - Find missing tests and generate suggestions
  • fix - Auto-fix detected issues (with --dry-run preview)
  • Multiple output formats: console, JSON, markdown, GitHub Actions

Try It Yourself

# Install
npm install -g hambugsy

# Analyze your project
hambugsy analyze ./src
Enter fullscreen mode Exit fullscreen mode

How GitHub Copilot CLI Powers Hambugsy

Hambugsy is fundamentally built around GitHub Copilot CLI's capabilities. Here's how each feature uses Copilot:

1. Semantic Code Analysis

When Hambugsy needs to understand what a test expects vs what the code does, it uses Copilot:

// Using Copilot to analyze test intent
const testAnalysis = await exec(`
  gh copilot explain "What behavior does this test verify: ${testCode}"
`);

// Using Copilot to analyze code behavior
const codeAnalysis = await exec(`
  gh copilot explain "What does this function actually do: ${sourceCode}"
`);
Enter fullscreen mode Exit fullscreen mode

2. Intelligent Fix Suggestions

Copilot generates the specific fix recommendations:

const fixSuggestion = await exec(`
  gh copilot suggest -t code "
    The test expects: ${testExpectation}
    The code does: ${actualBehavior}
    Generate a fix for the ${isTestWrong ? 'test' : 'code'}
  "
`);
Enter fullscreen mode Exit fullscreen mode

3. Commit Message Analysis

Copilot helps interpret whether code changes were intentional:

const intentAnalysis = await exec(`
  gh copilot explain "
    Was this change intentional or accidental based on the commit message: 
    '${commitMessage}'
  "
`);
Enter fullscreen mode Exit fullscreen mode

4. Natural Language Explanations

Every verdict includes a human-readable explanation generated by Copilot:

const explanation = await exec(`
  gh copilot explain "
    Explain why the test '${testName}' fails:
    - Test expects: ${expected}
    - Code returns: ${actual}
    - Test was written: ${testDate}
    - Code was changed: ${codeDate}
    Explain in plain English for a developer.
  "
`);
Enter fullscreen mode Exit fullscreen mode

The Verdict System

Hambugsy classifies every failing test into four categories:

Verdict Icon When Applied
Code Bug πŸ› Test is correct, code has defect
Outdated Test πŸ“œ Code changed intentionally, test needs update
Flaky Test 🎲 Test passes/fails inconsistently
Environment Issue 🌐 External dependency problem

Decision Tree

                    Test Failure
                         β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚                         β”‚
     Code changed?              Code unchanged
            β”‚                         β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
     β”‚             β”‚           β”‚             β”‚
Intentional?   Regression   Test valid?  Test invalid
     β”‚             β”‚           β”‚             β”‚
     β–Ό             β–Ό           β–Ό             β–Ό
  OUTDATED      CODE        CODE          TEST
    TEST         BUG         BUG           BUG
Enter fullscreen mode Exit fullscreen mode

Features

Multi-Language Support

# Java/JUnit
hambugsy analyze ./src/main/java/

# TypeScript/Jest
hambugsy analyze ./src/ --framework=jest

# Python/pytest
hambugsy analyze ./tests/
Enter fullscreen mode Exit fullscreen mode
Language Frameworks Status
Java JUnit 4/5, TestNG βœ… Full
TypeScript Jest, Mocha, Vitest βœ… Full
Python pytest, unittest βœ… Full
Go go test, testify βœ… Full
Rust #[test], tokio::test βœ… Full
C# NUnit, xUnit, MSTest βœ… Full

πŸ†• Missing Test Suggestions

Beyond analyzing failures, Hambugsy proactively finds untested code paths:

$ hambugsy suggest ./src/PaymentService.java

πŸ” Finding gaps in your test coverage...

πŸ“ processPayment() @ line 5
β”œβ”€β”€ βœ… TESTED: Happy path
β”œβ”€β”€ ❌ MISSING: null request handling
β”œβ”€β”€ ❌ MISSING: negative amount validation
└── ❌ MISSING: large amount threshold

πŸ’‘ SUGGESTED TESTS: [generates actual test code]
Enter fullscreen mode Exit fullscreen mode

This is what sets Hambugsy apart - it doesn't just analyze failures, it prevents future bugs by identifying missing test coverage.

CI/CD Integration

# GitHub Actions
- name: Analyze Tests
  run: hambugsy analyze ./src --format=github

# The tool outputs GitHub Actions annotations
# ::error file=src/Service.java,line=47::CODE BUG: Missing null check
Enter fullscreen mode Exit fullscreen mode

Rich Configuration

# .hambugsy.yml
patterns:
  source: ["src/**/*.java"]
  test: ["test/**/*.java"]

analysis:
  git_history_days: 90
  confidence_threshold: 0.7

ci:
  fail_on_bugs: true
  fail_on_outdated_tests: false
Enter fullscreen mode Exit fullscreen mode

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Hambugsy CLI                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚  β”‚ Parser  β”‚  β”‚Analyzer β”‚  β”‚Reporter β”‚          β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜          β”‚
β”‚       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
β”‚                    β”‚                            β”‚
β”‚           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚           β”‚  Copilot Bridge β”‚                   β”‚
β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                    β”‚                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚  β”‚   Git   β”‚  β”‚ Copilot β”‚  β”‚  File   β”‚          β”‚
β”‚  β”‚ History β”‚  β”‚   CLI   β”‚  β”‚ System  β”‚          β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Why "Hambugsy"?

πŸ” Ham + πŸ› Bug + 🎩 Bugsy

  • Like a hamburger with layers - bugs hide between the test layer and code layer
  • We hunt bugs - finding who's guilty
  • Bugsy Siegel - the gangster who always found the guilty party

"Finding the bug in your stack"


Repository

Hambugsy Logo

Hambugsy

The CLI tool that tells you WHO is wrong: your test or your code.

npm version npm downloads License: MIT Built with GitHub Copilot CLI Node.js 18+ Tests

πŸ“¦ View on npm β€’ 🌐 Website β€’ ⚑ Quick Start


Quick Install

npm install -g hambugsy
hambugsy analyze ./src
Enter fullscreen mode Exit fullscreen mode

Demo

asciicast


The Problem

Every developer knows this pain:

❌ FAILED: testCalculateDiscount
   Expected: 90
   Actual: 85

Now what? Is the test wrong? Is the code wrong? Did someone change the business logic? Is the test outdated?

You spend 30 minutes investigating only to find the test was written for the OLD discount logic.


The Solution

$ hambugsy analyze ./src/OrderService.java
πŸ” HAMBUGSY - Test Failure Diagnostics πŸ”
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ“ Method: calculateDiscount() @ line 32                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  ❌ FAILING TEST: testPremiumDiscount                              β”‚
β”‚                                                                    β”‚
β”‚  πŸ”¬ ANALYSIS:                                                      β”‚
β”‚  β”œβ”€β”€ Test expects: 10% discount (written: 2024-03-15)              β”‚
β”‚  └── Code applies: 15% discount (changed: 2024-11-22)              β”‚
β”‚                                                                    β”‚
β”‚  🎯 VERDICT: πŸ“œ OUTDATED TEST
…
Enter fullscreen mode Exit fullscreen mode

Quick Links


What's Next

  • [x] VS Code Extension (included!)
  • [x] Auto-fix mode (hambugsy fix)
  • [ ] IntelliJ Plugin
  • [ ] Team analytics dashboard
  • [ ] Slack/Teams notifications

Try It Now

# Prerequisites
gh extension install github/gh-copilot

# Install Hambugsy
npm install -g hambugsy

# Run on your project
hambugsy analyze ./src

# See what's really causing your test failures
Enter fullscreen mode Exit fullscreen mode

Feedback Welcome!

I'd love to hear your thoughts:

  • Is this useful for your workflow?
  • What languages/frameworks would you need?
  • Any features you'd want to see?

Drop a comment below or open an issue on GitHub! πŸ‘‡


Built with ❀️ for the GitHub Copilot CLI Challenge

Top comments (0)