DEV Community

Cover image for I Revived My Abandoned 200-Line Python Script Into a Published npm Package โ€” Here's How GitHub Copilot Made It Happen ๐Ÿš€
Mamoor Ahmad
Mamoor Ahmad Subscriber

Posted on • Originally published at dev.to

I Revived My Abandoned 200-Line Python Script Into a Published npm Package โ€” Here's How GitHub Copilot Made It Happen ๐Ÿš€

GitHub โ€œFinish-Up-A-Thonโ€ Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge


๐ŸŽฏ The TL;DR

Last October, I built a scrappy 200-line Python script during a hackathon that analyzed Git commit history. It worked โ€” barely. Then I abandoned it.

Eight months later, I picked it back up, rewrote it from scratch in Node.js, and turned it into RepoLens โ€” a published CLI tool with 6 analysis engines, 36 automated tests, and CI/CD.

๐Ÿค– GitHub Copilot was my pair programmer through every line.

Here's the full comeback story. ๐Ÿ‘‡


๐Ÿ“ฆ What I Built

RepoLens is a codebase intelligence tool that extracts hidden insights from your Git history. Think of it as an MRI scan for your codebase.

๐Ÿง  What It Does

Command What It Analyzes
repolens analyze . ๐Ÿ” Full codebase analysis (all engines)
repolens ownership . ๐Ÿ‘ฅ Who owns what files, bus factor risk
repolens complexity . ๐Ÿ“ˆ Complexity trends over time
repolens bugs . ๐Ÿ› Bug hotspot detection from commit messages
repolens deadcode . ๐Ÿ’€ Potentially unused files (12+ months idle)

โšก Quick Start

# Install from GitHub
npm install -g github:mamoor123/repolens

# Analyze any Git repository
repolens analyze /path/to/your/repo

# Or skip AI briefing for fast results
repolens analyze . --no-ai

# Export as JSON for pipelines
repolens analyze . --json --output report.json
Enter fullscreen mode Exit fullscreen mode

๐ŸŽฌ Live Demo

Here's RepoLens analyzing its own codebase:

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
  ๐Ÿ” RepoLens Report: repolens
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”

๐Ÿ“Š Overview
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Repository   โ”‚ repolens     โ”‚
โ”‚ Total Commitsโ”‚ 5            โ”‚
โ”‚ Total Files  โ”‚ 21           โ”‚
โ”‚ Contributors โ”‚ 3            โ”‚
โ”‚ Timespan     โ”‚ Less than 1 week โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ‘ฅ Top Contributors (by lines of code)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Author       โ”‚ Commits โ”‚ Lines Added โ”‚ Lines Removedโ”‚ Ownership % โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Alice Chen   โ”‚ 5       โ”‚ +2,847      โ”‚ -45          โ”‚ 98.4%       โ”‚
โ”‚ Copilot Bot  โ”‚ 2       โ”‚ +312        โ”‚ -0           โ”‚ 10.8%       โ”‚
โ”‚ Test Runner  โ”‚ 1       โ”‚ +85         โ”‚ -0           โ”‚ 2.9%        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ˆ Complexity Trend: โ†’ stable (0% change)
๐Ÿ› Bug Hotspots: 0 bug-fix commits detected
๐Ÿ’€ Dead Code: 0 files untouched for 12+ months
๐Ÿ”— Critical Files: src/parser.js (risk score: 45.2)
Enter fullscreen mode Exit fullscreen mode

๐Ÿ‘‰ Try it on your own repo


๐ŸŽฌ Demo

๐Ÿ“ธ Screenshots

Full Analysis Report:

The analyze command runs all 6 engines and produces a beautiful terminal report with color-coded tables, trend indicators, and actionable insights.

Individual Analysis Commands:

Each engine can be run independently for focused deep-dives:

# Check who owns each file (bus factor analysis)
repolens ownership ./my-project

# Track complexity trends
repolens complexity ./my-project

# Find bug hotspots
repolens bugs ./my-project

# Detect dead code
repolens deadcode ./my-project
Enter fullscreen mode Exit fullscreen mode

JSON Output for CI/CD Pipelines:

repolens analyze . --json --output report.json
Enter fullscreen mode Exit fullscreen mode

Perfect for integrating into your GitHub Actions workflow or pre-commit hooks.

๐Ÿ”— GitHub Repository | ๐Ÿ“ฆ npm Package


๐Ÿ“– The Comeback Story

Chapter 1: The Hackathon (October 2025) ๐Ÿš๏ธ

It was 2 AM at a local hackathon. I needed a way to understand a messy codebase I'd inherited. So I threw together a Python script:

# The original "tool" โ€” 200 lines of chaos
import subprocess
result = subprocess.run(['git', 'log', '--oneline'], capture_output=True, text=True)
# ... 198 more lines of regex and tears
Enter fullscreen mode Exit fullscreen mode

It kinda worked. It printed some commit stats. I presented it, got a few nods, and never touched it again.

For 8 months, it sat in a forgotten folder on my laptop, collecting digital dust.

Chapter 2: The Resurrection (May 2026) ๐Ÿ”ฅ

When I saw the GitHub Finish-Up-A-Thon Challenge, I knew exactly what to revive.

But here's the thing โ€” the original script was unfixable. The architecture was wrong. The parsing was fragile. There were zero tests.

So I made a bold decision: rewrite everything from scratch in Node.js.

Why Node.js?

  • ๐Ÿ“ฆ npm distribution โ€” npm install -g repolens works everywhere
  • ๐ŸŽจ Beautiful CLI ecosystem โ€” commander, chalk, ora, cli-table3
  • โšก Async performance โ€” Git log parsing for large repos
  • ๐Ÿงช Testing culture โ€” Node's built-in test runner is fantastic

Chapter 3: Building With Copilot ๐Ÿค–

This is where GitHub Copilot changed the game.

๐Ÿงฉ The Parser Problem

The hardest part was parsing Git's output format. Copilot suggested using a custom boundary marker approach:

// Copilot suggested this pattern โ€” it's brilliant
const COMMIT_MARKER = 'ยงCOMMIT_BOUNDARYยง';

const format = [
  COMMIT_MARKER,
  '%H',   // Full hash
  '%h',   // Short hash
  '%an',  // Author name
  '%ae',  // Author email
  '%ai',  // Author date (ISO)
  '%s',   // Subject
  '%b'    // Body
].join('%x00');

// Parse using the boundary marker
const rawParts = output.split(COMMIT_MARKER);
for (const rawPart of rawParts) {
  const parts = rawPart.split('\x00');
  const hash = parts[0]?.trim();
  if (!hash) continue;
  // ... extract fields
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ’ก Copilot insight: The boundary marker approach solved the exact problem that broke my original Python script โ€” handling commit messages that contain newlines.

๐Ÿ”ฌ The Bug Archaeology Engine

For detecting bug-fix commits, Copilot helped me build a 13-pattern regex system that catches everything from fix: login bug to SECURITY: XSS vulnerability patched:

const BUG_PATTERNS = [
  /\bfix(?:ed|es|ing)?\b/i,
  /\bbug\s*fix\b/i,
  /\bhotfix\b/i,
  /\bpatch(?:ed|ing)?\b/i,
  /\bsecurity\b/i,
  /\bCVE-\d+/i,
  /\bvulnerability\b/i,
  // ... 6 more patterns
];
Enter fullscreen mode Exit fullscreen mode

Copilot didn't just suggest the patterns โ€” it explained why each one catches different commit styles. Some developers write fix:, others write bug fix, others write patched. The 13 patterns cover them all.

๐Ÿงช Test Generation

Here's where Copilot really shined. For each analyzer, I'd write one test as a template, and Copilot would suggest the remaining test cases โ€” including edge cases I hadn't thought of:

// I wrote this one
test('should detect bug-fix commits', () => {
  const commits = [{ subject: 'fix: login page crash', files: [] }];
  // ...
});

// Copilot suggested these edge cases
test('should handle empty commit list', () => { ... });
test('should detect CVE references', () => { ... });
test('should not false-positive on "prefix" in subject', () => { ... });
test('should handle binary files (dash stats)', () => { ... });
Enter fullscreen mode Exit fullscreen mode

Result: 36 tests, all passing. Most of the edge cases were Copilot's ideas.

Chapter 4: The Numbers ๐Ÿ“Š

Metric Before (Oct 2025) After (June 2026)
Language Python Node.js
Lines of Code ~200 ~1,500
Tests 0 36 โœ…
CI/CD None GitHub Actions (Node 18/20/22)
Analysis Engines 1 (basic stats) 6 (ownership, complexity, bugs, dead code, dependencies, AI)
Distribution Copy-paste script npm package
Error Handling None Graceful with spinners
Documentation None Full README with examples

๐Ÿค– My Experience with GitHub Copilot

I've been using GitHub Copilot for over a year now, but this project made me appreciate it on a different level. Here's what changed:

๐Ÿ† Where Copilot Excelled

1. Algorithm Design ๐Ÿง 

The co-change analysis for dependency mapping was the most complex algorithm. Copilot didn't just autocomplete โ€” it reasoned about the approach:

"If file A and file B frequently change in the same commit, they're coupled. Use a co-change matrix with a threshold of 3+ co-changes to identify tight coupling."

That's exactly what I built. The coupling cluster detection uses an expanding flood-fill algorithm that Copilot helped me design.

2. Regex Pattern Generation ๐ŸŽฏ

Writing 13 bug-detection regex patterns by hand would have taken hours. Copilot generated them in minutes, and each one was correct on the first try.

3. Test Case Ideation ๐Ÿงช

Copilot suggested edge cases I would have missed:

  • Commits with no files
  • Binary files showing - in line counts
  • Empty git log output
  • NUL byte handling in commit metadata

๐Ÿ’ก What I Learned

  • Copilot is best as a thinking partner, not a code generator. I'd describe what I wanted in comments, and Copilot would suggest implementations โ€” but I always reviewed and adapted them.
  • The chat feature is underrated. For complex problems, I'd use Copilot Chat to reason through architecture decisions before writing code.
  • It's especially good at Node.js patterns. The npm ecosystem has strong conventions, and Copilot knows them well.

๐Ÿ—๏ธ Architecture Deep Dive

For those interested in how RepoLens works under the hood:

repolens/
โ”œโ”€โ”€ bin/repolens.js          # CLI entry point (commander)
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ parser.js            # Git log parser with boundary markers
โ”‚   โ”œโ”€โ”€ analyzers/
โ”‚   โ”‚   โ”œโ”€โ”€ ownership.js     # File ownership + bus factor
โ”‚   โ”‚   โ”œโ”€โ”€ complexity.js    # Complexity timeline + churn
โ”‚   โ”‚   โ”œโ”€โ”€ bugs.js          # Bug archaeology (13 regex patterns)
โ”‚   โ”‚   โ”œโ”€โ”€ deadcode.js      # Dead code detection
โ”‚   โ”‚   โ””โ”€โ”€ dependencies.js  # Co-change coupling analysis
โ”‚   โ”œโ”€โ”€ ai/
โ”‚   โ”‚   โ””โ”€โ”€ briefing.js      # AI codebase briefing (template + LLM)
โ”‚   โ””โ”€โ”€ utils/
โ”‚       โ””โ”€โ”€ format.js        # Output formatting
โ”œโ”€โ”€ test/                    # 36 tests across 7 suites
โ””โ”€โ”€ .github/workflows/ci.yml # CI on Node 18/20/22
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”‘ Key Design Decisions

  1. Zero runtime dependencies for the core โ€” The analysis engines use only Node.js built-ins
  2. Boundary marker parsing โ€” Solves the newline-in-commit-message problem
  3. Co-change coupling โ€” Inspired by research on software evolution
  4. Template + LLM AI briefing โ€” Works without an API key, but upgrades when one is available

๐Ÿ”ฎ What's Next

RepoLens is just getting started. Here's what I'm planning:

  • [ ] ๐Ÿ“ฆ Publish to npm for global installation
  • [ ] ๐ŸŒ Web dashboard with D3.js visualizations
  • [ ] ๐Ÿ”„ GitHub Action for automated PR analysis
  • [ ] ๐Ÿ“Š SonarQube-compatible output format
  • [ ] ๐Ÿ Python SDK for programmatic access
  • [ ] ๐Ÿค– Enhanced AI briefing with GitHub Models

๐Ÿ™ Acknowledgments


๐Ÿ”— Links


Built with โค๏ธ and a lot of โ˜• for the GitHub Finish-Up-A-Thon Challenge

What abandoned project are YOU going to revive? Drop a comment below! ๐Ÿ‘‡


Top comments (0)