The Problem That Started It All
It was 3 AM, and I was staring at yet another ChatGPT conversation, manually copying files from my project one by one. "Here's index.js... now here's utils.js... oh, and don't forget config.js..."
I was building an AI-powered feature and needed to give the model context about my entire codebase. But copy-pasting 20+ files? Every. Single. Time.
There had to be a better way.
That's when the idea hit me: What if I could merge my entire repository into a single file with one command?
Three months later, repomeld was born โ and it's now helping thousands of developers prepare context for AI tools, conduct code reviews, and archive their projects.
The Journey: From 200 Lines to Production
Week 1: The MVP (Minimum Viable Product)
The first version was embarrassingly simple:
// Just 200 lines of synchronous code
function getAllFiles(dir) {
// Read all files recursively
// Skip node_modules
// Concatenate them
}
It worked... barely. It took 30 seconds to scan a medium-sized project and crashed on anything with binary files.
Week 3: Adding Real Features
I realized I was building something people actually wanted. The GitHub issues started rolling in:
- "Can you add
.gitignoresupport?" - "What about binary file detection?"
- "Make it faster!"
So I rebuilt everything.
The Technical Deep Dive
Here's what I learned building a production-grade CLI tool:
1. Performance Optimization: The 10x Improvement
The Problem: Initial version using synchronous fs.readdirSync blocked the event loop and took forever.
The Solution: Async iteration with intelligent caching.
// Before: Blocking and slow
const files = fs.readdirSync(dirPath);
for (const file of files) {
// Process each file...
}
// After: Async and 10x faster
const entries = await fs.readdir(currentDir, { withFileTypes: true });
await Promise.all(entries.map(async (entry) => {
// Process concurrently
}));
Result: Scanning 10,000 files went from 45 seconds to 3.2 seconds.
2. Binary Detection: The Tricky Part
Detecting binary files sounds simple until you realize UTF-8 text can contain null bytes and some binaries don't.
The Solution: Hybrid approach - extension blacklist + content sampling.
async function isBinaryFileFast(filePath) {
// Cache results
if (binaryCache.has(filePath)) return binaryCache.get(filePath);
// Quick extension check
const ext = path.extname(filePath).slice(1);
if (BINARY_EXTENSIONS.has(ext)) return true;
// Sample first 512 bytes
const buffer = await fs.readFile(filePath);
return buffer.includes(0); // Null byte = binary
}
3. Cross-Platform Path Hell
Windows vs. Unix paths caused endless bugs. node_modules wouldn't ignore properly on Windows because of backslashes.
The Solution: Normalize everything.
const normalizePath = (p) => p.split(path.sep).join('/');
// Now "src\\utils\\index.js" becomes "src/utils/index.js"
4. The Recursion Problem
Users kept accidentally including repomeld's own output files, causing infinite loops and massive file bloat.
The Solution: Hard-coded ignore + pattern matching.
// Always ignore anything starting with "repomeld"
if (entry.name.startsWith('repomeld')) continue;
Features That Made the Difference
1. Gitignore Support (Most Requested)
Respecting .gitignore was non-negotiable. I used the ignore package:
const ig = ignore();
ig.add(fs.readFileSync('.gitignore', 'utf8'));
if (ig.ignores('node_modules/lodash/index.js')) {
// Skip it
}
2. Three Output Styles
Users wanted flexibility:
- Banner: Clear visual separation with metadata
- Markdown: Perfect for pasting into AI tools
- Minimal: Just the code, nothing else
3. Smart Defaults with Overrides
{
"ignore": [
"node_modules", // Always ignored
"dist", // Build output
"package.json" // Can be overridden with --force-include
]
}
4. Auto-Numbered Backups
Never overwrite existing files:
repomeld_output.txt # First run
repomeld_output__2.txt # Second run
repomeld_output__3.txt # Third run
repomeld_zips/ # Automatic zip backups
Lessons Learned
1. Start with the CLI, Not the Library
Building the command-line interface first forced me to think about user experience from day one.
2. Test on Windows Early
Most of my early users were on Windows, but I developed on Mac. Big mistake. Add Windows to your CI pipeline immediately.
3. Feature Flags Are Your Friend
repomeld --dry-run # Preview without writing
repomeld --no-backup # Skip zip creation
repomeld --no-update-check # For CI/CD
4. Documentation Is Not Optional
My initial README was three sentences. After expanding it to 400+ lines, downloads increased 5x.
What's Next?
repomeld v4.0 is in development with:
- Watch mode: Automatically rebuild when files change
- Diff views: Show what changed between runs
- Dependency graphs: Visualize file relationships
- AI prompts: Generate optimized prompts from your codebase
Try It Yourself
npm install -g repomeld
cd your-project
repomeld
That's it. You'll get a single file containing your entire codebase - perfect for AI context, code reviews, or archiving.
The Human Side
Building repomeld taught me that the best tools solve real problems simply. Not every CLI needs AI or blockchain or microservices. Sometimes, you just need to combine text files.
I'm currently available for freelance and full-time opportunities. If you need a developer who understands both the technical and human sides of building developer tools, let's talk.
Resources
Have a suggestion for repomeld? Open an issue on GitHub. Found a bug? PRs welcome. Want to hire me? Email me.
Happy coding! ๐ฅ
Appendix: Architecture Diagram
repomeld/
โโโ bin/
โ โโโ cli.js # CLI entry point
โโโ src/
โ โโโ core/
โ โ โโโ fileScanner.js # Async file traversal
โ โ โโโ ignoreBuilder.js # Gitignore parsing
โ โ โโโ formatter.js # Output formatting
โ โ โโโ progress.js # Progress indicator
โ โโโ utils/
โ โ โโโ helpers.js # Utilities
โ โ โโโ constants.js # Config
โ โ โโโ backup.js # Zip creation
โ โโโ index.js # Main orchestration
Key takeaway: Clean architecture and separation of concerns made the codebase maintainable as features grew.
Top comments (0)