DEV Community

OleksandraKordonets
OleksandraKordonets

Posted on

Adding a Compress Feature to My CLI Tool

For my latest update, I added a --compress feature to my Repository Context Packager, a CLI tool that packages repositories into readable text files for large language models. This new option allows users to compress their code content before packaging, meaning it removes unnecessary parts and keeps mainly function signatures and their associated comments, while skipping includes, imports, pragmas, and function bodies. Issue URL.

I was inspired by a similar feature in Repomix. While exploring its web version, I noticed options for compressing code, removing comments, and stripping empty lines. I really liked how simple and effective those features were, so I decided to bring the same idea into my own project. To understand how Repomix achieved this, I studied its code, focusing mainly on the processContent and parseFile functions. That helped me trace the logic flow from reading a file to modifying its text and returning a processed version.

The first major challenge I faced was that Repomix’s compression logic wasn’t perfect, so I didn’t have a flawless example to rely on, and as we all know, no software is perfect. I tested its compression feature using my own repository since I know my files well and could tell exactly what was being skipped. What I noticed was that sometimes it removed too much. For example, my Compressor.h file ended up completely empty after compression:
Repomix’s compression output

This showed me that I needed to rethink how compression should work in my own implementation.

Another big issue came from trying to do too much at once. At first, I attempted to implement --compress, --remove-comments, and --remove-empty-lines all together, assuming they were closely related. In practice, that approach caused a huge number of bugs, fixing one thing would break another. Eventually, I decided to narrow my focus and implement only the --compress feature for now, leaving the other two for later. I’ve already created GitHub issues for those so I can revisit them in future updates.
Remove Comments Issue
Remove Empty Lines Issue

One thing I particularly like about my implementation, compared to Repomix’s, is the output format. I designed mine to be easier to read while still staying compact. For example:
My tool’s output:
My tool's output
Repomix’s output:
Repomix’s output
While the Repomix approach might be more LLM-friendly in some ways, I personally prefer my version. It feels cleaner and easier for humans to read while still serving the purpose of summarizing code.

Overall, this feature taught me a lot about designing transformations carefully and not rushing to combine too many related ideas at once. My next step is to refine the compression logic further and eventually implement the other two options. Once those are in place, I plan to explore similar tools again to see what additional ideas I can bring into my project. My goal is to make the Repository Context Packager a truly powerful and customizable way to generate clean, readable summaries of any repository.

Top comments (0)