DEV Community

Kyle Homen
Kyle Homen

Posted on

Reading the code: Repomix

After spending some time looking at the [Repomix] tool and looking for features to implement in my own [project], I found that there was a checkbox for removing comments from the final output. This seems like a useful feature to have, so I got to reading the code to find out how it was implemented. I decided to use AI to give me an overview of the code, then dived deeper on my own.

What does the feature do?

The comment removal feature strips comments from source code files when packing a repository. For example:

// This is a comment
function hello() {
  /* Another comment */
  return "Hello!";
}
Enter fullscreen mode Exit fullscreen mode

becomes:

function hello() {
  return "Hello!";
}
Enter fullscreen mode Exit fullscreen mode

This reduces the token count when feeding code into LLMs, so it's definitely a useful feature to implement.

How it works

removeComments is first defined in configSchema.ts as a boolean.

removeComments: z.boolean().optional(),
Enter fullscreen mode Exit fullscreen mode

The config gets merged in configLoad.ts, where it checks for arguments and overwrites by precedence. This is similar to how I implemented the TOML config in a previous lab for one of my lab partners. CLI argument (--remove-comments) overwrites the config (repomix.config.json), which overwrites the default value.

I managed to stumble upon the project wiki, which gives a lot of valuable insight, such as the CLI's core components:

Core compononents

Here it mentions how the CLI uses Commander.js for argument parsing, and supports over 30 command-line options.

Also, they seem to use worker-based processing via Tinypool to enable parallel processing. I'm not too familiar with parallel processing so I don't fully understand this, but that's a neat bit of optimization even for something that is seemingly fast enough.

Moving on, I found that files are processed in fileProcessContent.ts, which is where it checks if comments are to be removed and calls the manipulator.

  if (manipulator && config.output.removeComments) {
    processedContent = manipulator.removeComments(processedContent);
  }
Enter fullscreen mode Exit fullscreen mode

And it is here, in fileManipulate.ts, where the various types of manipulators exist for each different file type.

  removeComments(content: string): string {
    let result = this.removeDocStrings(content);
    result = this.removeHashComments(result);
    return rtrimLines(result);
  }
Enter fullscreen mode Exit fullscreen mode

Where, each manipulator defines it's own way to remove comments and doc strings.

Top comments (0)