DEV Community

ElshadHu
ElshadHu

Posted on

Token Count Tree Feature: From Repomix Inspiration to C++ Implementation

Here We Go Again

I implemented --token-count-tree and --token-count-tree [threshold] to my repo based on my inspiration from Repomix. My current solution requires refactoring for better maintainability, therefore I kept one issue which addresses the need for tree-specific logic and new modules. II'll tackle this issue soon to make my feature as maintainable as Repomix's implementation. I closed other two issues. One of them is about utility function and another one is about CLI parsing and displaying. Here's my current commit.

Similar Implementation Techniques

I added showTokenCountTree and tokenCountThreshold options to cli module to parse the command. This part is very similar to Repomix's approach, as they handle command-line options parsing in their cliRun module.
For orchestration, I'm using a similar pattern to Repomix. Because, there is a defaultAction module in Repomix which uses accumulation approach for options like i do in cli.cpp.

Future Enhancement and Differences

Repomix's approach:
Repomix uses a tree data structure to organize files by directory:

export interface TreeNode {
  _files?: FileTokenInfo[];
  _tokenSum?: number;
  [key: string]: TreeNode | FileTokenInfo[] | number | undefined;
}
Enter fullscreen mode Exit fullscreen mode

This allows them to display nested directories with cumulative token counts.

My current approach:

I used a simpler flat structure for the prototype. I added writeTokenCountTree() to renderer module and used an ordered map to store file paths with their token counts. I also added two helper functions to the utils module: countTokens() and displayTokenTree().

std::map<std::filesystem::path, std::size_t> fileTokens;
Enter fullscreen mode Exit fullscreen mode

I chose to ship a working feature quickly, accepting technical debt that I'll address in future iterations through refactoring. In my current implementation, the renderer module has too many responsibilities. I plan to refactor this by creating a dedicated module that handles all token tree logic, calculating tokens and building the tree structure to better match industry standards and improve code maintainability.

Top comments (0)