Background and Motivation
The need for high-speed code analysis for AI development with tools like Claude Code.
The Python version reached its performance limits in large-scale projects.
Re-implemented in C++17 to achieve a 10-100x performance improvement.
Technical Choices
Parser Selection
From std::regex to PEGTL (Parsing Expression Grammar Template Library)
Escaping "regex hell."
Parallel Processing Design
Leveraging std::execution::par_unseq
Controlling I/O thread count (--io-threads)
Optimizing CPU-bound processing
Implementation Innovations
Language-Specific Hybrid Strategy
TypeScript: PEGTL + string parsing fallback
C++: Support for template and macro parsing
Python: Special handling for indentation-based syntax
Memory Optimization
180x speed improvement through session management
Caching strategy
Results and Measured Performance
Project Number of Files Detected Functions Processing Time
TypeScript Compiler 735 files 2,362 functions 1.9 minutes
lodash.js 1 file 489 functions 0.7 seconds
nlohmann/json 1 file 254 functions 0.5 seconds
What We Learned
The limitations of regular expressions and alternative methods
The practicality of C++17 parallel processing
The importance of a language-specific parsing strategy
Future Outlook
Support for more languages
Optimization specifically for Claude Code
Source Code
GitHub: https://github.com/moe-charm/nekocode
Video Demo
See it in action here: https://www.youtube.com/watch?v=I9Nij1KgTPw
Top comments (0)