DEV Community

Roman Kondratev
Roman Kondratev

Posted on

Introducing RCParsing - brand new .NET framework for DSLs and data scrapers

If you've ever built a parser in .NET, you know the struggle: choosing between performance and readability, wrestling with token priorities, or debugging cryptic error messages. What if you could have it all โ€“ a parser that's both blazing fast and delightful to use?

Meet RCParsing โ€“ a revolutionary lexerless parsing library that puts developer experience first without compromising on performance.

Why Another Parsing Library?

Most parsing libraries force you to choose between different trade-offs. RCParsing breaks this pattern by offering:

  • ๐Ÿ Hybrid Power: Unique barrier token support for indent-sensitive languages like Python and YAML
  • โ˜„๏ธ Incremental Parsing: Edit large documents with instant feedback โ€“ perfect for LSP servers
  • ๐ŸŒ€ Lexerless Freedom: No more token priority headaches โ€“ parse directly from raw text
  • ๐ŸŽจ Fluent API: Write parsers that read like clean BNF grammars
  • ๐Ÿ› Superior Debugging: Get detailed error messages with stack/walk traces and precise source locations

See It in Action

Simple Math Parser

var builder = new ParserBuilder();
builder.Settings.SkipWhitespaces();

builder.CreateMainRule("expression")
    .Number<double>()
    .LiteralChoice("+", "-")
    .Number<double>()
    .Transform(v => {
        var value1 = v.GetValue<double>(index: 0);
        var op = v.GetText(index: 1);
        var value2 = v.GetValue<double>(index: 2);
        return op == "+" ? value1 + value2 : value1 - value2;
    });

var parser = builder.Build();
var result = parser.Parse("10 + 15").GetValue<double>();
Console.WriteLine(result); // 25
Enter fullscreen mode Exit fullscreen mode

Python-like Indentation Parsing

builder.BarrierTokenizers
    .AddIndent(indentSize: 4, "INDENT", "DEDENT");

builder.CreateRule("block")
    .Token("INDENT")
    .OneOrMore(b => b.Rule("statement"))
    .Token("DEDENT");
Enter fullscreen mode Exit fullscreen mode

This killer feature lets you parse Python-style indentation naturally โ€“ something most .NET parsing libraries struggle with.

Performance That Competes

Don't let the great API fool you โ€“ RCParsing is built for speed. Check out these benchmarks against established libraries:

JSON Parsing (Combinator Style)

Method Mean Ratio Allocated
JsonShort_RCParsing 1,542 ns 1.00 2,280 B
JsonShort_Parlot 2,326 ns 1.51 1,960 B
JsonShort_Pidgin 11,427 ns 7.41 3,664 B

Expression Evaluation

Method Mean Ratio Allocated
ExpressionShort_RCParsing 2,527 ns 1.00 3,736 B
ExpressionShort_TokenCombination 474 ns 0.19 656 B
ExpressionShort_Parlot 591 ns 0.23 896 B

The "Token Combination" style shows what's possible when you bypass AST construction for direct, allocation-free results.

Python 3.13

Method Mean Ratio Allocated
PythonBig_RCParsing 35,644 us 1.00 37397.53 KB
PythonBig_RCParsing_Optimized 5,249 us 0.15 3863.07 KB
PythonBig_ANTLR 5,631 us 0.16 6699.11 KB

Yes, you see right, this library can parse any grammars - from A+B to entire Python 3.13!

Real-World Ready

Advanced Error Messages

RCParsing gives you incredibly detailed error information:

The line where the error occurred (position 130):
    "tags": ["tag1", "tag2", "tag3"],,
                   line 5, column 35 ^

',' is unexpected character, expected one of:
  'string'
  literal '}'

['string'] Stack trace (top call recently):
- Sequence 'pair':
    'string' <-- here
    literal ':'
    'value'

... 316 hidden parsing steps. Total: 356 ...
[ENTER]   pos:128   literal '//'
[FAIL]    pos:128   literal '//' failed to match: '],,\r\n\t"isActive...'
[FAIL]    pos:128   Sequence... failed to match: '],,\r\n\t"isActive...'
...
[FAIL]    pos:0     'value' failed to match: '{\r\n\t"id": 1,\r\n\t...'
[FAIL]    pos:0     'content' failed to match: '{\r\n\t"id": 1,\r\n\t...'
... End of walk trace ...
Enter fullscreen mode Exit fullscreen mode

Incremental Reparsing

Perfect for editors and IDEs:

var ast = jsonParser.Parse(json);
// Later, when the document changes:
var changedAst = ast.Reparsed(changedJson);
// Only changed sections are re-parsed!
Enter fullscreen mode Exit fullscreen mode

Beyond Traditional Parsing

Regex on Steroids (for data scraping)

var parser = builder.Build();
var prices = parser .FindAllMatches<dynamic>(input).ToList();
// Extract all "Price: X.XX USD" patterns with transformations
Enter fullscreen mode Exit fullscreen mode

Combinator Style for Maximum Speed

Skip AST construction entirely when you need raw performance:

builder.CreateToken("value")
    .SkipWhitespaces(b => b.Choice(
        c => c.Token("string"),
        c => c.Token("number"),
        // ... more choices
    ));

// Direct matching without AST
var result = parser.MatchToken<Dictionary<string, object>>("value", json);
Enter fullscreen mode Exit fullscreen mode

Who Should Use RCParsing?

  • Language Developers: Creating DSLs or full programming languages
  • Tooling Authors: Building LSP servers, linters, or formatters
  • Data Engineers: Extracting and transforming complex data formats
  • Library Maintainers: Implementing configuration parsers or query languages
  • Anyone Tired of Parser Pain: Who wants a delightful parsing experience

Getting Started

dotnet add package RCParsing
Enter fullscreen mode Exit fullscreen mode

Explore the documentation and try the web demo to see RCParsing in action.

Whatโ€™s Next for RCParsing

RCParsing is actively developed with an exciting roadmap:

  • Grammar transformers for automatic optimization
  • Semantic analysis tools
  • NFA algorithm for complex rules
  • Visualization and debugging tools

Conclusion

Interested? Install the library and go parsing! Also check the Github repo.

Whatโ€™s the biggest parsing pain youโ€™ve faced in .NET? Iโ€™d love to hear your cases in comments below.

Top comments (0)