DEV Community

Ian Cowley
Ian Cowley

Posted on

I built a native C# Grep engine that's holding it's own with ripgrep (with zero allocations)

Let’s be honest: the golden rule of modern software engineering is "never rewrite grep." Tools like ripgrep are written in native Rust, compiled straight to the metal, and aggressively optimized. For 99% of use cases, wrapping a CLI tool or pulling in a massive external dependency is what people do.

But when you are building an ultra-low-latency AI context layer where sub-20ms execution is the hard ceiling, spinning up external CLI processes, handling inter-process communication, and parsing standard output strings back onto the managed heap destroys performance. The Garbage Collector pressure alone kills your agentic execution loop.

So, I decided to see how close I could get with pure, unadulterated .NET 10.

The result? Glacier.Grep.

On a typical developer workload scanning a 257 MB workspace (590 files), it doesn't just rival ripgrepit actually beats it on case-sensitive paths.


The Raw Benchmarks (Warmed)

  • Target: 590 files, 257.83 MB text data
  • OS: Windows (x64)
Engine Query Execution Time Performance Ratio
Ripgrep (Rust) "public class" (Sensitive) 134.9 ms 1.12x
Glacier.Grep (.NET 10) "public class" (Sensitive) 120.4 ms 1.00x (FASTER)
Ripgrep (Rust) "public class" (Insensitive) 210.4 ms 1.52x
Ripgrep (Rust) "ThreadIndependentReaderWriterLock" 142.3 ms 1.23x

Going toe-to-toe with optimized Rust in a managed language requires treating memory like hot lava. Here is exactly how it's engineered under the hood.


1. Zero-Allocation Stack Traversal

The first place search engines lose time is evaluating filesystem metadata and parsing .gitignore rules. If you instantiate DirectoryInfo or materialize file paths as managed strings just to skip a hidden directory, you've already lost.

Glacier.Grep uses System.IO.Enumeration.FileSystemEnumerable<T>. We intercept the OS file handles and evaluate exclusion criteria entirely on the stack using custom ref struct rules before a single path string is allocated.

The .gitignore hierarchy is compiled at startup into a lightweight prefix-tree (Trie). Path matching becomes an instant $O(L)$ operation, pruning folders like bin/, obj/, and node_modules before they ever touch the processing queue.


2. The Hybrid I/O Dispatcher

There is no "one-size-fits-all" for disk I/O. Memory-mapped files are amazing for massive datasets, but forcing the OS to map page tables for thousands of tiny 4KB source files introduces major kernel overhead.

We use a dynamic dispatcher that checks the file length directly from the stack-allocated filesystem entry:

  • Small files (< 1MB): Handled via RandomAccess.Read straight into a chunk of memory rented from ArrayPool<byte>.Shared. This keeps data purely in the buffer pool and avoids virtual memory mapping overhead.
  • Large files (> 1MB): Handled via MemoryMappedFile.CreateFromFile. We grab an unsafe byte* pointer directly to the OS page cache, wrap it in a ReadOnlySpan<byte>, and feed it to the execution engine.

3. Hardware Acceleration via .NET 10 SearchValues<byte>

We don't convert bytes to characters, and we never split text into an array of strings. The entire search happens on raw UTF-8 bytes.

To find the needle in the haystack, we lean heavily on .NET 10's upgraded SearchValues<byte>. The JIT compiler automatically emits vectorized instructions—utilizing AVX-512 or AVX2 depending on the hardware—to scan 32 or 64 bytes of text in a single CPU clock cycle.

When a match byte sequence is triggered, the engine avoids line-splitting allocations by scanning backwards and forwards to the nearest \n byte boundaries, producing a ReadOnlySpan<byte> slice of the line instantly.

// The .NET 10 hot loop
while (offset < fileData.Length)
{
    // Hardware-accelerated SIMD scan
    int matchIndex = fileData.Slice(offset).IndexOfAny(_searchValues);
    if (matchIndex < 0) break;

    offset += matchIndex;

    // Zero-allocation line slicing via byte boundaries
    int lineStart = fileData.Slice(0, offset).LastIndexOf((byte)'\n') + 1;
    int lineEnd = fileData.Slice(offset).IndexOf((byte)'\n');

    // Process match slice...
    offset += _searchValues.Length;
}

Enter fullscreen mode Exit fullscreen mode

4. Built for Agentic Loops (MCP Native)

This isn’t just a CLI utility. Glacier.Grep is built specifically to serve as a Model Context Protocol (MCP) server for AI coding agents.

When an LLM agent needs to find code patterns, it can invoke the tool directly over standard I/O JSON-RPC. Because the engine runs continuously as a persistent process, there's zero process-spawning penalty. Matches are streamed immediately to the agent's context window via a zero-allocation System.IO.Pipelines stream using Utf8JsonWriter.


Conclusion

Managed languages aren't slow—heavy, framework-obsessed architectures are. When you drop the abstractions, bypass the heap, and write code with mechanical sympathy for the underlying CPU registers, .NET 10 is an absolute speed demon.

Glacier.Grep is open-source and part of the Glacier high-performance storage suite.

👉 Check out the repo here: github.com/ian-cowley/Glacier.Grep

Top comments (0)