<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Milliseconds.dev</title>
    <description>The latest articles on DEV Community by Milliseconds.dev (@milliseconds).</description>
    <link>https://dev.to/milliseconds</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3960095%2Fc5d4574c-c7fb-4201-adb3-ff8a2130bfb4.png</url>
      <title>DEV Community: Milliseconds.dev</title>
      <link>https://dev.to/milliseconds</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/milliseconds"/>
    <language>en</language>
    <item>
      <title>scikit-image vs ImageSharp: Processing 10,000 Images Without the Wait</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Fri, 12 Jun 2026 11:01:51 +0000</pubDate>
      <link>https://dev.to/milliseconds/scikit-image-vs-imagesharp-processing-10000-images-without-the-wait-3099</link>
      <guid>https://dev.to/milliseconds/scikit-image-vs-imagesharp-processing-10000-images-without-the-wait-3099</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Image processing is one of those workloads where the choice of library matters more than the choice of language. Python has Pillow (backed by libjpeg-turbo native C) and scikit-image (backed by NumPy and SciPy). The former wraps native code and is fast; the latter operates at a higher abstraction level and is the right choice when you need anti-aliased resampling, multi-channel Gaussian filtering, and colorspace-accurate conversion — which is exactly what you need for a production thumbnail pipeline.&lt;/p&gt;

&lt;p&gt;This benchmark compares scikit-image's full-quality image processing pipeline against ImageSharp, the pure-managed C# image library, on the same workload: load a JPEG, resize to 256×256 with anti-aliasing, apply a Gaussian blur (σ=2), convert to grayscale, and re-encode as JPEG. All three steps measure real processing quality — no fast-but-ugly nearest-neighbor resize, no skipped blur.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input:&lt;/strong&gt; 10,000 synthetic JPEG images at 400×300 pixels (97 MB total), pre-loaded into memory to eliminate disk I/O variance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline:&lt;/strong&gt; resize 256×256 → Gaussian blur (σ=2) → grayscale → JPEG encode (quality=85)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python:&lt;/strong&gt; &lt;code&gt;scikit-image 0.26&lt;/code&gt;, &lt;code&gt;Pillow 12.2&lt;/code&gt; (encode only), &lt;code&gt;numpy 1.26&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;.NET:&lt;/strong&gt; &lt;code&gt;SixLabors.ImageSharp 3.x&lt;/code&gt;, Lanczos3 resampler, &lt;code&gt;GaussianBlur(2f)&lt;/code&gt;, &lt;code&gt;Grayscale()&lt;/code&gt;, &lt;code&gt;JpegEncoder(Quality: 85)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validation:&lt;/strong&gt; pixel checksum within 3% tolerance (different blur kernel implementations, same result quality)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Python (scikit-image)&lt;/th&gt;
&lt;th&gt;.NET (ImageSharp)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000 images&lt;/td&gt;
&lt;td&gt;46.8 s&lt;/td&gt;
&lt;td&gt;6.1 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;7.7×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5,000 images&lt;/td&gt;
&lt;td&gt;5.2 min&lt;/td&gt;
&lt;td&gt;53.7 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 images&lt;/td&gt;
&lt;td&gt;5.0 min&lt;/td&gt;
&lt;td&gt;97.7 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At 1,000 images the gap is 7.7×. At 10,000 the gap narrows to 3.1× — both runtimes approach their steady-state per-image cost, but .NET stays consistently faster throughout.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fvzbcsmvbkik52wien6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fvzbcsmvbkik52wien6.jpg" alt="scikit-image vs ImageSharp — per-image processing time at each scale" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Gap Exists
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;scikit-image's resize path.&lt;/strong&gt; Calling &lt;code&gt;transform.resize(img, (256, 256), anti_aliasing=True)&lt;/code&gt; triggers a three-step pipeline internally: compute a Gaussian pre-filter to suppress aliasing artifacts, build a coordinate grid mapping output pixels to input coordinates, then interpolate. Each step creates a new &lt;code&gt;float64&lt;/code&gt; NumPy array of shape &lt;code&gt;(256, 256, 3)&lt;/code&gt;. At float64 that's 1.5 MB of intermediate allocation per image — before the blur and grayscale steps add more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ImageSharp's resize path.&lt;/strong&gt; &lt;code&gt;Resize&lt;/code&gt; with &lt;code&gt;KnownResamplers.Lanczos3&lt;/code&gt; computes a single-pass separable convolution over the source image. The kernel weights are precomputed and cached for the given scale factor. No intermediate array materialises — the output is written directly to the destination buffer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;scikit-image's Gaussian blur.&lt;/strong&gt; &lt;code&gt;filters.gaussian(img, sigma=2, channel_axis=-1)&lt;/code&gt; dispatches to &lt;code&gt;scipy.ndimage.gaussian_filter&lt;/code&gt; for each channel through a Python call stack, then applies a 1D separable convolution in C. The Python dispatch overhead — three calls for three channels, with array views created at each layer — compounds across 10,000 images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ImageSharp's Gaussian blur.&lt;/strong&gt; &lt;code&gt;GaussianBlur(2f)&lt;/code&gt; fuses the horizontal and vertical 1D passes into a single &lt;code&gt;ProcessPixelRows&lt;/code&gt; loop that the JIT compiles to AVX2-vectorised code. The kernel weights are computed once per sigma value, then reused for every image in the batch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grayscale and encode.&lt;/strong&gt; Both runtimes do similar work here: a weighted channel sum (BT.709 coefficients) and libjpeg-turbo encode. The difference in this stage is small — the resize and blur dominate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# scikit-image — resize allocates intermediate float64 arrays per image
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;skimage&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;skio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;skio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;                           &lt;span class="c1"&gt;# uint8 H×W×3
&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;anti_aliasing&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# float64, pre-filter + interpolate
&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gaussian&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sigma&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;channel_axis&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# float64, 3× scipy dispatch
&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rgb2gray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                                     &lt;span class="c1"&gt;# float64, Python-dispatched weighted sum
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ImageSharp — single-pass SIMD pipeline, zero intermediate allocations&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Rgba32&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;rawBytes&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Resize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ResizeOptions&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Size&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;Sampler&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KnownResamplers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lanczos3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Mode&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ResizeMode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stretch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GaussianBlur&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Grayscale&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MemoryStream&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;JpegEncoder&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;Quality&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;85&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The .NET pipeline chains three operations in a single &lt;code&gt;Mutate&lt;/code&gt; call. ImageSharp executes them sequentially over the pixel buffer with a shared scratch allocation, never materialising a full-size intermediate array.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Speedup Narrows at Scale
&lt;/h2&gt;

&lt;p&gt;At 1,000 images the JIT has had time to compile the hot paths but the working set fits cleanly in cache — the 7.7× advantage is close to the pure throughput ratio between the two pipelines. At 10,000 images both runtimes are memory-pressure-limited: scikit-image's Python GC and NumPy's allocator compete for the same heap, and ImageSharp's own GC pressure rises as the &lt;code&gt;MemoryStream&lt;/code&gt; pool cycles. The raw compute advantage (SIMD vs Python dispatch) remains, but memory latency becomes a larger fraction of total time.&lt;/p&gt;

&lt;p&gt;For a real-world thumbnail service processing images one at a time (not in a pre-loaded batch), the per-image advantage holds closer to the 1k numbers: &lt;strong&gt;~8ms in .NET vs ~47ms in Python&lt;/strong&gt; — a 6× difference per request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Projected Pipeline Throughput
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Python (scikit-image)&lt;/th&gt;
&lt;th&gt;.NET (ImageSharp)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;User upload handler (real-time)&lt;/td&gt;
&lt;td&gt;~21 img/s&lt;/td&gt;
&lt;td&gt;~125 img/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch thumbnail job, 100k images&lt;/td&gt;
&lt;td&gt;~6.6 hours&lt;/td&gt;
&lt;td&gt;~55 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch thumbnail job, 1M images&lt;/td&gt;
&lt;td&gt;~2.8 days&lt;/td&gt;
&lt;td&gt;~9.1 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A media platform reprocessing a 1M-image library after a quality algorithm change: Python takes three days, .NET finishes in nine hours.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9wujgm2jkwy8f1si3fou.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9wujgm2jkwy8f1si3fou.jpg" alt="Thumbnail pipeline throughput — images per minute across batch sizes" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Pillow?
&lt;/h2&gt;

&lt;p&gt;Pillow wraps libjpeg-turbo and libImaging (C implementations) for its core operations. Its &lt;code&gt;Image.resize(LANCZOS)&lt;/code&gt; and &lt;code&gt;ImageFilter.GaussianBlur&lt;/code&gt; call native C code directly — there is almost no Python overhead per image. In a separate test, Pillow and ImageSharp ran within 10% of each other across all dataset sizes. Neither won convincingly.&lt;/p&gt;

&lt;p&gt;scikit-image is the right comparison because it represents the quality-first choice: its anti-aliased resize produces better output than Pillow's LANCZOS in edge cases (correct pre-filtering, float precision throughout), and its Gaussian operates per-channel with proper sigma semantics. That quality comes with a Python-level coordination cost that ImageSharp avoids entirely.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>intervaltree vs Augmented BST: Range Queries on 1 Million Intervals</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Tue, 09 Jun 2026 16:29:14 +0000</pubDate>
      <link>https://dev.to/milliseconds/intervaltree-vs-augmented-bst-range-queries-on-1-million-intervals-5fc7</link>
      <guid>https://dev.to/milliseconds/intervaltree-vs-augmented-bst-range-queries-on-1-million-intervals-5fc7</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Interval overlap queries appear in bioinformatics (gene annotation), scheduling systems (meeting conflict detection), and database engines (range predicate evaluation). Given a point &lt;code&gt;q&lt;/code&gt;, an interval tree answers "how many intervals &lt;code&gt;[start, end)&lt;/code&gt; contain &lt;code&gt;q&lt;/code&gt;?" efficiently.&lt;/p&gt;

&lt;p&gt;Python's &lt;code&gt;intervaltree&lt;/code&gt; library implements a centered interval tree — each node stores intervals that overlap the node's center point, with left/right children for intervals entirely to the left or right. It's a solid pure-Python implementation. The .NET replacement uses an &lt;strong&gt;augmented BST&lt;/strong&gt; (sorted by start, with a &lt;code&gt;maxEnd&lt;/code&gt; field propagated upward) backed by flat &lt;code&gt;int[]&lt;/code&gt; arrays — no heap objects per node, no Python pointer chasing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Build:&lt;/strong&gt; Insert N random intervals with start ∈ [0, 10,000), length ∈ [1, 100)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query:&lt;/strong&gt; 10,000 point queries uniformly distributed over the range&lt;/li&gt;
&lt;li&gt;Tested at N = 10,000 / 100,000 / 1,000,000 intervals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Python uses &lt;code&gt;IntervalTree.addi(start, end)&lt;/code&gt; + &lt;code&gt;tree[q]&lt;/code&gt;. .NET uses &lt;code&gt;AugmentedIntervalTree(starts, ends)&lt;/code&gt; + &lt;code&gt;CountOverlaps(q)&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Intervals&lt;/th&gt;
&lt;th&gt;Build — Python&lt;/th&gt;
&lt;th&gt;Build — .NET&lt;/th&gt;
&lt;th&gt;Query 10k — Python&lt;/th&gt;
&lt;th&gt;Query 10k — .NET&lt;/th&gt;
&lt;th&gt;Speedup (query)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~0.3 s&lt;/td&gt;
&lt;td&gt;~12 ms&lt;/td&gt;
&lt;td&gt;~1.1 s&lt;/td&gt;
&lt;td&gt;~100 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;11×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~3.2 s&lt;/td&gt;
&lt;td&gt;~80 ms&lt;/td&gt;
&lt;td&gt;~11 s&lt;/td&gt;
&lt;td&gt;~650 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;~35 s&lt;/td&gt;
&lt;td&gt;~750 ms&lt;/td&gt;
&lt;td&gt;~120 s&lt;/td&gt;
&lt;td&gt;~3.9 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;30.8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Build speedup follows a similar curve (25–47×) because &lt;code&gt;IntervalTree.addi&lt;/code&gt; creates a Python object per interval.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Flat Arrays Win
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;intervaltree&lt;/code&gt; stores each interval as an &lt;code&gt;Interval&lt;/code&gt; namedtuple — a Python heap object. The tree structure stores these objects in Python sets per node. Traversing the tree during a query means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Following Python object pointers (cache-unfriendly)&lt;/li&gt;
&lt;li&gt;Set intersection operations (Python set object overhead)&lt;/li&gt;
&lt;li&gt;Counting results by iterating Python objects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The .NET augmented tree stores all data in three &lt;code&gt;int[]&lt;/code&gt; arrays: &lt;code&gt;_start&lt;/code&gt;, &lt;code&gt;_end&lt;/code&gt;, and &lt;code&gt;_maxEnd&lt;/code&gt;. The tree is an implicit structure over a sorted array — no heap objects per node. A query traverses the array recursively, comparing integers in registers with no pointer chasing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query cost per node:
  Python:  pointer deref → Python object → attribute lookup → comparison
  .NET:    array index → direct int comparison
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At 1 million intervals the cache difference dominates: Python's tree scatters Interval objects across the heap; .NET's arrays are contiguous in memory and prefetcher-friendly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Augmented interval tree — three flat arrays, no heap objects per interval&lt;/span&gt;
&lt;span class="c1"&gt;// Replaces: intervaltree.IntervalTree with addi/query interface&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;AugmentedIntervalTree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;starts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;ends&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;starts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Enumerable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;OrderBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;starts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;ToArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;_start&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;starts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;ToArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;_end&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ends&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;ToArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;_maxEnd&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nf"&gt;BuildMaxEnd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;CountOverlaps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;CountRec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_start&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;CountRec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;lo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;hi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lo&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;hi&lt;/span&gt; &lt;span class="p"&gt;||&lt;/span&gt; &lt;span class="n"&gt;_maxEnd&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;lo&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lo&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_start&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;CountRec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;_end&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;lo&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;CountRec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# intervaltree — Python object per interval, set-based node storage
&lt;/span&gt;&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;IntervalTree&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;intervals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;overlapping&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;   &lt;span class="c1"&gt;# returns a set of Interval objects
&lt;/span&gt;    &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;overlapping&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Python version returns full &lt;code&gt;Interval&lt;/code&gt; objects that must be counted. The .NET version counts directly in the traversal with no object creation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuwlw16r0teoizaquff2w.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuwlw16r0teoizaquff2w.jpg" alt="Query time for 10k points by interval count — .NET stays under 4 seconds at 1M intervals" width="799" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At 1 million intervals Python takes 2 minutes to answer 10,000 queries; .NET takes under 4 seconds. The logarithmic nature of the tree means neither scales as badly as a linear scan, but Python's per-node overhead keeps it orders of magnitude slower.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6yc8m25epmgcygo18dft.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6yc8m25epmgcygo18dft.jpg" alt="Build time comparison — augmented BST construction vs IntervalTree.addi" width="800" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Build time shows the same pattern: &lt;code&gt;IntervalTree.addi&lt;/code&gt; creates a Python object per interval and inserts it into a Python set. The .NET constructor sorts three integer arrays — a cache-friendly operation with zero heap allocation per interval.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>Difflib vs DP LCS: 21x Speedup Rewiring Python's Diff Engine</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Mon, 08 Jun 2026 13:03:26 +0000</pubDate>
      <link>https://dev.to/milliseconds/difflib-vs-dp-lcs-21x-speedup-rewiring-pythons-diff-engine-12i2</link>
      <guid>https://dev.to/milliseconds/difflib-vs-dp-lcs-21x-speedup-rewiring-pythons-diff-engine-12i2</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Python's &lt;code&gt;difflib.SequenceMatcher&lt;/code&gt; is one of the most used modules in the standard library — code review tools, merge conflict resolution, and fuzzy search all lean on it. The algorithm underneath is Ratcliff/Obershelp, which allocates a fresh matrix on every call. At 10,000 pairs that's manageable; at 100,000 pairs the garbage collector becomes the bottleneck.&lt;/p&gt;

&lt;p&gt;The .NET replacement uses classic DP LCS (Longest Common Subsequence) on integer-encoded sequences. The key trick: one preallocated &lt;code&gt;int[]&lt;/code&gt; buffer reused across every pair, so the GC sees zero pressure after warmup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;Test data is pairs of integer sequences (vocabulary size 9,000), each 80–200 elements long, with approximately 15% deletions, 10% substitutions, and 10% insertions per pair. Integer encoding keeps I/O negligible and isolates the diff algorithm itself.&lt;/p&gt;

&lt;p&gt;Dataset sizes tested: &lt;strong&gt;10,000 / 50,000 / 100,000 pairs&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Python (difflib)&lt;/th&gt;
&lt;th&gt;.NET (DP LCS)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~1.3 s&lt;/td&gt;
&lt;td&gt;~230 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.6×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50,000&lt;/td&gt;
&lt;td&gt;~6.5 s&lt;/td&gt;
&lt;td&gt;~305 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;21.3×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~13 s&lt;/td&gt;
&lt;td&gt;~960 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;13.5×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 50k peak at 21× reflects .NET's GC advantage most clearly: Python's allocator buckles under the sustained object churn while .NET hums along with a single live buffer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wjmh5lp0w369jz3gxvn.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wjmh5lp0w369jz3gxvn.jpg" alt="Speedup multiplier by dataset size — .NET peaks at 21× at 50k pairs" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Diagram 1 shows the speedup multiplier across the three dataset sizes. The 10k result (5.6×) reflects Python's overhead even at moderate scale. The jump to 21× at 50k is where GC pressure in Python compresses throughput — the GC is spending real time collecting the matrix objects. At 100k the speedup drops back to 13.5×, because .NET's own JIT warmup and memory pressure become measurable at this scale.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8ac0tpn6go5llbr3npt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8ac0tpn6go5llbr3npt.jpg" alt="Absolute execution time in milliseconds — Python scales linearly, .NET stays nearly flat" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Diagram 2 shows absolute execution time. Python's line rises steeply and linearly — each pair costs roughly the same, plus a growing GC tax. The .NET line is almost flat from 10k to 50k (the preallocated buffer means no marginal allocation cost per pair), then rises modestly to 100k as cache pressure builds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why .NET Wins
&lt;/h2&gt;

&lt;p&gt;Two factors compound each other:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Zero per-pair allocation.&lt;/strong&gt; The DP buffer is preallocated once at startup:&lt;br&gt;
&lt;code&gt;int[] dpBuf = new int[MaxLen * MaxLen]&lt;/code&gt; with &lt;code&gt;MaxLen = 210&lt;/code&gt;. Every diff reuses this flat array via index arithmetic &lt;code&gt;dp[i * stride + j]&lt;/code&gt;, so the GC has nothing to collect between pairs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Flat integer arrays outperform Python list-of-lists.&lt;/strong&gt; Python's 2D matrix is a list of list objects — each row is a separate heap allocation with its own reference count. The .NET flat array is a single contiguous block, which is cache-friendly and requires no pointer indirection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// .NET — single buffer, reused across all pairs&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;dpBuf&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MaxLen&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;MaxLen&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ins&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unch&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;DiffCounts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dpBuf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MaxLen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;ins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;dels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;unch&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;checksum&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;unch&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;DiffCounts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ins&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;dels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;unch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
            &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
                &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]);&lt;/span&gt;
    &lt;span class="n"&gt;unch&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;ins&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;unch&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;dels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;unch&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python — allocates new internal state per call
&lt;/span&gt;&lt;span class="n"&gt;matcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SequenceMatcher&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;autojunk&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;blocks&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_matching_blocks&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;unch&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>dateutil vs DateTimeOffset: Parsing a Million Timestamps</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sun, 07 Jun 2026 17:41:04 +0000</pubDate>
      <link>https://dev.to/milliseconds/dateutil-vs-datetimeoffset-parsing-a-million-timestamps-2dff</link>
      <guid>https://dev.to/milliseconds/dateutil-vs-datetimeoffset-parsing-a-million-timestamps-2dff</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Date parsing is a hidden bottleneck in ETL pipelines, log processing, and data ingestion. &lt;code&gt;dateutil.parser.parse&lt;/code&gt; is remarkable in its flexibility — it handles ISO 8601, RFC 2822, US date formats, and dozens of regional variations through a sophisticated heuristic tokenizer. That flexibility has a cost: each call tokenizes the input, tries multiple format hypothesis, and resolves ambiguities through code paths that can branch dozens of times per string.&lt;/p&gt;

&lt;p&gt;When the format is known — as it always is in a well-designed data pipeline — &lt;code&gt;DateTimeOffset.TryParseExact&lt;/code&gt; with a precompiled format list eliminates all the guesswork and processes timestamps through a direct state-machine parse.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;1 million timestamps across 8 formats (reflecting common log and API date patterns):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;2024-01-15T14:30:00&lt;/code&gt; — ISO 8601 datetime&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2024-01-15&lt;/code&gt; — ISO date only&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Mon, 15 Jan 2024 14:30:00 +0000&lt;/code&gt; — RFC 2822&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;January 15, 2024&lt;/code&gt; — long US format&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;1/15/2024&lt;/code&gt; — short US format&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;15 Jan 2024&lt;/code&gt; — day-month-year&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2024/01/15 14:30:00&lt;/code&gt; — slash-separated&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;01-15-2024 14:30&lt;/code&gt; — US with time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tested at 10,000 / 100,000 / 1,000,000 timestamps. Distribution is uniform across all 8 formats.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Timestamps&lt;/th&gt;
&lt;th&gt;Python (dateutil)&lt;/th&gt;
&lt;th&gt;.NET (TryParseExact)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~0.8 s&lt;/td&gt;
&lt;td&gt;~100 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~7.9 s&lt;/td&gt;
&lt;td&gt;~570 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;13.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;~79 s&lt;/td&gt;
&lt;td&gt;~4.1 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;19.3×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why TryParseExact Wins
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;dateutil.parser.parse&lt;/code&gt; works by:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tokenizing the input string into a list of Python objects (year, month, day, time components)&lt;/li&gt;
&lt;li&gt;Running heuristic rules to assign token roles&lt;/li&gt;
&lt;li&gt;Calling &lt;code&gt;datetime.datetime(...)&lt;/code&gt; with the resolved components&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each step creates Python objects on the heap. The tokenizer alone creates 10–20 Python string fragments per date string.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;DateTimeOffset.TryParseExact&lt;/code&gt; with a format array:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tries each format string in order&lt;/li&gt;
&lt;li&gt;Each attempt is a pure C state-machine scan over the input &lt;code&gt;ReadOnlySpan&amp;lt;char&amp;gt;&lt;/code&gt; — zero allocations&lt;/li&gt;
&lt;li&gt;On match, fills a &lt;code&gt;DateTimeOffset&lt;/code&gt; value type directly — no heap allocation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The critical detail: &lt;code&gt;DateTimeOffset&lt;/code&gt; is a &lt;code&gt;struct&lt;/code&gt;. The entire result fits in a CPU register. Python's &lt;code&gt;datetime&lt;/code&gt; is a heap-allocated object.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// One compiled format array — zero allocation per parse&lt;/span&gt;
&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;Formats&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s"&gt;"yyyy-MM-ddTHH:mm:ss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"yyyy-MM-dd"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"ddd, dd MMM yyyy HH:mm:ss zzz"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"MMMM d, yyyy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"M/d/yyyy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"dd MMM yyyy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"yyyy/MM/dd HH:mm:ss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"MM-dd-yyyy HH:mm"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="nf"&gt;TryParse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="n"&gt;DateTimeOffset&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;DateTimeOffset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryParseExact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Formats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CultureInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvariantCulture&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;DateTimeStyles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AllowWhiteSpaces&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# dateutil — heuristic tokenizer, format auto-detection
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dateutil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you control the data formats, &lt;code&gt;TryParseExact&lt;/code&gt; is the right tool. The cost of &lt;code&gt;dateutil&lt;/code&gt;'s flexibility — automatic format detection — is paid on every call even when the format is completely predictable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlvm2u06s1k7tcctsj75.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhlvm2u06s1k7tcctsj75.jpg" alt="Date parsing time by volume — dateutil scales steeply, .NET stays flat" width="799" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At 1 million timestamps Python takes 79 seconds; .NET takes 4 seconds. The slope difference confirms per-timestamp Python object allocation is the bottleneck.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5yyhg42j020ntjhk0a9w.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5yyhg42j020ntjhk0a9w.jpg" alt="Speedup multiplier — grows from 8× at 10k to 19× at 1M timestamps" width="799" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The speedup grows because GC pressure compounds: Python's allocator must garbage-collect the tokenizer fragments from each parse, and at 1 million timestamps that GC work becomes a significant fraction of total time.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>qrcode vs QRCoder: Generating 50,000 QR Codes</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sat, 06 Jun 2026 13:58:36 +0000</pubDate>
      <link>https://dev.to/milliseconds/qrcode-vs-qrcoder-generating-50000-qr-codes-22ni</link>
      <guid>https://dev.to/milliseconds/qrcode-vs-qrcoder-generating-50000-qr-codes-22ni</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;QR code generation is a batch workload in many systems: product labels, event ticketing, URL shorteners, and payment systems all generate large volumes on demand. The encoding algorithm is fixed by the ISO 18004 standard — both &lt;code&gt;qrcode&lt;/code&gt; (Python) and &lt;code&gt;QRCoder&lt;/code&gt; (.NET) implement identical Reed-Solomon error correction and matrix placement.&lt;/p&gt;

&lt;p&gt;Since the algorithm is standardized, this benchmark directly measures how much of the runtime is language overhead versus actual QR encoding work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;50,000 QR codes generated from random 12-character alphanumeric strings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Error correction level M (15% recovery capacity)&lt;/li&gt;
&lt;li&gt;Version auto-selected (typically version 3–4 for 12-char payloads)&lt;/li&gt;
&lt;li&gt;Output: dark-module count and matrix size (no image rendering — pure encoding)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tested at 1,000 / 10,000 / 50,000 codes. Python uses &lt;code&gt;qrcode.QRCode(error_correction=ERROR_CORRECT_M, border=0)&lt;/code&gt;. .NET uses &lt;code&gt;QRCoder.QRCodeGenerator&lt;/code&gt; with &lt;code&gt;ECCLevel.M&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Codes&lt;/th&gt;
&lt;th&gt;Python (qrcode)&lt;/th&gt;
&lt;th&gt;.NET (QRCoder)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;~0.6 s&lt;/td&gt;
&lt;td&gt;~95 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.3×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~5.9 s&lt;/td&gt;
&lt;td&gt;~650 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9.1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50,000&lt;/td&gt;
&lt;td&gt;~29 s&lt;/td&gt;
&lt;td&gt;~2.1 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;13.8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why QRCoder Is Faster
&lt;/h2&gt;

&lt;p&gt;QR encoding involves three expensive steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Data encoding&lt;/strong&gt; — convert input bytes to QR codewords using mode selection and character set tables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reed-Solomon error correction&lt;/strong&gt; — polynomial multiplication in GF(256), repeated for every block&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Matrix placement&lt;/strong&gt; — fill the QR matrix with function patterns, data bits, and masking&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In Python, each step iterates over lists of integers with per-iteration interpreter overhead. The GF(256) multiplication table lookup is a Python dict access. In .NET, all three steps compile to tight loops over &lt;code&gt;int[]&lt;/code&gt; arrays with no boxing, no dict hashing, and no object allocation per symbol.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;border=0&lt;/code&gt; / &lt;code&gt;QuietZone=4&lt;/code&gt; setting also matters: Python's default rendering calculates quiet zone positions in Python; the .NET implementation handles this in a single matrix-copy operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// QRCoder — generate and count dark modules&lt;/span&gt;
&lt;span class="c1"&gt;// Replaces: qrcode.QRCode(error_correction=ERROR_CORRECT_M, border=0)&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt; &lt;span class="nf"&gt;Analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;qrGenerator&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;QRCodeGenerator&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qrGenerator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateQrCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;QRCodeGenerator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ECCLevel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ModuleMatrix&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;dark&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="n"&gt;dark&lt;/span&gt;&lt;span class="p"&gt;++;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# qrcode — pure Python Reed-Solomon + matrix placement
&lt;/span&gt;&lt;span class="n"&gt;qr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qrcode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;QRCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_correction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;qrcode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ERROR_CORRECT_M&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;border&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;qr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;qr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;qr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_matrix&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;dark&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cell&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cell&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both produce identical matrices. The .NET version's Reed-Solomon polynomial arithmetic runs as compiled native code rather than Python bytecode.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42qjotqltb3d8fxo0vjh.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42qjotqltb3d8fxo0vjh.jpg" alt="QR code generation time by batch size — Python grows steeply, .NET stays nearly flat" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At 1,000 codes Python is ~6× slower. At 50,000 codes it's nearly 14×. The growing gap shows that per-code overhead (GF(256) table lookups, list operations) accumulates faster than any fixed startup cost.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0g8axljymr8wlnljf9v0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0g8axljymr8wlnljf9v0.jpg" alt="Throughput in codes per second — QRCoder generates ~24k codes/s vs Python's ~1,700 codes/s" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For a ticketing platform generating 100,000 QR codes on demand: Python takes ~59 seconds, .NET takes ~4 seconds. The difference determines whether this can be a synchronous API call or requires a queue.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>mistune vs Markdig: Rendering 10,000 Markdown Documents</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Fri, 05 Jun 2026 14:30:03 +0000</pubDate>
      <link>https://dev.to/milliseconds/mistune-vs-markdig-rendering-10000-markdown-documents-iel</link>
      <guid>https://dev.to/milliseconds/mistune-vs-markdig-rendering-10000-markdown-documents-iel</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Markdown rendering shows up in documentation systems, static site generators, API responses for rich text, and content management pipelines. &lt;code&gt;mistune&lt;/code&gt; is consistently benchmarked as Python's fastest Markdown parser — it uses a regex-based scanner with minimal Python object overhead. &lt;code&gt;Markdig&lt;/code&gt; is .NET's most popular Markdown library, built on a character-scanner architecture with extension points for CommonMark compliance.&lt;/p&gt;

&lt;p&gt;Both produce equivalent HTML for standard Markdown input. The benchmark isolates pure rendering throughput: no I/O, no template engine, just Markdown string in → HTML string out.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;10,000 documents from a Wikipedia plain-text dump, converted to Markdown format:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mix of headings, paragraphs, bold/italic, inline code, fenced code blocks, lists, and links&lt;/li&gt;
&lt;li&gt;Average document: ~2,500 characters&lt;/li&gt;
&lt;li&gt;Total input: ~25 MB of Markdown text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tested at 1,000 / 5,000 / 10,000 documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Python (mistune)&lt;/th&gt;
&lt;th&gt;.NET (Markdig)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;~0.9 s&lt;/td&gt;
&lt;td&gt;~130 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5,000&lt;/td&gt;
&lt;td&gt;~4.4 s&lt;/td&gt;
&lt;td&gt;~480 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9.2×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~8.8 s&lt;/td&gt;
&lt;td&gt;~800 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;11×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why Markdig Is Faster
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pipeline reuse.&lt;/strong&gt; &lt;code&gt;Markdig&lt;/code&gt; requires building a &lt;code&gt;MarkdownPipeline&lt;/code&gt; once — it's thread-safe and reused across all documents. &lt;code&gt;mistune&lt;/code&gt;'s &lt;code&gt;create_markdown()&lt;/code&gt; is designed to be called once per style, but the underlying renderer still allocates Python objects per document during parsing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Character-at-a-time scanning in native code.&lt;/strong&gt; Markdig's block parser and inline parser each process characters through a JIT-compiled state machine. Python's regex engine (even the fast PCRE-style one in mistune) adds interpreter overhead per match object created.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No intermediate AST allocation.&lt;/strong&gt; Markdig's pipeline streams tokens directly from the scanner into the HTML writer without building a full AST in memory. mistune builds a list of (type, content) tuples as an intermediate representation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Pipeline built once, reused for all 10,000 documents&lt;/span&gt;
&lt;span class="c1"&gt;// Python equivalent: md = mistune.create_markdown()&lt;/span&gt;
&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;MarkdownPipeline&lt;/span&gt; &lt;span class="n"&gt;_pipeline&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MarkdownPipelineBuilder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;Render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;Markdown&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToHtml&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_pipeline&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# mistune — create once, call per document
&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mistune&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_markdown&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;md&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The interface is nearly identical. The difference is what happens inside: &lt;code&gt;Markdown.ToHtml&lt;/code&gt; dispatches to JIT-compiled C# scanning code, while &lt;code&gt;md(doc)&lt;/code&gt; dispatches through Python's regex engine and object system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiach2i8yap5z97v62et.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiach2i8yap5z97v62et.jpg" alt="Rendering time by document count — Markdig stays under 1 second for 10k documents" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;mistune's time grows linearly with document count. Markdig's growth is also linear but with a slope roughly 11× smaller. At 10,000 documents: 8.8 seconds vs 800 milliseconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cteliovli6uxse2558d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cteliovli6uxse2558d.jpg" alt="Throughput in documents per second — Markdig processes ~12,500 docs/s vs mistune's ~1,130 docs/s" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Throughput matters for pipeline systems: a documentation site generating 100,000 pages at build time takes 88 seconds with mistune, 8 seconds with Markdig.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>Whoosh vs Lucene.NET: Full-Text Search, Pure Managed Code</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Thu, 04 Jun 2026 01:48:31 +0000</pubDate>
      <link>https://dev.to/milliseconds/whoosh-vs-lucenenet-full-text-search-pure-managed-code-3ljf</link>
      <guid>https://dev.to/milliseconds/whoosh-vs-lucenenet-full-text-search-pure-managed-code-3ljf</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Full-text search powers document retrieval across millions of use cases: log search, e-commerce product lookup, knowledge base queries. Both Whoosh and Lucene.NET implement BM25 ranking on an inverted index — the same algorithm, the same data structures, different languages.&lt;/p&gt;

&lt;p&gt;This is one of the cleanest comparisons in the benchmark suite because neither library uses native binaries. Whoosh is 100% Python. Lucene.NET is 100% managed C#. Every millisecond of difference is language execution speed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Corpus:&lt;/strong&gt; 100,000 news articles (JSONL, &lt;code&gt;id&lt;/code&gt; + &lt;code&gt;title&lt;/code&gt; + &lt;code&gt;body&lt;/code&gt; fields)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index:&lt;/strong&gt; StandardAnalyzer + BM25 scoring, 256 MB RAM buffer (matches Whoosh's &lt;code&gt;limitmb&lt;/code&gt; default)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queries:&lt;/strong&gt; 20 high-frequency English words × 50 rounds = &lt;strong&gt;1,000 total queries&lt;/strong&gt;, &lt;code&gt;limit=10&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Both use on-disk indexes; Lucene.NET's &lt;code&gt;FSDirectory&lt;/code&gt; and Whoosh's default &lt;code&gt;FileStorage&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Python (Whoosh)&lt;/th&gt;
&lt;th&gt;.NET (Lucene.NET)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Index 100k docs&lt;/td&gt;
&lt;td&gt;~42 s&lt;/td&gt;
&lt;td&gt;~4.7 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 queries&lt;/td&gt;
&lt;td&gt;~11.2 s&lt;/td&gt;
&lt;td&gt;~510 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;22×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why Lucene.NET Is Faster
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Indexing:&lt;/strong&gt; Whoosh builds its inverted index through Python dict operations — every token triggers a dict lookup and list append. At 100,000 documents with an average of 200 tokens each, that's 20 million Python attribute accesses per index build. Lucene.NET does the same work in JIT-compiled C# with value-type token structs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Querying:&lt;/strong&gt; BM25 scoring requires computing IDF weights and term frequencies for every matching document per query. In Whoosh, each posting list traversal is a Python generator — &lt;code&gt;yield&lt;/code&gt; overhead on every document. Lucene.NET's &lt;code&gt;IndexSearcher.Search&lt;/code&gt; compiles the query plan to a tight C# iterator with no Python call overhead.&lt;/p&gt;

&lt;p&gt;Additionally, Lucene.NET's &lt;code&gt;StandardAnalyzer&lt;/code&gt; reuses a pooled token stream; Whoosh creates new Python objects for each analyzed token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Lucene.NET — single pipeline instance, reused searcher&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;IndexResult&lt;/span&gt; &lt;span class="nf"&gt;Index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;docsPath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IndexWriterConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Ver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_analyzer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;OpenMode&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OpenMode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CREATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;RAMBufferSizeMB&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IndexWriter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_fsDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;File&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ReadAllLines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docsPath&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;JsonDocument&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;Document&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StringField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RootElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetProperty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;GetString&lt;/span&gt;&lt;span class="p"&gt;()!,&lt;/span&gt;    &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;YES&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;TextField&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RootElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetProperty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;GetString&lt;/span&gt;&lt;span class="p"&gt;()!,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;YES&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;TextField&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RootElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetProperty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;GetString&lt;/span&gt;&lt;span class="p"&gt;()!,&lt;/span&gt;  &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NO&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;SearchResult&lt;/span&gt; &lt;span class="nf"&gt;Search&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;QueryParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Ver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_analyzer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Queries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;_searcher&lt;/span&gt;&lt;span class="p"&gt;!.&lt;/span&gt;&lt;span class="nf"&gt;Search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;TotalHits&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;SearchResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Elapsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TotalMilliseconds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Whoosh — writer and searcher use Python generator chains
&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limitmb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;jsonl_docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;searcher&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;QueryParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both implement identical BM25 scoring on the same inverted index structure. The 22× search speedup comes from Lucene.NET's compiled query evaluation replacing Whoosh's Python generator chains.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvuio5cy61wdp9fc0durx.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvuio5cy61wdp9fc0durx.jpg" alt="Indexing and search time comparison — Lucene.NET completes in seconds, Whoosh in minutes" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Indexing 100k documents: Whoosh takes 42 seconds, Lucene.NET finishes in under 5 seconds. The BM25 index structure is identical — the time difference is pure Python interpretation overhead during tokenization and posting-list construction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fus6u99k93tozgk9ey31t.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fus6u99k93tozgk9ey31t.jpg" alt="Query throughput — queries per second, Lucene.NET vs Whoosh" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whoosh handles ~90 queries/second; Lucene.NET handles ~1,960 queries/second. For any real-time search endpoint this difference is the margin between a responsive UI and a timeout.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>NLTK vs Compiled Regex: Tokenizing 100 MB of Text in .NET</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Tue, 02 Jun 2026 17:00:13 +0000</pubDate>
      <link>https://dev.to/milliseconds/nltk-vs-compiled-regex-tokenizing-100-mb-of-text-in-net-3nm2</link>
      <guid>https://dev.to/milliseconds/nltk-vs-compiled-regex-tokenizing-100-mb-of-text-in-net-3nm2</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Tokenization is the first step of almost every NLP pipeline. NLTK's &lt;code&gt;sent_tokenize&lt;/code&gt; uses Punkt — an unsupervised ML model trained on abbreviation lists — to split sentences. &lt;code&gt;word_tokenize&lt;/code&gt; then applies a regex with Penn Treebank conventions. Both are high-quality, widely used, and measurably slow.&lt;/p&gt;

&lt;p&gt;The .NET replacement uses two &lt;code&gt;Regex.Compiled&lt;/code&gt; patterns: one for sentence splitting on punctuation + capitalization heuristics, one for word extraction matching alphanumeric sequences. No trained model, no Python objects — just a tight state machine compiled to native code by the regex JIT.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;Three corpus sizes from a Wikipedia plain-text dump:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;10 MB&lt;/strong&gt; — ~90k sentences, ~1.5M words&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;50 MB&lt;/strong&gt; — ~450k sentences, ~7.5M words&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100 MB&lt;/strong&gt; — ~900k sentences, ~15M words&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both implementations process the same files sequentially. Output is validated within tolerance: sentence counts ±15% (Punkt handles abbreviations the regex misses), word counts ±20% (NLTK splits contractions like &lt;code&gt;don't&lt;/code&gt; → &lt;code&gt;do&lt;/code&gt; + &lt;code&gt;n't&lt;/code&gt;; .NET keeps them whole — both are valid strategies).&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Corpus&lt;/th&gt;
&lt;th&gt;Python (NLTK)&lt;/th&gt;
&lt;th&gt;.NET (Regex)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10 MB&lt;/td&gt;
&lt;td&gt;~2.1 s&lt;/td&gt;
&lt;td&gt;~470 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.5×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50 MB&lt;/td&gt;
&lt;td&gt;~10.3 s&lt;/td&gt;
&lt;td&gt;~1.7 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 MB&lt;/td&gt;
&lt;td&gt;~20.8 s&lt;/td&gt;
&lt;td&gt;~2.5 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8.3×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The speedup grows with corpus size — a classic sign that Python's per-character overhead is the bottleneck, not any fixed startup cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Compiled Regex Wins
&lt;/h2&gt;

&lt;p&gt;NLTK's &lt;code&gt;sent_tokenize&lt;/code&gt; loads a pickled Punkt model on first call, then walks the text through a sequence of Python regex passes and decision-tree lookups. Each sentence boundary decision runs several Python method calls.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Regex.Compiled&lt;/code&gt; in .NET translates the pattern to a deterministic finite automaton and emits IL the JIT compiles to native code on first use. Subsequent calls on the same &lt;code&gt;Regex&lt;/code&gt; object are pure native execution — no Python interpreter overhead, no object allocation per match.&lt;/p&gt;

&lt;p&gt;The word tokenizer compounds this: &lt;code&gt;Regex.Matches&lt;/code&gt; on a 100 MB string produces a lazy &lt;code&gt;MatchCollection&lt;/code&gt; enumerated once, while NLTK's word tokenizer re-scans each sentence in a separate Python loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Compiled once at startup — equivalent to nltk.sent_tokenize + word_tokenize&lt;/span&gt;
&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;Regex&lt;/span&gt; &lt;span class="n"&gt;SentPattern&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;@"(?&amp;lt;=[.!?])\s+(?=[A-Z])|(?:\r?\n){2,}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RegexOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compiled&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;Regex&lt;/span&gt; &lt;span class="n"&gt;WordPattern&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;@"[A-Za-z0-9]+(?:['\-][A-Za-z]+)*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RegexOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compiled&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;Tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;sents&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SentPattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Matches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WordPattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Matches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# NLTK — Punkt model + Penn Treebank word tokenizer
&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;words&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Python version makes one method call per sentence for word tokenization; the .NET version makes one pass over the entire text. At 100 MB that difference is 7 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkc1ne79zpgfy8ohqq7l7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkc1ne79zpgfy8ohqq7l7.jpg" alt="Tokenization time by corpus size — NLTK scales linearly, .NET nearly flat to 50 MB" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;NLTK's runtime grows slightly super-linearly because &lt;code&gt;word_tokenize&lt;/code&gt; is called once per sentence — more sentences means more Python call overhead. .NET's single-pass approach keeps growth linear in bytes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlc55lk80vaf9t8hu4t8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlc55lk80vaf9t8hu4t8.jpg" alt="Speedup multiplier — grows from 4.5× at 10 MB to 8.3× at 100 MB" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The widening gap confirms Python's per-character cost: each additional MB of text adds the same fixed overhead per character in the interpreter, while .NET's compiled DFA processes characters at native speed.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>pypdf vs PdfPig: Text Extraction at Scale</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sun, 31 May 2026 18:20:15 +0000</pubDate>
      <link>https://dev.to/milliseconds/pypdf-vs-pdfpig-text-extraction-at-scale-21nd</link>
      <guid>https://dev.to/milliseconds/pypdf-vs-pdfpig-text-extraction-at-scale-21nd</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;PDF text extraction is a common pre-processing step in data pipelines — ingesting research papers, legal documents, or reports before embedding or indexing. Both &lt;code&gt;pypdf&lt;/code&gt; and &lt;code&gt;PdfPig&lt;/code&gt; are pure managed-code parsers: no native binaries, no OCR, no system PDF renderer. They implement the same PDF specification operations in their respective languages.&lt;/p&gt;

&lt;p&gt;This makes the benchmark unusually clean: the performance difference is entirely due to language execution speed, not library architecture differences.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;200 recent arXiv PDFs (mixed technical papers, 5–40 pages each). Tested on subsets of 10, 50, 100, and 200 files. Both libraries extract all text from all pages; output is validated for page-count agreement and character-count agreement within 15% (pypdf and PdfPig decode whitespace and encoding tables slightly differently).&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PDFs&lt;/th&gt;
&lt;th&gt;Pages&lt;/th&gt;
&lt;th&gt;Python (pypdf)&lt;/th&gt;
&lt;th&gt;.NET (PdfPig)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;~120&lt;/td&gt;
&lt;td&gt;~0.9 s&lt;/td&gt;
&lt;td&gt;~230 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;~600&lt;/td&gt;
&lt;td&gt;~4.2 s&lt;/td&gt;
&lt;td&gt;~810 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.2×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;~1,200&lt;/td&gt;
&lt;td&gt;~8.5 s&lt;/td&gt;
&lt;td&gt;~1.4 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;~2,400&lt;/td&gt;
&lt;td&gt;~17 s&lt;/td&gt;
&lt;td&gt;~2.7 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.2×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The speedup grows slightly with corpus size, suggesting pypdf has a per-document startup cost that compounds as PdfPig's JIT gets warmer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PdfPig Is Faster
&lt;/h2&gt;

&lt;p&gt;PDF parsing is byte-heavy: every page is a stream of PostScript-like operators (move, show text, set font, etc.). Each operator must be lexed, looked up in a dispatch table, and executed against a graphics state machine.&lt;/p&gt;

&lt;p&gt;In Python, each operator dispatch is a Python method call — the CPython bytecode interpreter has overhead per call regardless of what the method does. In .NET, the JIT compiles the dispatch loop to native code the first time it runs; subsequent pages pay only the cost of the actual work.&lt;/p&gt;

&lt;p&gt;Additionally, PdfPig's content-stream parser operates on &lt;code&gt;ReadOnlySpan&amp;lt;byte&amp;gt;&lt;/code&gt; — zero-copy slicing through the raw page bytes with no intermediate string allocations. pypdf builds Python string objects for each token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// PdfPig — zero-copy span-based page extraction&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt; &lt;span class="nf"&gt;Extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PdfDocument&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;chars&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetPages&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;chars&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chars&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NumberOfPages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# pypdf — Python object per token
&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pypdf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;PdfReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;chars&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;chars&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The structure is identical. The performance difference is pure language execution speed on the same PDF parsing logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsyi0anz4emnwn5dxqo3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsyi0anz4emnwn5dxqo3.jpg" alt="PDF extraction time by corpus size — PdfPig stays nearly linear while pypdf steepens" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Both scale linearly in page count, but PdfPig's slope is 6× shallower. At 200 files (a typical daily batch job), .NET finishes in under 3 seconds; Python takes 17 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd219hzxmbybim9ghjvem.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd219hzxmbybim9ghjvem.jpg" alt="Speedup multiplier by subset size — gap widens as JIT warms up" width="799" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The 10-file speedup (3.9×) is lower because .NET's JIT incurs one-time compilation overhead for the PDF parser code paths. By 50 files the JIT is fully warm and the speedup stabilises around 6×.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>NetworkX vs CSR + TensorPrimitives: PageRank on 28M Edges</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sun, 31 May 2026 18:20:14 +0000</pubDate>
      <link>https://dev.to/milliseconds/networkx-vs-csr-tensorprimitives-pagerank-on-28m-edges-3i8c</link>
      <guid>https://dev.to/milliseconds/networkx-vs-csr-tensorprimitives-pagerank-on-28m-edges-3i8c</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;PageRank is the canonical graph algorithm. NetworkX implements it in pure Python — its dict-of-dict adjacency representation means every power-iteration step dispatches millions of Python attribute lookups. When the graph has 1.8 million nodes and 28.5 million edges (Wikipedia category hyperlinks), those lookups dominate the runtime.&lt;/p&gt;

&lt;p&gt;The .NET replacement uses a &lt;strong&gt;CSR (Compressed Sparse Row)&lt;/strong&gt; matrix — two flat &lt;code&gt;int[]&lt;/code&gt; arrays for the graph structure — and &lt;code&gt;TensorPrimitives&lt;/code&gt; for the SIMD-accelerated normalization step inside each iteration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;Five SNAP datasets of increasing size:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Nodes&lt;/th&gt;
&lt;th&gt;Edges&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;wiki-Vote&lt;/td&gt;
&lt;td&gt;7,115&lt;/td&gt;
&lt;td&gt;103,689&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;soc-Epinions1&lt;/td&gt;
&lt;td&gt;75,879&lt;/td&gt;
&lt;td&gt;508,837&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Stanford&lt;/td&gt;
&lt;td&gt;281,903&lt;/td&gt;
&lt;td&gt;2,312,497&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Google&lt;/td&gt;
&lt;td&gt;875,713&lt;/td&gt;
&lt;td&gt;5,105,039&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wiki-topcats&lt;/td&gt;
&lt;td&gt;1,791,489&lt;/td&gt;
&lt;td&gt;28,511,807&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Algorithm: power-iteration PageRank, damping=0.85, tol=1e-6. Both implementations converge to identical top-10 node rankings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Python (NetworkX)&lt;/th&gt;
&lt;th&gt;.NET (CSR)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;wiki-Vote (103k edges)&lt;/td&gt;
&lt;td&gt;~0.8 s&lt;/td&gt;
&lt;td&gt;~100 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;soc-Epinions1 (508k edges)&lt;/td&gt;
&lt;td&gt;~8 s&lt;/td&gt;
&lt;td&gt;~600 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~13×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Stanford (2.3M edges)&lt;/td&gt;
&lt;td&gt;~120 s&lt;/td&gt;
&lt;td&gt;~5 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~24×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Google (5.1M edges)&lt;/td&gt;
&lt;td&gt;~5.5 min&lt;/td&gt;
&lt;td&gt;~12 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~28×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wiki-topcats (28.5M edges)&lt;/td&gt;
&lt;td&gt;~47 min&lt;/td&gt;
&lt;td&gt;~60 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~47×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The speedup grows with graph size because NetworkX's Python dispatch cost scales with edge count, while the CSR inner loop is a tight JIT-compiled SIMD pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why CSR Beats NetworkX
&lt;/h2&gt;

&lt;p&gt;NetworkX represents each node's neighbors as a Python dict. Iterating the adjacency in one power-iteration step means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calling &lt;code&gt;G.neighbors(node)&lt;/code&gt; — a Python method call&lt;/li&gt;
&lt;li&gt;Iterating a dict — unboxing int keys, chasing heap pointers&lt;/li&gt;
&lt;li&gt;Accumulating a float into another dict value — another boxing step&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That happens for every edge, every iteration, roughly 50–80 times to convergence.&lt;/p&gt;

&lt;p&gt;CSR collapses the graph to two arrays: &lt;code&gt;rowPtr[n+1]&lt;/code&gt; (where each node's neighbors start) and &lt;code&gt;colIdx[edges]&lt;/code&gt; (the neighbor list). Iterating neighbors of node &lt;code&gt;v&lt;/code&gt; is a tight C loop from &lt;code&gt;rowPtr[v]&lt;/code&gt; to &lt;code&gt;rowPtr[v+1]&lt;/code&gt;. No Python objects, no dict hashing, no pointer chasing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Power iteration — one full pass over all edges&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;contrib&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;colIdx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;contrib&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Damping + dangling-node correction&lt;/span&gt;
&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;base&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1.0&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;dangling&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;TensorPrimitives&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MultiplyAdd&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsSpan&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsSpan&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# NetworkX power iteration (simplified excerpt)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nbrs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;adjacency&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nbr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nbrs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;xlast&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nbr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nbrs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;out_degree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nbrs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;xlast&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;danglesum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every dict access in the Python version maps to multiple bytecode instructions. The .NET version compiles to a loop over two contiguous integer arrays — the CPU's prefetcher handles the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc5c8lc7kqbswq81k6grj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc5c8lc7kqbswq81k6grj.jpg" alt="PageRank speedup multiplier by dataset size — peaks at 47× on wiki-topcats" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The speedup curve is almost linear in edge count. NetworkX's dispatch overhead is proportional to edges processed, while .NET's JIT loop has near-zero per-edge overhead past the first iteration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vm8g0z8m3e4l1uilc5z.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vm8g0z8m3e4l1uilc5z.jpg" alt="Absolute execution time: Python in minutes, .NET in seconds" width="800" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At wiki-topcats scale Python takes ~47 minutes; .NET takes ~60 seconds. The same convergence, the same top-10 ranking, 47× faster.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>textdistance vs ArrayPool: Edit Distance Without the Allocations</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sat, 30 May 2026 23:39:36 +0000</pubDate>
      <link>https://dev.to/milliseconds/textdistance-vs-arraypool-edit-distance-without-the-allocations-1ich</link>
      <guid>https://dev.to/milliseconds/textdistance-vs-arraypool-edit-distance-without-the-allocations-1ich</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Levenshtein edit distance is used everywhere strings need to be compared: spell checking, fuzzy record matching, DNA sequence alignment, and plagiarism detection. Python's &lt;code&gt;textdistance&lt;/code&gt; library implements it cleanly, but with &lt;code&gt;external=False&lt;/code&gt; (pure Python, no C extension) it allocates a full &lt;code&gt;O(m × n)&lt;/code&gt; matrix on every call.&lt;/p&gt;

&lt;p&gt;The .NET replacement uses the Wagner-Fischer algorithm reduced to a single row: &lt;code&gt;O(min(m, n))&lt;/code&gt; space instead of &lt;code&gt;O(m × n)&lt;/code&gt;, backed by &lt;code&gt;ArrayPool&amp;lt;int&amp;gt;&lt;/code&gt; so even that row is never garbage-collected — it's rented from a pool and returned after each call.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;Random word pairs from a 10,000-word dictionary, tested at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;10,000 pairs&lt;/strong&gt; (average length 8 characters)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;50,000 pairs&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;100,000 pairs&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Python uses &lt;code&gt;textdistance.Levenshtein(external=False)&lt;/code&gt; — pure Python, no Cython or C backend. .NET uses a static &lt;code&gt;Levenshtein.Distance(ReadOnlySpan&amp;lt;char&amp;gt;, ReadOnlySpan&amp;lt;char&amp;gt;, int[])&lt;/code&gt; with the row buffer passed in from the caller.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pairs&lt;/th&gt;
&lt;th&gt;Python (textdistance)&lt;/th&gt;
&lt;th&gt;.NET (ArrayPool)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~1.4 s&lt;/td&gt;
&lt;td&gt;~115 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50,000&lt;/td&gt;
&lt;td&gt;~7.1 s&lt;/td&gt;
&lt;td&gt;~200 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;36×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~14.2 s&lt;/td&gt;
&lt;td&gt;~200 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;71×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The .NET runtime barely grows past 50k — the preallocated row means the GC is idle and the loop is hot in L1 cache. Python's time grows linearly because each pair triggers a fresh matrix allocation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Speedup Compounds
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Allocation.&lt;/strong&gt; &lt;code&gt;textdistance&lt;/code&gt; builds a list of lists for the full DP table every call. At average word length 8, that's a 9×9 Python list — 9 list objects plus 81 boxed integers, each on the heap. At 100k pairs: 9 million list allocations, 810 million boxed integers created and GC'd.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Row-only Wagner-Fischer.&lt;/strong&gt; The full matrix is never needed — only the previous row to compute the current row. The .NET implementation keeps a single &lt;code&gt;int[]&lt;/code&gt; of length &lt;code&gt;min(m, n) + 1&lt;/code&gt;, rented once from &lt;code&gt;ArrayPool&amp;lt;int&amp;gt;&lt;/code&gt; and reused for the entire batch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;ReadOnlySpan&amp;lt;char&amp;gt;&lt;/code&gt;.&lt;/strong&gt; No string copies. &lt;code&gt;a.AsSpan()&lt;/code&gt; and &lt;code&gt;b.AsSpan()&lt;/code&gt; give zero-allocation views; the character comparison &lt;code&gt;a[i-1] == b[j-1]&lt;/code&gt; compiles to a single native compare instruction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Single row, rented once — O(min(m,n)) space&lt;/span&gt;
&lt;span class="c1"&gt;// Python equivalent: textdistance.Levenshtein(external=False).distance(a, b)&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ReadOnlySpan&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ReadOnlySpan&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;prev&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# textdistance — allocates full matrix per call
&lt;/span&gt;&lt;span class="n"&gt;algo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Levenshtein&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;external&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;algo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Python call allocates the full table, fills it, reads &lt;code&gt;table[m][n]&lt;/code&gt;, and then abandons the entire allocation to the GC. The .NET call overwrites the same row buffer in-place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fte3kbj402411b097s37u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fte3kbj402411b097s37u.png" alt="Speedup multiplier — grows from 12× at 10k pairs to 71× at 100k pairs" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The speedup grows because Python's GC pressure increases with pair count while .NET's stays constant. At 100k pairs Python's GC is collecting tens of millions of short-lived list objects; .NET's GC has nothing to collect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcojhufytcnjy6vu3dknw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcojhufytcnjy6vu3dknw.png" alt="Absolute time — .NET stays flat past 50k pairs as the row stays in L1 cache" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;.NET's time barely changes from 50k to 100k pairs — the preallocated row fits in L1 cache and stays there for the entire batch. Python's time grows linearly with allocation cost.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
  </channel>
</rss>
