<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Milliseconds.dev</title>
    <description>The latest articles on DEV Community by Milliseconds.dev (@milliseconds).</description>
    <link>https://dev.to/milliseconds</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3960095%2Fc5d4574c-c7fb-4201-adb3-ff8a2130bfb4.png</url>
      <title>DEV Community: Milliseconds.dev</title>
      <link>https://dev.to/milliseconds</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/milliseconds"/>
    <language>en</language>
    <item>
      <title>pypdf vs PdfPig: Text Extraction at Scale</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sun, 31 May 2026 18:20:15 +0000</pubDate>
      <link>https://dev.to/milliseconds/pypdf-vs-pdfpig-text-extraction-at-scale-21nd</link>
      <guid>https://dev.to/milliseconds/pypdf-vs-pdfpig-text-extraction-at-scale-21nd</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;PDF text extraction is a common pre-processing step in data pipelines — ingesting research papers, legal documents, or reports before embedding or indexing. Both &lt;code&gt;pypdf&lt;/code&gt; and &lt;code&gt;PdfPig&lt;/code&gt; are pure managed-code parsers: no native binaries, no OCR, no system PDF renderer. They implement the same PDF specification operations in their respective languages.&lt;/p&gt;

&lt;p&gt;This makes the benchmark unusually clean: the performance difference is entirely due to language execution speed, not library architecture differences.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;200 recent arXiv PDFs (mixed technical papers, 5–40 pages each). Tested on subsets of 10, 50, 100, and 200 files. Both libraries extract all text from all pages; output is validated for page-count agreement and character-count agreement within 15% (pypdf and PdfPig decode whitespace and encoding tables slightly differently).&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PDFs&lt;/th&gt;
&lt;th&gt;Pages&lt;/th&gt;
&lt;th&gt;Python (pypdf)&lt;/th&gt;
&lt;th&gt;.NET (PdfPig)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;~120&lt;/td&gt;
&lt;td&gt;~0.9 s&lt;/td&gt;
&lt;td&gt;~230 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;~600&lt;/td&gt;
&lt;td&gt;~4.2 s&lt;/td&gt;
&lt;td&gt;~810 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.2×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;~1,200&lt;/td&gt;
&lt;td&gt;~8.5 s&lt;/td&gt;
&lt;td&gt;~1.4 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;~2,400&lt;/td&gt;
&lt;td&gt;~17 s&lt;/td&gt;
&lt;td&gt;~2.7 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.2×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The speedup grows slightly with corpus size, suggesting pypdf has a per-document startup cost that compounds as PdfPig's JIT gets warmer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PdfPig Is Faster
&lt;/h2&gt;

&lt;p&gt;PDF parsing is byte-heavy: every page is a stream of PostScript-like operators (move, show text, set font, etc.). Each operator must be lexed, looked up in a dispatch table, and executed against a graphics state machine.&lt;/p&gt;

&lt;p&gt;In Python, each operator dispatch is a Python method call — the CPython bytecode interpreter has overhead per call regardless of what the method does. In .NET, the JIT compiles the dispatch loop to native code the first time it runs; subsequent pages pay only the cost of the actual work.&lt;/p&gt;

&lt;p&gt;Additionally, PdfPig's content-stream parser operates on &lt;code&gt;ReadOnlySpan&amp;lt;byte&amp;gt;&lt;/code&gt; — zero-copy slicing through the raw page bytes with no intermediate string allocations. pypdf builds Python string objects for each token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// PdfPig — zero-copy span-based page extraction&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt; &lt;span class="nf"&gt;Extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PdfDocument&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;chars&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetPages&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;chars&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chars&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NumberOfPages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# pypdf — Python object per token
&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pypdf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;PdfReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;chars&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;chars&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The structure is identical. The performance difference is pure language execution speed on the same PDF parsing logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsyi0anz4emnwn5dxqo3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsyi0anz4emnwn5dxqo3.jpg" alt="PDF extraction time by corpus size — PdfPig stays nearly linear while pypdf steepens" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Both scale linearly in page count, but PdfPig's slope is 6× shallower. At 200 files (a typical daily batch job), .NET finishes in under 3 seconds; Python takes 17 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd219hzxmbybim9ghjvem.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd219hzxmbybim9ghjvem.jpg" alt="Speedup multiplier by subset size — gap widens as JIT warms up" width="799" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The 10-file speedup (3.9×) is lower because .NET's JIT incurs one-time compilation overhead for the PDF parser code paths. By 50 files the JIT is fully warm and the speedup stabilises around 6×.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>NetworkX vs CSR + TensorPrimitives: PageRank on 28M Edges</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sun, 31 May 2026 18:20:14 +0000</pubDate>
      <link>https://dev.to/milliseconds/networkx-vs-csr-tensorprimitives-pagerank-on-28m-edges-3i8c</link>
      <guid>https://dev.to/milliseconds/networkx-vs-csr-tensorprimitives-pagerank-on-28m-edges-3i8c</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;PageRank is the canonical graph algorithm. NetworkX implements it in pure Python — its dict-of-dict adjacency representation means every power-iteration step dispatches millions of Python attribute lookups. When the graph has 1.8 million nodes and 28.5 million edges (Wikipedia category hyperlinks), those lookups dominate the runtime.&lt;/p&gt;

&lt;p&gt;The .NET replacement uses a &lt;strong&gt;CSR (Compressed Sparse Row)&lt;/strong&gt; matrix — two flat &lt;code&gt;int[]&lt;/code&gt; arrays for the graph structure — and &lt;code&gt;TensorPrimitives&lt;/code&gt; for the SIMD-accelerated normalization step inside each iteration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;Five SNAP datasets of increasing size:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Nodes&lt;/th&gt;
&lt;th&gt;Edges&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;wiki-Vote&lt;/td&gt;
&lt;td&gt;7,115&lt;/td&gt;
&lt;td&gt;103,689&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;soc-Epinions1&lt;/td&gt;
&lt;td&gt;75,879&lt;/td&gt;
&lt;td&gt;508,837&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Stanford&lt;/td&gt;
&lt;td&gt;281,903&lt;/td&gt;
&lt;td&gt;2,312,497&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Google&lt;/td&gt;
&lt;td&gt;875,713&lt;/td&gt;
&lt;td&gt;5,105,039&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wiki-topcats&lt;/td&gt;
&lt;td&gt;1,791,489&lt;/td&gt;
&lt;td&gt;28,511,807&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Algorithm: power-iteration PageRank, damping=0.85, tol=1e-6. Both implementations converge to identical top-10 node rankings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;Python (NetworkX)&lt;/th&gt;
&lt;th&gt;.NET (CSR)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;wiki-Vote (103k edges)&lt;/td&gt;
&lt;td&gt;~0.8 s&lt;/td&gt;
&lt;td&gt;~100 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;soc-Epinions1 (508k edges)&lt;/td&gt;
&lt;td&gt;~8 s&lt;/td&gt;
&lt;td&gt;~600 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~13×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Stanford (2.3M edges)&lt;/td&gt;
&lt;td&gt;~120 s&lt;/td&gt;
&lt;td&gt;~5 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~24×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;web-Google (5.1M edges)&lt;/td&gt;
&lt;td&gt;~5.5 min&lt;/td&gt;
&lt;td&gt;~12 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~28×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wiki-topcats (28.5M edges)&lt;/td&gt;
&lt;td&gt;~47 min&lt;/td&gt;
&lt;td&gt;~60 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~47×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The speedup grows with graph size because NetworkX's Python dispatch cost scales with edge count, while the CSR inner loop is a tight JIT-compiled SIMD pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why CSR Beats NetworkX
&lt;/h2&gt;

&lt;p&gt;NetworkX represents each node's neighbors as a Python dict. Iterating the adjacency in one power-iteration step means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calling &lt;code&gt;G.neighbors(node)&lt;/code&gt; — a Python method call&lt;/li&gt;
&lt;li&gt;Iterating a dict — unboxing int keys, chasing heap pointers&lt;/li&gt;
&lt;li&gt;Accumulating a float into another dict value — another boxing step&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That happens for every edge, every iteration, roughly 50–80 times to convergence.&lt;/p&gt;

&lt;p&gt;CSR collapses the graph to two arrays: &lt;code&gt;rowPtr[n+1]&lt;/code&gt; (where each node's neighbors start) and &lt;code&gt;colIdx[edges]&lt;/code&gt; (the neighbor list). Iterating neighbors of node &lt;code&gt;v&lt;/code&gt; is a tight C loop from &lt;code&gt;rowPtr[v]&lt;/code&gt; to &lt;code&gt;rowPtr[v+1]&lt;/code&gt;. No Python objects, no dict hashing, no pointer chasing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Power iteration — one full pass over all edges&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;contrib&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;rowPtr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;colIdx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;contrib&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Damping + dangling-node correction&lt;/span&gt;
&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="k"&gt;base&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1.0&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;dangling&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;TensorPrimitives&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MultiplyAdd&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsSpan&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsSpan&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# NetworkX power iteration (simplified excerpt)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nbrs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;adjacency&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nbr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nbrs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;xlast&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nbr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nbrs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;out_degree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nbrs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;xlast&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;danglesum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every dict access in the Python version maps to multiple bytecode instructions. The .NET version compiles to a loop over two contiguous integer arrays — the CPU's prefetcher handles the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc5c8lc7kqbswq81k6grj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc5c8lc7kqbswq81k6grj.jpg" alt="PageRank speedup multiplier by dataset size — peaks at 47× on wiki-topcats" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The speedup curve is almost linear in edge count. NetworkX's dispatch overhead is proportional to edges processed, while .NET's JIT loop has near-zero per-edge overhead past the first iteration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vm8g0z8m3e4l1uilc5z.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vm8g0z8m3e4l1uilc5z.jpg" alt="Absolute execution time: Python in minutes, .NET in seconds" width="800" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At wiki-topcats scale Python takes ~47 minutes; .NET takes ~60 seconds. The same convergence, the same top-10 ranking, 47× faster.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
    <item>
      <title>textdistance vs ArrayPool: Edit Distance Without the Allocations</title>
      <dc:creator>Milliseconds.dev</dc:creator>
      <pubDate>Sat, 30 May 2026 23:39:36 +0000</pubDate>
      <link>https://dev.to/milliseconds/textdistance-vs-arraypool-edit-distance-without-the-allocations-1ich</link>
      <guid>https://dev.to/milliseconds/textdistance-vs-arraypool-edit-distance-without-the-allocations-1ich</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Levenshtein edit distance is used everywhere strings need to be compared: spell checking, fuzzy record matching, DNA sequence alignment, and plagiarism detection. Python's &lt;code&gt;textdistance&lt;/code&gt; library implements it cleanly, but with &lt;code&gt;external=False&lt;/code&gt; (pure Python, no C extension) it allocates a full &lt;code&gt;O(m × n)&lt;/code&gt; matrix on every call.&lt;/p&gt;

&lt;p&gt;The .NET replacement uses the Wagner-Fischer algorithm reduced to a single row: &lt;code&gt;O(min(m, n))&lt;/code&gt; space instead of &lt;code&gt;O(m × n)&lt;/code&gt;, backed by &lt;code&gt;ArrayPool&amp;lt;int&amp;gt;&lt;/code&gt; so even that row is never garbage-collected — it's rented from a pool and returned after each call.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Setup
&lt;/h2&gt;

&lt;p&gt;Random word pairs from a 10,000-word dictionary, tested at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;10,000 pairs&lt;/strong&gt; (average length 8 characters)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;50,000 pairs&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;100,000 pairs&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Python uses &lt;code&gt;textdistance.Levenshtein(external=False)&lt;/code&gt; — pure Python, no Cython or C backend. .NET uses a static &lt;code&gt;Levenshtein.Distance(ReadOnlySpan&amp;lt;char&amp;gt;, ReadOnlySpan&amp;lt;char&amp;gt;, int[])&lt;/code&gt; with the row buffer passed in from the caller.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pairs&lt;/th&gt;
&lt;th&gt;Python (textdistance)&lt;/th&gt;
&lt;th&gt;.NET (ArrayPool)&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~1.4 s&lt;/td&gt;
&lt;td&gt;~115 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50,000&lt;/td&gt;
&lt;td&gt;~7.1 s&lt;/td&gt;
&lt;td&gt;~200 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;36×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~14.2 s&lt;/td&gt;
&lt;td&gt;~200 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;71×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The .NET runtime barely grows past 50k — the preallocated row means the GC is idle and the loop is hot in L1 cache. Python's time grows linearly because each pair triggers a fresh matrix allocation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Speedup Compounds
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Allocation.&lt;/strong&gt; &lt;code&gt;textdistance&lt;/code&gt; builds a list of lists for the full DP table every call. At average word length 8, that's a 9×9 Python list — 9 list objects plus 81 boxed integers, each on the heap. At 100k pairs: 9 million list allocations, 810 million boxed integers created and GC'd.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Row-only Wagner-Fischer.&lt;/strong&gt; The full matrix is never needed — only the previous row to compute the current row. The .NET implementation keeps a single &lt;code&gt;int[]&lt;/code&gt; of length &lt;code&gt;min(m, n) + 1&lt;/code&gt;, rented once from &lt;code&gt;ArrayPool&amp;lt;int&amp;gt;&lt;/code&gt; and reused for the entire batch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;ReadOnlySpan&amp;lt;char&amp;gt;&lt;/code&gt;.&lt;/strong&gt; No string copies. &lt;code&gt;a.AsSpan()&lt;/code&gt; and &lt;code&gt;b.AsSpan()&lt;/code&gt; give zero-allocation views; the character comparison &lt;code&gt;a[i-1] == b[j-1]&lt;/code&gt; compiles to a single native compare instruction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Single row, rented once — O(min(m,n)) space&lt;/span&gt;
&lt;span class="c1"&gt;// Python equivalent: textdistance.Levenshtein(external=False).distance(a, b)&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ReadOnlySpan&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ReadOnlySpan&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;++)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;prev&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# textdistance — allocates full matrix per call
&lt;/span&gt;&lt;span class="n"&gt;algo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Levenshtein&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;external&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;algo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Python call allocates the full table, fills it, reads &lt;code&gt;table[m][n]&lt;/code&gt;, and then abandons the entire allocation to the GC. The .NET call overwrites the same row buffer in-place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagrams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fte3kbj402411b097s37u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fte3kbj402411b097s37u.png" alt="Speedup multiplier — grows from 12× at 10k pairs to 71× at 100k pairs" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The speedup grows because Python's GC pressure increases with pair count while .NET's stays constant. At 100k pairs Python's GC is collecting tens of millions of short-lived list objects; .NET's GC has nothing to collect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcojhufytcnjy6vu3dknw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcojhufytcnjy6vu3dknw.png" alt="Absolute time — .NET stays flat past 50k pairs as the row stays in L1 cache" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;.NET's time barely changes from 50k to 100k pairs — the preallocated row fits in L1 cache and stays there for the entire batch. Python's time grows linearly with allocation cost.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>performance</category>
      <category>benchmarks</category>
    </item>
  </channel>
</rss>
