<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mohammad Raziei</title>
    <description>The latest articles on DEV Community by Mohammad Raziei (@mohammadraziei).</description>
    <link>https://dev.to/mohammadraziei</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3598809%2F06207ea2-3a7b-4667-a9c1-d19355411337.jpeg</url>
      <title>DEV Community: Mohammad Raziei</title>
      <link>https://dev.to/mohammadraziei</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mohammadraziei"/>
    <language>en</language>
    <item>
      <title>pygixml 0.10.0 released — A Faster, Smarter XML Parser for Python</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Sat, 11 Apr 2026 16:12:28 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/pygixml-0100-released-a-faster-smarter-xml-parser-for-python-12p1</link>
      <guid>https://dev.to/mohammadraziei/pygixml-0100-released-a-faster-smarter-xml-parser-for-python-12p1</guid>
      <description>&lt;p&gt;XML parsing in Python has had three choices for over a decade: &lt;strong&gt;ElementTree&lt;/strong&gt; (slow but built-in), &lt;strong&gt;lxml&lt;/strong&gt; (fast but heavy), and &lt;strong&gt;minidom&lt;/strong&gt; (don't). I wanted something that sits at the intersection of speed, simplicity, and a small install footprint.&lt;/p&gt;

&lt;p&gt;That's what &lt;a href="https://github.com/MohammadRaziei/pygixml" rel="noopener noreferrer"&gt;pygixml&lt;/a&gt; is — a Cython wrapper around &lt;a href="https://pugixml.org/" rel="noopener noreferrer"&gt;pugixml&lt;/a&gt;, one of the fastest C++ XML parsers in existence.&lt;/p&gt;

&lt;p&gt;Version 0.10.0 just dropped, and it's the most significant release so far. Let's walk through what's new.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers (50 iterations, 5 000 elements)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Avg Time&lt;/th&gt;
&lt;th&gt;Speedup vs ElementTree&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.0009 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9.2× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;lxml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.0041 s&lt;/td&gt;
&lt;td&gt;2.0× faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ElementTree&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.0083 s&lt;/td&gt;
&lt;td&gt;1.0× (baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Memory usage tells a similar story: pygixml uses &lt;strong&gt;0.67 MB&lt;/strong&gt; at 5 000 elements vs ElementTree's &lt;strong&gt;4.84 MB&lt;/strong&gt;. And the installed package is just &lt;strong&gt;0.45 MB&lt;/strong&gt;, vs lxml's 5.48 MB, according to the &lt;a href="https://dev.to/mohammadraziei/introducing-pip-size-see-the-real-cost-of-python-packages-5a58"&gt;pip-size&lt;/a&gt; report.&lt;/p&gt;

&lt;p&gt;If you care about these numbers, the full benchmark suite covers 6 XML sizes (100 to 10 000 elements) and is included in the repo. Run it yourself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MohammadRaziei/pygixml.git
&lt;span class="nb"&gt;cd &lt;/span&gt;pygixml

cmake &lt;span class="nt"&gt;-B&lt;/span&gt; build
cmake &lt;span class="nt"&gt;--build&lt;/span&gt; build &lt;span class="nt"&gt;--target&lt;/span&gt; run_full_benchmarks
&lt;span class="c"&gt;# or&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;
python benchmarks/full_benchmark.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's New in 0.10.0
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;code&gt;children()&lt;/code&gt; — Iterate Direct Children (or All Descendants)
&lt;/h3&gt;

&lt;p&gt;Before 0.10.0, iterating over an element's children required manual sibling walking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The old way — walk siblings manually
&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first_child&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;student&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;next_sibling&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you get a clean Pythonic iterator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Direct children only (default)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;children&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# All descendants in depth-first order
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;descendant&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;children&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recursive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;descendant&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Text, comment, and processing-instruction nodes are automatically skipped — you only get element nodes.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;code&gt;text()&lt;/code&gt; — Recursive Text Extraction with Configurable Joins
&lt;/h3&gt;

&lt;p&gt;Getting text out of an XML element shouldn't require walking the tree yourself. &lt;code&gt;text()&lt;/code&gt; collects all text and CDATA nodes from the subtree and joins them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
&amp;lt;article&amp;gt;
    &amp;lt;p&amp;gt;Hello &amp;lt;b&amp;gt;world&amp;lt;/b&amp;gt;! This is &amp;lt;i&amp;gt;rich&amp;lt;/i&amp;gt; text.&amp;lt;/p&amp;gt;
&amp;lt;/article&amp;gt;
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                     &lt;span class="c1"&gt;# "Hello\nworld!\nThis is\nrich\ntext."
&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recursive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# "Hello "  (direct text only)
&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;             &lt;span class="c1"&gt;# "Hello world! This is rich text."
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For simple cases where you just want the first child's text, &lt;code&gt;child_value("tag")&lt;/code&gt; is still there and is slightly faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. &lt;code&gt;element.value = "text"&lt;/code&gt; — Finally, This Works
&lt;/h3&gt;

&lt;p&gt;Element nodes in pugixml don't store text directly — they contain child text nodes. In 0.10.0, setting &lt;code&gt;.value&lt;/code&gt; on an element automatically creates or replaces that text child:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;root&amp;gt;&amp;lt;item/&amp;gt;&amp;lt;/root&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# "Hello"  ✅
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;  &lt;span class="c1"&gt;# "Hello"  ✅
# XML: &amp;lt;item&amp;gt;Hello&amp;lt;/item&amp;gt;
&lt;/span&gt;
&lt;span class="c1"&gt;# Replaces existing text
&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;World&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# "World"
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And reading back: &lt;code&gt;element.value&lt;/code&gt; now returns the first text child's value (or &lt;code&gt;None&lt;/code&gt; if there's no text), so &lt;code&gt;set&lt;/code&gt; and &lt;code&gt;get&lt;/code&gt; are symmetric.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. &lt;code&gt;from_mem_id_unsafe()&lt;/code&gt; — O(1) Node Lookup
&lt;/h3&gt;

&lt;p&gt;This is the most powerful — and most dangerous — feature in 0.10.0.&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;XMLNode&lt;/code&gt; exposes a &lt;code&gt;mem_id&lt;/code&gt; property: a unique numeric identifier derived from the node's internal address. You can use it to reconstruct a node later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Fast: O(1), direct pointer cast
&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;XMLNode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_mem_id_unsafe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Safe but O(n): walks the tree
&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_mem_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference is &lt;strong&gt;O(1) vs O(n)&lt;/strong&gt;. But &lt;code&gt;from_mem_id_unsafe&lt;/code&gt; treats the identifier as a raw pointer — if the document was freed or the node deleted, using it &lt;strong&gt;will cause a segfault&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use it:&lt;/strong&gt; only in performance-critical paths where you've profiled and confirmed that &lt;code&gt;find_mem_id&lt;/code&gt;'s tree walk is a bottleneck. For most code, &lt;code&gt;find_mem_id&lt;/code&gt; is the right choice.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;mem_id&lt;/code&gt; system is also &lt;strong&gt;hashable&lt;/strong&gt;, making it ideal for dictionary-based caching.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why aren't &lt;code&gt;XMLNode&lt;/code&gt; objects hashable?
&lt;/h3&gt;

&lt;p&gt;You might wonder why you can't just do &lt;code&gt;cache[node] = data&lt;/code&gt;. The reason is intentional: &lt;code&gt;XMLNode&lt;/code&gt; objects are &lt;strong&gt;mutable&lt;/strong&gt; — you can rename them, change their content, add children, and so on. In Python, mutable objects shouldn't be hashable, because their identity and equivalence would break the moment you modify them. Using &lt;code&gt;mem_id&lt;/code&gt; as the key makes the contract explicit: the integer is stable and hashable, while the node wrapper is transient.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using nodes in dictionaries (the right way)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Store node data by mem_id (a stable, hashable integer)
&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mem_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xpath&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;depth&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Later, reconstruct the node (O(1) but unsafe)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mem_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;XMLNode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_mem_id_unsafe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mem_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# Always check if the node is still valid
&lt;/span&gt;        &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For safety, use &lt;code&gt;find_mem_id&lt;/code&gt; (O(n) but returns &lt;code&gt;None&lt;/code&gt; for deleted nodes):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_mem_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mem_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. &lt;code&gt;xpath&lt;/code&gt; Property — Generate Absolute XPath to Any Node
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;root&amp;gt;&amp;lt;book&amp;gt;&amp;lt;title&amp;gt;Gatsby&amp;lt;/title&amp;gt;&amp;lt;/book&amp;gt;&amp;lt;/root&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;book&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# /root[1]/book[1]/title[1]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a custom O(depth) algorithm that walks from the node up to the root, counting same-name siblings to produce accurate positional predicates. pugixml doesn't provide this natively — it's pygixml's own addition.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. &lt;code&gt;xml&lt;/code&gt; Property — One-Liner XML Serialization
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;  &lt;span class="c1"&gt;# same as node.to_string() with 2-space indent
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. &lt;code&gt;ParseFlags&lt;/code&gt; Enum
&lt;/h3&gt;

&lt;p&gt;All 18 pugixml parse flags are now available as a proper &lt;code&gt;IntFlag&lt;/code&gt; enum:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Fastest parse — skip escapes, EOL normalization, whitespace
&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MINIMAL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Combine specific flags
&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMMENTS&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CDATA&lt;/span&gt;
&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8. Python 3.6–3.13 Support
&lt;/h3&gt;

&lt;p&gt;pygixml works with every Python from 3.6 through 3.13. &lt;code&gt;.pyi&lt;/code&gt; stub generation via &lt;code&gt;stubgen-pyx&lt;/code&gt; is only enabled on Python 3.9+ (where the package is available), so older versions still build fine — just without type stubs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Full Feature Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;pygixml&lt;/th&gt;
&lt;th&gt;lxml&lt;/th&gt;
&lt;th&gt;ElementTree&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Parse speed (5K elements)&lt;/td&gt;
&lt;td&gt;0.0009 s&lt;/td&gt;
&lt;td&gt;0.0041 s&lt;/td&gt;
&lt;td&gt;0.0083 s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory (5K elements)&lt;/td&gt;
&lt;td&gt;0.67 MB&lt;/td&gt;
&lt;td&gt;0.67 MB&lt;/td&gt;
&lt;td&gt;4.84 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime Dependencies&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;libxml2, libxslt&lt;/td&gt;
&lt;td&gt;None (stdlib)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Package size&lt;/td&gt;
&lt;td&gt;0.45 MB&lt;/td&gt;
&lt;td&gt;5.48 MB&lt;/td&gt;
&lt;td&gt;built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XPath 1.0&lt;/td&gt;
&lt;td&gt;✅ full&lt;/td&gt;
&lt;td&gt;✅ full&lt;/td&gt;
&lt;td&gt;❌ limited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XSLT&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema validation&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;children()&lt;/code&gt; iterator&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;text()&lt;/code&gt; recursive&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;element.value = "text"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;xpath&lt;/code&gt; property&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;mem_id&lt;/code&gt; caching&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pygixml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Zero Runtime Dependencies
&lt;/h3&gt;

&lt;p&gt;This is a huge advantage that often gets overlooked. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;&lt;code&gt;lxml&lt;/code&gt;&lt;/strong&gt; depends on system libraries (&lt;code&gt;libxml2&lt;/code&gt;, &lt;code&gt;libxslt&lt;/code&gt;). If those have security vulnerabilities or version conflicts, your environment breaks.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;code&gt;pygixml&lt;/code&gt;&lt;/strong&gt; bundles &lt;code&gt;pugixml&lt;/code&gt; directly into the Python extension.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It has &lt;strong&gt;zero runtime dependencies&lt;/strong&gt;. No &lt;code&gt;libxml&lt;/code&gt;, no external binaries, no transitive dependency chains. Just a single install that works.&lt;/p&gt;

&lt;p&gt;Pre-compiled wheels are available for Windows, Linux, and macOS.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MohammadRaziei/pygixml" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mohammadraziei.github.io/pygixml/" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/project/pygixml/" rel="noopener noreferrer"&gt;PyPI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If this project helps you, a star on GitHub goes a long way. Thanks for reading.&lt;/p&gt;

</description>
      <category>python</category>
      <category>showdev</category>
      <category>xml</category>
      <category>cython</category>
    </item>
    <item>
      <title>How to Parse XML Fast in 2026 (Python)</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Wed, 08 Apr 2026 23:53:13 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/how-to-parse-xml-fast-in-2026-python-20fd</link>
      <guid>https://dev.to/mohammadraziei/how-to-parse-xml-fast-in-2026-python-20fd</guid>
      <description>&lt;p&gt;JSON won the internet. We all know that. But XML never left — it just moved&lt;br&gt;
into the places where &lt;em&gt;reliability matters more than trendiness&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;If you work with Maven configs, Android manifests, Office Open XML (&lt;code&gt;.docx&lt;/code&gt;/&lt;code&gt;.xlsx&lt;/code&gt;),&lt;br&gt;
SVG, RSS feeds, DocBook, SOAP services, or any enterprise integration layer, you're&lt;br&gt;
still parsing XML. And in 2026, there's no excuse for it being slow.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Problem with XML Parsing in Python
&lt;/h2&gt;

&lt;p&gt;Python's standard library ships with &lt;code&gt;xml.etree.ElementTree&lt;/code&gt;. It works. It's&lt;br&gt;
fine for small files. But the moment your XML grows beyond a few hundred&lt;br&gt;
elements, ElementTree becomes a bottleneck — because it builds a full Python&lt;br&gt;
object for &lt;em&gt;every single node, attribute, and text node&lt;/em&gt; in the tree.&lt;/p&gt;

&lt;p&gt;The usual answer is &lt;code&gt;lxml&lt;/code&gt;, which wraps libxml2 in C. It's fast and&lt;br&gt;
feature-rich. But it's also a 5.5 MB install with a heavy dependency chain,&lt;br&gt;
and its Python bindings add overhead on every call.&lt;/p&gt;

&lt;p&gt;So what if you want the fastest possible parse, a tiny footprint, and a&lt;br&gt;
clean Python API?&lt;/p&gt;

&lt;p&gt;That's the question that led me to build &lt;strong&gt;&lt;a href="https://github.com/MohammadRaziei/pygixml" rel="noopener noreferrer"&gt;pygixml&lt;/a&gt;&lt;/strong&gt; —&lt;br&gt;
a Cython wrapper around &lt;a href="https://pugixml.org/" rel="noopener noreferrer"&gt;pugixml&lt;/a&gt;, one of the fastest&lt;br&gt;
C++ XML parsers in existence.&lt;/p&gt;

&lt;p&gt;Let me show you the numbers first, then we'll get into the code.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;Here's what happens when you parse a 5,000-element XML document with the&lt;br&gt;
three most common Python XML libraries:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Parse Time&lt;/th&gt;
&lt;th&gt;Speedup vs ElementTree&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.0009 s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8.6× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;lxml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.0041 s&lt;/td&gt;
&lt;td&gt;1.9× faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ElementTree&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.0076 s&lt;/td&gt;
&lt;td&gt;1.0× (baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffx7k3b4pqn0k1bmjv9tu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffx7k3b4pqn0k1bmjv9tu.png" alt=" " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And memory usage during the same parse:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Peak Memory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.67 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;lxml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.67 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ElementTree&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4.84 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;ElementTree uses &lt;strong&gt;7× more memory&lt;/strong&gt; because it materializes every node as a&lt;br&gt;
full Python object. pygixml and lxml stay in C/C++ land until you&lt;br&gt;
explicitly access data.&lt;/p&gt;

&lt;p&gt;The installed package size tells its own story:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.43 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;lxml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5.48 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's a 12× difference. If you're building a Docker image, Lambda function,&lt;br&gt;
or anything where size matters, it adds up.&lt;/p&gt;

&lt;p&gt;All benchmarks run on the same machine with &lt;code&gt;time.perf_counter()&lt;/code&gt; across 5&lt;br&gt;
warmed-up iterations. You can reproduce them yourself — the code is in the&lt;br&gt;
&lt;a href="https://github.com/MohammadRaziei/pygixml/tree/master/benchmarks" rel="noopener noreferrer"&gt;&lt;code&gt;benchmarks/&lt;/code&gt;&lt;/a&gt; directory.&lt;/p&gt;
&lt;h2&gt;
  
  
  How pygixml Works Under the Hood
&lt;/h2&gt;

&lt;p&gt;Here's the architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57am46ezlwdhzpb7x4xa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57am46ezlwdhzpb7x4xa.png" alt=" " width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three things make this fast:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No Python object per node&lt;/strong&gt; — the entire parsed tree lives in C++ memory.
pygixml only creates a Python wrapper when you explicitly access a node.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero-copy Cython bridge&lt;/strong&gt; — data doesn't get copied between C++ and
Python. Strings are encoded in-place.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pugixml's custom allocator&lt;/strong&gt; — pugixml uses a block-based memory pool
instead of per-node &lt;code&gt;malloc&lt;/code&gt;, which means fewer syscalls and better cache
locality.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pygixml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;One dependency-free install, 430 KB.&lt;/p&gt;
&lt;h3&gt;
  
  
  Parsing XML
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;

&lt;span class="n"&gt;xml&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
&amp;lt;library&amp;gt;
    &amp;lt;book id=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; category=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fiction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;
        &amp;lt;title&amp;gt;The Great Gatsby&amp;lt;/title&amp;gt;
        &amp;lt;author&amp;gt;F. Scott Fitzgerald&amp;lt;/author&amp;gt;
        &amp;lt;year&amp;gt;1925&amp;lt;/year&amp;gt;
    &amp;lt;/book&amp;gt;
    &amp;lt;book id=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; category=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fiction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;
        &amp;lt;title&amp;gt;1984&amp;lt;/title&amp;gt;
        &amp;lt;author&amp;gt;George Orwell&amp;lt;/author&amp;gt;
        &amp;lt;year&amp;gt;1949&amp;lt;/year&amp;gt;
    &amp;lt;/book&amp;gt;
&amp;lt;/library&amp;gt;
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;

&lt;span class="c1"&gt;# Access children
&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;book&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                      &lt;span class="c1"&gt;# book
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# 1
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;     &lt;span class="c1"&gt;# The Great Gatsby
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The API is deliberately simple. &lt;strong&gt;Properties&lt;/strong&gt; for simple access&lt;br&gt;
(&lt;code&gt;node.name&lt;/code&gt;, &lt;code&gt;node.value&lt;/code&gt;, &lt;code&gt;node.type&lt;/code&gt;), &lt;strong&gt;methods&lt;/strong&gt; for operations that take&lt;br&gt;
arguments (&lt;code&gt;node.child(name)&lt;/code&gt;, &lt;code&gt;node.text()&lt;/code&gt;). No surprises.&lt;/p&gt;
&lt;h3&gt;
  
  
  XPath Queries
&lt;/h3&gt;

&lt;p&gt;This is where pygixml really shines. pugixml's XPath engine is fast,&lt;br&gt;
standards-compliant (XPath 1.0), and fully exposed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# All fiction books
&lt;/span&gt;&lt;span class="n"&gt;fiction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select_nodes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;book[@category=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fiction&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Found &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fiction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; fiction books&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Single match
&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;book[@id=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;   &lt;span class="c1"&gt;# 1984
&lt;/span&gt;
&lt;span class="c1"&gt;# Pre-compile for repeated use
&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;XPathQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;book[year &amp;gt; 1950]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;recent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate_node_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Scalar evaluations
&lt;/span&gt;&lt;span class="n"&gt;avg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;XPathQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sum(book/price) div count(book)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;evaluate_number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Average price: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;avg&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;has_orwell&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;XPathQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;book[author=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;George Orwell&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;evaluate_boolean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Has Orwell: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;has_orwell&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating XML
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;XMLDocument&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append_child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;catalog&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append_child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append_child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Laptop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append_child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;999.99&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;catalog.xml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Modifying XML
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;person&amp;gt;&amp;lt;name&amp;gt;John&amp;lt;/name&amp;gt;&amp;lt;/person&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;

&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jane&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;full_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append_child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;age&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;30&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# &amp;lt;person&amp;gt;
#   &amp;lt;full_name&amp;gt;Jane&amp;lt;/full_name&amp;gt;
#   &amp;lt;age&amp;gt;30&amp;lt;/age&amp;gt;
# &amp;lt;/person&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performance Tuning: Parse Flags
&lt;/h2&gt;

&lt;p&gt;Here's a feature most Python XML libraries don't expose: &lt;strong&gt;parse flags&lt;/strong&gt;.&lt;br&gt;
pygixml gives you a &lt;code&gt;ParseFlags&lt;/code&gt; enum with 18 options to control exactly&lt;br&gt;
how pugixml processes your input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Fastest possible parse — skip everything optional
&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MINIMAL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Pick exactly what you need
&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMMENTS&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CDATA&lt;/span&gt;
&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftipdidg61vzaoi45e7j3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftipdidg61vzaoi45e7j3.png" alt=" " width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ParseFlags.MINIMAL&lt;/code&gt; skips escape processing, EOL normalization, and&lt;br&gt;
attribute whitespace conversion. On real-world XML with lots of escaped&lt;br&gt;
content (&lt;code&gt;&amp;amp;amp;&lt;/code&gt;, &lt;code&gt;&amp;amp;lt;&lt;/code&gt;, etc.), this can give you a noticeable speed boost&lt;br&gt;
over the default.&lt;/p&gt;
&lt;h2&gt;
  
  
  Which Library Should You Use?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F136h1wxrjr04e5mp0d4f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F136h1wxrjr04e5mp0d4f.png" alt=" " width="800" height="1410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;pygixml&lt;/th&gt;
&lt;th&gt;lxml&lt;/th&gt;
&lt;th&gt;ElementTree&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parse speed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fastest&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Slowest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High (7×)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Package size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.43 MB&lt;/td&gt;
&lt;td&gt;5.48 MB&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;XPath&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;1.0 + 2.0 + 3.0&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;XSLT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Schema validation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dependencies&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;libxml2, libxslt&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  The Full Benchmark
&lt;/h2&gt;

&lt;p&gt;If you want to run the numbers yourself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MohammadRaziei/pygixml.git
&lt;span class="nb"&gt;cd &lt;/span&gt;pygixml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The project uses CMake for its build system, so benchmarks are built-in targets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Full suite: parsing (6 sizes), memory, package size&lt;/span&gt;
cmake &lt;span class="nt"&gt;--build&lt;/span&gt; build &lt;span class="nt"&gt;--target&lt;/span&gt; run_full_benchmarks

&lt;span class="c"&gt;# Legacy parsing-only benchmark&lt;/span&gt;
cmake &lt;span class="nt"&gt;--build&lt;/span&gt; build &lt;span class="nt"&gt;--target&lt;/span&gt; run_benchmarks

&lt;span class="c"&gt;# Or directly with Python&lt;/span&gt;
python benchmarks/full_benchmark.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's the actual output from a recent run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=====================================================================
PARSING PERFORMANCE
=====================================================================
    Size | Library      |    Avg (s) |    Min (s) |  Speedup vs ET
----------------------------------------------------------------------
     100 | pygixml      |   0.000008 |   0.000008 |          14.4x
     100 | lxml         |   0.000094 |   0.000088 |           1.2x
     100 | elementtree  |   0.000112 |   0.000108 |           1.0x
----------------------------------------------------------------------
     500 | pygixml      |   0.000097 |   0.000096 |           5.8x
     500 | lxml         |   0.000394 |   0.000385 |           1.4x
     500 | elementtree  |   0.000558 |   0.000542 |           1.0x
----------------------------------------------------------------------
    1000 | pygixml      |   0.000147 |   0.000143 |           7.8x
    1000 | lxml         |   0.001127 |   0.001052 |           1.0x
    1000 | elementtree  |   0.001146 |   0.001114 |           1.0x
----------------------------------------------------------------------
    5000 | pygixml      |   0.000883 |   0.000880 |           8.6x
    5000 | lxml         |   0.004108 |   0.003907 |           1.9x
    5000 | elementtree  |   0.007614 |   0.006634 |           1.0x
----------------------------------------------------------------------
   10000 | pygixml      |   0.001649 |   0.001635 |           9.8x
   10000 | lxml         |   0.009095 |   0.008174 |           1.8x
   10000 | elementtree  |   0.016108 |   0.013917 |           1.0x
----------------------------------------------------------------------
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Memory usage (tracemalloc peak):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;pygixml&lt;/th&gt;
&lt;th&gt;lxml&lt;/th&gt;
&lt;th&gt;ElementTree&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 000&lt;/td&gt;
&lt;td&gt;0.13 MB&lt;/td&gt;
&lt;td&gt;0.13 MB&lt;/td&gt;
&lt;td&gt;1.01 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 000&lt;/td&gt;
&lt;td&gt;0.67 MB&lt;/td&gt;
&lt;td&gt;0.67 MB&lt;/td&gt;
&lt;td&gt;4.84 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10 000&lt;/td&gt;
&lt;td&gt;1.34 MB&lt;/td&gt;
&lt;td&gt;1.34 MB&lt;/td&gt;
&lt;td&gt;9.68 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Package size:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.43 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;lxml&lt;/td&gt;
&lt;td&gt;5.48 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Wrap-Up
&lt;/h2&gt;

&lt;p&gt;XML isn't going anywhere. The tools we use to process it matter more than&lt;br&gt;
we think — especially when that XML is on the critical path of a request,&lt;br&gt;
a batch job, or a data pipeline.&lt;/p&gt;

&lt;p&gt;pygixml brings one of the fastest C++ XML parsers to Python with minimal&lt;br&gt;
friction. Same API patterns you already know. Same XPath you already use.&lt;br&gt;
Just faster.&lt;/p&gt;

&lt;p&gt;If you try it out, I'd love to hear about your use case. And if the project&lt;br&gt;
helps you, a star on &lt;a href="https://github.com/MohammadRaziei/pygixml" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;br&gt;
goes a long way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MohammadRaziei/pygixml" rel="noopener noreferrer"&gt;pygixml on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mohammadraziei.github.io/pygixml/" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/project/pygixml/" rel="noopener noreferrer"&gt;pygixml on PyPI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pugixml.org/" rel="noopener noreferrer"&gt;pugixml&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Have a different XML parsing strategy? Drop it in the comments — I'm&lt;br&gt;
always looking for better approaches.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>xml</category>
      <category>showdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>Introducing pip-size: See the Real Cost of Python Packages</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Wed, 08 Apr 2026 20:10:59 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/introducing-pip-size-see-the-real-cost-of-python-packages-5a58</link>
      <guid>https://dev.to/mohammadraziei/introducing-pip-size-see-the-real-cost-of-python-packages-5a58</guid>
      <description>&lt;h2&gt;
  
  
  Why Package Size Matters More Than You Think
&lt;/h2&gt;

&lt;p&gt;Every day, thousands of Python packages are uploaded to PyPI. Many of us check the wheel size before installing and think "oh, it's lightweight!" — but that's just the tip of the iceberg.&lt;/p&gt;

&lt;p&gt;A package might only be 50 KB on its own, but when you install it, you could be pulling in hundreds of megabytes of transitive dependencies. The package advertises itself as "lightweight," but what your users actually download is something entirely different.&lt;/p&gt;

&lt;p&gt;This is exactly the problem &lt;strong&gt;pip-size&lt;/strong&gt; solves.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is pip-size?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;pip-size&lt;/code&gt; calculates the real download size of PyPI packages — including all their dependencies — without actually downloading anything. It uses the PyPI JSON API to resolve the entire dependency tree and shows you the full picture before you run &lt;code&gt;pip install&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip-size requests
🔍 Resolving &lt;span class="s1"&gt;'requests'&lt;/span&gt;...
  ✓ &lt;span class="nv"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.32.5  →  requests-2.32.5-py3-none-any.whl
    ✓ &lt;span class="nv"&gt;urllib3&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.3.0  →  urllib3-2.3.0-py3-none-any.whl
    ✓ charset-normalizer&lt;span class="o"&gt;==&lt;/span&gt;3.4.1  →  charset_normalizer-3.4.1-py3-none-any.whl
    ✓ &lt;span class="nv"&gt;certifi&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2025.1.31  →  certifi-2025.1.31-py3-none-any.whl
    ✓ &lt;span class="nv"&gt;idna&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;3.10  →  idna-3.10-py3-none-any.whl
  &lt;span class="nv"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.32.5  63.2 KB  &lt;span class="o"&gt;(&lt;/span&gt;total: 834.8 KB&lt;span class="o"&gt;)&lt;/span&gt;
  ├── &lt;span class="nv"&gt;urllib3&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.3.0  341.8 KB
  ├── charset-normalizer&lt;span class="o"&gt;==&lt;/span&gt;3.4.1  204.8 KB
  ├── &lt;span class="nv"&gt;certifi&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2025.1.31  164.0 KB
  └── &lt;span class="nv"&gt;idna&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;3.10  61.4 KB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See? &lt;code&gt;requests&lt;/code&gt; itself is only 63.2 KB, but the total cost is &lt;strong&gt;834.8 KB&lt;/strong&gt; — over 13x more than the package alone!&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Fair Comparison Between Alternatives
&lt;/h3&gt;

&lt;p&gt;Want to compare &lt;code&gt;httpx&lt;/code&gt; vs &lt;code&gt;requests&lt;/code&gt; vs &lt;code&gt;aiohttp&lt;/code&gt;? Don't just look at their individual sizes — compare the full dependency tree:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size httpx
pip-size requests
pip-size aiohttp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can make an informed decision based on what users will actually download.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Audit Your Own Packages
&lt;/h3&gt;

&lt;p&gt;If you maintain a package, you might be surprised what your "lightweight" library is actually shipping. Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size your-package
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Spot Heavy Dependencies
&lt;/h3&gt;

&lt;p&gt;Ever wondered why a simple CLI tool pulls in 200 MB? &lt;code&gt;pip-size&lt;/code&gt; shows you exactly which dependency is responsible for the bulk of the size.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. CI Automation
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;--quiet&lt;/code&gt; or &lt;code&gt;--bytes&lt;/code&gt; to integrate size checks into your CI pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size mypackage &lt;span class="nt"&gt;--quiet&lt;/span&gt;
&lt;span class="c"&gt;# Output: 1234567&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pip-size
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero downloads&lt;/strong&gt; — uses PyPI JSON API only&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full dependency tree&lt;/strong&gt; — includes all transitive dependencies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extras support&lt;/strong&gt; — see how &lt;code&gt;requests[security]&lt;/code&gt; affects size&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proxy support&lt;/strong&gt; — works with HTTP, SOCKS4, and SOCKS5 proxies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching&lt;/strong&gt; — 24-hour cache to avoid repeated API calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON output&lt;/strong&gt; — integrate with your own tools&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;We often obsess over code performance, but &lt;strong&gt;install size&lt;/strong&gt; is an overlooked dimension of developer experience. Every megabyte you force users to download:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slows down CI/CD pipelines&lt;/li&gt;
&lt;li&gt;Increases container image sizes&lt;/li&gt;
&lt;li&gt;Wastes bandwidth, especially in regions with limited connectivity&lt;/li&gt;
&lt;li&gt;Frustrates users on slow connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;pip-size&lt;/code&gt; is my small step toward raising awareness about this issue. I hope it helps you make better decisions when choosing dependencies — and when publishing your own packages.&lt;/p&gt;

&lt;p&gt;Give it a try and let me know what you think!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;mohammadraziei/pip-size&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>showdev</category>
      <category>devtool</category>
      <category>programming</category>
    </item>
    <item>
      <title>Why I Built pip-size: A Story About Obsession with Performance</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Mon, 06 Apr 2026 21:07:01 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/why-i-built-pip-size-a-story-about-obsession-with-performance-o40</link>
      <guid>https://dev.to/mohammadraziei/why-i-built-pip-size-a-story-about-obsession-with-performance-o40</guid>
      <description>&lt;h2&gt;
  
  
  It Started with a Simple Question
&lt;/h2&gt;

&lt;p&gt;"How fast is it?"&lt;/p&gt;

&lt;p&gt;That's the question I always ask when I write a Python package. Not "does it work?" — because obviously it works. The real question is: &lt;strong&gt;how fast is it compared to what already exists?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I've been building high-performance Python libraries for years. Libraries like:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://github.com/mohammadraziei/yyaml" rel="noopener noreferrer"&gt;&lt;strong&gt;yyaml&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://github.com/mohammadraziei/pygixml" rel="noopener noreferrer"&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://github.com/mohammadraziei/serin" rel="noopener noreferrer"&gt;&lt;strong&gt;serin&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://github.com/mohammadraziei/ctoon" rel="noopener noreferrer"&gt;&lt;strong&gt;ctoon&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://github.com/mohammadraziei/novasvg" rel="noopener noreferrer"&gt;&lt;strong&gt;novasvg&lt;/strong&gt;&lt;/a&gt;, &lt;a href="http://github.com/mohammadraziei/liburlparser" rel="noopener noreferrer"&gt;&lt;strong&gt;liburlparser&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the results? In many cases, &lt;strong&gt;20x to 100x faster&lt;/strong&gt; than the mainstream alternatives.&lt;/p&gt;

&lt;p&gt;I have the benchmarks to prove it. I've spent countless hours profiling, optimizing, and benchmarking. I know exactly how fast my code runs.&lt;/p&gt;

&lt;p&gt;But there was one question I couldn't answer easily:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"How big is it?"&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;When you compare Python packages, everyone talks about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Features&lt;/li&gt;
&lt;li&gt;API simplicity&lt;/li&gt;
&lt;li&gt;Community support&lt;/li&gt;
&lt;li&gt;GitHub stars&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But nobody talks about &lt;strong&gt;download size&lt;/strong&gt;. And that's a problem.&lt;/p&gt;

&lt;p&gt;Here's why: a package might be "lightweight" in source code, but its dependencies tell a different story.&lt;/p&gt;

&lt;p&gt;Let me give you a real example. A few months ago, I was comparing HTTP libraries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;requests==2.33.1  63.4 KB  (total: 620.4 KB)
httpx==0.28.1  71.8 KB  (total: 560.0 KB)
aiohttp==3.13.5  1.7 MB  (total: 2.6 MB)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The package itself is small. But the total size tells a different story.&lt;/p&gt;

&lt;p&gt;Now imagine you're choosing between two libraries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Library A: 50 KB package, but pulls in 500 KB of dependencies&lt;/li&gt;
&lt;li&gt;Library B: 200 KB package, but zero dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which one is really "lighter"?&lt;/p&gt;

&lt;p&gt;That's the question I wanted to answer. But there was no tool to do it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Search for a Solution
&lt;/h2&gt;

&lt;p&gt;I searched for existing tools. I found:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pip show&lt;/code&gt; — shows installed package size, but only for what's already installed&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pip download&lt;/code&gt; — downloads everything to measure it (wasteful!)&lt;/li&gt;
&lt;li&gt;Various size calculators — none of them considered the full dependency tree&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The problem? &lt;strong&gt;You have to install the package to see its size.&lt;/strong&gt; That's insane!&lt;/p&gt;

&lt;p&gt;I wanted to know the size &lt;strong&gt;before&lt;/strong&gt; installing. I wanted to see the full picture — the package plus every dependency, transitively.&lt;/p&gt;

&lt;p&gt;So I did what any developer would do: I built it myself.&lt;/p&gt;




&lt;h2&gt;
  
  
  Introducing pip-size
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;pip-size&lt;/a&gt; calculates the real download size of PyPI packages and their dependencies. Zero downloads. No pip subprocess. Pure PyPI JSON API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pip-size
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size requests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;🔍 Resolving 'requests'...
  ✓ requests==2.33.1  →  requests-2.33.1-py3-none-any.whl
    ✓ idna==3.11  →  idna-3.11-py3-none-any.whl
    ✓ certifi==2026.2.25  →  certifi-2026.2.25-py3-none-any.whl
    ✓ charset_normalizer==3.4.7  →  charset_normalizer-3.4.7-py3-none-any.whl
    ✓ urllib3==2.6.3  →  urllib3-2.6.3-py3-none-any.whl
  requests==2.33.1  63.4 KB  (total: 620.4 KB)
  ├── idna==3.11  69.3 KB
  ├── certifi==2026.2.25  150.1 KB
  ├── charset_normalizer==3.4.7  209.0 KB
  └── urllib3==2.6.3  128.5 KB
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The package size (63.4 KB)&lt;/li&gt;
&lt;li&gt;The total size including all dependencies (620.4 KB)&lt;/li&gt;
&lt;li&gt;The breakdown of each dependency&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Full dependency tree&lt;/strong&gt; — see every transitive dependency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extras support&lt;/strong&gt; — check &lt;code&gt;requests[security]&lt;/code&gt; or &lt;code&gt;fastapi[standard]&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON output&lt;/strong&gt; — integrate with scripts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proxy support&lt;/strong&gt; — for restricted networks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching&lt;/strong&gt; — 24-hour cache to avoid repeated requests&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;When I'm developing high-performance libraries, size matters for several reasons:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Deployment
&lt;/h3&gt;

&lt;p&gt;If you're shipping to edge devices, every megabyte counts. A library that claims to be "lightweight" but pulls in 500 MB of dependencies is not lightweight — it's a liability.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Cold Starts
&lt;/h3&gt;

&lt;p&gt;In serverless environments (AWS Lambda, Google Cloud Functions), cold start time correlates with package size. Smaller packages = faster cold starts.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. CI/CD
&lt;/h3&gt;

&lt;p&gt;Smaller packages mean faster pip installs in your CI pipeline. Over hundreds of builds, this adds up.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. User Trust
&lt;/h3&gt;

&lt;p&gt;As a package maintainer, I want to be transparent about what I'm shipping. If my package is 100 KB but pulls in 50 MB of dependencies, users deserve to know.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;Building pip-size made me realize something: &lt;strong&gt;we've been comparing packages wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When we see "package X is 50 KB" and "package Y is 200 KB," we assume X is lighter. But that's only half the story.&lt;/p&gt;

&lt;p&gt;The real cost of a package is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package size + size of all dependencies + size of their dependencies + ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's what pip-size reveals.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I'm continuing to improve pip-size. Some ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compare multiple packages side-by-side&lt;/li&gt;
&lt;li&gt;Show size trends over time&lt;/li&gt;
&lt;li&gt;Integrate with dependency security tools&lt;/li&gt;
&lt;li&gt;Add "size budget" warnings for CI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have ideas or want to contribute, the repo is open: &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;github.com/mohammadraziei/pip-size&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;I've spent years optimizing for speed. Now I'm obsessed with size too.&lt;/p&gt;

&lt;p&gt;Because at the end of the day, &lt;strong&gt;performance isn't just about how fast code runs — it's about how efficiently it reaches your users.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you ever been surprised by a package's hidden size? Let me know in the comments!&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;github.com/mohammadraziei/pip-size&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/pip-size" rel="noopener noreferrer"&gt;pypi.org/project/pip-size&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>programming</category>
      <category>showdev</category>
      <category>opensource</category>
    </item>
    <item>
      <title>The Real Size of AI Frameworks: A Wake-Up Call</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Mon, 06 Apr 2026 18:56:05 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/the-real-size-of-ai-frameworks-a-wake-up-call-2he</link>
      <guid>https://dev.to/mohammadraziei/the-real-size-of-ai-frameworks-a-wake-up-call-2he</guid>
      <description>&lt;h2&gt;
  
  
  You Think You Know What You're Installing
&lt;/h2&gt;

&lt;p&gt;When someone says "just install PyTorch," you probably think "how bad can it be?" It's a deep learning library, right? A few hundred megabytes, maybe?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Think again.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;pip-size&lt;/a&gt; to expose the hidden cost of Python packages. And what I found in the AI ecosystem is... shocking.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers Don't Lie
&lt;/h2&gt;

&lt;p&gt;I ran pip-size on the most popular AI frameworks. Here are the results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;Package Size&lt;/th&gt;
&lt;th&gt;Total (with deps)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;torch&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;506.0 MB&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;2.5 GB&lt;/strong&gt; 🤯&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;tensorflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;545.9 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;611.9 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;paddlepaddle&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;185.8 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;212.1 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;jax&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3.0 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;137.1 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;onnxruntime&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16.4 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;39.5 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;transformers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9.8 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;38.4 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;keras&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.6 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;29.5 MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The PyTorch Surprise
&lt;/h2&gt;

&lt;p&gt;Here's what happens when you &lt;code&gt;pip install torch&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;torch==2.11.0  506.0 MB  (total: 2.5 GB)
├── nvidia-cudnn-cu13==9.19.0.56  349.1 MB
├── nvidia-cublas==13.3.0.5  384.6 MB  [extra: cublas]
├── nvidia-nccl-cu13==2.28.9  187.4 MB
├── triton==3.6.0  179.5 MB
├── nvidia-cusparse==12.7.9.17  143.9 MB  [extra: cusparse]
├── nvidia-cusparselt-cu13==0.8.0  162.0 MB
├── nvidia-curand==10.4.2.51  57.1 MB  [extra: curand]
├── nvidia-cusolver==12.1.0.51  192.4 MB  [extra: cusolver]
└── ... (more CUDA libs)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2.5 GB.&lt;/strong&gt; For a "simple" deep learning library.&lt;/p&gt;

&lt;p&gt;The package itself is 506 MB, but CUDA dependencies add another ~2 GB. This is why your Docker images are huge. This is why your CI takes forever. This is why you need a 100GB disk just to do machine learning.&lt;/p&gt;




&lt;h2&gt;
  
  
  TensorFlow: The Heavy Champion
&lt;/h2&gt;

&lt;p&gt;TensorFlow isn't far behind:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tensorflow==2.21.0  545.9 MB  (total: 611.9 MB)
├── keras==3.14.0  1.6 MB  (total: 8.3 MB)
│   └── ml-dtypes==0.5.4  4.8 MB
├── numpy==2.4.4  16.1 MB
├── h5py==3.14.0  4.3 MB
└── grpcio==1.80.0  6.5 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;612 MB total. Keras helps (it's now bundled), but TensorFlow still brings a lot of baggage.&lt;/p&gt;




&lt;h2&gt;
  
  
  JAX: The Lightweight Contender?
&lt;/h2&gt;

&lt;p&gt;JAX looks small at first glance — just 3 MB! But look closer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jax==0.9.2  3.0 MB  (total: 137.1 MB)
├── jaxlib==0.9.2  79.4 MB
├── scipy==1.17.1  33.7 MB
└── numpy==2.4.4  16.1 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;137 MB&lt;/strong&gt; when you count everything. Still smaller than PyTorch and TensorFlow, but not "lightweight" by any means.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Hidden Gems
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ONNX Runtime: Only 39.5 MB
&lt;/h3&gt;

&lt;p&gt;If you're deploying models and don't need the full training stack, ONNX Runtime is surprisingly compact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;onnxruntime==1.24.4  16.4 MB  (total: 39.5 MB)
├── numpy==2.4.4  16.1 MB
└── sympy==1.14.0  6.0 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's &lt;strong&gt;65x smaller than PyTorch&lt;/strong&gt;. For inference, this is a game-changer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Keras: Just 29.5 MB
&lt;/h3&gt;

&lt;p&gt;Keras (the standalone version, not bundled with TensorFlow) is the lightest option:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;keras==3.14.0  1.6 MB  (total: 29.5 MB)
├── numpy==2.4.4  16.1 MB
├── h5py==3.16.0  4.8 MB
└── ml-dtypes==0.5.4  4.8 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Perfect for when you want something simple without the enterprise overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Means for You
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Docker Images
&lt;/h3&gt;

&lt;p&gt;If you're shipping PyTorch in a Docker image, plan for at least &lt;strong&gt;3 GB&lt;/strong&gt;. TensorFlow? &lt;strong&gt;700 MB&lt;/strong&gt;. ONNX Runtime? &lt;strong&gt;50 MB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Choose wisely based on your deployment constraints.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. CI/CD
&lt;/h3&gt;

&lt;p&gt;Every &lt;code&gt;pip install torch&lt;/code&gt; in your CI pipeline costs time and bandwidth. Consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Caching wheels&lt;/li&gt;
&lt;li&gt;Using lighter alternatives for testing&lt;/li&gt;
&lt;li&gt;Installing only what's needed&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Local Development
&lt;/h3&gt;

&lt;p&gt;That "quick experiment" with PyTorch? It's 2.5 GB. Maybe JAX at 137 MB is enough for your use case.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The AI ecosystem is massive — literally. Before you &lt;code&gt;pip install&lt;/code&gt; your next ML library, know what you're getting into.&lt;/p&gt;

&lt;p&gt;Use &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;pip-size&lt;/a&gt; to see the full picture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pip-size
pip-size torch
pip-size tensorflow
pip-size jax
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your disk space will thank you.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;github.com/mohammadraziei/pip-size&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/pip-size" rel="noopener noreferrer"&gt;pypi.org/project/pip-size&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;What's the biggest package surprise you've encountered? Let me know in the comments!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How I Discovered the Hidden Cost of "Lightweight" Python Packages</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Mon, 06 Apr 2026 18:28:08 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/how-i-discovered-the-hidden-cost-of-lightweight-python-packages-1f7e</link>
      <guid>https://dev.to/mohammadraziei/how-i-discovered-the-hidden-cost-of-lightweight-python-packages-1f7e</guid>
      <description>&lt;h2&gt;
  
  
  The "It's Just a Small Library" Trap
&lt;/h2&gt;

&lt;p&gt;We've all been there. You find a Python package that promises to solve your problem with minimal overhead. The README says "lightweight," the GitHub stars look good, and the developer swears it's "just a few kilobytes."&lt;/p&gt;

&lt;p&gt;So you install it, run your project, and wonder why your Docker image grew by 200MB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What happened?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The package &lt;em&gt;is&lt;/em&gt; small. But its dependencies aren't. And those dependencies have dependencies. And those... you get the idea.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Moment I Realized Something Was Missing
&lt;/h2&gt;

&lt;p&gt;I was comparing HTTP libraries for a new project. &lt;code&gt;requests&lt;/code&gt; is popular, but everyone says it's "heavy." Then I found a library that claimed to be a "lightweight alternative."&lt;/p&gt;

&lt;p&gt;But something in my gut said "let me check." So I built &lt;strong&gt;&lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;pip-size&lt;/a&gt;&lt;/strong&gt; — a tool that calculates the real download size of PyPI packages and their dependencies, using only the PyPI JSON API. No downloads. No pip subprocess. Just data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install it:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pip-size
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Compare HTTP libraries fairly:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size requests
pip-size httpx
pip-size aiohttp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The results might surprise you:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Package Size&lt;/th&gt;
&lt;th&gt;Total (with deps)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;requests&lt;/td&gt;
&lt;td&gt;63.4 KB&lt;/td&gt;
&lt;td&gt;620.4 KB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;httpx&lt;/td&gt;
&lt;td&gt;71.8 KB&lt;/td&gt;
&lt;td&gt;560.0 KB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;aiohttp&lt;/td&gt;
&lt;td&gt;1.7 MB&lt;/td&gt;
&lt;td&gt;2.6 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;httpx&lt;/code&gt; is often marketed as a "modern" alternative to &lt;code&gt;requests&lt;/code&gt;, but the total size is almost identical! Meanwhile, &lt;code&gt;aiohttp&lt;/code&gt; is over &lt;strong&gt;4x larger&lt;/strong&gt; — which makes sense since it's a full async framework, not just a client.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Flask vs FastAPI Myth
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting. Flask is often called "lightweight" while FastAPI is labeled as "heavy." Let's verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size flask
pip-size fastapi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;Package Size&lt;/th&gt;
&lt;th&gt;Total (with deps)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Flask&lt;/td&gt;
&lt;td&gt;101.0 KB&lt;/td&gt;
&lt;td&gt;606.2 KB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FastAPI&lt;/td&gt;
&lt;td&gt;115.0 KB&lt;/td&gt;
&lt;td&gt;2.9 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Flask is indeed smaller — about &lt;strong&gt;5x smaller&lt;/strong&gt; than FastAPI when you count everything.&lt;/p&gt;

&lt;p&gt;But here's the nuance: FastAPI's size comes from &lt;code&gt;pydantic&lt;/code&gt; (2.4 MB), which brings powerful data validation and automatic API documentation. You're not just getting a web framework — you're getting a &lt;em&gt;complete API solution&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;So "lightweight" depends on what you need. If you want simplicity and control, Flask wins. If you want automatic docs, validation, and type hints, FastAPI's "weight" is a feature, not a bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Compare Alternatives Fairly
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size httpx
pip-size requests
pip-size aiohttp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can compare apples to apples — not just the package size, but the entire dependency tree.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Audit Your Own Packages
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size mypackage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See what you're actually shipping to your users. Sometimes you'll be surprised.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Spot the Heavy Culprit
&lt;/h3&gt;

&lt;p&gt;When your project grows unexpectedly, run pip-size on your dependencies. You'll find which one is dragging in the bulk of the weight.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Understand Optional Extras
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip-size &lt;span class="s2"&gt;"requests[security]"&lt;/span&gt;
pip-size &lt;span class="s2"&gt;"fastapi[standard]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See exactly how much each extra adds over the base package.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;In a world where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Docker images need to be small&lt;/li&gt;
&lt;li&gt;CI/CD pipelines need to be fast&lt;/li&gt;
&lt;li&gt;Bandwidth isn't free (especially in developing countries)&lt;/li&gt;
&lt;li&gt;Cold starts in serverless matter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Knowing the real cost of a dependency before you install it isn't a luxury — it's a necessity.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;pip-size is open source (MIT license) and available on PyPI. It uses the PyPI JSON API, caches responses for 24 hours, and supports proxies if you need them.&lt;/p&gt;

&lt;p&gt;Next time you see a package advertised as "lightweight," run pip-size first. Your future self (and your users) will thank you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you ever been surprised by a package's hidden dependencies? Let me know in the comments!&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;github.com/mohammadraziei/pip-size&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/pip-size" rel="noopener noreferrer"&gt;pypi.org/project/pip-size&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>cli</category>
      <category>pipsize</category>
      <category>devtool</category>
    </item>
    <item>
      <title>Stop Writing Boilerplate Wrappers for C++ Bindings — Meet polybind</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Sun, 05 Apr 2026 21:24:43 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/stop-writing-boilerplate-wrappers-for-c-bindings-meet-polybind-18nd</link>
      <guid>https://dev.to/mohammadraziei/stop-writing-boilerplate-wrappers-for-c-bindings-meet-polybind-18nd</guid>
      <description>&lt;p&gt;If you've spent time writing Python bindings for a C++ library with template&lt;br&gt;
classes, you know the pattern. You expose &lt;code&gt;Box&amp;lt;int32_t&amp;gt;&lt;/code&gt; as &lt;code&gt;Box_int32&lt;/code&gt;,&lt;br&gt;
&lt;code&gt;Box&amp;lt;double&amp;gt;&lt;/code&gt; as &lt;code&gt;Box_float64&lt;/code&gt;, and then you spend an afternoon writing the&lt;br&gt;
same dispatch logic in Python to pretend they're one class. And then you do&lt;br&gt;
it again for &lt;code&gt;Matrix&lt;/code&gt;, &lt;code&gt;Tensor&lt;/code&gt;, &lt;code&gt;Pair&lt;/code&gt;, and every other template in the&lt;br&gt;
library.&lt;/p&gt;

&lt;p&gt;This post is about why that happens, and how one command fixes it.&lt;/p&gt;
&lt;h2&gt;
  
  
  The root of the problem
&lt;/h2&gt;

&lt;p&gt;C++ templates don't exist at runtime — they're resolved at compile time.&lt;br&gt;
When you use nanobind, pybind11, or Cython to expose a &lt;code&gt;Box&amp;lt;T&amp;gt;&lt;/code&gt;, the binding&lt;br&gt;
layer has no generic &lt;code&gt;T&lt;/code&gt; to offer Python. You register each specialisation&lt;br&gt;
separately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// nanobind&lt;/span&gt;
&lt;span class="n"&gt;nb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class_&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;int32_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"_Box__int32"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;init&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;int32_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;int32_t&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;::&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;nb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class_&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"_Box__float64"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;init&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;::&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So Python gets two completely separate classes. &lt;code&gt;isinstance&lt;/code&gt;, &lt;code&gt;type()&lt;/code&gt;, and&lt;br&gt;
every type-check in your codebase sees them as unrelated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_Box__int32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# False
&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__int32&lt;/span&gt;        &lt;span class="c1"&gt;# True
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The usual fix is a hand-written dispatcher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_MAP&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__int32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__float64&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__new__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_MAP&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)](&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This breaks &lt;code&gt;type(b) is Box&lt;/code&gt;, loses docstrings, kills IDE autocomplete, and&lt;br&gt;
needs to be written again for every template class in the project.&lt;/p&gt;
&lt;h2&gt;
  
  
  Multi-parametric templates make it worse
&lt;/h2&gt;

&lt;p&gt;When you have &lt;code&gt;Pair&amp;lt;T1, T2&amp;gt;&lt;/code&gt;, you now need a two-dimensional dispatch table.&lt;br&gt;
&lt;code&gt;Pair__float64__int32&lt;/code&gt;, &lt;code&gt;Pair__int32__int64&lt;/code&gt;, maybe more combinations. The&lt;br&gt;
hand-written approach becomes a maintenance problem very quickly.&lt;/p&gt;
&lt;h2&gt;
  
  
  The polybind approach
&lt;/h2&gt;

&lt;p&gt;polybind solves this by reading the &lt;code&gt;.pyi&lt;/code&gt; stub your binding tool already&lt;br&gt;
produces and generating the wrapper for you.&lt;/p&gt;

&lt;p&gt;The naming convention is intentional: use &lt;strong&gt;double underscores&lt;/strong&gt; to separate&lt;br&gt;
template parameters in your class names, following numpy scalar type names:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;_Box__int32           →  Box&amp;lt;int32_t&amp;gt;
_Box__float64         →  Box&amp;lt;double&amp;gt;
_Pair__float64__int32 →  Pair&amp;lt;double, int32_t&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; nanobind.stubgen &lt;span class="nt"&gt;-m&lt;/span&gt; _mylib &lt;span class="nt"&gt;-o&lt;/span&gt; _mylib.pyi
polybind _mylib.pyi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. &lt;code&gt;mylib.py&lt;/code&gt; is written and ready to import.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mylib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Pair&lt;/span&gt;

&lt;span class="c1"&gt;# single-type: auto-detect from argument
&lt;/span&gt;&lt;span class="n"&gt;b_int&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;b_float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.14&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;b_str&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b_int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Box&lt;/span&gt;          &lt;span class="c1"&gt;# True  ✅
&lt;/span&gt;&lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b_float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c1"&gt;# True  ✅
&lt;/span&gt;&lt;span class="n"&gt;b_int&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;value&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;               &lt;span class="c1"&gt;# 42    ✅
&lt;/span&gt;
&lt;span class="c1"&gt;# multi-type: auto-detect from both arguments
&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                   &lt;span class="c1"&gt;# 3.14
&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;second&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                  &lt;span class="c1"&gt;# 5
&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Pair&lt;/span&gt;             &lt;span class="c1"&gt;# True
&lt;/span&gt;
&lt;span class="c1"&gt;# explicit dtype when auto-detect isn't enough
&lt;/span&gt;&lt;span class="nc"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;int64&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# partial dict — specify what matters, rest is auto
&lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;first&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# subscript to get the raw C++ class
&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;                &lt;span class="c1"&gt;# → _mylib._Box__int32
&lt;/span&gt;&lt;span class="n"&gt;Pair&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;  &lt;span class="c1"&gt;# → _mylib._Pair__float64__int32
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How the type map works
&lt;/h2&gt;

&lt;p&gt;The generated wrapper stores a map keyed by &lt;strong&gt;suffix tuples&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_type_map_box&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ClassVar&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,):&lt;/span&gt;   &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__int32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,):&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__float64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;str_&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,):&lt;/span&gt;    &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Box__str_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;_type_map_pair&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ClassVar&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Pair__float64__int32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int64&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="n"&gt;_mylib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_Pair__int32__int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;__new__&lt;/code&gt; maps argument types to suffix tuples via &lt;code&gt;_NUMPY_TYPE_MAP&lt;/code&gt; and&lt;br&gt;
looks them up. For &lt;code&gt;Pair(3.14, 5)&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.14&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;float&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  &lt;span class="err"&gt;→&lt;/span&gt;  &lt;span class="n"&gt;suffix&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;    &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;    &lt;span class="err"&gt;→&lt;/span&gt;  &lt;span class="n"&gt;suffix&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;float64&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;int32&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="err"&gt;→&lt;/span&gt;  &lt;span class="n"&gt;_Pair__float64__int32&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Binding-method agnostic
&lt;/h2&gt;

&lt;p&gt;polybind never imports your C++ module at generation time. It only reads the&lt;br&gt;
&lt;code&gt;.pyi&lt;/code&gt; stub — plain text that every binding tool can produce:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Stub command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;nanobind&lt;/td&gt;
&lt;td&gt;&lt;code&gt;python -m nanobind.stubgen -m _mylib&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pybind11&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pybind11-stubgen _mylib&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cython&lt;/td&gt;
&lt;td&gt;stubgen via mypy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Switch tools tomorrow — the polybind command stays the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  What else is preserved
&lt;/h2&gt;

&lt;p&gt;Beyond dispatch, polybind also:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reproduces &lt;code&gt;@staticmethod&lt;/code&gt;, &lt;code&gt;@classmethod&lt;/code&gt;, &lt;code&gt;@property&lt;/code&gt; decorators from
the stub — returning wrapper instances, not raw C++ objects&lt;/li&gt;
&lt;li&gt;Carries docstrings through and rewrites variant class names
(&lt;code&gt;_Box__int32&lt;/code&gt; → &lt;code&gt;Box&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Generates full type annotations using &lt;code&gt;typing.Union&lt;/code&gt; for method signatures&lt;/li&gt;
&lt;li&gt;Accepts &lt;code&gt;np.dtype&lt;/code&gt; objects in the &lt;code&gt;dtypes&lt;/code&gt; argument if numpy is installed&lt;/li&gt;
&lt;li&gt;Registers all C++ classes as virtual subclasses of the wrapper via
&lt;code&gt;ABC.register()&lt;/code&gt;, so &lt;code&gt;isinstance(raw_cpp_obj, Box)&lt;/code&gt; is also &lt;code&gt;True&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A note on when dtypes is required
&lt;/h2&gt;

&lt;p&gt;polybind infers template types from constructor arguments by matching Python&lt;br&gt;
type annotations. If a template parameter isn't represented in the&lt;br&gt;
constructor (a tag-dispatch pattern, for example), auto-detection isn't&lt;br&gt;
possible. The generated wrapper will raise a clear &lt;code&gt;TypeError&lt;/code&gt; at runtime&lt;br&gt;
asking for an explicit &lt;code&gt;dtypes&lt;/code&gt; list.&lt;/p&gt;

&lt;p&gt;This is a deliberate design choice: fail loudly at construction time rather&lt;br&gt;
than silently select the wrong variant.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;polybind
polybind _mylib.pyi          &lt;span class="c"&gt;# generates mylib.py&lt;/span&gt;
polybind _mylib.pyi &lt;span class="nt"&gt;--dry-run&lt;/span&gt;  &lt;span class="c"&gt;# preview without writing&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Source and docs: &lt;a href="https://github.com/mohammadraziei/polybind" rel="noopener noreferrer"&gt;github.com/mohammadraziei/polybind&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback is very welcome, especially from projects using less common binding&lt;br&gt;
tools or unusual template patterns. Open an issue with a sample &lt;code&gt;.pyi&lt;/code&gt; and&lt;br&gt;
I'll make sure it's handled correctly.&lt;/p&gt;

</description>
      <category>python</category>
      <category>cpp</category>
      <category>pybind11</category>
      <category>showdev</category>
    </item>
    <item>
      <title>When Constraints Build Tools</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Sun, 05 Apr 2026 00:52:08 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/when-constraints-build-tools-59ag</link>
      <guid>https://dev.to/mohammadraziei/when-constraints-build-tools-59ag</guid>
      <description>&lt;p&gt;The office network had rules. Strict ones.&lt;/p&gt;

&lt;p&gt;No &lt;code&gt;apt-get&lt;/code&gt;. No &lt;code&gt;brew&lt;/code&gt;. No &lt;code&gt;npm&lt;/code&gt;. No downloading binaries from the internet. If it wasn't on PyPI, it didn't exist. The IT policy was clear, the firewall was clearer, and the list of exceptions was empty.&lt;/p&gt;

&lt;p&gt;I had one job: automate the documentation pipeline. Diagrams, architecture charts, flow diagrams — all written in Mermaid, all living as &lt;code&gt;.mmd&lt;/code&gt; files in the repo, all needing to be rendered to SVG on every build. Simple enough, in theory.&lt;/p&gt;




&lt;p&gt;The first thing I found was &lt;code&gt;mermaid-cli&lt;/code&gt;. The official tool. Maintained by the Mermaid team themselves. I opened the installation docs, and the first line was:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @mermaid-js/mermaid-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Closed the tab.&lt;/p&gt;




&lt;p&gt;I kept searching. There was a Python package — &lt;code&gt;mermaid-cli&lt;/code&gt; on PyPI. I felt a small rush of hope. I ran &lt;code&gt;pip install&lt;/code&gt;. It installed. I ran it.&lt;/p&gt;

&lt;p&gt;It printed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;playwright &lt;span class="nb"&gt;install &lt;/span&gt;chromium
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course. Under the hood, it needed a browser. And installing a browser meant downloading a binary from the internet, outside of PyPI, which the network blocked. Even if it hadn't — I didn't &lt;em&gt;want&lt;/em&gt; a browser. A browser meant hundreds of megabytes of dependency for what was, at its core, a text-to-SVG conversion.&lt;/p&gt;

&lt;p&gt;The hope disappeared.&lt;/p&gt;




&lt;p&gt;I sat with the problem for a while.&lt;/p&gt;

&lt;p&gt;What does Mermaid.js actually need? I read the source. It needs a DOM. Not a full browser with tabs and network requests and a GPU process — just a DOM. &lt;code&gt;document.createElement&lt;/code&gt;. &lt;code&gt;querySelector&lt;/code&gt;. CSS computed styles. The ability to measure text. That's it.&lt;/p&gt;

&lt;p&gt;The reason everyone reaches for a browser is that browsers are where DOMs live. But a DOM and a browser aren't the same thing.&lt;/p&gt;

&lt;p&gt;I remembered PhantomJS.&lt;/p&gt;




&lt;p&gt;Most people think PhantomJS is dead. And for what it was originally built for — web scraping, UI testing, automated screenshots of modern sites — it is. Playwright killed it for those use cases in 2018, and the project hasn't had a release since.&lt;/p&gt;

&lt;p&gt;But PhantomJS is, underneath all of that, a self-contained WebKit binary. It has a JavaScript engine. It has a real DOM. And it ships as a single executable file — no installation, no system dependencies, no apt-get required.&lt;/p&gt;

&lt;p&gt;More importantly: it was on PyPI. Wrapped, bundled, ready to pip install.&lt;/p&gt;

&lt;p&gt;The question was whether I could build something thin and clean on top of it. Not a web scraping tool. Not a browser automation framework. Just: &lt;em&gt;run this JavaScript file, give it a DOM, capture what it prints to stdout&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That was &lt;strong&gt;phasma&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;The first version was small. Almost embarrassingly small. A Python class that started a PhantomJS subprocess, wrote a JS file to a temp directory, ran it, and captured the output. No async, no fancy API, no browser context abstraction. Just &lt;code&gt;driver.exec&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;phasma.driver&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="k"&gt;exec&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;run_js&lt;/span&gt;
&lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_js&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;render_diagram.js&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I pointed it at Mermaid.js. Wrote a small script that loaded the library, created a DOM element, called &lt;code&gt;mermaid.render()&lt;/code&gt;, and printed the SVG to stdout.&lt;/p&gt;

&lt;p&gt;It worked.&lt;/p&gt;

&lt;p&gt;The whole thing — PhantomJS starting up, loading Mermaid, rendering the diagram, printing SVG — took about 800 milliseconds. For a CI pipeline that ran once per push, that was completely acceptable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;mmdc&lt;/strong&gt; was maybe two hundred lines of Python on top of that. Read the &lt;code&gt;.mmd&lt;/code&gt; file. Pass the content to phasma. Capture the SVG. Write it to disk. Done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mmdc
mmdc &lt;span class="nt"&gt;--input&lt;/span&gt; architecture.mmd &lt;span class="nt"&gt;--output&lt;/span&gt; architecture.svg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No Node.js. No npm. No browser. No apt-get. Just pip — the one thing the network allowed.&lt;/p&gt;




&lt;p&gt;There's a version of this story where I found a better solution. Where someone had already built the right thing and I just hadn't searched hard enough. Where the constraint turned out to be navigable with an existing tool.&lt;/p&gt;

&lt;p&gt;That version didn't happen.&lt;/p&gt;

&lt;p&gt;What happened instead is that the constraint — &lt;em&gt;only PyPI, nothing else&lt;/em&gt; — pushed me into a corner narrow enough that the only way out was to build something. And the thing I built turned out to be useful beyond the original problem.&lt;/p&gt;

&lt;p&gt;People use mmdc now in Docker containers where they don't want a browser. In CI pipelines where Node.js isn't available. In air-gapped environments where the internet doesn't exist. The constraint that created the tool turns out to be a constraint a lot of people have.&lt;/p&gt;




&lt;p&gt;phasma grew a little after that. A Playwright-inspired async API got added — not because mmdc needed it, but because the lower layer was interesting enough to build on. That part is still rough around the edges, still needs work, still has edge cases that aren't handled cleanly. It's the part of the project that's most alive, and most in need of people who want to dig into Python async internals. The door is open.&lt;/p&gt;

&lt;p&gt;But the core — &lt;code&gt;driver.exec&lt;/code&gt;, a bundled PhantomJS binary, a DOM you can use from Python with nothing but pip — that part works. It works because it had to.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The firewall never had to open. The diagrams appeared in the documentation. The pipeline ran.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The constraint didn't block the solution — it was the solution.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/MohammadRaziei/phasma" rel="noopener noreferrer"&gt;phasma on GitHub&lt;/a&gt; — if the async API interests you, PRs are open&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MohammadRaziei/mmdc" rel="noopener noreferrer"&gt;mmdc on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pypi.org/project/phasma/" rel="noopener noreferrer"&gt;phasma on PyPI&lt;/a&gt; · &lt;a href="https://pypi.org/project/mmdc/" rel="noopener noreferrer"&gt;mmdc on PyPI&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>javascript</category>
      <category>devtools</category>
    </item>
    <item>
      <title>I Needed to Run Mermaid.js in Python. So I Built Two Libraries.</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Sat, 04 Apr 2026 23:09:59 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/i-needed-to-run-mermaidjs-in-python-so-i-built-two-libraries-5elc</link>
      <guid>https://dev.to/mohammadraziei/i-needed-to-run-mermaidjs-in-python-so-i-built-two-libraries-5elc</guid>
      <description>&lt;p&gt;It started with a single line in a requirements doc:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Diagrams should be auto-generated as part of the build pipeline."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Simple enough, right? I was building a documentation automation tool in Python. The diagrams were written in Mermaid — clean, text-based, version-controlled. All I needed was to convert &lt;code&gt;.mmd&lt;/code&gt; files to SVG during the build.&lt;/p&gt;

&lt;p&gt;I looked up the standard way to do it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @mermaid-js/mermaid-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I stared at that line for a moment. This was a Python project. A &lt;em&gt;pure&lt;/em&gt; Python project. And now I needed Node.js, npm, and — I kept reading — Chromium running headlessly in the background, just to turn a text file into an SVG.&lt;/p&gt;

&lt;p&gt;I closed the tab.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Search for Alternatives
&lt;/h2&gt;

&lt;p&gt;Surely someone had solved this already. I started digging through PyPI.&lt;/p&gt;

&lt;p&gt;The Python packages that claimed to render Mermaid either called out to the npm tool under the hood (so you still needed Node.js), hit a third-party API (so you needed internet access and an API key), or just... generated the Mermaid syntax and left the rendering to you.&lt;/p&gt;

&lt;p&gt;None of them actually rendered diagrams. Locally. In Python. Without external dependencies.&lt;/p&gt;

&lt;p&gt;I went deeper. What does Mermaid.js actually need to render? I read through the source. It needs a real DOM — &lt;code&gt;document.createElement&lt;/code&gt;, CSS computed styles, SVG measurement APIs. It's not just parsing text; it's doing real browser-level layout to figure out where nodes go.&lt;/p&gt;

&lt;p&gt;That's why everyone reaches for a browser. Mermaid genuinely needs one.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Realization
&lt;/h2&gt;

&lt;p&gt;At some point I remembered PhantomJS.&lt;/p&gt;

&lt;p&gt;Most people think of PhantomJS as a dead project — and for web scraping and UI testing, it is. Playwright killed it for those use cases. But PhantomJS is, at its core, a &lt;strong&gt;self-contained WebKit binary with a full DOM implementation&lt;/strong&gt;. It hasn't had a new release since 2018, but it also hasn't needed one for what I needed it for. It's frozen in time, which for a reproducible build environment is actually a &lt;em&gt;feature&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The question was: could I build a clean Python interface around it that would let me inject Mermaid.js into PhantomJS and capture the SVG output?&lt;/p&gt;

&lt;p&gt;I started building &lt;strong&gt;&lt;a href="https://github.com/MohammadRaziei/phasma" rel="noopener noreferrer"&gt;phasma&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building Phasma
&lt;/h2&gt;

&lt;p&gt;The first design goal was zero setup. If you had to install PhantomJS separately, I hadn't actually solved the original problem. So phasma bundles the PhantomJS binary directly — it ships with the package, across Windows, Linux, and macOS.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;phasma
&lt;span class="c"&gt;# That's it. PhantomJS is included.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The core of phasma is &lt;code&gt;driver.exec&lt;/code&gt; — a way to run JavaScript files directly through the bundled PhantomJS binary, with full DOM and WebKit support:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;phasma.driver&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="k"&gt;exec&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;phantomjs_exec&lt;/span&gt;

&lt;span class="c1"&gt;# Run any JS file with full PhantomJS/WebKit environment
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;phantomjs_exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_script.js&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This was actually all that &lt;code&gt;mmdc&lt;/code&gt; needed. I just needed to run a JS file that loaded Mermaid, rendered a diagram, and printed the SVG. The &lt;code&gt;driver.exec&lt;/code&gt; interface handled it cleanly.&lt;/p&gt;

&lt;p&gt;But while I was at it, I kept going.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Playwright-like API
&lt;/h2&gt;

&lt;p&gt;Once the core worked, the API felt obvious: make it look like Playwright. If you've used modern browser automation in Python, Playwright's API is the gold standard. Clean async, familiar method names, intuitive page/browser hierarchy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;phasma&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;launch&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;h1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;screenshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;capture.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;page.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The async implementation still has rough edges — this is an area where &lt;strong&gt;contributions are genuinely welcome&lt;/strong&gt;. If you're comfortable with Python async internals and want to help bring the Playwright-like API to full stability, &lt;a href="https://github.com/MohammadRaziei/phasma" rel="noopener noreferrer"&gt;the repo is open&lt;/a&gt; and PRs are very much appreciated.&lt;/p&gt;




&lt;h2&gt;
  
  
  Then Building mmdc
&lt;/h2&gt;

&lt;p&gt;With phasma working, building &lt;strong&gt;&lt;a href="https://github.com/MohammadRaziei/mmdc" rel="noopener noreferrer"&gt;mmdc&lt;/a&gt;&lt;/strong&gt; took surprisingly little code. The hard problem was already solved.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mmdc&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MermaidConverter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="n"&gt;converter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MermaidConverter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;converter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_svg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;graph TD&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;  A --&amp;gt; B --&amp;gt; C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diagram.svg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;converter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_png&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;graph TD&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;  A --&amp;gt; B --&amp;gt; C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diagram.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;converter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;graph TD&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;  A --&amp;gt; B --&amp;gt; C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diagram.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No Node.js. No npm. No Chromium. Just:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mmdc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI mirrors the official &lt;code&gt;mermaid-cli&lt;/code&gt; syntax, so if you're already familiar with it, switching is trivial:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mmdc &lt;span class="nt"&gt;--input&lt;/span&gt; diagram.mmd &lt;span class="nt"&gt;--output&lt;/span&gt; diagram.svg
mmdc &lt;span class="nt"&gt;--input&lt;/span&gt; diagram.mmd &lt;span class="nt"&gt;--output&lt;/span&gt; diagram.png &lt;span class="nt"&gt;--timeout&lt;/span&gt; 60
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What I Actually Shipped
&lt;/h2&gt;

&lt;p&gt;Two packages, one problem:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;phasma&lt;/strong&gt; — a Python interface for PhantomJS with a bundled binary, &lt;code&gt;driver.exec&lt;/code&gt; for direct JS execution, and a Playwright-inspired async API (in active development). For anyone who needs to run JavaScript with real DOM support inside Python.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;mmdc&lt;/strong&gt; — a Mermaid diagram converter built on top of phasma. Converts &lt;code&gt;.mmd&lt;/code&gt; files to SVG, PNG, and PDF. Fully offline, no system dependencies beyond &lt;code&gt;pip install mmdc&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Both are on PyPI. Both are MIT licensed. And both genuinely do what they claim.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Honest Part
&lt;/h2&gt;

&lt;p&gt;PhantomJS is old. It doesn't support ES2020+. Its async story required careful handling. And the Playwright-like API in phasma is still maturing — the sync paths are solid, but the full async implementation needs more work and testing.&lt;/p&gt;

&lt;p&gt;But for the original problem — render Mermaid diagrams from Python with zero external dependencies — it works. Reliably. On every platform.&lt;/p&gt;

&lt;p&gt;If you're building documentation tooling, CI pipelines, or any Python project that needs diagrams without Node.js, give mmdc a try. And if you're interested in the lower-level plumbing — running arbitrary JavaScript with a real DOM inside Python — phasma is the piece you want.&lt;/p&gt;




&lt;p&gt;⭐ If either of these saves you from writing &lt;code&gt;npm install&lt;/code&gt; in a Python project, a star on GitHub goes a long way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MohammadRaziei/phasma" rel="noopener noreferrer"&gt;github.com/MohammadRaziei/phasma&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MohammadRaziei/mmdc" rel="noopener noreferrer"&gt;github.com/MohammadRaziei/mmdc&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And if you want to help stabilize the async Playwright-like API in phasma — PRs are open and very welcome.&lt;/p&gt;

</description>
      <category>python</category>
      <category>javascript</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Your Package Is Not As Lightweight As You Think</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Sat, 04 Apr 2026 19:34:35 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/your-package-is-not-as-lightweight-as-you-think-b2k</link>
      <guid>https://dev.to/mohammadraziei/your-package-is-not-as-lightweight-as-you-think-b2k</guid>
      <description>&lt;p&gt;There's a claim you've probably seen in a README before:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Zero dependencies. Lightweight. Minimal footprint."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It sounds great. But most of the time, it's only half the story.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Weight You Don't See
&lt;/h2&gt;

&lt;p&gt;When you run &lt;code&gt;pip install some-package&lt;/code&gt;, you're not installing one thing. You're installing that package &lt;em&gt;plus&lt;/em&gt; every library it depends on, plus every library &lt;em&gt;those&lt;/em&gt; libraries depend on. The size printed on PyPI is just the tip of the iceberg.&lt;/p&gt;

&lt;p&gt;This matters a lot in constrained environments: embedded systems, Docker containers you want to keep slim, serverless functions with cold start sensitivity, CI pipelines that install fresh on every run, or HPC clusters where storage quotas are real and network bandwidth costs time.&lt;/p&gt;

&lt;p&gt;And yet, almost no one measures this before claiming their package is "lightweight."&lt;/p&gt;




&lt;h2&gt;
  
  
  A Real Example: XML Parsing in Python
&lt;/h2&gt;

&lt;p&gt;I was building &lt;a href="https://pypi.org/project/pygixml/" rel="noopener noreferrer"&gt;pygixml&lt;/a&gt;, a Python binding for the &lt;a href="https://pugixml.org/" rel="noopener noreferrer"&gt;pugixml&lt;/a&gt; C++ library, aimed at high-performance XML parsing. At some point, I claimed it was lighter than the alternatives.&lt;/p&gt;

&lt;p&gt;But &lt;em&gt;lighter&lt;/em&gt; compared to what, exactly? And measured how?&lt;/p&gt;

&lt;p&gt;I wrote a small tool called &lt;a href="https://pypi.org/project/pip-size/" rel="noopener noreferrer"&gt;pip-size&lt;/a&gt; to find out. It queries the PyPI JSON API and calculates the real download size of a package — the wheel file itself — along with the complete transitive dependency tree. No downloads, no installs, no guesswork.&lt;/p&gt;

&lt;p&gt;Here's what it showed for the three main Python XML parsing libraries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip-size pygixml
&lt;span class="go"&gt;  pygixml==0.6.0  167.3 KB

&lt;/span&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip-size pugixml
&lt;span class="go"&gt;  pugixml==0.7.0  375.1 KB

&lt;/span&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip-size lxml
&lt;span class="go"&gt;  lxml==6.0.2  5.0 MB
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The comparison holds up: &lt;code&gt;pygixml&lt;/code&gt; is about 2.2× lighter than &lt;code&gt;pugixml&lt;/code&gt; and roughly 30× lighter than &lt;code&gt;lxml&lt;/code&gt;. In this case none of the three have significant Python-level dependencies, so the package itself &lt;em&gt;is&lt;/em&gt; the story.&lt;/p&gt;

&lt;p&gt;But that's not always the case.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Dependencies Change Everything
&lt;/h2&gt;

&lt;p&gt;Let me show you a more dramatic scenario. Imagine you're choosing an HTTP client for a minimal service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip-size httpie
&lt;span class="go"&gt;  httpie==3.2.4  119.2 KB  (total: 4.1 MB)
  ├── requests==2.32.5  63.2 KB  (total: 834.8 KB)
  │   ├── urllib3==2.3.0  341.8 KB
  │   ├── charset-normalizer==3.4.1  204.8 KB
  │   ├── certifi==2025.1.31  164.0 KB
  │   └── idna==3.10  61.4 KB
  ├── rich==13.9.4  238.1 KB  (total: 1.2 MB)
  │   ├── markdown-it-py==3.0.0  87.3 KB
  │   └── pygments==2.19.1  4.4 MB
  └── ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The package itself is 119 KB. Its total footprint is 4.1 MB. That's a 34× multiplier hidden behind a single &lt;code&gt;pip install&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is not a criticism of httpie — it's a fully-featured CLI tool and those dependencies are justified. The point is that &lt;strong&gt;the number on the PyPI page is almost never the number that matters&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fairness Problem
&lt;/h2&gt;

&lt;p&gt;Here's the thing that bothered me when I started thinking about this:&lt;/p&gt;

&lt;p&gt;If library A claims to be "lightweight" and library B doesn't make that claim, but A pulls in 800 KB of dependencies while B pulls in 200 KB — who's actually lighter?&lt;/p&gt;

&lt;p&gt;The "lightweight" claim is often made based on the package's own size, or the number of dependencies, rather than the actual bytes that land on disk. Neither of those is a fair measure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A fair comparison looks at the full dependency tree.&lt;/strong&gt; And that's what &lt;code&gt;pip-size&lt;/code&gt; is designed to make easy — no installation required, just a quick query against PyPI's public API.&lt;/p&gt;




&lt;h2&gt;
  
  
  How pip-size Works
&lt;/h2&gt;

&lt;p&gt;The tool uses PyPI's JSON API (&lt;code&gt;https://pypi.org/pypi/{package}/json&lt;/code&gt;) to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Resolve the correct version based on your specifier&lt;/li&gt;
&lt;li&gt;Select the right wheel for your platform, using the same priority logic as pip itself&lt;/li&gt;
&lt;li&gt;Walk the &lt;code&gt;requires_dist&lt;/code&gt; metadata to find all dependencies&lt;/li&gt;
&lt;li&gt;Resolve each dependency recursively, in concurrent BFS layers&lt;/li&gt;
&lt;li&gt;Report the size at every level of the tree&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The output is a tree where every intermediate node shows both its own size and the total weight of its subtree — so you can see at a glance which dependency is responsible for the bulk of the footprint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  fastapi==0.115.12  276.3 KB  (total: 1.1 MB)
  ├── starlette==0.46.1  254.0 KB  (total: 481.2 KB)
  │   └── anyio==4.9.0  227.2 KB
  ├── pydantic==2.11.3  440.5 KB  (total: 1.6 MB)  ← here's your culprit
  │   ├── pydantic-core==2.33.1  1.8 MB
  │   └── ...
  └── ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  When to Use This
&lt;/h2&gt;

&lt;p&gt;A few concrete situations where this kind of measurement is useful:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before publishing a library.&lt;/strong&gt; If you're telling users your library is lightweight, measure it. Run &lt;code&gt;pip-size your-package&lt;/code&gt; and check whether the claim survives contact with reality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choosing between alternatives.&lt;/strong&gt; &lt;code&gt;pip-size requests&lt;/code&gt; vs &lt;code&gt;pip-size httpx&lt;/code&gt; vs &lt;code&gt;pip-size aiohttp&lt;/code&gt; gives you a side-by-side cost comparison without installing anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auditing a project's dependencies.&lt;/strong&gt; &lt;code&gt;pip-size your-project&lt;/code&gt; before a Docker build tells you where the size is coming from and which dependency is worth optimizing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CI size budgets.&lt;/strong&gt; The &lt;code&gt;--quiet --bytes&lt;/code&gt; flags output a raw number, which you can compare against a threshold in a shell script or GitHub Action.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;pip-size my-package &lt;span class="nt"&gt;--quiet&lt;/span&gt; &lt;span class="nt"&gt;--bytes&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SIZE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-gt&lt;/span&gt; 5000000 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Package exceeds 5 MB size budget"&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Optional Dependencies
&lt;/h2&gt;

&lt;p&gt;One more thing worth mentioning: pip-size handles optional dependencies (extras) correctly.&lt;/p&gt;

&lt;p&gt;By default, it only includes dependencies that are always required — the same ones pip would install for a plain &lt;code&gt;pip install package&lt;/code&gt;. If you want to see the cost of enabling specific extras:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip-size &lt;span class="s2"&gt;"requests[security]"&lt;/span&gt;
&lt;span class="go"&gt;  requests==2.32.5  63.2 KB  (total: 1.2 MB)
  ├── urllib3==2.3.0  341.8 KB
  ├── ...
  ├── cryptography==44.0.3  518.2 KB  [extra: security]
  └── pyOpenSSL==25.0.0  112.4 KB    [extra: security]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or if you want the absolute worst-case footprint — every optional dependency across the entire tree — use &lt;code&gt;--all-extras&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The size shown on a PyPI package page is that package's size. The size that actually matters is the size of everything it brings with it.&lt;/p&gt;

&lt;p&gt;Before claiming a package is lightweight, measure it. Before choosing between libraries, compare their full footprint. Before shipping a container, know what's in it.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip-size&lt;/code&gt; is available on PyPI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pip-size
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Source: &lt;a href="https://github.com/mohammadraziei/pip-size" rel="noopener noreferrer"&gt;github.com/mohammadraziei/pip-size&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The numbers in this article were obtained on Python 3.11 / Linux x86_64. Sizes vary by platform and Python version because pip selects different wheels.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>pipsize</category>
      <category>python</category>
      <category>cicd</category>
      <category>packaging</category>
    </item>
    <item>
      <title>🎉 Big News for Python Developers &amp; Mermaid Fans: "mmdc" Makes Mermaid Diagrams Easy as Python! 🚀</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Thu, 08 Jan 2026 09:02:36 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/big-news-for-python-developers-mermaid-fans-mmdc-makes-mermaid-diagrams-easy-as-python-1gok</link>
      <guid>https://dev.to/mohammadraziei/big-news-for-python-developers-mermaid-fans-mmdc-makes-mermaid-diagrams-easy-as-python-1gok</guid>
      <description>&lt;p&gt;If you &lt;em&gt;love&lt;/em&gt; Mermaid diagrams — flowcharts, sequence diagrams, Gantt charts, pie charts, and more — but you’ve ever felt stuck because you had to install Node.js, npm, browsers, or other system tools just to generate diagram files, &lt;strong&gt;today is your day&lt;/strong&gt;!&lt;br&gt;
Say hello to &lt;strong&gt;&lt;code&gt;mmdc&lt;/code&gt;&lt;/strong&gt;, the &lt;strong&gt;Python‑native Mermaid diagram converter&lt;/strong&gt; that finally lets you generate beautiful diagrams &lt;em&gt;straight from Python&lt;/em&gt; — with &lt;strong&gt;no external installs&lt;/strong&gt;, no system packages, and no extra runtime hassles! 🙌(&lt;a href="https://github.com/mohammadraziei/mmdc" rel="noopener noreferrer"&gt;github&lt;/a&gt;)&lt;/p&gt;


&lt;h2&gt;
  
  
  🧠 First — What &lt;em&gt;Is&lt;/em&gt; Mermaid?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Mermaid&lt;/strong&gt; is an open‑source diagramming tool that lets you define diagrams using &lt;strong&gt;simple, text‑based syntax&lt;/strong&gt; — very similar to Markdown — and render them into real diagrams.&lt;br&gt;
You write plain text like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
  A --&amp;gt; B
  B --&amp;gt; C
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…and Mermaid turns it into a visual flowchart you can embed in docs, wikis, blogs, or technical writing. It’s fast to learn, version‑control friendly, and integrates with many tools.&lt;/p&gt;

&lt;p&gt;Mermaid has become super popular in documentation, engineering teams, and developer blogs precisely because &lt;strong&gt;diagrams become code&lt;/strong&gt; — no GUI drag‑and‑drop tools, no files to manage manually — just text that lives with your project.(&lt;a href="https://mermaid.js.org" rel="noopener noreferrer"&gt;Mermaid&lt;/a&gt;)&lt;/p&gt;




&lt;h2&gt;
  
  
  🌟 Why &lt;strong&gt;mmdc&lt;/strong&gt; Is a Game Changer
&lt;/h2&gt;

&lt;p&gt;Traditionally, if you wanted to convert Mermaid into SVG, PNG, or PDF, you needed:&lt;/p&gt;

&lt;p&gt;✔ Node.js&lt;br&gt;
✔ npm&lt;br&gt;
✔ Mermaid CLI&lt;br&gt;
✔ Browsers or headless workers&lt;br&gt;
✔ Extra system tools&lt;/p&gt;

&lt;p&gt;That always felt like overkill for something as simple as &lt;em&gt;turn text into a diagram&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;mmdc&lt;/strong&gt; changes all that. It’s a &lt;strong&gt;pure Python solution&lt;/strong&gt; — installable with a single &lt;code&gt;pip install&lt;/code&gt;, and it works &lt;strong&gt;without installing ANY external tools&lt;/strong&gt; like system packages or browsers.&lt;/p&gt;

&lt;p&gt;It uses the powerhouse library &lt;strong&gt;Phasma&lt;/strong&gt;, which leverages an internal PhantomJS instance under the hood to render Mermaid code into real diagram outputs — &lt;strong&gt;yet you never have to install anything else yourself&lt;/strong&gt;. This makes it perfect for Python environments, automation, docs pipelines, and CI/CD workflows.&lt;/p&gt;


&lt;h2&gt;
  
  
  🚀 Installation
&lt;/h2&gt;

&lt;p&gt;Just run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mmdc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s all — you’re ready to go! No Node.js, npm, apt installs, or browsers required. &lt;/p&gt;




&lt;h2&gt;
  
  
  🌈 Use It from the Command Line
&lt;/h2&gt;

&lt;p&gt;Convert a simple Mermaid file to SVG:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mmdc &lt;span class="nt"&gt;--input&lt;/span&gt; my_diagram.mmd &lt;span class="nt"&gt;--output&lt;/span&gt; my_diagram.svg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create PNG or PDF just by specifying the extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mmdc &lt;span class="nt"&gt;--input&lt;/span&gt; my_diagram.mmd &lt;span class="nt"&gt;--output&lt;/span&gt; my_diagram.png
mmdc &lt;span class="nt"&gt;--input&lt;/span&gt; my_diagram.mmd &lt;span class="nt"&gt;--output&lt;/span&gt; my_diagram.pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Perfect for automated doc builds, static site generators, or even blog pipelines!&lt;/p&gt;




&lt;h2&gt;
  
  
  🐍 Use It in Python Too
&lt;/h2&gt;

&lt;p&gt;Want to generate diagrams right inside your Python code? No problem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mmdc&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MermaidConverter&lt;/span&gt;

&lt;span class="n"&gt;converter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MermaidConverter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;mermaid_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
graph TD
    A[Start] --&amp;gt; B{Is it cool?}
    B --&amp;gt;|Yes| C[Love it!]
    B ----&amp;gt;|No| D[Try again]
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;converter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_svg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mermaid_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cool_diagram.svg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple, powerful, and integrates cleanly with Python applications, docs generators, notebook workflows, and automation scripts! &lt;/p&gt;




&lt;h2&gt;
  
  
  📊 Example Mermaid Code Snippets
&lt;/h2&gt;

&lt;p&gt;Here are a few Mermaid diagrams you can try:&lt;/p&gt;

&lt;h3&gt;
  
  
  📈 Flowchart
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph LR
    A[Idea] --&amp;gt; B[Develop]
    B --&amp;gt; C[Test]
    C --&amp;gt; D[Deploy]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔁 Simple Loop
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flowchart TD
    Start --&amp;gt; Process
    Process --&amp;gt; Review
    Review --&amp;gt;|OK| End
    Review --&amp;gt;|Fix| Process
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ⏱️ Sequence Diagram
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sequenceDiagram
    Alice-&amp;gt;&amp;gt;Bob: Hello Bob!
    Bob--&amp;gt;&amp;gt;Alice: Hi Alice!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💡 Why This Matters
&lt;/h2&gt;

&lt;p&gt;🧩 &lt;strong&gt;No external setup:&lt;/strong&gt; Python devs finally get Mermaid without any extra installs. &lt;br&gt;
🛠 &lt;strong&gt;Fits docs automation:&lt;/strong&gt; Great for Sphinx, MkDocs, Jupyter, notebooks, and CI/CD. &lt;br&gt;
📦 &lt;strong&gt;Python‑centric workflows:&lt;/strong&gt; Treat diagrams as first‑class parts of your codebase. &lt;/p&gt;




&lt;h2&gt;
  
  
  🎉 Wrap Up
&lt;/h2&gt;

&lt;p&gt;If you’ve ever wanted a &lt;strong&gt;clean, Python‑only way to generate Mermaid diagrams&lt;/strong&gt;, &lt;strong&gt;mmdc&lt;/strong&gt; is &lt;em&gt;huge news&lt;/em&gt;. It brings a beloved text‑based diagramming approach straight into the Python ecosystem — all with a single &lt;code&gt;pip install&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now diagrams truly can be &lt;strong&gt;code first&lt;/strong&gt; — versioned, automated, lightweight, and beautiful — without the weight of external toolchains. 💥&lt;/p&gt;




</description>
      <category>mermaid</category>
      <category>python</category>
      <category>javascript</category>
      <category>programming</category>
    </item>
    <item>
      <title>Beyond lxml: Faster and More Pythonic Parsing with pygixml and selectolax</title>
      <dc:creator>Mohammad Raziei</dc:creator>
      <pubDate>Sat, 08 Nov 2025 08:00:50 +0000</pubDate>
      <link>https://dev.to/mohammadraziei/beyond-lxml-faster-and-more-pythonic-parsing-with-pygixml-and-selectolax-278h</link>
      <guid>https://dev.to/mohammadraziei/beyond-lxml-faster-and-more-pythonic-parsing-with-pygixml-and-selectolax-278h</guid>
      <description>&lt;p&gt;For almost two decades, &lt;strong&gt;lxml&lt;/strong&gt; has been the go-to choice for parsing XML and HTML in Python.&lt;br&gt;
It’s fast, reliable, and feature-rich — a powerful C-based library that has served the ecosystem extremely well.&lt;/p&gt;

&lt;p&gt;But the world has changed.&lt;br&gt;
XML and HTML parsing have new performance demands, and developers expect cleaner, faster, and more Pythonic APIs.&lt;/p&gt;

&lt;p&gt;That’s where &lt;strong&gt;pygixml&lt;/strong&gt; (for XML) and &lt;strong&gt;selectolax&lt;/strong&gt; (for HTML) come in — two modern parsing libraries built with &lt;strong&gt;Cython&lt;/strong&gt; and inspired by low-level speed but high-level usability.&lt;/p&gt;


&lt;h2&gt;
  
  
  🕰️ A Brief Look Back at lxml
&lt;/h2&gt;

&lt;p&gt;Let’s give credit where it’s due.&lt;br&gt;
&lt;strong&gt;lxml&lt;/strong&gt; revolutionized parsing when it came out — it combined the power of the C-based &lt;strong&gt;libxml2&lt;/strong&gt; with a clean Python API.&lt;br&gt;
For years, it was the de facto standard for working with XML and HTML.&lt;/p&gt;

&lt;p&gt;A simple example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;lxml&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;etree&lt;/span&gt;

&lt;span class="n"&gt;xml&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;root&amp;gt;&amp;lt;item&amp;gt;Alpha&amp;lt;/item&amp;gt;&amp;lt;item&amp;gt;Beta&amp;lt;/item&amp;gt;&amp;lt;/root&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;etree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromstring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;//item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Alpha
Beta
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing wrong here — it works!&lt;br&gt;
But as XML and HTML documents grow larger, lxml starts to struggle. Its API is also somewhat verbose and its performance, though C-based, is not fully optimized for modern multi-megabyte data workloads.&lt;/p&gt;


&lt;h2&gt;
  
  
  ⚡ Meet pygixml — XML at C++ Speed
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/mohammadraziei/pygixml" rel="noopener noreferrer"&gt;pygixml&lt;/a&gt;&lt;/strong&gt; is a modern XML parser for Python, powered by &lt;strong&gt;pugixml (C++)&lt;/strong&gt; and &lt;strong&gt;Cython&lt;/strong&gt;.&lt;br&gt;
It’s not just a wrapper — it’s a reimagined XML API that combines raw C++ speed with Pythonic usability.&lt;/p&gt;

&lt;p&gt;Benchmarks show &lt;strong&gt;pygixml is 16× to 33× faster than ElementTree&lt;/strong&gt;, and around &lt;strong&gt;5× faster than lxml&lt;/strong&gt;, depending on input size.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;

&lt;span class="n"&gt;xml&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;root&amp;gt;&amp;lt;item id=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;Alpha&amp;lt;/item&amp;gt;&amp;lt;item id=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;Beta&amp;lt;/item&amp;gt;&amp;lt;/root&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pygixml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select_nodes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;//item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xpath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/root/item[1] Alpha
/root/item[2] Beta
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each node exposes an &lt;code&gt;.xpath&lt;/code&gt; property (something lxml doesn’t provide directly), and every node has a unique &lt;code&gt;mem_id&lt;/code&gt; — letting you find or reference elements instantly.&lt;/p&gt;

&lt;p&gt;Need to iterate?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recursive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Want the full text recursively?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first_child&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recursive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pygixml is designed for developers who work with &lt;strong&gt;huge XML files&lt;/strong&gt;, &lt;strong&gt;complex XPath queries&lt;/strong&gt;, and &lt;strong&gt;high-performance data pipelines&lt;/strong&gt;.&lt;br&gt;
It’s production-ready, thread-safe, and ridiculously fast.&lt;/p&gt;

&lt;p&gt;📘 Full API documentation:&lt;br&gt;
&lt;a href="https://mohammadraziei.github.io/pygixml/api.html" rel="noopener noreferrer"&gt;https://mohammadraziei.github.io/pygixml/api.html&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🌐 And for HTML? Meet selectolax
&lt;/h2&gt;

&lt;p&gt;When it comes to HTML parsing, &lt;strong&gt;&lt;a href="https://github.com/rushter/selectolax" rel="noopener noreferrer"&gt;selectolax&lt;/a&gt;&lt;/strong&gt; fills the same niche that pygixml does for XML.&lt;br&gt;
It’s built in &lt;strong&gt;Cython&lt;/strong&gt;, inspired by the speed of &lt;strong&gt;lexbor&lt;/strong&gt; (a fast C-based HTML5 parser), and offers a familiar, Pythonic interface.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;selectolax.parser&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HTMLParser&lt;/span&gt;

&lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;html&amp;gt;&amp;lt;body&amp;gt;&amp;lt;div class=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;post&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;Hello&amp;lt;/div&amp;gt;&amp;lt;div class=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;post&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;World&amp;lt;/div&amp;gt;&amp;lt;/body&amp;gt;&amp;lt;/html&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HTMLParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;css&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;div.post&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello
World
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It even supports &lt;strong&gt;CSS selectors&lt;/strong&gt; natively, making it ideal for scraping and lightweight DOM manipulation.&lt;/p&gt;

&lt;p&gt;In spirit, &lt;strong&gt;selectolax&lt;/strong&gt; feels like &lt;strong&gt;pygixml’s sibling&lt;/strong&gt; — both written in Cython, both extremely fast, both with modern and minimal APIs.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Why Modern Parsers Make Sense Today
&lt;/h2&gt;

&lt;p&gt;Here’s a summary of how these libraries stack up:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Domain&lt;/th&gt;
&lt;th&gt;Language Core&lt;/th&gt;
&lt;th&gt;Speed&lt;/th&gt;
&lt;th&gt;API Style&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ElementTree&lt;/td&gt;
&lt;td&gt;XML&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;td&gt;Built-in, minimal features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;lxml&lt;/td&gt;
&lt;td&gt;XML/HTML&lt;/td&gt;
&lt;td&gt;C (libxml2)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Verbose&lt;/td&gt;
&lt;td&gt;Mature but aging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pygixml&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;XML&lt;/td&gt;
&lt;td&gt;C++ (pugixml) + Cython&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Very Fast&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pythonic&lt;/td&gt;
&lt;td&gt;Full XPath support, mem_id, recursive text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;selectolax&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;HTML&lt;/td&gt;
&lt;td&gt;C (lexbor) + Cython&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Very Fast&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pythonic&lt;/td&gt;
&lt;td&gt;CSS selectors, minimal overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both &lt;strong&gt;pygixml&lt;/strong&gt; and &lt;strong&gt;selectolax&lt;/strong&gt; take the same philosophy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use low-level performance engines (pugixml, lexbor)&lt;/li&gt;
&lt;li&gt;Expose a simple, modern, Pythonic API&lt;/li&gt;
&lt;li&gt;Strip away decades of legacy overhead&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 Time to Modernize Your Stack
&lt;/h2&gt;

&lt;p&gt;If you’re still using lxml for XML or HTML in new projects — it might be time to consider a faster, cleaner alternative.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For &lt;strong&gt;XML&lt;/strong&gt;: try &lt;strong&gt;&lt;a href="https://github.com/mohammadraziei/pygixml" rel="noopener noreferrer"&gt;pygixml&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;For &lt;strong&gt;HTML&lt;/strong&gt;: try &lt;strong&gt;&lt;a href="https://github.com/rushter/selectolax" rel="noopener noreferrer"&gt;selectolax&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Parsing in Python no longer has to be slow or clunky.&lt;br&gt;
Modern Cython-powered parsers give you the best of both worlds — &lt;strong&gt;the speed of C/C++&lt;/strong&gt; and &lt;strong&gt;the elegance of Python&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏁 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;lxml&lt;/strong&gt; was a legend — and it still works fine.&lt;br&gt;
But libraries like &lt;strong&gt;pygixml&lt;/strong&gt; and &lt;strong&gt;selectolax&lt;/strong&gt; show what parsing can feel like in 2025:&lt;br&gt;
leaner, faster, and built for the way modern Python developers actually work.&lt;/p&gt;

&lt;p&gt;If you deal with big XML or HTML workloads, give these tools a spin.&lt;br&gt;
You might never look back.&lt;/p&gt;




</description>
      <category>lxml</category>
      <category>xml</category>
      <category>html</category>
      <category>python</category>
    </item>
  </channel>
</rss>
