<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergey Dobrov</title>
    <description>The latest articles on DEV Community by Sergey Dobrov (@jbinary).</description>
    <link>https://dev.to/jbinary</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3617962%2Fd0a00be4-2bbf-47c8-9520-da23346408e4.jpeg</url>
      <title>DEV Community: Sergey Dobrov</title>
      <link>https://dev.to/jbinary</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jbinary"/>
    <language>en</language>
    <item>
      <title>40 Lines of Python to Fake a Serial Mouse</title>
      <dc:creator>Sergey Dobrov</dc:creator>
      <pubDate>Tue, 10 Mar 2026 15:41:55 +0000</pubDate>
      <link>https://dev.to/jbinary/40-lines-of-python-to-fake-a-serial-mouse-5cob</link>
      <guid>https://dev.to/jbinary/40-lines-of-python-to-fake-a-serial-mouse-5cob</guid>
      <description>&lt;p&gt;During the COVID lockdowns my son and I started playing &lt;em&gt;The Settlers II&lt;/em&gt; in DOSBox.&lt;/p&gt;

&lt;p&gt;One of the coolest features of the game is that two players can play on the same computer in split screen, each controlling their own cursor — a surprisingly social multiplayer mode for a 1996 strategy game.&lt;/p&gt;

&lt;p&gt;The trick is that the second player uses a serial mouse.&lt;/p&gt;

&lt;p&gt;Unfortunately modern operating systems don't really expose the concept of multiple independent mice anymore — they all get merged into a single pointer.&lt;/p&gt;

&lt;p&gt;So if I wanted that old-school multiplayer experience back, I needed an adapter.&lt;/p&gt;

&lt;p&gt;My first thought was simple: maybe I can fake a serial mouse?&lt;/p&gt;

&lt;p&gt;In Unix systems everything is &lt;em&gt;a file&lt;/em&gt;. If DOSBox expects a serial device, perhaps I can generate the right byte stream and feed it to it?&lt;/p&gt;

&lt;p&gt;Reading through the Linux docs and trying to &lt;code&gt;hexdump&lt;/code&gt; the mouse device I realized I need to convert from PS/2 mouse protocol into Microsoft serial mouse protocol. The descriptions of both I quickly found on the Internet: &lt;a href="https://roborooter.com/post/serial-mice/" rel="noopener noreferrer"&gt;https://roborooter.com/post/serial-mice/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Quickly noticing some key differences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Both communicate via groups of three bytes;&lt;/li&gt;
&lt;li&gt;The most significant bit of each byte is used for framing in the Microsoft protocol, so the high bits of the coordinates are packed into the first byte;&lt;/li&gt;
&lt;li&gt;Both encode movement deltas, but PS/2 uses sign bits in the status byte while the Microsoft serial protocol splits the high bits of the coordinates into the first byte;&lt;/li&gt;
&lt;li&gt;As a result PS/2 effectively has one extra bit of precision for each axis.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first idea was to just write the packets into a pipe and connect DOSBox to it.&lt;/p&gt;

&lt;p&gt;Unfortunately, DOSBox expects something that behaves like a real serial device, not just a pipe.&lt;/p&gt;

&lt;p&gt;Eventually I landed on &lt;code&gt;socat&lt;/code&gt;, which can create a pair of pseudo-terminals:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;socat &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; pty,raw,echo&lt;span class="o"&gt;=&lt;/span&gt;0 pty,raw,echo&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates two linked devices:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/dev/pts/24
/dev/pts/25
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whatever you write to one appears on the other.&lt;/p&gt;

&lt;p&gt;Now DOSBox can connect to one side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;serial1&lt;/span&gt;=&lt;span class="n"&gt;directserial&lt;/span&gt; &lt;span class="n"&gt;realport&lt;/span&gt;:&lt;span class="n"&gt;pts&lt;/span&gt;/&lt;span class="m"&gt;25&lt;/span&gt; &lt;span class="n"&gt;rxdelay&lt;/span&gt;:&lt;span class="m"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the script writes to the other.&lt;/p&gt;

&lt;p&gt;The first version worked — but the mouse felt strange: movements were extremely smooth and continued even after I stopped moving the mouse.&lt;/p&gt;

&lt;p&gt;Modern mice have extremely high DPI, which means the adapter was sending a huge number of tiny movements.&lt;/p&gt;

&lt;p&gt;DOSBox replayed them with a delay.&lt;/p&gt;

&lt;p&gt;The solution was simple: accumulate movement and send it in larger steps.&lt;/p&gt;

&lt;p&gt;I deliberately didn’t try to make it neat or reusable — it only needed to run on my laptop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;struct&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;#
&lt;/span&gt;    &lt;span class="n"&gt;byte1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mb"&gt;0b01000000&lt;/span&gt;
    &lt;span class="n"&gt;byte1&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mb"&gt;0b11000000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="n"&gt;byte1&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mb"&gt;0b11000000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="c1"&gt;#
&lt;/span&gt;    &lt;span class="n"&gt;buff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;byte1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mb"&gt;0b00111111&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mb"&gt;0b00111111&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="n"&gt;BUNDLING&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;SENSITIVITY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;


&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/dev/pts/24&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;acc_dx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;acc_dy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/dev/input/mouse1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mouse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;let&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s get it started!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;b1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mouse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mb"&gt;0b00010000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mb"&gt;0b00100000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;
            &lt;span class="n"&gt;acc_dx&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;dx&lt;/span&gt;
            &lt;span class="n"&gt;acc_dy&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt;

            &lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;acc_dx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;BUNDLING&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;acc_dx&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;SENSITIVITY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;acc_dx&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;SENSITIVITY&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;acc_dy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;BUNDLING&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;acc_dy&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;SENSITIVITY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;acc_dy&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;SENSITIVITY&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;dy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see the script is very simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Main loop reads PS/2 packets from &lt;code&gt;/dev/input/mouse1&lt;/code&gt;;&lt;/li&gt;
&lt;li&gt;Decodes movement deltas;&lt;/li&gt;
&lt;li&gt;Accumulates them over time;&lt;/li&gt;
&lt;li&gt;Applies &lt;code&gt;BUNDLING&lt;/code&gt; to avoid overwhelming DOSBox with packets;&lt;/li&gt;
&lt;li&gt;Applies &lt;code&gt;SENSITIVITY&lt;/code&gt; so that our high-DPI mouse feels good in small resolution of the old game;&lt;/li&gt;
&lt;li&gt;Every now and then the &lt;code&gt;send&lt;/code&gt; routine builds MS Serial Mouse packets back from the accumulated movements and sends it to the &lt;code&gt;pts&lt;/code&gt; devices so that &lt;code&gt;socat&lt;/code&gt; mirrors it into the other &lt;code&gt;pts&lt;/code&gt; which DOSBox will read.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that The Microsoft protocol has slightly lower resolution for movement deltas than PS/2, so very large movements would technically need to be split across multiple packets. In practice this isn't an issue because mice report movement in small increments.&lt;/p&gt;

&lt;h2&gt;
  
  
  The UNIX way
&lt;/h2&gt;

&lt;p&gt;In the end the entire adapter was about 40 lines of Python, no third-party libraries — just the standard library and a bit of Unix plumbing.&lt;/p&gt;

&lt;p&gt;Interestingly, modern DOSBox builds now support this feature natively: &lt;a href="https://www.dosbox-staging.org/releases/release-notes/0.80.0/?utm_source=chatgpt.com#dual-mouse-gaming" rel="noopener noreferrer"&gt;https://www.dosbox-staging.org/releases/release-notes/0.80.0/?utm_source=chatgpt.com#dual-mouse-gaming&lt;/a&gt;, so the little adapter is no longer necessary.&lt;/p&gt;

&lt;p&gt;But solving the problem was half the fun — and it’s a nice illustration of how Unix-style abstractions let you insert a tiny translator between two pieces of software from completely different eras.&lt;/p&gt;

&lt;p&gt;Even if one of them thinks it's talking to a serial mouse from 1995.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>python</category>
      <category>beginners</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>The Best Engineers Can Move Between Abstraction Layers</title>
      <dc:creator>Sergey Dobrov</dc:creator>
      <pubDate>Wed, 18 Feb 2026 10:57:14 +0000</pubDate>
      <link>https://dev.to/jbinary/the-best-engineers-can-move-between-abstraction-layers-ee8</link>
      <guid>https://dev.to/jbinary/the-best-engineers-can-move-between-abstraction-layers-ee8</guid>
      <description>&lt;p&gt;One pattern I’ve noticed over the years: the strongest engineers can move between abstraction layers almost effortlessly.&lt;/p&gt;

&lt;p&gt;They write high-level code comfortably — but when something behaves strangely, they instinctively descend a few layers, reason about what’s happening underneath, and come back with a fix.&lt;/p&gt;

&lt;p&gt;Here’s a simple thought experiment.&lt;/p&gt;

&lt;p&gt;Ask someone why Python code using dense numeric arrays is much faster than similar logic written with regular Python objects.&lt;/p&gt;

&lt;p&gt;A surface-level answer is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Because NumPy is written in C.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A deeper answer mentions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;contiguous memory&lt;/li&gt;
&lt;li&gt;cache locality&lt;/li&gt;
&lt;li&gt;pointer indirection&lt;/li&gt;
&lt;li&gt;object metadata overhead&lt;/li&gt;
&lt;li&gt;how CPUs actually operate on memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That explanation travels from Python syntax down to memory layout and hardware behavior — and back.&lt;/p&gt;

&lt;p&gt;That ability to move across layers isn’t academic. It shows up in very ordinary work.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Very Ordinary API Problem
&lt;/h2&gt;

&lt;p&gt;Recently I was integrating with an external REST API that was painfully slow.&lt;/p&gt;

&lt;p&gt;It returned paginated results and also reported the total number of matching records. The task was straightforward: fetch everything.&lt;/p&gt;

&lt;p&gt;The original implementation used broad filters over the entire time range and fetched page by page with large page sizes.&lt;/p&gt;

&lt;p&gt;At the HTTP layer, that approach makes sense:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer requests&lt;/li&gt;
&lt;li&gt;Larger pages&lt;/li&gt;
&lt;li&gt;Simpler control flow&lt;/li&gt;
&lt;li&gt;“Minimize round trips”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It feels efficient. But performance was terrible.&lt;/p&gt;

&lt;p&gt;Instead of tweaking page size or adding concurrency, I asked a different question:&lt;/p&gt;

&lt;p&gt;— Why is the API returning total count?&lt;/p&gt;

&lt;p&gt;— That likely means a &lt;code&gt;COUNT(*)&lt;/code&gt; somewhere. And pagination with &lt;code&gt;LIMIT … OFFSET&lt;/code&gt; … usually means something else:&lt;/p&gt;

&lt;p&gt;To serve page N, the database must scan and discard the previous N × page_size rows.&lt;/p&gt;

&lt;p&gt;So with broad filters over a large time range, each successive page becomes more expensive than the previous one.&lt;/p&gt;

&lt;p&gt;At the REST layer, everything looked clean.&lt;/p&gt;

&lt;p&gt;Underneath, it was probably:&lt;/p&gt;

&lt;p&gt;REST&lt;br&gt;
→ controller&lt;br&gt;
→ SQL with &lt;code&gt;COUNT(*)&lt;/code&gt;&lt;br&gt;
→ &lt;code&gt;LIMIT/OFFSET&lt;/code&gt;&lt;br&gt;
→ large scans and discarded rows&lt;/p&gt;

&lt;p&gt;So instead of fetching everything broadly, I changed the access pattern.&lt;/p&gt;

&lt;p&gt;Fetch day by day:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smaller slices.&lt;/li&gt;
&lt;li&gt;Better index selectivity.&lt;/li&gt;
&lt;li&gt;Much smaller offsets.&lt;/li&gt;
&lt;li&gt;Less discarded work.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: roughly a 10× speedup.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No access to their database.&lt;/li&gt;
&lt;li&gt;No vendor escalation.&lt;/li&gt;
&lt;li&gt;No architectural rewrite.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Just switching layers mentally, adjusting the shape of the query, and moving back up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Most of our daily work lives at high levels of abstraction: frameworks, APIs, cloud services.&lt;/p&gt;

&lt;p&gt;And that’s good — that’s where productivity lives.&lt;/p&gt;

&lt;p&gt;But performance issues, strange behavior, and cost explosions rarely originate at the same layer where they surface.&lt;/p&gt;

&lt;p&gt;The engineers who consistently deliver under pressure are usually the ones who can descend a few layers, reason about what’s actually happening, and then come back with a pragmatic fix.&lt;/p&gt;

&lt;p&gt;Not because they love low-level code.&lt;/p&gt;

&lt;p&gt;But because they’re not confined to a single abstraction.&lt;/p&gt;

&lt;p&gt;And this becomes even more important as more code is generated for us and more infrastructure is abstracted away. When AI writes the boilerplate and platforms hide the machinery, the differentiator isn’t how quickly you can produce code — it’s how well you understand what happens underneath it.&lt;/p&gt;

&lt;p&gt;That ability to move across layers becomes rarer.&lt;/p&gt;

&lt;p&gt;And that skill compounds.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>performance</category>
      <category>architecture</category>
      <category>career</category>
    </item>
    <item>
      <title>Rediscovering Unix Pipelines: Two Backup Problems, One mindset</title>
      <dc:creator>Sergey Dobrov</dc:creator>
      <pubDate>Mon, 08 Dec 2025 11:31:08 +0000</pubDate>
      <link>https://dev.to/jbinary/rediscovering-unix-pipelines-two-backup-problems-one-mindset-59n7</link>
      <guid>https://dev.to/jbinary/rediscovering-unix-pipelines-two-backup-problems-one-mindset-59n7</guid>
      <description>&lt;p&gt;Modern engineers often reach for JSON parsing, temporary files, or orchestration tools. Unix pipelines still outperform them more often than you'd expect. Two recent backup tasks reminded me how often we over-engineer simple data-movement problems.&lt;/p&gt;

&lt;p&gt;Two backup problems, same pattern: both reached for complex solutions involving temporary files, JSON parsing, and extra dependencies. Both could be solved with simple pipelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem 1: Pruning Old Backups
&lt;/h2&gt;

&lt;p&gt;I needed to keep the latest seven backups in object storage and delete everything older.&lt;/p&gt;

&lt;p&gt;First instinct: treat it like an application problem.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mc &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;--recursive&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; bucket/ &lt;span class="se"&gt;\&lt;/span&gt;
  | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'. | "\(.lastModified) \(.key)"'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | … &lt;span class="c"&gt;# extract keys, build array, loop, delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parse JSON, build arrays, extract timestamps, loop through objects.&lt;/p&gt;

&lt;p&gt;But backups already encode timestamps in filenames. &lt;code&gt;mc find&lt;/code&gt; already prints one file per line. &lt;code&gt;sort&lt;/code&gt; already orders strings. &lt;code&gt;head&lt;/code&gt; already drops lines.&lt;/p&gt;

&lt;p&gt;The whole task collapses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mc find bucket/ &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"backup-*.gz"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="nt"&gt;-7&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; file&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;mc &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No JSON. No arrays. No parsing. Just text flowing through composable tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem 2: Creating Backups
&lt;/h2&gt;

&lt;p&gt;A teammate needed to dump PostgreSQL, compress it, and upload to object storage.&lt;/p&gt;

&lt;p&gt;His approach (roughly):&lt;/p&gt;

&lt;p&gt;First, upload a script to the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;scp backup.sh db-server:/tmp/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh db-server /tmp/backup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
pg_dump mydb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/backup.sql
&lt;span class="nb"&gt;gzip&lt;/span&gt; /tmp/backup.sql
mc &lt;span class="nb"&gt;cp&lt;/span&gt; /tmp/backup.sql.gz s3://bucket/backup-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.sql.gz
&lt;span class="nb"&gt;rm&lt;/span&gt; /tmp/backup.sql.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uploading the script first&lt;/li&gt;
&lt;li&gt;Installing &lt;code&gt;mc&lt;/code&gt; on the database server&lt;/li&gt;
&lt;li&gt;Managing temporary files&lt;/li&gt;
&lt;li&gt;Cleanup logic&lt;/li&gt;
&lt;li&gt;Disk space for both uncompressed and compressed data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these steps adds friction, state, and failure modes.&lt;br&gt;
But none of them are actually required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh db-server &lt;span class="s2"&gt;"pg_dump mydb | gzip"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | mc pipe s3://bucket/backup-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.sql.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From two commands plus a bash script to one line. From five requirements to zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus: Free Parallelism
&lt;/h2&gt;

&lt;p&gt;That backup pipeline runs three processes simultaneously: &lt;code&gt;pg_dump&lt;/code&gt; generating data, &lt;code&gt;gzip&lt;/code&gt; compressing it, &lt;code&gt;mc&lt;/code&gt; uploading it. No threading code. No coordination.&lt;/p&gt;

&lt;p&gt;The temporary-file version? Strictly sequential. Each step waits for the previous to finish.&lt;/p&gt;

&lt;p&gt;Pipes give you streaming parallelism by default.&lt;/p&gt;

&lt;p&gt;Need more? &lt;code&gt;xargs&lt;/code&gt; and GNU &lt;code&gt;parallel&lt;/code&gt; scale across cores:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Compress 100 files with 4 workers&lt;/span&gt;
find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.log"&lt;/span&gt; | xargs &lt;span class="nt"&gt;-P&lt;/span&gt; 4 &lt;span class="nt"&gt;-I&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same pipeline thinking, multiplied across CPUs.&lt;/p&gt;

&lt;h2&gt;
  
  
  When NOT to Use Pipes
&lt;/h2&gt;

&lt;p&gt;Pipelines aren't always the answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complex state management&lt;/strong&gt; - multiple passes over data, tracking relationships&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit error handling&lt;/strong&gt; - pipelines can fail silently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extreme scale&lt;/strong&gt; - terabytes need Spark/Hadoop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bash quirkiness&lt;/strong&gt; - arcane quoting, clunky error handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Framework requirements&lt;/strong&gt; - ETL tools gain orchestration but lose composability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team familiarity&lt;/strong&gt; - if pipelines are cryptic to your team, write Python&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unix pipes work best for glue code: moving data between systems, transforming formats, filtering streams.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern to Look For
&lt;/h2&gt;

&lt;p&gt;Whenever you find yourself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Loading data into arrays&lt;/li&gt;
&lt;li&gt;Writing temporary files&lt;/li&gt;
&lt;li&gt;Parsing structured data just to transform it&lt;/li&gt;
&lt;li&gt;Installing tools on systems just to move data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…there's probably a pipeline waiting to be discovered.&lt;/p&gt;

&lt;p&gt;Not because pipelines are "better." Because they're simpler. Simple solutions are easier to debug, modify, and maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Fifty years later, Unix pipelines still win through elimination.&lt;/p&gt;

&lt;p&gt;They eliminate temporary state. They eliminate dependencies. They eliminate the complexity of treating data movement as a programming exercise.&lt;/p&gt;

&lt;p&gt;Your turn: What's a problem you recently solved with a pipeline instead of code? Or where you wrote code when a pipeline would have worked?&lt;/p&gt;

</description>
      <category>bash</category>
      <category>devops</category>
      <category>productivity</category>
      <category>cli</category>
    </item>
    <item>
      <title>You don't understand GIL</title>
      <dc:creator>Sergey Dobrov</dc:creator>
      <pubDate>Thu, 27 Nov 2025 16:14:07 +0000</pubDate>
      <link>https://dev.to/jbinary/you-dont-understand-gil-2ce7</link>
      <guid>https://dev.to/jbinary/you-dont-understand-gil-2ce7</guid>
      <description>&lt;p&gt;Some time ago I was chatting with a friend about programming languages, and the conversation drifted — inevitably — to why Python is “bad.”&lt;/p&gt;

&lt;p&gt;The first argument was that Python has “no types,” which makes it error-prone.&lt;br&gt;
I pushed back: Python’s modern type system is surprisingly expressive, and meanwhile Java manages to produce endless null-pointer exceptions despite all its ceremony. That point didn’t land well.&lt;/p&gt;

&lt;p&gt;So the next argument came out:&lt;/p&gt;

&lt;p&gt;“Anyway, Python can’t even use more than one CPU core because of the GIL.”&lt;/p&gt;

&lt;p&gt;I didn’t try to debate it.&lt;br&gt;
Instead, I opened a terminal, ran a data-processing script I had lying around, and showed them top.&lt;/p&gt;

&lt;p&gt;One Python process was using exactly 300% CPU. They did not believe it.&lt;/p&gt;

&lt;p&gt;And that moment captures something important: the GIL is one of the most confidently misunderstood ideas in all of programming.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why the GIL Seems Simple but Isn’t
&lt;/h2&gt;

&lt;p&gt;What makes this even trickier is that the GIL actually exists to make Python simple.&lt;br&gt;
It lets most of the runtime behave like a friendly, high-level environment where you never have to think about memory management, object lifetimes, or thread safety inside the interpreter.&lt;/p&gt;

&lt;p&gt;But concurrency is where that abstraction finally hits its boundary.&lt;br&gt;
The GIL doesn’t behave the same way in every situation — it takes different forms under different workloads.&lt;/p&gt;

&lt;p&gt;That’s why so many explanations are technically correct yet still misleading.&lt;/p&gt;

&lt;p&gt;So let’s peel it back one layer at a time.&lt;/p&gt;
&lt;h3&gt;
  
  
  Layer 1: “Threads Are Useless in Python”
&lt;/h3&gt;

&lt;p&gt;A common starting point is the idea that Python threads are “useless” because of the GIL.&lt;/p&gt;

&lt;p&gt;If that were true, CPython wouldn’t bother with real OS threads — it could have simulated concurrency with simple user-level green threads.&lt;/p&gt;

&lt;p&gt;But CPython uses real &lt;code&gt;pthreads&lt;/code&gt; for a reason: threads matter in Python.&lt;/p&gt;
&lt;h3&gt;
  
  
  Layer 2: “Threads Only Help for I/O”
&lt;/h3&gt;

&lt;p&gt;A slightly more sophisticated version of this is:&lt;/p&gt;

&lt;p&gt;“Threads are only useful when something blocks on I/O. Otherwise the GIL stops everything.”&lt;/p&gt;

&lt;p&gt;This sounds reasonable — but it’s still not right.&lt;/p&gt;

&lt;p&gt;And we already saw that in the very beginning: our Python process was happily using 300% CPU, with no I/O involved at all.&lt;/p&gt;

&lt;p&gt;So clearly something else is going on.&lt;/p&gt;
&lt;h3&gt;
  
  
  Layer 3: “Threads Help for I/O &lt;em&gt;or&lt;/em&gt; Native Code — So Problem Solved?”
&lt;/h3&gt;

&lt;p&gt;Once people realize threads aren’t limited to I/O, they usually expand the model:&lt;/p&gt;

&lt;p&gt;“Okay, fine — if a thread is blocked on I/O or running C code, another thread can run. That’s the whole story.”&lt;/p&gt;

&lt;p&gt;Closer, but still not exactly. Because here’s another catch:&lt;/p&gt;

&lt;p&gt;Not &lt;em&gt;all&lt;/em&gt; C libraries release the GIL. &lt;em&gt;Some&lt;/em&gt; explicitly request it.&lt;/p&gt;

&lt;p&gt;And when they do, the interpreter goes right back to single-threaded behavior, even though no Python code is executing.&lt;/p&gt;

&lt;p&gt;This is why you can see one native operation scale beautifully across cores, while another operation — also written in C — completely freezes out every other thread in the process.&lt;/p&gt;
&lt;h3&gt;
  
  
  Layer 4: Why Some C Libraries Must Hold the GIL
&lt;/h3&gt;

&lt;p&gt;So, the “I/O or C computation frees another thread” intuition is still incomplete.&lt;br&gt;
It depends entirely on what the C library is doing under the hood... and whether it needs exclusive access to Python objects or interpreter state.&lt;/p&gt;

&lt;p&gt;Every Python object — even something as small as an &lt;code&gt;int&lt;/code&gt; or a &lt;code&gt;bytes&lt;/code&gt; object — carries a reference count that tracks how many places are using it.&lt;br&gt;
Incrementing and decrementing these counts must happen one at a time.&lt;br&gt;
If two threads updated them independently, you’d get leaked objects, prematurely freed objects, or corrupted memory.&lt;/p&gt;

&lt;p&gt;But refcounts are only the beginning.&lt;/p&gt;

&lt;p&gt;Any C code that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;creates or destroys Python objects&lt;/li&gt;
&lt;li&gt;mutates a Python object (appending to a &lt;code&gt;list&lt;/code&gt;, updating a &lt;code&gt;dict&lt;/code&gt;, modifying a &lt;code&gt;set&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;raises exceptions&lt;/li&gt;
&lt;li&gt;calls back into Python code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…relies on parts of the interpreter that assume exclusive access.&lt;/p&gt;

&lt;p&gt;And the only way to guarantee that is to hold the GIL.&lt;/p&gt;

&lt;p&gt;This is why one C extension can run happily across multiple cores while another — also written in C — blocks every thread in the process.&lt;br&gt;
It depends entirely on whether that extension needs to interact with Python objects or whether it can work on its own data structures without consulting the interpreter.&lt;/p&gt;
&lt;h3&gt;
  
  
  Layer 5: Why “Just Use Multiprocessing” Is an Oversimplification
&lt;/h3&gt;

&lt;p&gt;When people get frustrated with the GIL, the next instinct is usually: “Okay, forget threads. Just use multiprocessing.”&lt;/p&gt;

&lt;p&gt;On paper it sounds perfect — each process has its own GIL, so you get true parallelism, and sometimes that is the right approach.&lt;/p&gt;

&lt;p&gt;But as a general rule, it’s still an oversimplification.&lt;/p&gt;

&lt;p&gt;Multiprocessing has real costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most data still has to be serialized to move between processes (Shared memory only works for a limited set of data types)&lt;/li&gt;
&lt;li&gt;Even with copy-on-write on Unix, large Python objects often end up duplicated anyway — refcount updates alone are enough to break CoW&lt;/li&gt;
&lt;li&gt; Starting a process has much higher overhead than starting a thread (a new interpreter instance, new memory space, new OS structures — not just a new stack)&lt;/li&gt;
&lt;li&gt;Context switching between processes is heavier for the OS than between threads&lt;/li&gt;
&lt;li&gt;Coordinating state is harder when it can’t be freely shared&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So yes — multiprocessing sidesteps the GIL for CPU-bound Python bytecode.&lt;br&gt;
But it brings fully separate runtimes, higher startup costs, heavier context switches, and more complicated data sharing.&lt;/p&gt;
&lt;h3&gt;
  
  
  How to Tell Whether a C Library Releases the GIL
&lt;/h3&gt;

&lt;p&gt;At this point a natural question comes up:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“So how do I know whether a C library actually releases the GIL?”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unfortunately, you can’t always tell from the outside — two libraries that look identical in Python can have completely different GIL behavior.&lt;br&gt;
But there are a few practical ways to reason about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. If a C extension touches Python objects, it must hold the GIL — but only for those parts&lt;/strong&gt;&lt;br&gt;
This doesn’t mean it needs the GIL for the entire function.&lt;br&gt;
A well-written C extension typically does this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Acquire GIL&lt;/li&gt;
&lt;li&gt;Inspect or convert Python inputs (refcounts, type checks, copies, etc.)&lt;/li&gt;
&lt;li&gt;Release GIL&lt;/li&gt;
&lt;li&gt;Run the heavy native computation&lt;/li&gt;
&lt;li&gt;Reacquire GIL&lt;/li&gt;
&lt;li&gt;Build Python output objects&lt;/li&gt;
&lt;li&gt;Return&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;touching Python objects → requires the GIL&lt;/li&gt;
&lt;li&gt;heavy inner computation → often does not&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why you can see “C code + threads” giving real multi-core speedups despite Python objects being involved at the boundaries.&lt;/p&gt;

&lt;p&gt;Some of the standard library’s own C extensions follow this pattern.&lt;br&gt;
For example, &lt;code&gt;zlib&lt;/code&gt;, &lt;code&gt;bz2&lt;/code&gt;, and &lt;code&gt;hashlib&lt;/code&gt; all parse Python arguments and allocate Python objects while holding the GIL, then drop the GIL around the inner compression or hashing loop, and reacquire it only to wrap up the result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Pure native computation usually releases the GIL cleanly&lt;br&gt;
Libraries like:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;zlib / bz2 / lzma&lt;/li&gt;
&lt;li&gt;hashing&lt;/li&gt;
&lt;li&gt;crypto&lt;/li&gt;
&lt;li&gt;NumPy (when you do &lt;code&gt;array1 + array2&lt;/code&gt;, NumPy releases the GIL around the actual arithmetic loop. But when you do something like &lt;code&gt;array.tolist()&lt;/code&gt;, it can't because it's creating Python objects)&lt;/li&gt;
&lt;li&gt;many image codecs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;operate on raw buffers or on their own internal data structures.&lt;br&gt;
They typically wrap the expensive part with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;Py_BEGIN_ALLOW_THREADS&lt;/span&gt;
    &lt;span class="cm"&gt;/* heavy computation */&lt;/span&gt;
&lt;span class="n"&gt;Py_END_ALLOW_THREADS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is how you get things like “Python using 300% CPU” from a single process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Some native libraries can’t release the GIL much — because their core work is Python interaction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;regex engines working on Python string objects (standard &lt;code&gt;re&lt;/code&gt; module &lt;a href="https://bugs.python.org/issue1366311" rel="noopener noreferrer"&gt;doesn't release GIL&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Python-level parsing loops (e.g. &lt;code&gt;json&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;libraries that mutate lists or dicts internally (e.g. &lt;code&gt;py-radix&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;things that allocate many intermediate Python objects (e.g. &lt;code&gt;csv&lt;/code&gt;, &lt;code&gt;json&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These must hold the GIL almost the whole time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Docs might mention it — but inconsistently&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some libraries explicitly say “this releases the GIL,” but many don’t. (e.g. &lt;a href="https://numpy.org/doc/stable/reference/thread_safety.html" rel="noopener noreferrer"&gt;NumPy&lt;/a&gt;, &lt;a href="https://docs.scipy.org/doc/scipy-1.15.0/tutorial/thread_safety.html" rel="noopener noreferrer"&gt;SciPy&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;If documentation does say it, you can trust it.&lt;br&gt;
If it doesn’t—no conclusion can be drawn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. You can measure it&lt;/strong&gt;&lt;br&gt;
A simple test:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;run the operation in many threads&lt;/li&gt;
&lt;li&gt;watch CPU usage in top or htop&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you see a single core saturated → it's likely holding the GIL&lt;br&gt;
If you see multiple cores fully used → it's releasing the GIL&lt;br&gt;
If you see partial scaling → it's partial GIL release (very common)&lt;/p&gt;

&lt;h3&gt;
  
  
  With Big Power Comes Big Responsibility
&lt;/h3&gt;

&lt;p&gt;Releasing the GIL — or running code that doesn’t use it — doesn’t magically make concurrency “safe.” It just gives you more freedom and more ways to shoot yourself in the foot.&lt;/p&gt;

&lt;p&gt;If two threads start modifying the same NumPy array, or the same shared buffer, or the same Python object through a C extension, nothing protects you anymore. You can get classic race conditions, torn writes, and inconsistent data just like in any other multithreaded language.&lt;/p&gt;

&lt;p&gt;The GIL wasn’t only a limitation — it was also a guardrail.&lt;br&gt;
Once it’s out of the way, you have to be just as careful as you would be in C++ or Java.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The GIL has sharp edges and real limitations — but it also hides a lot of complexity and keeps Python usable without turning every piece of code into a concurrency puzzle.&lt;/p&gt;

&lt;p&gt;Once you understand the layers behind it, the whole picture becomes much clearer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;threads aren’t “useless,”&lt;/li&gt;
&lt;li&gt;I/O isn’t the whole story,&lt;/li&gt;
&lt;li&gt;native code can run in parallel,&lt;/li&gt;
&lt;li&gt;native code sometimes can’t,&lt;/li&gt;
&lt;li&gt;and “just use multiprocessing” solves one problem while introducing several others.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, Python can make excellent use of multiple cores — it just depends on what kind of work you’re doing and which libraries you’re using.&lt;/p&gt;

&lt;p&gt;I’d love to hear your own experiences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;times when &lt;code&gt;multiprocessing&lt;/code&gt; was total overkill,&lt;/li&gt;
&lt;li&gt;or when a single Python process maxed out all your cores&lt;/li&gt;
&lt;li&gt;or situations where threads surprised you (for better or worse)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Leave a comment, share a story, or correct me if I got something wrong — the whole point of this post is to make the conversation around the GIL more honest and less magical.&lt;/p&gt;

&lt;h2&gt;
  
  
  tl;dr
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your workload is:
├─ Pure Python computation → multiprocessing
├─ I/O bound (network, disk) → threading or asyncio
├─ Calling C libraries
│  ├─ Don't know if GIL-safe → measure with top/htop or check docs
│  ├─ Releases GIL → threading is fine
│  └─ Requests GIL → multiprocessing or asyncio
└─ Many small tasks → consider overhead cost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>gil</category>
      <category>performance</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
