DEV Community

Cover image for A Secret I Will Never Reveal
Joseph Boone
Joseph Boone

Posted on

A Secret I Will Never Reveal

My Secret:

I may or may not know the exact way to release the GIL on python 3.10+

I might also know how to recreate this minimally and easily without C extensions

It's not a magic trick, or some weird off beat code that doesn't work.

It functional, typed, code. What it really is I'll never actually reveal directly. Because I already did by releasing TokenGate, the abstraction itself is simple readable and traceable in that codebase. (Once you see it, you know why.)

When I first unlocked the GIL I tested it and traced the exact requirements. You can do it with or without serialization (serializing makes it easier for obvious reasons once you know but the physics doesn't change). Threading can output ~45x operations across 32 workers on my system in varied normal task distributions. Take a look at these results:

Wave   Tokens    OK      Fail    Time      Tok/s     Lat(ms)   Conc    Overlap
1      4         4       0       0.003s    1386.2    0.721     1.00×   1.44×
2      8         8       0       0.003s    2391.2    0.418     1.72×   2.48×
3      16        16      0       0.006s    2744.8    0.364     1.98×   4.82×
4      32        32      0       0.011s    2812.7    0.356     2.03×   11.32×
5      64        64      0       0.022s    2880.0    0.347     2.08×   22.01×
6      128       128     0       0.044s    2907.6    0.344     2.10×   29.78×
7      256       256     0       0.090s    2846.8    0.351     2.05×   37.98×
8      512       512     0       0.182s    2811.5    0.356     2.03×   41.81×
9      1024      1024    0       0.364s    2813.9    0.355     2.03×   44.18× <-
10     2048      2048    0       0.775s    2644.3    0.378     1.91×   44.86× <-
11     4096      4096    0       1.454s    2816.3    0.355     2.03×   38.34×
12     8192      8192    0       2.905s    2819.9    0.355     2.03×   32.64×
13     16384     16384   0       5.925s    2765.0    0.362     1.99×   27.92×
14     32768     32768   0       12.102s   2707.7    0.369     1.95×   24.96× <-
15     65536     65536   0       23.494s   2789.5    0.358     2.01×   24.21× <- 

Wave 9+10, 1024-2048 tasks at any given moment = 40+ times faster

Wave 14 + 15, Massive overload and still holding GIL free status.

TOTAL  131,068   131,068  0      89.091s
Avg latency : 0.386 ms/token
Peak overlap: 44.86×
Enter fullscreen mode Exit fullscreen mode

All tokens submitted instantly per-batch.

I'll just say this: There are domains, there is physics, the CPU must obey these. So think about that and read TokenGate.

(Bonus hint: There is exactly 3 components required to unlock the GIL.)

Top comments (0)