<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ritesh</title>
    <description>The latest articles on DEV Community by Ritesh (@ritesh_fcb86fb4b3890c81e4).</description>
    <link>https://dev.to/ritesh_fcb86fb4b3890c81e4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3907454%2Fdcc1a247-6567-4725-bd77-82d452ed4fab.jpeg</url>
      <title>DEV Community: Ritesh</title>
      <link>https://dev.to/ritesh_fcb86fb4b3890c81e4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ritesh_fcb86fb4b3890c81e4"/>
    <language>en</language>
    <item>
      <title>Build a BPE tokenizer in 30 lines of Python and you will never read a prompt the same way again</title>
      <dc:creator>Ritesh</dc:creator>
      <pubDate>Mon, 04 May 2026 17:57:26 +0000</pubDate>
      <link>https://dev.to/ritesh_fcb86fb4b3890c81e4/build-a-bpe-tokenizer-in-30-lines-of-python-and-you-will-never-read-a-prompt-the-same-way-again-4o65</link>
      <guid>https://dev.to/ritesh_fcb86fb4b3890c81e4/build-a-bpe-tokenizer-in-30-lines-of-python-and-you-will-never-read-a-prompt-the-same-way-again-4o65</guid>
      <description>&lt;p&gt;Most engineers who use language models every day cannot, on a blank piece of paper, describe what their tokenizer is doing to the prompts they send. The library returns a list of integers; the integers are passed to the API call; the model produces a response; the engineer goes back to work. Most of the time, the abstraction holds, and there is no reason to look underneath it. Then one day, there is a reason. The model breaks on a name with an unusual suffix. The cost of a Korean prompt is twice that of an English one. A jailbreak works because of the way an emoji decomposes into surrogate pairs. None of these are mysterious once you have built a tokenizer; all of them are mysterious if you have not.&lt;/p&gt;

&lt;p&gt;This piece is the version of the chapter on Byte-Pair Encoding (BPE) that fits in a blog post. The full chapter is in Volume I of the book, but the algorithm is small enough that a thirty-line implementation is enough to change what you see when you look at a prompt. Twenty minutes, a calculator-sized corpus, and a Python REPL are all you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem BPE was invented to solve
&lt;/h2&gt;

&lt;p&gt;There are two obvious ways to turn text into integers, and both of them are bad. You can split on whitespace and assign each unique word an integer; the vocabulary explodes, every typo is a new token, and the model never sees &lt;code&gt;unhappy&lt;/code&gt; and &lt;code&gt;unhappily&lt;/code&gt; as related. Or you can split on characters; the vocabulary stays tiny, but every word becomes a long sequence of integers and the model has to learn the structure of &lt;code&gt;the&lt;/code&gt; from scratch every time it sees the three letters in a row.&lt;/p&gt;

&lt;p&gt;What you want is the middle: a vocabulary of frequent subword pieces, where common words come out as one token, common prefixes and suffixes come out as their own tokens, and rare or invented words decompose into a handful of pieces the model has seen before. BPE is one of the cheapest ways to build such a vocabulary, and it serves as the basis for the tokenizers in GPT-2, GPT-3, GPT-4, Llama, and most open-weight models you might fine-tune.&lt;/p&gt;

&lt;h2&gt;
  
  
  The algorithm in plain English
&lt;/h2&gt;

&lt;p&gt;Start with each word as a list of characters, plus a special end-of-word marker to indicate where one word ends and the next begins. Count every adjacent pair of symbols across the entire corpus. Find the most frequent pair. Merge that pair, everywhere in the corpus, into a single new symbol. Repeat until the vocabulary reaches the size you want.&lt;/p&gt;

&lt;p&gt;That is it. Four steps. The model that emerges has the property that the most common letter combinations in the training data become single tokens first, and the rare combinations are left as smaller pieces. &lt;code&gt;the&lt;/code&gt; becomes one token because &lt;code&gt;t-h-e&lt;/code&gt; is a frequent pair in your corpus. &lt;code&gt;understandable&lt;/code&gt; becomes maybe &lt;code&gt;under&lt;/code&gt; and &lt;code&gt;stand&lt;/code&gt; and &lt;code&gt;able&lt;/code&gt; because each of those substrings is frequent across the corpus, even if the full word is not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three deterministic merges, on three small words
&lt;/h2&gt;

&lt;p&gt;Take a corpus of three words: &lt;code&gt;low lower lowest&lt;/code&gt;. After the initial split into characters with an end-of-word marker (we will use &lt;code&gt;_&lt;/code&gt;), the corpus looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;l o w _
l o w e r _
l o w e s t _
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Count the adjacent pairs across all three words. The pair &lt;code&gt;(l, o)&lt;/code&gt; appears three times, and so does &lt;code&gt;(o, w)&lt;/code&gt;. After that, &lt;code&gt;(w, e)&lt;/code&gt; appears twice; every other pair appears once. We pick &lt;code&gt;(l, o)&lt;/code&gt; as the first merge (a tie is broken by ordering; the choice does not matter for the example).&lt;/p&gt;

&lt;p&gt;After merge 1, the corpus is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;lo w _
lo w e r _
lo w e s t _
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now &lt;code&gt;(lo, w)&lt;/code&gt; is the most frequent pair, with three occurrences. Merge it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;low _
low e r _
low e s t _
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;(low, e)&lt;/code&gt; is the most frequent now, with two occurrences. Merge it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;low _
lowe r _
lowe s t _
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After three merges, the word &lt;code&gt;low&lt;/code&gt; is a single token, the word &lt;code&gt;lowest&lt;/code&gt; is the three tokens &lt;code&gt;lowe s t _&lt;/code&gt;, and &lt;code&gt;lower&lt;/code&gt; is the three tokens &lt;code&gt;lowe r _&lt;/code&gt;. The shared prefix &lt;code&gt;lowe&lt;/code&gt; is what the algorithm has discovered. The algorithm did not know in advance that &lt;code&gt;lowe&lt;/code&gt; would be a useful token; it found it by counting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The thirty-line implementation
&lt;/h2&gt;

&lt;p&gt;The whole algorithm fits in a small Python file. The following script trains BPE on a corpus and prints each merge as it happens. Save it, run it, change the corpus, watch the merges change.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_pairs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return the list of adjacent pairs in a tokenized word.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;merge_pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replacement&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Replace every occurrence of pair in the tokenized word.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;replacement&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_bpe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_merges&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
    &lt;span class="n"&gt;merges&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_merges&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pair&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;get_pairs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;best&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;most_common&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;replacement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;merge_pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replacement&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;merges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replacement&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Merge &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; -&amp;gt; &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;replacement&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;merges&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;

&lt;span class="n"&gt;corpus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;low lower lowest&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;merges&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vocab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_bpe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_merges&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Final tokenization:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vocab&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it on the corpus from the worked example and the first three lines of output will match the merges we computed by hand. Run it on a larger corpus, say a paragraph from a book, and the merges become a small autobiography of which letter combinations are common in your text.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this changes about your prompts tomorrow
&lt;/h2&gt;

&lt;p&gt;Once you have built the tokenizer, four things stop being mysterious. The first is why the cost of an English prompt and a Korean prompt of the same character count differ; English text is in the GPT tokenizer's training distribution and decomposes into one token per common word, while Korean has to fall back to longer subword splits. The second is why typos cost more tokens than the corrected version of the word; the typo is rare and decomposes into pieces. The third is why models sometimes treat compound words as if they were two unrelated concepts; the merge that would have unified them never reached the cutoff. The fourth is why putting variable content at the end of a prompt makes prefix caching cheap; the tokenizer is deterministic, and an unchanged prefix produces an unchanged sequence of tokens, which is what the cache keys on.&lt;/p&gt;

&lt;p&gt;None of these is mysterious once the four-step algorithm has gone through your fingers on a tiny corpus. All of them are mysterious if it has not.&lt;/p&gt;

&lt;p&gt;The book this excerpt is drawn from is two volumes long, and the BPE chapter is one of eighteen. &lt;/p&gt;

&lt;p&gt;The companion repository is at &lt;a href="https://github.com/ritesh-modi/inside-llm" rel="noopener noreferrer"&gt;github.com/ritesh-modi/inside-llm&lt;/a&gt;; &lt;/p&gt;

&lt;p&gt;The script above is in the chapter-three folder, with a longer corpus and a few extensions (vocabulary serialization, a &lt;code&gt;tokenize&lt;/code&gt; function for new text, and the byte-level fallback that real tokenizers use). Clone it, run it on your own paragraph, and the next prompt you write will look different.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>genai</category>
      <category>ai</category>
      <category>chatgpt</category>
    </item>
    <item>
      <title>Build a working asyncio event loop in 30 lines of plain Python</title>
      <dc:creator>Ritesh</dc:creator>
      <pubDate>Mon, 04 May 2026 17:51:25 +0000</pubDate>
      <link>https://dev.to/ritesh_fcb86fb4b3890c81e4/build-a-working-asyncio-event-loop-in-30-lines-of-plain-python-5gjb</link>
      <guid>https://dev.to/ritesh_fcb86fb4b3890c81e4/build-a-working-asyncio-event-loop-in-30-lines-of-plain-python-5gjb</guid>
      <description>&lt;p&gt;I will keep saying this until it stops being controversial: asyncio is small. The reason it feels large is that every tutorial introduces the keywords before the runtime they describe, leaving you to guess at what &lt;code&gt;await&lt;/code&gt; is actually doing. Strip the keywords away, and the runtime fits in 30 lines of plain Python with no asyncio import.&lt;/p&gt;

&lt;p&gt;This post walks through the toy event loop end to end. The code runs. Paste it, save, execute. By the end, you will have built a working asyncio runtime in your terminal and watched it interleave three jobs on a single thread.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of the problem
&lt;/h2&gt;

&lt;p&gt;A program with three jobs, each of which spends most of its time waiting. The naive version runs them one after the other and pays the sum of the waits. A program that overlaps the waits pays only the longest one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;serial:     [job A 2s] -&amp;gt; [job B 1s] -&amp;gt; [job C 3s]   total 6s
concurrent: [A 2s] [B 1s] [C 3s]   all overlapping   total 3s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Six seconds versus three. The work is identical. The only thing that changes is whether the jobs queue up behind each other or share the time.&lt;/p&gt;

&lt;h2&gt;
  
  
  A job is a generator that yields
&lt;/h2&gt;

&lt;p&gt;A generator is a function that can pause itself and be resumed by its caller. The pause point is &lt;code&gt;yield&lt;/code&gt;. The caller advances the generator with &lt;code&gt;next(...)&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;example&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;step 1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;step 2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;step 3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; g &lt;span class="o"&gt;=&lt;/span&gt; example&lt;span class="o"&gt;()&lt;/span&gt;
&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; next&lt;span class="o"&gt;(&lt;/span&gt;g&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;step 1
&lt;/span&gt;&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; next&lt;span class="o"&gt;(&lt;/span&gt;g&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;step 2
&lt;/span&gt;&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; next&lt;span class="o"&gt;(&lt;/span&gt;g&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;step 3
Traceback (most recent call last):
&lt;/span&gt;&lt;span class="c"&gt;  ...
&lt;/span&gt;&lt;span class="go"&gt;StopIteration
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the entire mechanism. A &lt;code&gt;yield&lt;/code&gt; is a bookmark. The caller picks up a different generator, runs it for a while, and comes back to the bookmarked one when it feels like it. Hold on to that picture; it is what &lt;code&gt;await&lt;/code&gt; will do later, in fewer letters.&lt;/p&gt;

&lt;h2&gt;
  
  
  A timer that needs three ticks
&lt;/h2&gt;

&lt;p&gt;For the toy loop, a "wait" is a generator that yields a target wake-up time. The loop checks the time on each pass; once the wake-up time has passed, the generator is allowed to advance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;deadline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A job that "waits two seconds" is a generator that yields the deadline &lt;code&gt;now + 2&lt;/code&gt; until that deadline passes, then returns. The loop watches deadlines; jobs advance when their deadline arrives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The loop in 30 lines
&lt;/h2&gt;

&lt;p&gt;Here it is. Save it as &lt;code&gt;toy_loop.py&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Run the given jobs until all of them finish.

    Each job is a generator. A job yields a deadline (a time.time()
    value) to mean &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wake me up at or after this time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;. When a job
    returns (StopIteration), it is removed from the queue.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wake_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;popleft&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;wake_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wake_at&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# avoid a tight CPU spin
&lt;/span&gt;            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;new_wake_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
            &lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_wake_at&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;StopIteration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;deadline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Twenty-eight lines of code. No imports beyond &lt;code&gt;time&lt;/code&gt; and &lt;code&gt;deque&lt;/code&gt;. No &lt;code&gt;asyncio&lt;/code&gt;. No threads. No futures. The whole runtime is a queue and a while loop.&lt;/p&gt;

&lt;p&gt;The while loop pops the front entry. If the job's wake-up time is in the future, push it to the back, sleep one millisecond, continue. If the job is ready, advance it with &lt;code&gt;next(...)&lt;/code&gt;; the job runs until its next &lt;code&gt;yield&lt;/code&gt;, returns the new wake-up time, and goes back on the queue. When &lt;code&gt;next(job)&lt;/code&gt; raises &lt;code&gt;StopIteration&lt;/code&gt;, the job is finished and does not return to the queue.&lt;/p&gt;

&lt;p&gt;That is the whole runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run it
&lt;/h2&gt;

&lt;p&gt;Add three jobs and a main block.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; started, waiting &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; done&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;B&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;yield from sleep(...)&lt;/code&gt; delegates to another generator. It is the same idea &lt;code&gt;await&lt;/code&gt; will be later. Each &lt;code&gt;yield&lt;/code&gt; from &lt;code&gt;sleep&lt;/code&gt; flows up through &lt;code&gt;fetch&lt;/code&gt; to the &lt;code&gt;next(job)&lt;/code&gt; call in the loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python toy_loop.py
&lt;span class="go"&gt;  A started, waiting 2.0s
  B started, waiting 1.0s
  C started, waiting 3.0s
  B done
  A done
  C done
total: 3.00s
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three seconds. Not six. Three jobs whose waits sum to six seconds finished in the time of the longest one. Single thread. CPU idle for almost all of it. That is concurrency.&lt;/p&gt;

&lt;h2&gt;
  
  
  This is what asyncio is
&lt;/h2&gt;

&lt;p&gt;Now look at what we built and what &lt;code&gt;asyncio&lt;/code&gt; adds on top.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;asyncio&lt;/code&gt; replaces the &lt;code&gt;time.sleep(0.001)&lt;/code&gt; polling with a real selector-based wait on file descriptors (&lt;code&gt;epoll&lt;/code&gt; on Linux, &lt;code&gt;kqueue&lt;/code&gt; on macOS, &lt;code&gt;IOCP&lt;/code&gt; on Windows). The selector tells the OS, "wake me up when any of these sockets has data, or when the next deadline arrives", so the loop sleeps for exactly as long as it needs to and no longer. That is the only meaningful difference between this toy loop and the real one.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;async def&lt;/code&gt; is &lt;code&gt;def&lt;/code&gt; with one extra property: the function returns a coroutine object instead of running its body. The coroutine object is the same shape as a generator. It pauses at &lt;code&gt;await&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;await x&lt;/code&gt; is &lt;code&gt;yield from x.__await__()&lt;/code&gt;. It is &lt;code&gt;yield from&lt;/code&gt; with a different name and a slightly tighter contract.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;asyncio.run(main())&lt;/code&gt; is the same as &lt;code&gt;while ready:&lt;/code&gt; loop, with proper exception handling, signal handling, and the selector mentioned above.&lt;/p&gt;

&lt;p&gt;The asyncio source code is more than 30 lines long, but the extra lines cover corner cases (cancellation propagation, exception groups, the lost-task trap), not new mechanisms. The mechanism is what you just built.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to go next
&lt;/h2&gt;

&lt;p&gt;If you want to go deeper, my book &lt;em&gt;Asyncio from Ground Up&lt;/em&gt; opens with the same toy loop, then takes the lid off at the bytecode level (&lt;code&gt;__await__&lt;/code&gt;, &lt;code&gt;coro.send&lt;/code&gt;, what a coroutine frame holds), then introduces the real &lt;code&gt;asyncio&lt;/code&gt; API as a polished version of what you already have.&lt;/p&gt;

&lt;p&gt;The book is on Leanpub: &lt;a href="https://leanpub.com/author/book/asyncio/home" rel="noopener noreferrer"&gt;https://leanpub.com/author/book/asyncio/home&lt;/a&gt;. More about my work at &lt;a href="https://www.riteshmodi.com" rel="noopener noreferrer"&gt;www.riteshmodi.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The technology is small. The runtime fits on a single screen of code. The barrier was the order in which it was taught to you.&lt;/p&gt;

</description>
      <category>asyncio</category>
      <category>python</category>
      <category>softwareengineering</category>
      <category>genai</category>
    </item>
  </channel>
</rss>
