<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Fedor Nikolaev</title>
    <description>The latest articles on DEV Community by Fedor Nikolaev (@frodo).</description>
    <link>https://dev.to/frodo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F992292%2F39040d6b-f335-4180-8e17-402350c5b0b0.jpg</url>
      <title>DEV Community: Fedor Nikolaev</title>
      <link>https://dev.to/frodo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/frodo"/>
    <language>en</language>
    <item>
      <title>What Is MLIR and Why Does It Exist?</title>
      <dc:creator>Fedor Nikolaev</dc:creator>
      <pubDate>Wed, 01 Jul 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/frodo/what-is-mlir-and-why-does-it-exist-4d78</link>
      <guid>https://dev.to/frodo/what-is-mlir-and-why-does-it-exist-4d78</guid>
      <description>&lt;p&gt;If you've never written a compiler, the word "MLIR" probably looks like alphabet soup. This article is for you. By the end you'll understand, in plain language, &lt;em&gt;what&lt;/em&gt; problem MLIR solves and &lt;em&gt;why&lt;/em&gt; it had to exist at all.&lt;/p&gt;

&lt;p&gt;Let's start with the origin story — because where something comes from tells you almost everything about what it's for.&lt;/p&gt;




&lt;h2&gt;
  
  
  The origin story: from TensorFlow to a universal framework
&lt;/h2&gt;

&lt;p&gt;The story of MLIR starts in 2018 at Google. Chris Lattner, one of the most influential figures in compiler engineering, set out to solve a problem that had been bothering the industry for years — there was no common way to represent and transform code across different hardware targets and programming models. MLIR was his answer, and it went public in 2019 under the LLVM umbrella.&lt;/p&gt;

&lt;p&gt;Imagine you work on TensorFlow, Google's machine learning library. Your job is to take a model someone wrote in Python and make it run &lt;em&gt;fast&lt;/em&gt; — on a laptop CPU, on a phone, on a GPU, and on Google's custom TPU chips. To do that, the model has to be translated, step by step, into instructions each piece of hardware understands. That translation-and-optimization process is, fundamentally, a &lt;strong&gt;compiler&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The trouble was that there wasn't &lt;em&gt;one&lt;/em&gt; compiler. There were many. One team built a tool to optimize graphs. Another built a separate tool to target TPUs. Another for mobile. Another for a specific hardware accelerator. Each tool had its own way of representing the program internally, its own bugs, its own optimization tricks that couldn't be shared with the others. The ecosystem was &lt;strong&gt;siloed&lt;/strong&gt; — a pile of separate, half-overlapping compilers all reinventing the same wheels.&lt;/p&gt;

&lt;p&gt;And this wasn't unique to Google. Across the industry, the same pattern kept repeating: a new chip, a new language, or a new ML framework would appear, and someone would sit down to build &lt;em&gt;yet another&lt;/em&gt; compiler from scratch to support it. Everybody was paying the same enormous bill, over and over.&lt;/p&gt;

&lt;p&gt;Chris Lattner moved to Google in 2017 to lead the TensorFlow infrastructure team, walked straight into that fragmentation mess, and built MLIR to fix it.&lt;/p&gt;

&lt;p&gt;MLIR stands for &lt;strong&gt;Multi-Level Intermediate Representation&lt;/strong&gt;. Hold onto that name — every word in it is doing real work, and we'll unpack it as we go. The official paper describes the goals directly: reduce software fragmentation, improve compilation for the wild variety of modern hardware, dramatically lower the cost of building domain-specific compilers, and help existing compilers connect to one another.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A small but telling detail:&lt;/strong&gt; MLIR doesn't live in its own separate project. It was added &lt;em&gt;inside&lt;/em&gt; the LLVM monorepo (&lt;code&gt;llvm-project&lt;/code&gt;) in a folder literally called &lt;code&gt;mlir/&lt;/code&gt;. Why? Because LLVM already had two decades of battle-tested, reusable building blocks — data structures, error handling, a testing framework — and Lattner knew that codebase better than anyone alive. Starting from zero would have meant rebuilding all of that. Sitting inside the monorepo, MLIR could borrow it on day one.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Before we get to the machine-learning payoff, we need a shared mental model of what a compiler actually &lt;em&gt;does&lt;/em&gt;. Let's build that with the simplest possible program.&lt;/p&gt;




&lt;h2&gt;
  
  
  A quick tour: how a compiler works under the hood
&lt;/h2&gt;

&lt;p&gt;When you compile a program, your code goes on a journey through several stages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Source code
   → Frontend (parsing)
      → AST (a tree of your program)
         → IR (intermediate representation)
            → Optimization passes (run in a loop)
               → Lowering (toward the machine)
                  → Backend (per-CPU details)
                     → Code generation (actual machine code)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Don't worry about memorizing it. The three ideas that matter are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The frontend&lt;/strong&gt; reads your text and understands its structure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The IR&lt;/strong&gt; is a clean, internal representation the compiler does its real thinking in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The backend&lt;/strong&gt; turns that into instructions for a specific chip (x86, ARM, etc.).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's trace a single expression — &lt;code&gt;x = 1 + 2&lt;/code&gt; — through all three.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. The frontend: reading your text
&lt;/h3&gt;

&lt;p&gt;For instance, when you run a &lt;code&gt;.py&lt;/code&gt; file, the very first thing CPython does is break raw text into &lt;strong&gt;tokens&lt;/strong&gt; — the smallest meaningful chunks of the language.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tokenize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;

&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x = 1 + 2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenize&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;StringIO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tok&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;TokenInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NAME&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;   &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;x&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="nc"&gt;TokenInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OP&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;     &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="nc"&gt;TokenInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUMBER&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="nc"&gt;TokenInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OP&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;     &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;+&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="nc"&gt;TokenInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUMBER&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So &lt;code&gt;x = 1 + 2&lt;/code&gt; stops being an opaque string and becomes a flat list of typed pieces. The tokenizer doesn't care about &lt;em&gt;meaning&lt;/em&gt; yet — it just answers: &lt;strong&gt;"what kind of thing is this character sequence?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Next, the &lt;strong&gt;parser&lt;/strong&gt; takes that flat list of tokens and builds an &lt;strong&gt;AST&lt;/strong&gt; (Abstract Syntax Tree) — a nested structure that captures the &lt;em&gt;grammar&lt;/em&gt; of your program.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;

&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x = 1 + 2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="nc"&gt;Assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;targets&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;x&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;BinOp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Constant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Constant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)))])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flat sequence &lt;code&gt;1 + 2&lt;/code&gt; became a &lt;code&gt;BinOp&lt;/code&gt; node with an &lt;code&gt;Add&lt;/code&gt; operator and two children. The structure of the expression is now &lt;em&gt;explicit&lt;/em&gt; in the shape of the tree — not buried in the order of characters. This tree is what gets handed off to the next stage. The compiler never looks at your source text again.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. The IR: where the real thinking happens
&lt;/h3&gt;

&lt;p&gt;Next, &lt;code&gt;compile()&lt;/code&gt; takes the AST and produces &lt;strong&gt;bytecode&lt;/strong&gt; — CPython's IR. The optimizer runs between the two, applying any transformations it can find. Here it applied &lt;strong&gt;constant folding&lt;/strong&gt;: since both operands are literals, &lt;code&gt;1 + 2&lt;/code&gt; can be solved at compile time. The runtime never sees the addition at all.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dis&lt;/span&gt;

&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x = 1 + 2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="c1"&gt;# Stage 1 — AST
&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;exec&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Stage 2 — bytecode
&lt;/span&gt;&lt;span class="n"&gt;dis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  1           0 RESUME          0
              2 LOAD_CONST      0 (3)   ← already computed
              4 STORE_NAME      0 (x)
              6 RETURN_CONST    1 (None)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;1&lt;/code&gt; and &lt;code&gt;2&lt;/code&gt; are gone. Only &lt;code&gt;3&lt;/code&gt; remains.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. The backend: a glimpse
&lt;/h3&gt;

&lt;p&gt;The backend is the most complex part of any compiler and deserves its own article. For now, just one thing worth seeing: after all the stages above, &lt;code&gt;x = 1 + 2&lt;/code&gt; eventually becomes exactly &lt;strong&gt;two x86 instructions&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mov eax, 3   ; load the result (already computed at compile time)
ret          ; return it
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The CPU never sees &lt;code&gt;1&lt;/code&gt; or &lt;code&gt;2&lt;/code&gt; — only &lt;code&gt;3&lt;/code&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;CPython itself doesn't go this far. It stops at bytecode and interprets it via a virtual machine in &lt;code&gt;ceval.c&lt;/code&gt;. JIT compilers like PyPy or Numba go all the way to machine code like the snippet above.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Going deeper: dead code elimination in C++
&lt;/h2&gt;

&lt;p&gt;The Python example showed the pipeline from the outside. Let's now watch the optimizer do something slightly more interesting — remove code that will never matter.&lt;/p&gt;

&lt;p&gt;Here's a small C++ program with a deliberate mistake:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;iostream&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;dead&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"I am never used"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// created, then never read&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;cout&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="s"&gt;"Hello world&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;dead&lt;/code&gt; variable is &lt;strong&gt;dead code&lt;/strong&gt;: we build it, then never read it. A human reviewer would say "just delete that line." We're going to watch the compiler figure that out on its own.&lt;/p&gt;

&lt;p&gt;The AST captures the &lt;em&gt;structure&lt;/em&gt; of your code with all the punctuation and formatting stripped away. For brevity, the &lt;code&gt;#include&lt;/code&gt; machinery is omitted — it expands into a lot of generated declarations. The meaningful structure of &lt;code&gt;main&lt;/code&gt; looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FunctionDecl: main -&amp;gt; int
└── CompoundStmt
    ├── DeclStmt
    │   └── VarDecl: dead : std::string = "I am never used"
    ├── CallExpr: operator&amp;lt;&amp;lt;
    │   └── (std::cout &amp;lt;&amp;lt; "Hello world\n")
    └── ReturnStmt
        └── IntegerLiteral: 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tree is faithful to what you wrote — warts and all. The &lt;code&gt;dead&lt;/code&gt; variable is still there. Cleanup comes later.&lt;/p&gt;

&lt;h3&gt;
  
  
  The IR: before optimization
&lt;/h3&gt;

&lt;p&gt;The compiler then converts the AST into &lt;strong&gt;Intermediate Representation (IR)&lt;/strong&gt;. Real IR for a &lt;code&gt;std::string&lt;/code&gt; program is genuinely noisy, so let's switch to a simpler version of the same idea:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;unused&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// dead variable&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With optimizations &lt;strong&gt;off&lt;/strong&gt;, the LLVM IR looks like this (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight llvm"&gt;&lt;code&gt;&lt;span class="k"&gt;define&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="vg"&gt;@compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="nl"&gt;entry:&lt;/span&gt;
  &lt;span class="nv"&gt;%unused&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;alloca&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;
  &lt;span class="nv"&gt;%a&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;alloca&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;
  &lt;span class="nv"&gt;%b&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;alloca&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;
  &lt;span class="k"&gt;store&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="m"&gt;99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%unused&lt;/span&gt;   &lt;span class="c1"&gt;; unused = 99&lt;/span&gt;
  &lt;span class="k"&gt;store&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%a&lt;/span&gt;        &lt;span class="c1"&gt;; a = 2&lt;/span&gt;
  &lt;span class="k"&gt;store&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%b&lt;/span&gt;        &lt;span class="c1"&gt;; b = 3&lt;/span&gt;
  &lt;span class="nv"&gt;%0&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;load&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%a&lt;/span&gt;
  &lt;span class="nv"&gt;%1&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;load&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%b&lt;/span&gt;
  &lt;span class="nv"&gt;%add&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;add&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;%1&lt;/span&gt;        &lt;span class="c1"&gt;; a + b&lt;/span&gt;
  &lt;span class="k"&gt;ret&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%add&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verbose, but readable: reserve some slots, store numbers, add two of them, return the result. Every line of your source has a faithful echo — including the pointless &lt;code&gt;unused = 99&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The optimizer fires
&lt;/h3&gt;

&lt;p&gt;Now we turn optimizations &lt;strong&gt;on&lt;/strong&gt;. The compiler runs a series of &lt;strong&gt;optimization passes&lt;/strong&gt; — small, focused transformations applied in a loop until nothing more can be improved. Two run here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Constant folding&lt;/strong&gt; — &lt;code&gt;2 + 3&lt;/code&gt; is always &lt;code&gt;5&lt;/code&gt;. No reason to compute it at runtime.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dead code elimination&lt;/strong&gt; — &lt;code&gt;unused&lt;/code&gt; is written but never read. No one depends on it, so it's deleted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight llvm"&gt;&lt;code&gt;&lt;span class="k"&gt;define&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="vg"&gt;@compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="nl"&gt;entry:&lt;/span&gt;
  &lt;span class="k"&gt;ret&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The whole function became "return 5." The dead variable vanished and the arithmetic was solved at compile time. &lt;em&gt;That&lt;/em&gt; is what the compiler's middle stage is for — and it's exactly the kind of work MLIR is built to make easy across many different kinds of programs.&lt;/p&gt;

&lt;h3&gt;
  
  
  A tool worth bookmarking: Compiler Explorer
&lt;/h3&gt;

&lt;p&gt;Go to &lt;strong&gt;&lt;a href="https://godbolt.org/" rel="noopener noreferrer"&gt;godbolt.org&lt;/a&gt;&lt;/strong&gt;. Paste in C++ (or dozens of other languages), pick a compiler, and watch the output update in real time as you toggle between &lt;code&gt;-O0&lt;/code&gt; (no optimization) and &lt;code&gt;-O2&lt;/code&gt; (optimize hard). Watching dead code evaporate is the fastest way to build intuition for everything above. It's the single best companion to this article.&lt;/p&gt;




&lt;h2&gt;
  
  
  Back to machine learning: why LLVM alone wasn't enough
&lt;/h2&gt;

&lt;p&gt;So if LLVM is such a great compiler infrastructure, why couldn't TensorFlow just &lt;em&gt;use&lt;/em&gt; it directly?&lt;/p&gt;

&lt;p&gt;Here's the catch. LLVM's IR was designed to describe programs at the level of &lt;strong&gt;CPU instructions&lt;/strong&gt; — load this number, add these two registers, jump to that address. That's the right level for compiling C or Rust. But it's far too &lt;em&gt;low&lt;/em&gt; for machine learning.&lt;/p&gt;

&lt;p&gt;A neural network doesn't think in "add two registers." It thinks in operations like &lt;strong&gt;"do a 2D convolution"&lt;/strong&gt; or &lt;strong&gt;"apply softmax"&lt;/strong&gt; or &lt;strong&gt;"multiply these two matrices."&lt;/strong&gt; If you flatten all of that down to individual CPU instructions too early, you throw away the high-level meaning — and with it, the chance to do the &lt;em&gt;big&lt;/em&gt; optimizations that only make sense when you can still see "oh, these two matrix multiplications could be fused together."&lt;/p&gt;

&lt;p&gt;This is the core insight behind the &lt;strong&gt;"Multi-Level"&lt;/strong&gt; in MLIR. Instead of one fixed IR, MLIR lets you have &lt;strong&gt;many IRs at different levels of abstraction&lt;/strong&gt;, and lower your program gradually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;High level:   "matmul", "convolution", "softmax"   ← ML-shaped operations
    ↓
Mid level:    loops, array indexing, linear algebra
    ↓
Low level:    LLVM IR  →  actual CPU / GPU / TPU instructions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each level is called a &lt;strong&gt;dialect&lt;/strong&gt; in MLIR — a self-contained vocabulary of operations suited to one kind of reasoning. You optimize at the level where it's natural, &lt;em&gt;then&lt;/em&gt; lower to the next. The philosophy in one sentence: a big compiler should be broken into many small compilers between intermediate languages, each designed to make one kind of optimization easy to express.&lt;/p&gt;

&lt;p&gt;LLVM couldn't be stretched to do this: it was designed for CPUs, sat at too low a level of abstraction, and carried years of incidental baggage. But it had all those reusable pieces worth keeping. MLIR is what you get when you keep the good parts and add the missing "multi-level" idea on top.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;Let's make it concrete. Suppose we're training a network to recognize handwritten &lt;strong&gt;letters of the alphabet&lt;/strong&gt; (26 classes, A–Z). In Keras the model is just a few lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Flatten&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_shape&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;relu&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;26&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;softmax&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Innocent-looking. But under the hood, running this model is a chain of math operations on large grids of numbers. To make it fast on real hardware, a compiler has to take it through exactly the kind of multi-level lowering we just described.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's a tensor?
&lt;/h3&gt;

&lt;p&gt;Quick detour, because the word is everywhere (it's literally in "TensorFlow"). A &lt;strong&gt;tensor&lt;/strong&gt; is just a container of numbers with a shape:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single number (&lt;code&gt;7&lt;/code&gt;) → &lt;strong&gt;scalar&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A list of numbers (&lt;code&gt;[1, 2, 3]&lt;/code&gt;) → &lt;strong&gt;vector&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A grid of numbers (rows × columns) → &lt;strong&gt;matrix&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;tensor&lt;/strong&gt; generalizes all of these to &lt;em&gt;any&lt;/em&gt; number of dimensions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For our purposes: &lt;strong&gt;a tensor is a matrix of numbers, and in a neural network, those numbers are the weights the model learned during training.&lt;/strong&gt; When the model recognizes a letter, your input image (a tensor) gets multiplied by weight tensors, over and over, until it produces 26 scores — one per letter.&lt;/p&gt;

&lt;h3&gt;
  
  
  The model expressed in MLIR
&lt;/h3&gt;

&lt;p&gt;When that Keras model is fed into an MLIR-based compiler, the high-level operations get represented in a dialect with explicit tensor types. Below is a simplified but syntactically real sketch of the &lt;code&gt;Dense&lt;/code&gt; layer — a matrix multiply followed by a bias add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Input: one flattened image (784 = 28×28 numbers)
func.func @dense(%input:   tensor&amp;lt;1x784xf32&amp;gt;,
                 %weights: tensor&amp;lt;784x128xf32&amp;gt;,
                 %bias:    tensor&amp;lt;1x128xf32&amp;gt;) -&amp;gt; tensor&amp;lt;1x128xf32&amp;gt; {

  %0 = "tosa.matmul"(%input, %weights)
        : (tensor&amp;lt;1x784xf32&amp;gt;, tensor&amp;lt;784x128xf32&amp;gt;) -&amp;gt; tensor&amp;lt;1x128xf32&amp;gt;

  %1 = "tosa.add"(%0, %bias)
        : (tensor&amp;lt;1x128xf32&amp;gt;, tensor&amp;lt;1x128xf32&amp;gt;) -&amp;gt; tensor&amp;lt;1x128xf32&amp;gt;

  return %1 : tensor&amp;lt;1x128xf32&amp;gt;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at the types: &lt;code&gt;tensor&amp;lt;1x784xf32&amp;gt;&lt;/code&gt; means "a tensor shaped 1 × 784 of 32-bit floats." The compiler can &lt;em&gt;see&lt;/em&gt; the shapes and the high-level operations (&lt;code&gt;matmul&lt;/code&gt;, &lt;code&gt;add&lt;/code&gt;), which means it can reason about them — fuse operations, reorder them, choose the optimal memory layout for a TPU — all &lt;em&gt;before&lt;/em&gt; lowering everything down to LLVM IR and finally to machine code.&lt;/p&gt;

&lt;p&gt;That's the whole point. The dead-code-elimination trick we watched earlier was a tiny optimization on a tiny program. MLIR is the framework that lets you apply that same &lt;em&gt;style&lt;/em&gt; of optimization to machine-learning-shaped programs, at the right level of abstraction, for whatever hardware you're targeting — without building a brand-new compiler from scratch every single time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where we're headed next
&lt;/h2&gt;

&lt;p&gt;We've covered the &lt;em&gt;why&lt;/em&gt; — deliberately staying at altitude:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The fragmentation problem that made MLIR necessary.&lt;/li&gt;
&lt;li&gt;How a compiler flows from frontend → AST → IR → optimized machine code.&lt;/li&gt;
&lt;li&gt;Why a single fixed IR (like LLVM's) isn't enough for machine learning.&lt;/li&gt;
&lt;li&gt;What "multi-level," "dialect," and "tensor" actually mean.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the next articles we'll get our hands dirty: setting up an MLIR project, reading and writing real dialects, running an actual lowering pass, and seeing the &lt;code&gt;mlir-opt&lt;/code&gt; tool transform code live.&lt;/p&gt;

&lt;p&gt;If you want a head start, the &lt;a href="https://www.jeremykun.com/2023/08/10/mlir-getting-started/" rel="noopener noreferrer"&gt;MLIR tutorial series by Jeremy Kun&lt;/a&gt; and the &lt;a href="https://mlir.llvm.org/" rel="noopener noreferrer"&gt;official MLIR docs&lt;/a&gt; are excellent next stops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The one idea worth keeping:&lt;/strong&gt; MLIR exists because the world kept building the same compiler over and over. It's the reusable, multi-level foundation that makes that stop.&lt;/p&gt;

</description>
      <category>mlir</category>
      <category>compilers</category>
      <category>machinelearning</category>
      <category>programming</category>
    </item>
    <item>
      <title>Comparison: GitHub Actions vs BitBucket Pipelines vs GitLab CI/CD</title>
      <dc:creator>Fedor Nikolaev</dc:creator>
      <pubDate>Tue, 20 Dec 2022 14:59:08 +0000</pubDate>
      <link>https://dev.to/frodo/comparison-github-actions-vs-bitbucket-pipelines-vs-gitlab-cicd-1141</link>
      <guid>https://dev.to/frodo/comparison-github-actions-vs-bitbucket-pipelines-vs-gitlab-cicd-1141</guid>
      <description>&lt;p&gt;Today we will look into the development process and compare the three most popular continuous integration, testing, and rolling out your code. The article will help you determine if you need to make a choice or if you have already mastered one and want to learn more about the others.&lt;/p&gt;

&lt;h3&gt;
  
  
  Brief summary:
&lt;/h3&gt;

&lt;p&gt;The entry threshold plays an important role, unlike other factors. So:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gitlab CI/CD&lt;/strong&gt; - the optimal choice for most. It combines a balance of simplicity and a huge community of users;&lt;br&gt;
&lt;strong&gt;GitHub actions&lt;/strong&gt; - The most popular platform and is one of the most confusing in my opinion. It is not a coincidence that GitHub has the highest number of users and open source projects compared to other platforms;&lt;br&gt;
&lt;strong&gt;Bitbucket pipelines&lt;/strong&gt; - Atlassian's integrated toolkit allows you to create integrations between other Atlassian products on the fly. The work has its own peculiarities, designed for confident users;&lt;/p&gt;
&lt;h3&gt;
  
  
  What do these platforms have in common:
&lt;/h3&gt;

&lt;p&gt;Any of the tools discussed here will allow you to implement a working CI/CD pipeline, but its costs and operational costs are different.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;All of them:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;designed for development;&lt;/li&gt;
&lt;li&gt;have a comfortable development environment with support in the browser;&lt;/li&gt;
&lt;li&gt;have components for encapsulation;&lt;/li&gt;
&lt;li&gt;have decent performance;&lt;/li&gt;
&lt;li&gt;All support &lt;em&gt;docker&lt;/em&gt;;&lt;/li&gt;
&lt;li&gt;Self-Hosted Runners;&lt;/li&gt;
&lt;li&gt;All are using &lt;em&gt;YAML&lt;/em&gt;;&lt;/li&gt;
&lt;li&gt;Allow you to create pipelines for the application quickly;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's stipulate about the &lt;strong&gt;supported operating systems&lt;/strong&gt; right away: &lt;/p&gt;

&lt;p&gt;Not all platforms allow you to support the software development lifecycle, and usually, a Linux environment is enough for you. Still, in some cases, you will need other environments presets,  such as &lt;em&gt;Windows&lt;/em&gt; or &lt;em&gt;macOS&lt;/em&gt;,  to be able to create applications for &lt;em&gt;iOS/Windows&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bitbucket Pipelines&lt;/strong&gt; supports Linux environments at the moment and for about half a year &lt;em&gt;Windows Server&lt;/em&gt; and a few months &lt;em&gt;mac OS&lt;/em&gt; and are quite raw for use in a production system&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GitLab CI/CD&lt;/strong&gt; supports Linux, mac OS, and Windows Server&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GitHub Actions&lt;/strong&gt; supports Linux, macOS, and Windows Server.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article I will try to focus on the functionality and features of a certain platform.&lt;/p&gt;
&lt;h3&gt;
  
  
  Bitbucket Pipelines &lt;br&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Built-in integrations between their products and third-party services using the Atlassian Marketplace;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This supports only linux;&lt;/li&gt;
&lt;li&gt;A &lt;a href="https://bitbucket.org/blog/automatically-refresh-caches-when-build-dependencies-are-updated" rel="noopener noreferrer"&gt;cache&lt;/a&gt; is used that sometimes interferes with work;&lt;/li&gt;
&lt;li&gt;Restarting the whole pipeline or failed steps, inability to start one step&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is an example of a Bitbucket Pipelines file (Bitbucket Pipelines makes use of a YAML file &lt;code&gt;bitbucket-pipelines.yml&lt;/code&gt; located at the root directory of the repository):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;image: node:19.3.0-slim

pipelines:
  default:
    - step:
        name: delete cache if changes in the build dependencies
        script:
          - pipe: atlassian/bitbucket-clear-cache:3.1.1
            variables:
                 BITBUCKET_USERNAME: $BITBUCKET_USER_NAME
                 BITBUCKET_APP_PASSWORD: $BITBUCKET_APP_PASSWORD
                 CACHES: ["node"]
        condition:
          changesets:
            includePaths:
               - package.json
               - src/*
    - step:
        name: Build and test
        script:
          - npm install
          - npm test
  branches:
    main:
      - step:
          name: Hello
          script:
            - echo "Hello from Dev.to"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GitLab CI/CD &lt;br&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simplicity for writing, a lot of examples for building a full-fledged pipeline;&lt;/li&gt;
&lt;li&gt;Auto DevOps;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A weak role model for the distribution of policies who can roll out in production;&lt;/li&gt;
&lt;li&gt;Has quite a lot of bugs;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;default:
  image: node:19.3.0-slim

stages:
  - build
  - test

build-app:
  stage: build
  script:
    - npm install
    - echo "Hello from Dev.to"

test-app:
  stage: test
  script:
    - npm test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GitHub Actions &lt;br&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Has a large community. The most popular solution means that someone has already had this or that problem and most likely you will be able to solve your own;&lt;/li&gt;
&lt;li&gt;Actions marketplace;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It is not very features;&lt;/li&gt;
&lt;li&gt;Doesn’t offer very well API development;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;name: Tests

on:
  workflow_dispatch:
  push:
    branches:
    - 'main'
  pull_request:
    branches:
    - 'main'

permissions:
  contents: read

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3
      - name: Install Dependencies
        run: sudo ./.github/workflows/node-apt.sh
      - name: Install
        run: npm install
      - name: Test
        run: |
          npm test
          echo "Hello from Dev.to"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Syntax &lt;br&gt;
&lt;/h3&gt;

&lt;p&gt;The syntax is similar in all three cases, but the devil is in the details. Each platform has its own nuances, and the expediency of switching from one platform will cost not only money but also time to comprehend the documentation and the author's ideas.&lt;/p&gt;

&lt;h4&gt;
  
  
  Significant points that I would highlight:
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Bitbucket&lt;/strong&gt;&lt;br&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implicit use of variables for custom pipelines. This will not work the way you intended and you will have to drag environment variables through the artifacts for all the steps where they are needed.&lt;/li&gt;
&lt;li&gt;Bitbucket is delicate for building docker images, such settings should not confuse anyone who has had experience with bitbucket, you can refer to the official documentation for more details.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;options:
  docker: true
  size: 2x

definitions:
  services:
   docker:
     memory: 6144
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So let's talk about features of &lt;strong&gt;Gitlab&lt;/strong&gt;:&lt;br&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When a business grows out of a startup scale and begins to store a bunch of personal information, various federal laws of the company's country and certifications in the field of information security apply. It turns out that a role model is needed, the division of responsibilities of who rolls out, tests and who writes the code (which is not a common GitOps approach). Gitlab has only a few permissions and roles.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Gitlab has only a few &lt;a href="https://docs.gitlab.com/ee/user/permissions.html" rel="noopener noreferrer"&gt;permissions and roles&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What about &lt;strong&gt;Github&lt;/strong&gt;:&lt;br&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some valuable features of GitHub are paid, which is why most people stopped using it.&lt;/li&gt;
&lt;li&gt;It will be the most difficult if you are still bothered to read and use the GitHub API. Large JSON layers that need to be extracted must be sorted out and transformed into a working object.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Conclusion:&lt;br&gt;
&lt;/h3&gt;

&lt;p&gt;Now that you've figured out the platforms, let's see which one is right for you and your company.&lt;/p&gt;

&lt;p&gt;If you are going to work only on home projects or open source, Github will be an excellent choice for you without a doubt.&lt;br&gt;
While GitLab can be used if you are an enterprise. If you are someone who wants to host multiple repositories and work with many colleagues, then GitLab may be a good choice for you. Gitlab is the youngest project of all and follows the GitOps approach.&lt;/p&gt;

&lt;p&gt;BitBucket is popular among many large organizations because of its user interface and integration.&lt;/p&gt;

&lt;p&gt;But, to be honest, this question should be answered by you. It depends on your requirements, the size of the team and your niche.&lt;/p&gt;

&lt;p&gt;If you are a developer, outsourcing, consulting company, you will need good integration with a project management tool, an error reporting tool, a text editor tool, etc. Check if you can integrate your project management tool such as Trello, Jira, and then take the following steps to make a decision.&lt;/p&gt;

&lt;p&gt;Based on syntax, interface, functions, Gitlab CI/CD is one of the most fashionable and young compared to others. It is commonly used in startups and medium-sized companies. When it comes to a large enterprise, we tend to choose something more reliable, stable and user-tested to avoid the above problems.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>programming</category>
      <category>git</category>
      <category>codenewbie</category>
    </item>
  </channel>
</rss>
