<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: BMJ</title>
    <description>The latest articles on DEV Community by BMJ (@bare_metal_junkie).</description>
    <link>https://dev.to/bare_metal_junkie</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3950922%2F78c58dc4-8696-4783-b57b-3b7cd89eac18.jpg</url>
      <title>DEV Community: BMJ</title>
      <link>https://dev.to/bare_metal_junkie</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bare_metal_junkie"/>
    <language>en</language>
    <item>
      <title>ForgeZero Now Supports musl Cross-Compilation and Objective-C on Linux</title>
      <dc:creator>BMJ</dc:creator>
      <pubDate>Sat, 06 Jun 2026 08:55:36 +0000</pubDate>
      <link>https://dev.to/bare_metal_junkie/forgezero-now-supports-musl-cross-compilation-and-objective-c-on-linux-1025</link>
      <guid>https://dev.to/bare_metal_junkie/forgezero-now-supports-musl-cross-compilation-and-objective-c-on-linux-1025</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8na16bu6yqevfcsm9c55.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8na16bu6yqevfcsm9c55.png" alt=" " width="799" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  🚀 ForgeZero Now Supports musl Cross-Compilation and Objective-C on Linux
&lt;/h1&gt;

&lt;p&gt;I've been spending the last few weeks improving &lt;strong&gt;ForgeZero&lt;/strong&gt;, adding support for more toolchains and making cross-platform development a little easier.&lt;/p&gt;

&lt;p&gt;The latest update introduces two features that I wanted for quite some time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;musl cross-compilation&lt;/strong&gt; powered by the Zig toolchain&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Objective-C support on Linux&lt;/strong&gt; with automatic compiler and linker selection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The best part? &lt;strong&gt;No additional configuration is required.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just write your code and let ForgeZero handle the rest.&lt;/p&gt;




&lt;h1&gt;
  
  
  Static musl Builds
&lt;/h1&gt;

&lt;p&gt;ForgeZero can now generate &lt;strong&gt;fully static musl binaries&lt;/strong&gt;, making cross-compilation almost effortless.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;fz &lt;span class="nt"&gt;-cc&lt;/span&gt; main.c &lt;span class="nt"&gt;-musl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;riscv64 &lt;span class="nt"&gt;-toolchain&lt;/span&gt; zig
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The resulting binary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;main: ELF 64-bit LSB executable, UCB RISC-V, RVC, double-float ABI, version 1 (SYSV), statically linked, with debug_info, not stripped
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Static binaries are incredibly useful when building:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🐳 Minimal Docker images (&lt;code&gt;scratch&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;📦 Portable standalone executables&lt;/li&gt;
&lt;li&gt;🔌 Embedded &amp;amp; IoT applications&lt;/li&gt;
&lt;li&gt;🖥️ Systems without glibc&lt;/li&gt;
&lt;li&gt;🌍 Cross-platform deployment pipelines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Using &lt;strong&gt;Zig&lt;/strong&gt; as the backend compiler makes targeting different architectures surprisingly simple while keeping the ForgeZero interface exactly the same.&lt;/p&gt;




&lt;h1&gt;
  
  
  Objective-C on Linux
&lt;/h1&gt;

&lt;p&gt;This is probably the feature I'm most excited about.&lt;/p&gt;

&lt;p&gt;ForgeZero now automatically detects &lt;strong&gt;&lt;code&gt;.m&lt;/code&gt; Objective-C source files&lt;/strong&gt; and switches to the correct compilation pipeline without requiring any flags.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;fz &lt;span class="nt"&gt;-cc&lt;/span&gt; main.m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verbose output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Objective-C detected!
Bypassing Zig linker to use Clang with -lobjc

Running:
clang main.o -o main -lobjc -Wl,--build-id=none

Built: main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No custom scripts.&lt;/p&gt;

&lt;p&gt;No Makefiles.&lt;/p&gt;

&lt;p&gt;No manually remembering linker flags.&lt;/p&gt;

&lt;p&gt;ForgeZero simply detects the language and invokes the correct backend automatically.&lt;/p&gt;




&lt;h1&gt;
  
  
  Why Objective-C?
&lt;/h1&gt;

&lt;p&gt;Most developers associate Objective-C exclusively with macOS and Apple's ecosystem.&lt;/p&gt;

&lt;p&gt;However, &lt;strong&gt;GNU Objective-C works perfectly fine on Linux&lt;/strong&gt; through Clang and &lt;code&gt;libobjc&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Supporting it means ForgeZero can now build another systems programming language using exactly the same interface as C or Assembly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;fz &lt;span class="nt"&gt;-cc&lt;/span&gt; hello.c
fz &lt;span class="nt"&gt;-cc&lt;/span&gt; hello.m
fz &lt;span class="nt"&gt;-asm&lt;/span&gt; boot.asm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The command stays the same—the build pipeline adapts automatically.&lt;/p&gt;




&lt;h1&gt;
  
  
  Where ForgeZero is Going
&lt;/h1&gt;

&lt;p&gt;ForgeZero originally started as a tiny utility to avoid typing endless compiler and linker commands.&lt;/p&gt;

&lt;p&gt;Over time, it has evolved into a unified build frontend capable of orchestrating multiple toolchains while automatically selecting the right backend for the current source language.&lt;/p&gt;

&lt;p&gt;The philosophy remains simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Write code, run one command, and let the build system figure out the details.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There's still a lot of work ahead, but I'm happy with the direction the project is taking.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;⭐ &lt;strong&gt;GitHub &amp;amp; Documentation:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://github.com/forgezero-cli/forgezero" rel="noopener noreferrer"&gt;https://github.com/forgezero-cli/forgezero&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;👤 &lt;strong&gt;Author:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://github.com/alexvoste" rel="noopener noreferrer"&gt;https://github.com/alexvoste&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback, ideas, and contributions are always welcome.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Gloria JIT v4.4.0 — Bare-Metal Control, Memory Primitives, and Structured Flow</title>
      <dc:creator>BMJ</dc:creator>
      <pubDate>Mon, 01 Jun 2026 22:23:30 +0000</pubDate>
      <link>https://dev.to/bare_metal_junkie/gloria-jit-v440-bare-metal-control-memory-primitives-and-structured-flow-5oc</link>
      <guid>https://dev.to/bare_metal_junkie/gloria-jit-v440-bare-metal-control-memory-primitives-and-structured-flow-5oc</guid>
      <description>&lt;h2&gt;
  
  
  What is Gloria JIT?
&lt;/h2&gt;

&lt;p&gt;Gloria JIT is a low-level programming language and compiler that is part of the &lt;a href="https://dev.to/alexvoste"&gt;ForgeZero&lt;/a&gt; ecosystem. It is written in Go and compiles source code directly to x86-64 machine code — no LLVM, no GCC, no Clang in the middle.&lt;/p&gt;

&lt;p&gt;The goal is straightforward: give the programmer direct, unmediated control over the machine. No intermediate representation handed off to a third-party backend. No optimizer making decisions you didn't ask for. The compiler emits raw bytes, and those bytes run.&lt;/p&gt;

&lt;p&gt;This makes Gloria JIT an interesting project for anyone curious about how compilers actually work, or for developers who want to explore bare-metal programming without the abstraction layers that most toolchains introduce.&lt;/p&gt;

&lt;p&gt;v4.4.0 expands the language significantly, adding structured control flow, direct memory and I/O access, and a bare-metal output path for VGA framebuffer writing.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;while&lt;/code&gt; Loops — Structured Control Flow
&lt;/h2&gt;

&lt;p&gt;Before this release, repeated execution required manual branching. v4.4.0 adds proper &lt;code&gt;while&lt;/code&gt; loops.&lt;/p&gt;

&lt;p&gt;A loop runs as long as its condition is non-zero. Inside the loop body you can use standard assignment and mutation operators (&lt;code&gt;=&lt;/code&gt;, &lt;code&gt;+=&lt;/code&gt;, &lt;code&gt;-=&lt;/code&gt;), and built-in calls are permitted as well. This is a meaningful step toward the kind of control flow you'd expect from a general-purpose language.&lt;/p&gt;




&lt;h2&gt;
  
  
  Memory Primitives — &lt;code&gt;peek&lt;/code&gt; and &lt;code&gt;poke&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Two new built-in functions expose direct memory access:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;peek(address)&lt;/code&gt; — reads a 16-bit value from the given address&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;poke(address, value)&lt;/code&gt; — writes a 16-bit value to the given address&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both accept either immediate values or runtime variables as arguments.&lt;/p&gt;

&lt;p&gt;This is the kind of primitive that exists in very few high-level languages, but is essential for bare-metal work — writing to hardware registers, inspecting memory-mapped I/O, or building your own allocator from scratch. Handle with care.&lt;/p&gt;




&lt;h2&gt;
  
  
  x86-64 Port I/O
&lt;/h2&gt;

&lt;p&gt;For environments that use port-mapped I/O (common in older PC hardware and embedded x86 systems), two new built-ins are available:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;in8(port)&lt;/code&gt; — reads a single byte from the given I/O port (zero-extended)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;out8(port, value)&lt;/code&gt; — writes a byte to the given I/O port&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This enables direct hardware interaction at a level that most programming languages simply do not expose.&lt;/p&gt;




&lt;h2&gt;
  
  
  VGA Framebuffer Output
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;print(string)&lt;/code&gt; now supports a bare-metal execution path that writes directly to the VGA text buffer at memory address &lt;code&gt;0xB8000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For context: on x86 PCs, this address is where text-mode video memory lives. Writing bytes there places characters directly on screen — no operating system, no drivers, no system calls involved. This is how early PC software (and modern bootloaders) produce output.&lt;/p&gt;

&lt;p&gt;Details of the implementation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Default text color is green (&lt;code&gt;0x0A&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Register &lt;code&gt;R15&lt;/code&gt; is reserved as a cursor offset and is preserved across calls&lt;/li&gt;
&lt;li&gt;Escape sequences &lt;code&gt;\n&lt;/code&gt; and &lt;code&gt;\t&lt;/code&gt; are resolved at compile time into the appropriate control characters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To support testing without real hardware, two utilities are included: &lt;code&gt;patchVGA()&lt;/code&gt; swaps the real framebuffer address for a heap-allocated buffer, and &lt;code&gt;dumpVGA()&lt;/code&gt; renders that buffer to stdout via a direct syscall, with zero heap allocations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Register Constants
&lt;/h2&gt;

&lt;p&gt;To reduce ambiguity in the backend IR and make generated code easier to reason about, named constants have been introduced for all general-purpose x86-64 registers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;regRAX(0), regRCX(1), regRDX(2), regRBX(3),
regRSP(4), regRBP(5), regRSI(6), regRDI(7),
regR8(8) ... regR15(15)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Internal Changes
&lt;/h2&gt;

&lt;p&gt;A few backend improvements shipped alongside the language features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;emitLowLevelPrint&lt;/code&gt; now accepts a &lt;code&gt;kernelMode&lt;/code&gt; flag to switch between syscall-based output and direct VGA writes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;emitPushReg&lt;/code&gt; and &lt;code&gt;emitPopReg&lt;/code&gt; now cover the full register set including R8–R15&lt;/li&gt;
&lt;li&gt;New operations: &lt;code&gt;emitMovMemToReg64&lt;/code&gt;, &lt;code&gt;emitMovRegToMem64&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;parseStringLiteral&lt;/code&gt; handles escape sequences at compile time rather than at runtime&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;emitBareMetalPrint&lt;/code&gt; introduced for the VGA output path&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Where This Is Heading
&lt;/h2&gt;

&lt;p&gt;With v4.4.0, Gloria JIT operates in two modes simultaneously: as a userspace compiler with kernel-aware syscall output, and as a bare-metal code generator capable of running without an operating system underneath it.&lt;/p&gt;

&lt;p&gt;The next focus areas are optimization passes and IR stability. If you're interested in how compilers work from the ground up — or in low-level x86 programming without giving up a proper language — this is worth following.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Gloria JIT is part of the ForgeZero project. Follow along on &lt;a href="https://dev.to/alexvoste"&gt;dev.to/alexvoste&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>compilers</category>
      <category>lowlevel</category>
      <category>x86</category>
      <category>go</category>
    </item>
    <item>
      <title>ForgeZero 4.1 vs GNU Make: Up to 4.5x Faster Build Performance</title>
      <dc:creator>BMJ</dc:creator>
      <pubDate>Tue, 26 May 2026 13:37:57 +0000</pubDate>
      <link>https://dev.to/bare_metal_junkie/forgezero-41-vs-gnu-make-up-to-45x-faster-build-performance-2664</link>
      <guid>https://dev.to/bare_metal_junkie/forgezero-41-vs-gnu-make-up-to-45x-faster-build-performance-2664</guid>
      <description>&lt;h1&gt;
  
  
  ForgeZero 4.1 vs GNU Make: Up to 4.5x Faster Build Performance
&lt;/h1&gt;

&lt;p&gt;I've been working on &lt;strong&gt;ForgeZero&lt;/strong&gt;, a modern build system designed to replace traditional &lt;code&gt;make&lt;/code&gt; with a faster, zero-config approach.&lt;/p&gt;

&lt;p&gt;With the release of &lt;strong&gt;ForgeZero 4.1&lt;/strong&gt;, I benchmarked it against GNU Make on multiple machines to answer one simple question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can a modern build system still significantly outperform &lt;code&gt;make&lt;/code&gt; in 2025?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Turns out: &lt;strong&gt;yes&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benchmark setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GNU Make
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make &lt;span class="nt"&gt;-j4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ForgeZero
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./fzt &lt;span class="nt"&gt;-dir&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-out&lt;/span&gt; fz_out
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Measurement tool
&lt;/h3&gt;

&lt;p&gt;Benchmarks were run using &lt;a href="https://github.com/sharkdp/hyperfine" rel="noopener noreferrer"&gt;&lt;code&gt;hyperfine&lt;/code&gt;&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hyperfine &lt;span class="s1"&gt;'./fzt -dir . -out fz_out'&lt;/span&gt; &lt;span class="s1"&gt;'make -j4'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa5hmv2hlqh7zeojmrfth.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa5hmv2hlqh7zeojmrfth.png" alt=" " width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  Results
&lt;/h1&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;ForgeZero&lt;/th&gt;
&lt;th&gt;GNU Make&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ryzen 9 7950X3D (KVM, 1 vCPU)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~80–84 ms&lt;/td&gt;
&lt;td&gt;~350–364 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.1x–4.5x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Intel Core i5-10310U&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;82.2 ms&lt;/td&gt;
&lt;td&gt;291.1 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.54x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AMD FX-8370E (AM3+)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;111.0 ms&lt;/td&gt;
&lt;td&gt;238.5 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.15x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Across very different CPUs and environments, ForgeZero consistently outperformed GNU Make.&lt;/p&gt;




&lt;h1&gt;
  
  
  Example benchmark output
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Ryzen 9 7950X3D
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Benchmark 1: ./fzt -dir . -out fz_out
Time (mean ± σ): 84.5 ms ± 7.6 ms

Benchmark 2: make -j4
Time (mean ± σ): 350.4 ms ± 16.4 ms

Summary:
4.14x faster than make -j4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Intel i5-10310U
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Benchmark 1: ./fzt -dir . -out fz_out
Time (mean ± σ): 82.2 ms ± 4.2 ms

Benchmark 2: make -j4
Time (mean ± σ): 291.1 ms ± 11.2 ms

Summary:
3.54x faster than make -j4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  AMD FX-8370E
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Benchmark 1: ./fzt -dir . -out fz_out
Time (mean ± σ): 111.0 ms ± 17.9 ms

Benchmark 2: make -j4
Time (mean ± σ): 238.5 ms ± 24.4 ms

Summary:
2.15x faster than make -j4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h1&gt;
  
  
  Why is ForgeZero faster?
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. No shell overhead
&lt;/h2&gt;

&lt;p&gt;GNU Make spends a surprising amount of time spawning shell processes (&lt;code&gt;fork/exec&lt;/code&gt;) for build commands.&lt;/p&gt;

&lt;p&gt;ForgeZero executes the build pipeline directly.&lt;/p&gt;

&lt;p&gt;No unnecessary shell orchestration.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Lightweight dependency graph
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;make&lt;/code&gt; still carries decades of legacy behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Makefile parsing&lt;/li&gt;
&lt;li&gt;implicit rules&lt;/li&gt;
&lt;li&gt;pattern matching&lt;/li&gt;
&lt;li&gt;recursive variable expansion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ForgeZero builds a direct dependency graph and updates only what actually changed.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Better task scheduling
&lt;/h2&gt;

&lt;p&gt;Instead of the classic &lt;code&gt;make -j&lt;/code&gt; job model, ForgeZero uses a lightweight internal scheduler with lower synchronization overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Zero-config design
&lt;/h2&gt;

&lt;p&gt;No giant Makefiles.&lt;/p&gt;

&lt;p&gt;No boilerplate.&lt;/p&gt;

&lt;p&gt;ForgeZero analyzes project structure automatically and starts building immediately.&lt;/p&gt;




&lt;h1&gt;
  
  
  Why this matters
&lt;/h1&gt;

&lt;p&gt;&lt;code&gt;make&lt;/code&gt; was introduced in &lt;strong&gt;1976&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Almost 50 years later, many developers still accept build-system latency as unavoidable.&lt;/p&gt;

&lt;p&gt;These benchmarks suggest otherwise.&lt;/p&gt;

&lt;p&gt;If your team runs hundreds of builds per day—locally and in CI—even saving &lt;strong&gt;100–250 ms per build&lt;/strong&gt; adds up quickly.&lt;/p&gt;




&lt;h1&gt;
  
  
  ForgeZero 4.1 is available
&lt;/h1&gt;

&lt;p&gt;GitHub:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/forgezero-cli/forgezero" rel="noopener noreferrer"&gt;https://github.com/forgezero-cli/forgezero&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'd love feedback from developers still using &lt;code&gt;make&lt;/code&gt;, &lt;code&gt;ninja&lt;/code&gt;, or other build systems.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>performance</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Zero Heap Allocations at 1.18 GB/s: Deep Dive into ForgeZero 4.0.x</title>
      <dc:creator>BMJ</dc:creator>
      <pubDate>Mon, 25 May 2026 15:14:42 +0000</pubDate>
      <link>https://dev.to/bare_metal_junkie/zero-heap-allocations-at-118-gbs-deep-dive-into-forgezero-40x-3emp</link>
      <guid>https://dev.to/bare_metal_junkie/zero-heap-allocations-at-118-gbs-deep-dive-into-forgezero-40x-3emp</guid>
      <description>&lt;p&gt;What happens when you migrate a system tool from pure Node.js to Go, strip out the standard GC-heavy paths, and force a file system engine to hit &lt;strong&gt;0 allocs/op&lt;/strong&gt;?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fylu2eezgd8y4yfesopx5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fylu2eezgd8y4yfesopx5.png" alt=" " width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1sbrfbnfivlh7dro7dy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1sbrfbnfivlh7dro7dy.png" alt=" " width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You get &lt;strong&gt;ForgeZero&lt;/strong&gt; (&lt;code&gt;fz&lt;/code&gt;) — an open-source bare-metal system software builder created by &lt;a href="https://github.com/AlexVoste" rel="noopener noreferrer"&gt;@AlexVoste&lt;/a&gt;. Designed to eliminate bloated Makefiles for low-level developers, it orchestrates NASM, GAS, FASM, GCC, and Clang concurrently under a single unified &lt;code&gt;.fz.yaml&lt;/code&gt; configuration.&lt;/p&gt;

&lt;p&gt;With the recent launch of &lt;strong&gt;version 4.0&lt;/strong&gt; and its subsequent &lt;strong&gt;4.0.1 patch&lt;/strong&gt;, the project underwent a radical low-level optimization sprint targeting Go's runtime overhead.&lt;/p&gt;

&lt;p&gt;Here's a technical breakdown of how it achieves near-native bare-metal execution speeds.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ The Benchmark Reality Check
&lt;/h2&gt;

&lt;p&gt;Running on an Arch Linux testbed (Intel i5-10310U), the updated engine delivers striking performance metrics:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data throughput&lt;/td&gt;
&lt;td&gt;~1.18 GB/s steady state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File hashing (100 MB payload)&lt;/td&gt;
&lt;td&gt;~78–84 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory footprint&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;0 allocs/op&lt;/strong&gt; across all hot-path runs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;goos: linux
goarch: amd64
BenchmarkHadesEngine/Process100MB-8   14   78411200 ns/op   0 B/op   0 allocs/op
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By completely avoiding heap allocations on critical execution paths, the application bypasses Go's Garbage Collector entirely — achieving &lt;strong&gt;deterministic latency&lt;/strong&gt; similar to C or Rust.&lt;/p&gt;




&lt;h2&gt;
  
  
  🛠️ The Architecture: Under the Hood of HADES
&lt;/h2&gt;

&lt;p&gt;To pull off &lt;code&gt;0 allocs/op&lt;/code&gt; while scanning deeply nested directory structures and executing multiple sub-processes, the compiler architecture leans on three internal layers.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The HADES Engine &amp;amp; Memory Re-use
&lt;/h3&gt;

&lt;p&gt;The file system sub-engine (&lt;code&gt;fs&lt;/code&gt;, &lt;code&gt;seal&lt;/code&gt;, and the linker/assembler modules) was fully overhauled. Instead of spawning new byte slices or strings during recursive scans, ForgeZero:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pre-allocates &lt;strong&gt;localized memory arenas&lt;/strong&gt; and sliding ring buffers&lt;/li&gt;
&lt;li&gt;Handles path strings via direct &lt;code&gt;string&lt;/code&gt;-to-&lt;code&gt;[]byte&lt;/code&gt; headers (&lt;code&gt;unsafe.Pointer&lt;/code&gt;), dodging the typical heap allocation penalty associated with dynamic string manipulation in Go&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Multi-Engine Concurrency &amp;amp; Automated Fallbacks
&lt;/h3&gt;

&lt;p&gt;ForgeZero dynamically parallelizes multi-file assembly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single file:&lt;/strong&gt; matches input files directly to object targets (&lt;code&gt;fz -asm boot.asm&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Directory:&lt;/strong&gt; parses whole structures recursively (&lt;code&gt;fz -dir ./src&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The engine also implements an aggressive &lt;strong&gt;link-level degradation system&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Try &lt;code&gt;gcc&lt;/code&gt; compilation&lt;/li&gt;
&lt;li&gt;Fallback to &lt;code&gt;gcc -no-pie&lt;/code&gt; if position-independent execution fails&lt;/li&gt;
&lt;li&gt;Degrade cleanly to a bare &lt;code&gt;ld&lt;/code&gt; link for completely naked environments&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  3. Explicit Mode Switches
&lt;/h3&gt;

&lt;p&gt;For strict bare-metal control, devs can override automated link behaviors via targeted CLI flags:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-mode c&lt;/code&gt; — explicitly lock execution strictly through GCC&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-mode raw&lt;/code&gt; — bypass safety overrides and link unmanaged binaries directly with raw &lt;code&gt;ld&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 What's New in Patch 4.0.1?
&lt;/h2&gt;

&lt;p&gt;While 4.0 laid the groundwork for memory optimization, the 4.0.1 hotfix secures edge cases in bare-metal pipeline execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Silent-by-Default Pipeline&lt;/strong&gt;&lt;br&gt;
Hides external noise from standard tooling (like &lt;code&gt;nasm&lt;/code&gt; or &lt;code&gt;gcc&lt;/code&gt;), displaying a clean single-line state block: &lt;code&gt;Built: program.out&lt;/code&gt;. Errors are trapped and viewable in full via the &lt;code&gt;-verbose&lt;/code&gt; flag.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Collision Resolution&lt;/strong&gt;&lt;br&gt;
Fixes namespace collisions on identical file names using distinct low-level syntax extensions — e.g., &lt;code&gt;main.asm&lt;/code&gt; and &lt;code&gt;main.s&lt;/code&gt; now map correctly to independent &lt;code&gt;main_asm.o&lt;/code&gt; and &lt;code&gt;main_s.o&lt;/code&gt; components without cross-contamination.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Garbage Cleanup&lt;/strong&gt;&lt;br&gt;
Refined &lt;code&gt;-clean&lt;/code&gt; runtime structures to ensure all cross-compilation objects (&lt;code&gt;.fz_objs&lt;/code&gt; temporary workspaces) are recursively pruned using zero-allocation OS system calls.&lt;/p&gt;


&lt;h2&gt;
  
  
  💻 Getting Started
&lt;/h2&gt;

&lt;p&gt;For system engineers moving away from manually typed, multi-stage assembly toolchains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull the latest bare-metal builder package directly via Go&lt;/span&gt;
go &lt;span class="nb"&gt;install &lt;/span&gt;github.com/forgezero-cli/forgezero@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Make sure your underlying assembly tools (&lt;code&gt;nasm&lt;/code&gt;, &lt;code&gt;fasm&lt;/code&gt;, &lt;code&gt;ld&lt;/code&gt;, etc.) are globally mapped within your system &lt;code&gt;$PATH&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Check out the fully-tested source tree, architecture specs, and documentation over at the &lt;strong&gt;&lt;a href="https://github.com/forgezero-cli/forgezero" rel="noopener noreferrer"&gt;official ForgeZero GitHub Repository&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>go</category>
      <category>assembly</category>
      <category>lowlevel</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
