<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Stjepan</title>
    <description>The latest articles on DEV Community by Stjepan (@stjepan86).</description>
    <link>https://dev.to/stjepan86</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3935981%2F6fc2fd86-a222-4abd-a4c4-8471e94ebf6b.png</url>
      <title>DEV Community: Stjepan</title>
      <link>https://dev.to/stjepan86</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stjepan86"/>
    <language>en</language>
    <item>
      <title>Learning KVM by Reverse-Engineering QEMU with strace</title>
      <dc:creator>Stjepan</dc:creator>
      <pubDate>Fri, 05 Jun 2026 18:42:53 +0000</pubDate>
      <link>https://dev.to/stjepan86/learning-kvm-by-reverse-engineering-qemu-with-strace-445a</link>
      <guid>https://dev.to/stjepan86/learning-kvm-by-reverse-engineering-qemu-with-strace-445a</guid>
      <description>&lt;h2&gt;
  
  
  Motivation
&lt;/h2&gt;

&lt;p&gt;I work with virtual machines in QEMU/KVM environment (a lot). In order to debug, optimize and customize the VMs requires an in-depth knowledge of both QEMU and KVM, the Linux kernel virtualization subsystem that exposes hardware virtualization features such as Intel VT-x and AMD-V to userspace applications like QEMU. Not only that, but I work on a lot of hobby projects requiring quick&lt;br&gt;
bare-metal boot-ups and debugging workflows, and, to be honest, a lot of times QEMU is an overkill for these sorts of tasks.&lt;/p&gt;
&lt;h2&gt;
  
  
  KVM vs TCG
&lt;/h2&gt;

&lt;p&gt;Also when it comes to QEMU, it's worth noting that we always have an option of using TCG, which has a completely different purpose than KVM. TCG is short for Tiny Code Generator, which works by translating guest instructions into host instructions at runtime. This is quite slow compared to running code without&lt;br&gt;
translation overhead. So, if we want to test our bare-metal code, we may want to test it on our own real CPU at native speed. This is where KVM comes in. Unlike TCG, KVM does not emulate the CPU itself. Instead, it allows guest code to execute directly on the host processor while the Linux kernel manages&lt;br&gt;
transitions between guest and host execution.&lt;/p&gt;
&lt;h2&gt;
  
  
  How KVM works in Linux
&lt;/h2&gt;

&lt;p&gt;So, I do already know some basics. KVM driver exposes a driver interface in&lt;br&gt;
Linux root filesystem, &lt;code&gt;/dev/kvm&lt;/code&gt;. Communicating with the driver is done via&lt;br&gt;
&lt;code&gt;ioctl()&lt;/code&gt; system call on a file descriptor. What we need to find out is how QEMU&lt;br&gt;
communicates with Linux kernel and try and follow the QEMU logic without&lt;br&gt;
reading QEMU source code and KVM API (both can be a bit more intimidating than&lt;br&gt;
just seeing how it works under the hood).&lt;/p&gt;
&lt;h2&gt;
  
  
  Reverse-engineering KVM
&lt;/h2&gt;

&lt;p&gt;Now we can take some lightweight Debian Linux image and load it into the QEMU,&lt;br&gt;
with KVM enabled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;QEMU_IMAGE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./debian-12-nocloud-amd64.qcow2

qemu-system-x86_64                                      &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-m&lt;/span&gt; 1024                                             &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-drive&lt;/span&gt; &lt;span class="nv"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;QEMU_IMAGE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;,if&lt;span class="o"&gt;=&lt;/span&gt;virtio,cache&lt;span class="o"&gt;=&lt;/span&gt;none    &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-serial&lt;/span&gt; stdio                                       &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-enable-kvm&lt;/span&gt;                                         &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-cpu&lt;/span&gt; host                                           &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-nodefaults&lt;/span&gt;                                         &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-nographic&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a pretty straightforward way to run QEMU with minimal setup. The most&lt;br&gt;
relevant options for us are &lt;code&gt;-enable-kvm&lt;/code&gt; and &lt;code&gt;-cpu host&lt;/code&gt;, which will enable&lt;br&gt;
KVM and use host CPU instead of emulating some specific CPU.&lt;/p&gt;
&lt;h3&gt;
  
  
  Tracing QEMU/KVM with strace
&lt;/h3&gt;

&lt;p&gt;Now, we want to see what QEMU is really doing by utilizing &lt;code&gt;strace&lt;/code&gt;. We can put this command in a &lt;code&gt;start-qemu.sh&lt;/code&gt; script and call it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;strace &lt;span class="nt"&gt;-yy&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; verbose                                &lt;span class="se"&gt;\&lt;/span&gt;
       &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ioctl,openat,read,write,mmap            &lt;span class="se"&gt;\&lt;/span&gt;
       &lt;span class="nt"&gt;-o&lt;/span&gt; kvm.log                                       &lt;span class="se"&gt;\&lt;/span&gt;
       ./start-qemu.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will trace all &lt;code&gt;ioctl&lt;/code&gt;, &lt;code&gt;openat&lt;/code&gt;, &lt;code&gt;read&lt;/code&gt;, &lt;code&gt;write&lt;/code&gt; and &lt;code&gt;mmap&lt;/code&gt; system&lt;br&gt;
calls. Although I mentioned only &lt;code&gt;ioctl&lt;/code&gt; calls so far, I always like to include&lt;br&gt;
some other common system calls that could be used. As far as we know &lt;code&gt;/dev/kvm&lt;/code&gt;&lt;br&gt;
is the interface to KVM driver and QEMU will probably use &lt;code&gt;openat&lt;/code&gt; on it.&lt;br&gt;
Similarly, we also want to see what QEMU is doing with memory and what it's&lt;br&gt;
reading and writing in general.&lt;/p&gt;

&lt;p&gt;Note: Information on &lt;code&gt;strace&lt;/code&gt; arguments as above can be found in &lt;code&gt;strace --help&lt;/code&gt;&lt;br&gt;
or &lt;code&gt;man strace&lt;/code&gt;, but essentially, the &lt;code&gt;-yy&lt;/code&gt; tells strace to print all available&lt;br&gt;
information when decoding file descriptors, &lt;code&gt;-f&lt;/code&gt; follows forks (we need this one&lt;br&gt;
as we're wrapping it in scripts and QEMU might also do similar stuff). The&lt;br&gt;
&lt;code&gt;-X verbose&lt;/code&gt; will print names of constants and flags (very important when&lt;br&gt;
analyzing &lt;code&gt;ioctl&lt;/code&gt; calls);&lt;/p&gt;
&lt;h3&gt;
  
  
  Interpreting the logs
&lt;/h3&gt;

&lt;p&gt;Now, we start the above command and, as soon as system boots, we can kill it&lt;br&gt;
with &lt;code&gt;CTRL+C&lt;/code&gt;. This will be quite sufficient to see how QEMU/KVM works without&lt;br&gt;
spamming our logs with redundant information. When we read the &lt;code&gt;kvm.log&lt;/code&gt; file,&lt;br&gt;
we will see a lot of traces that are not really interesting. However, we already&lt;br&gt;
have some knowledge: we know QEMU should be opening &lt;code&gt;/dev/kvm&lt;/code&gt; so a quick&lt;br&gt;
search for &lt;code&gt;kvm&lt;/code&gt; reveals exactly what we need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;140900 openat(-100 /* AT_FDCWD */&amp;lt;/home/stjepan/Develop/KVM&amp;gt;, "/dev/kvm", 0x80002 /* O_RDWR|O_CLOEXEC */) = 3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;
140900 ioctl(3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;, 0xae00 /* KVM_GET_API_VERSION */, 0) = 12
140900 ioctl(3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;, 0xae03 /* KVM_CHECK_EXTENSION */, 0x88 /* KVM_CAP_IMMEDIATE_EXIT */) = 1
140900 ioctl(3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;, 0xae03 /* KVM_CHECK_EXTENSION */, 0xa /* KVM_CAP_NR_MEMSLOTS */) = 32764
140900 ioctl(3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;, 0xae03 /* KVM_CHECK_EXTENSION */, 0x76 /* KVM_CAP_MULTI_ADDRESS_SPACE */) = 2
140900 ioctl(3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;, 0xae01 /* KVM_CREATE_VM */, 0) = 9&amp;lt;anon_inode:kvm-vm&amp;gt;
140900 ioctl(9&amp;lt;anon_inode:kvm-vm&amp;gt;, 0xae03 /* KVM_CHECK_EXTENSION */, 0x9 /* KVM_CAP_NR_VCPUS */) = 4
140900 ioctl(3&amp;lt;/dev/kvm&amp;lt;char 10:232&amp;gt;&amp;gt;, 0xae03 /* KVM_CHECK_EXTENSION */, 0x42 /* KVM_CAP_MAX_VCPUS */) = 4096
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can see that QEMU is opening &lt;code&gt;/dev/kvm&lt;/code&gt; and that it's checking API version&lt;br&gt;
and various extensions. We may skip these checks and focus on the calls that&lt;br&gt;
look most important; one of these here is &lt;code&gt;KVM_CREATE_VM&lt;/code&gt; which also returns a&lt;br&gt;
file descriptor &lt;code&gt;9&amp;lt;anon_inode:kvm-vm&amp;gt;&lt;/code&gt; which we can use as a further reference.&lt;/p&gt;
&lt;h4&gt;
  
  
  Setting up memory regions
&lt;/h4&gt;

&lt;p&gt;We know QEMU must eventually load firmware and guest memory into the VM. Looking&lt;br&gt;
for file operations after &lt;code&gt;KVM_CREATE_VM&lt;/code&gt;, we quickly encounter SeaBIOS being&lt;br&gt;
loaded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;140900 openat(-100 /* AT_FDCWD */&amp;lt;/home/stjepan/Develop/KVM&amp;gt;, "/usr/share/seabios/bios-256k.bin", 0 /* O_RDONLY */) = 12&amp;lt;/usr/share/seabios/bios-256k.bin&amp;gt;
140900 mmap(NULL, 2359296, 0 /* PROT_NONE */, 0x22 /* MAP_PRIVATE|MAP_ANONYMOUS */, -1, 0) = 0x776900dc1000
140900 mmap(0x776900e00000, 262144, 0x3 /* PROT_READ|PROT_WRITE */, 0x32 /* MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS */, -1, 0) = 0x776900e00000
140900 openat(-100 /* AT_FDCWD */&amp;lt;/home/stjepan/Develop/KVM&amp;gt;, "/usr/share/seabios/bios-256k.bin", 0 /* O_RDONLY */) = 12&amp;lt;/usr/share/seabios/bios-256k.bin&amp;gt;
140900 mmap(NULL, 266240, 0x3 /* PROT_READ|PROT_WRITE */, 0x22 /* MAP_PRIVATE|MAP_ANONYMOUS */, -1, 0) = 0x776900fc0000
140900 read(12&amp;lt;/usr/share/seabios/bios-256k.bin&amp;gt;, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can see QEMU opening and reading SeaBIOS binary and reserving memory along&lt;br&gt;
with it. We can use the return of the &lt;code&gt;read&lt;/code&gt; system call (the size of the&lt;br&gt;
SeaBIOS binary) and see what KVM is doing with it. Searching through the log&lt;br&gt;
for this size gives us further information on what KVM is doing with SeaBIOS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;140900 ioctl(9&amp;lt;anon_inode:kvm-vm&amp;gt;, 0x4020ae46 /* KVM_SET_USER_MEMORY_REGION */, {slot=3, flags=0x2 /* KVM_MEM_READONLY */, guest_phys_addr=0xfffc0000, memory_size=262144, userspace_addr=0x776900e00000}) = 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So we see that it's now setting this as the memory region for KVM at guest&lt;br&gt;
physical address &lt;code&gt;0xfffc0000&lt;/code&gt; from userspace address that was actually&lt;br&gt;
obtained by &lt;code&gt;mmap&lt;/code&gt; in one of the traces above. In other words, KVM does not&lt;br&gt;
allocate guest RAM itself; userspace applications such as QEMU remain&lt;br&gt;
responsible for managing the backing memory.&lt;/p&gt;
&lt;h4&gt;
  
  
  Creating vCPU and running
&lt;/h4&gt;

&lt;p&gt;Now, it gets very busy in the logs, but most of the stuff we see is still just&lt;br&gt;
checking for extensions and capabilities. However, if we take a look at the&lt;br&gt;
tail of the log, we will see a lot of these &lt;code&gt;ioctl&lt;/code&gt; calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xae80 /* KVM_RUN */, 0) = 0
140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xaeb7 /* KVM_SMI */, 0) = 0
140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xae80 /* KVM_RUN */, 0) = 0
140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xae80 /* KVM_RUN */, 0) = 0
140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xae80 /* KVM_RUN */, 0) = 0
140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xae80 /* KVM_RUN */, 0) = 0
140904 ioctl(10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;, 0xaeb7 /* KVM_SMI */, 0) = 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now both &lt;code&gt;KVM_RUN&lt;/code&gt; and &lt;code&gt;KVM_SMI&lt;/code&gt; are operating on a &lt;code&gt;kvm-vcpu&lt;/code&gt; file descriptor,&lt;br&gt;
something we haven't yet seen. So if we search the logs for it, we can actually&lt;br&gt;
see where it's created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;140904 ioctl(9&amp;lt;anon_inode:kvm-vm&amp;gt;, 0xae41 /* KVM_CREATE_VCPU */, 0) = 10&amp;lt;anon_inode:kvm-vcpu:0&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Now we have a more complete picture of how QEMU is setting up KVM. First,&lt;br&gt;
&lt;code&gt;/dev/kvm&lt;/code&gt; is opened to obtain a file descriptor representing the KVM subsystem.&lt;br&gt;
From it we create a new virtual machine and get &lt;code&gt;kvm-vm&lt;/code&gt; file descriptor. On&lt;br&gt;
this file descriptor we are setting up memory regions and later use it to create&lt;br&gt;
a vCPU, on which we can call &lt;code&gt;KVM_RUN&lt;/code&gt;. The following diagram explains it&lt;br&gt;
better:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/dev/kvm   kvm fd
    |
    +--&amp;gt; KVM_CREATE_VM   kvm-vm fd
            |
            +--&amp;gt; KVM_SET_USER_MEMORY_REGION
            |
            +--&amp;gt; KVM_CREATE_VCPU   kvm-vcpu fd
                            |
                            +--&amp;gt; KVM_RUN (loop)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As we can see from the logs, &lt;code&gt;KVM_RUN&lt;/code&gt; appears repeatedly during guest&lt;br&gt;
execution, while occasional &lt;code&gt;KVM_SMI&lt;/code&gt; calls inject System Management Interrupts&lt;br&gt;
into the guest. This repeated interaction between userspace and KVM is what&lt;br&gt;
ultimately drives virtual CPU execution.&lt;/p&gt;

&lt;p&gt;Next time we will recreate this exact behavior in Rust and also see about just&lt;br&gt;
a few missing pieces to get our first virtual machine running in KVM.&lt;/p&gt;

</description>
      <category>kvm</category>
      <category>linux</category>
      <category>qemu</category>
    </item>
    <item>
      <title>Automating Stack Corruption Analysis in GDB with Python</title>
      <dc:creator>Stjepan</dc:creator>
      <pubDate>Mon, 25 May 2026 12:12:18 +0000</pubDate>
      <link>https://dev.to/stjepan86/automating-stack-corruption-analysis-in-gdb-with-python-1h3</link>
      <guid>https://dev.to/stjepan86/automating-stack-corruption-analysis-in-gdb-with-python-1h3</guid>
      <description>&lt;h2&gt;
  
  
  A bug in my operating system
&lt;/h2&gt;

&lt;p&gt;During a recent visit to my wife's family in Sarajevo, I decided to revisit my&lt;br&gt;
hobby operating system in QEMU. I discovered that the boot process consistently&lt;br&gt;
froze while printing the BIOS memory map. What initially looked like a&lt;br&gt;
protected-mode issue eventually turned into a useful exercise in automating&lt;br&gt;
debugging using GDB's Python scripting.&lt;/p&gt;
&lt;h2&gt;
  
  
  Manual debugging failure
&lt;/h2&gt;

&lt;p&gt;After reinspecting some common pitfalls in my protected mode setup I was more&lt;br&gt;
confident that the issue was stack corruption in my &lt;code&gt;print_memory_map&lt;/code&gt; function.&lt;br&gt;
If I had an unmatched push or pop, it could have corrupted return addresses and&lt;br&gt;
eventually redirected execution flow into invalid memory. Single-stepping&lt;br&gt;
through the routine manually quickly became impractical. The function mixed BIOS&lt;br&gt;
interrupt handling, memory map parsing, and multiple helper calls, making it&lt;br&gt;
difficult to reason about stack state over time.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting up GDB scripting in Python
&lt;/h2&gt;

&lt;p&gt;I needed to automate this and GDB's integration with Python was the most&lt;br&gt;
promising route I could take. The idea was to do exactly what I started&lt;br&gt;
manually: break at a specific (suspicious) function and then start single&lt;br&gt;
stepping while inspecting how the stack pointer behaved.&lt;/p&gt;

&lt;p&gt;First things first, we import GDB module in Python, connect to the remote&lt;br&gt;
target and set up a breakpoint (this is done by inheriting from&lt;br&gt;
&lt;code&gt;gdb.Breakpoint&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;gdb&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StackTraceBreakpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Breakpoint&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;StackTraceBreakpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Recursion detected - stopping.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;gdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target remote :5555&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;gdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;symbol-file ../build/arch/x86/bios-legacy/boot-stage1-5.elf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StackTraceBreakpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;print_memory_map&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;gdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;continue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  StackTraceBreakpoint
&lt;/h3&gt;

&lt;p&gt;So this is kind of bare-bones of what I wanted to do. This code will simply add&lt;br&gt;
a breakpoint with custom logic after connecting to QEMU and loading symbols.&lt;br&gt;
As soon as we do &lt;code&gt;gdb.execute("continue")&lt;/code&gt; GDB will run and, if and when it hits&lt;br&gt;
our breakpoint, it will execute whatever we wrote in the &lt;code&gt;stop()&lt;/code&gt; method. For&lt;br&gt;
now I only added a kind of assertion that we cannot analyze recursions (I didn't&lt;br&gt;
use any in my code anyway and logic would be a bit more complex).&lt;/p&gt;
&lt;h3&gt;
  
  
  Single-stepping
&lt;/h3&gt;

&lt;p&gt;Now what we need to do is start single-stepping after we hit our breakpoint. So&lt;br&gt;
we add a while loop with &lt;code&gt;gdb.execute("stepi")&lt;/code&gt; after &lt;code&gt;gdb.execute("continue")&lt;/code&gt;.&lt;br&gt;
Note that the &lt;code&gt;stepi&lt;/code&gt; instruction steps over machine instructions, not over&lt;br&gt;
source code statements. Also note that We cannot start single-stepping in&lt;br&gt;
&lt;code&gt;stop()&lt;/code&gt; method because GDB won't be in a state which can accept these kinds of&lt;br&gt;
debugging requests.&lt;/p&gt;
&lt;h2&gt;
  
  
  Detecting stack imbalance
&lt;/h2&gt;

&lt;p&gt;Furthermore I have wrapped this single-stepping logic in &lt;code&gt;trace_step()&lt;/code&gt; method&lt;br&gt;
in our breakpoint class. This method is not part of GDB breakpoint API, but&lt;br&gt;
rather as a convenience for tracking the number of pushes and pops in a&lt;br&gt;
consistent manner. To run this script we need to call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gdb &lt;span class="nt"&gt;-ex&lt;/span&gt; &lt;span class="s1"&gt;'source debug-stack.py'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Another thing we need to track is whether we have entered another function (in&lt;br&gt;
which case we won't be counting pushes and pops) and if we have returned from&lt;br&gt;
it. If I ever wanted to inspect routines being called, I would just run the same&lt;br&gt;
script for them (my intention here is not creating a custom emulator on top of&lt;br&gt;
GDB). So I added a counter &lt;code&gt;func_count&lt;/code&gt; which will increase on &lt;code&gt;call&lt;/code&gt; and&lt;br&gt;
decrease on &lt;code&gt;ret&lt;/code&gt; instruction. Here is a rough idea:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;trace_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;insn_full&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x/i $pc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_string&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;insn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;^=&amp;gt;.*:\s*([^\s].*)$&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;insn_full&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="n"&gt;insn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;push&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;func_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pushes&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;func_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pops&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;call&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;func_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ret&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;func_count&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here you can see how I'm extracting the instruction from GDB. And what is left&lt;br&gt;
is just improving logic and also fetching registers like PC, SP and CS for&lt;br&gt;
debugging. I log everything into a file as GDB can get really noisy with&lt;br&gt;
standard output (I didn't find a way to turn off all logging in GDB completely&lt;br&gt;
when single stepping). The script is available in my GitHub repository:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/StjepanPoljak/raspios/tree/master/scripts/debug-stack.py" rel="noopener noreferrer"&gt;https://github.com/StjepanPoljak/raspios/tree/master/scripts/debug-stack.py&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Finding the root cause
&lt;/h2&gt;

&lt;p&gt;Finally, you can see an example output detecting my very issue:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[START] print_memory_map SP=fffd
[0000:8183] push %ax (SP=0xfffd)
[0000:8185] push %bx (SP=0xfff9)
[0000:8187] push %cx (SP=0xfff5)
[0000:8189] push %dx (SP=0xfff1)
[0000:81a5] call 0x66e980a5 (SP=0xffed)
[0000:80a3] push %ax (SP=0xffeb)
[0000:80a7] call 0xab18056 (SP=0xffe7)
(...)
[0000:806e] ret  (SP=0xffef)
[0000:8098] call 0xf6ec8056 (SP=0xfff1)
[0000:8054] push %ax (SP=0xffef)
[0000:8056] push %bx (SP=0xffeb)
[0000:806a] pop %bx (SP=0xffe7)
[0000:806c] pop %ax (SP=0xffeb)
[0000:806e] ret  (SP=0xffef)
[0000:8098] call 0xf6ec8056 (SP=0xfff1)
[0000:8054] push %ax (SP=0xffef)
[0000:8056] push %bx (SP=0xffeb)
[0000:806a] pop %bx (SP=0xffe7)
[0000:806c] pop %ax (SP=0xffeb)
[0000:806e] ret  (SP=0xffef)
[0000:809d] pop %esi (SP=0xfff1)
[0000:809e] pop %bx (SP=0xfff3)
[0000:80a0] pop %ax (SP=0xfff7)
[0000:80a2] ret  (SP=0xfffb)
[FAIL] Extra pop detected at [0000:81fa].
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the real culprit was an extra &lt;code&gt;pop eax&lt;/code&gt; in my &lt;code&gt;print_memory_map&lt;/code&gt; routine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print_memory_map:
    push eax
    push ebx
    push ecx
    push edx

; --- ommited print loop logic ---

.noprint_newline:
    pop eax 

    cmp ecx, [memory_map_size]
    jne .print_memory_map_loop

    pop edx
    pop ecx
    pop ebx
    pop eax
    ret
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Removing this line will cause my debugging script to successfully pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it out yourself
&lt;/h2&gt;

&lt;p&gt;You can try it out yourself, just check out my operating system, &lt;code&gt;raspios&lt;/code&gt;, on&lt;br&gt;
GitHub:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/StjepanPoljak/raspios" rel="noopener noreferrer"&gt;https://github.com/StjepanPoljak/raspios&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Build and run it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;build
&lt;span class="nb"&gt;cd &lt;/span&gt;build
&lt;span class="nv"&gt;ARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;x86 cmake ..
make
make qemu_debug
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, in the &lt;code&gt;scripts&lt;/code&gt; folder run &lt;code&gt;gdb -ex 'source debug-stack.py'&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This was a good reminder that low-level debugging often benefits from&lt;br&gt;
lightweight tooling tailored to the problem at hand. In this case, a small&lt;br&gt;
amount of Python automation around GDB made stack corruption analysis&lt;br&gt;
significantly more manageable than manual instruction tracing.&lt;/p&gt;

</description>
      <category>python</category>
      <category>gdb</category>
      <category>osdev</category>
      <category>assembly</category>
    </item>
  </channel>
</rss>
