<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ignat Korchagin</title>
    <description>The latest articles on DEV Community by Ignat Korchagin (@ignatk).</description>
    <link>https://dev.to/ignatk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F369299%2Fc2deae55-453d-4f21-bcb0-7f03cd2b6a91.jpeg</url>
      <title>DEV Community: Ignat Korchagin</title>
      <link>https://dev.to/ignatk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ignatk"/>
    <language>en</language>
    <item>
      <title>A Business Idea to Give Away: Fuel Delivery</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Wed, 13 Oct 2021 15:11:23 +0000</pubDate>
      <link>https://dev.to/ignatk/a-business-idea-to-give-away-fuel-delivery-3c9c</link>
      <guid>https://dev.to/ignatk/a-business-idea-to-give-away-fuel-delivery-3c9c</guid>
      <description>&lt;h2&gt;
  
  
  It’s like Uber Eats but for car fuel
&lt;/h2&gt;

&lt;p&gt;Recent &lt;a href="https://www.bbc.co.uk/news/business-58747281"&gt;petrol (or gasoline — for our US readers) crisis in the UK&lt;/a&gt; made me wondering if we can introduce new ways of how we refuel our cars. With so many things delivered to our doorsteps in a snap of our fingers (or more accurately, with a tap of our finger on a mobile screen) why car fuel can’t be one of them?&lt;/p&gt;

&lt;h3&gt;
  
  
  UK fuel shortage revisited
&lt;/h3&gt;

&lt;p&gt;The crisis in the UK is mainly caused not by the lack of the fuel itself, but by a supply chain failure. Shortage of HGV (heavy goods vehicle) drivers is causing many fuel stations to run out of fuel, because they are not refilled in time. This in turn creates an avalanche effect as drivers, seeing many stations closed, begin stockpiling fuel and pushing the demand and consumption even higher.&lt;/p&gt;

&lt;p&gt;Even with all the calls from the UK Government not to panic buy, car drivers keep creating lengthy queues at fuel stations and try to fill every available fuel can with petrol or diesel besides their car fuel tank. The market need is apparent here: car drivers want the ability to refuel their cars whenever they want and with whatever amount they want. Anything less is just not tolerated.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solving the HGV driver shortage
&lt;/h3&gt;

&lt;p&gt;A textbook approach for dealing with unavailable goods/services is to try to find substitutes. HGV drivers are a scarce resource as one needs a special license and training to drive huge trucks especially transporting flammable materials. But unlike cars or containers, for example, fuel is liquid and can be divided into unlimited amount of smaller shipments. So in the end there is no reason why fuel can’t be transported by smaller vehicles with a size of a large van. Of course, these “large vans” have to be adapted for fuel transportation, which might be a big upfront investment. However, that investment might be justifiable for a country-wide venture.&lt;/p&gt;

&lt;p&gt;This is where fuel delivery comes into play: sending a fleet of “large vans” instead of a one big HGV to refuel a single petrol station might not be economically feasible. However, it might make sense to do so to cover a whole neighbourhood of potentially distributed customers, who want their fuel delivered to their doorsteps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Taming the surging fuel demand
&lt;/h3&gt;

&lt;p&gt;Another advantage of fuel home delivery is the ability to somewhat cope with the anxiety of car drivers not getting their fuel in time. As we established above, if drivers don’t get the chance to refill their tanks, when they want — they begin to panic. However, the surging demand can be partially solved not only by immediate delivery, but also by a promise to deliver the fuel at a certain time later.&lt;/p&gt;

&lt;p&gt;Imagine, you’re a driver needing to get some fuel for your car “soon” and you know you might need to spend some time finding a petrol station as well as you’re facing some hours in the actual queue at the station. If you had an alternative — an app, which allows you to book a fuel delivery for later today, you would probably seriously consider it.&lt;/p&gt;

&lt;p&gt;By having some kind of a slot system such booking app can flatten surging demand over a longer period of time as well as distribute consumption more evenly across space instead of all drivers in a highly populated area going to a single closest petrol station all at once. And by providing advanced bookings we mostly solve the panic buying problem, as being able to prebook fuel delivery reduces the drivers’ anxiety.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will this venture be useful, when the crisis passes?
&lt;/h3&gt;

&lt;p&gt;Probably. Many people are willing to pay extra to have their food delivered to their doorsteps. And we are not talking about fancy restaurant dishes, but also low-end fast food chains and coffee shops. Most of the time petrol stations are located close to shopping malls and food places, so if people are not willing to go themselves to food courts, they might reconsider wasting their time going to petrol station, should they get an alternative.&lt;/p&gt;

&lt;p&gt;It is worth noting that even if fuel home delivery ends up not being viable, the fuel booking system may be useful on its own and probably requires much less upfront investment. How many times, even before the crisis, did you show up at a petrol station before the weekend or holidays just to find yourself in a queue, because everyone else did the same? What if instead you could just book a time slot at the station in the same way we book doctor appointments, and get your fuel without any waiting or other inconvenience? Definitely, sounds more appealing to me…&lt;/p&gt;

&lt;h3&gt;
  
  
  Is the idea new?
&lt;/h3&gt;

&lt;p&gt;Not really. There are emergency or breakdown services available, which would also deliver small amounts of fuel in case of need. However, their use-case is limited to emergency road assistance.&lt;/p&gt;

&lt;p&gt;Surprisingly, there was a startup in the UK — Zebra Fuel, which attempted exactly this idea in the past, however they &lt;a href="https://techcrunch.com/2019/11/28/zebra-fuel-no-longer-delivering-london/"&gt;ceased their operations in 2019&lt;/a&gt;. Somehow I’ve never heard about them previously, so it seems they never reached the viral marketing point. And it is not clear, why they pulled out of this &lt;a href="https://techcrunch.com/2018/02/05/zebra-fuel/"&gt;seemingly successful business&lt;/a&gt;. Perhaps, the timing wasn’t just right and the current fuel crisis would have given them a rocket boost.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Do you think this business is viable? Why/why not? (Leave a comment)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>startup</category>
      <category>business</category>
      <category>fuel</category>
      <category>delivery</category>
    </item>
    <item>
      <title>How to execute an object file: Part 3</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Mon, 13 Sep 2021 16:21:53 +0000</pubDate>
      <link>https://dev.to/ignatk/how-to-execute-an-object-file-part-3-ab0</link>
      <guid>https://dev.to/ignatk/how-to-execute-an-object-file-part-3-ab0</guid>
      <description>&lt;h2&gt;
  
  
  Dealing with external libraries
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is a repost of my post from the &lt;a href="https://blog.cloudflare.com/how-to-execute-an-object-file-part-3/"&gt;Cloudflare Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-2-2fmo"&gt;part 2 of our series&lt;/a&gt; we learned how to process relocations in object files in order to properly wire up internal dependencies in the code. In this post we will look into what happens if the code has external dependencies — that is, it tries to call functions from external libraries. As before, we will be building upon &lt;a href="https://github.com/cloudflare/cloudflare-blog/tree/master/2021-03-obj-file/2"&gt;the code from part 2&lt;/a&gt;. Let's add another function to our toy object file:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;obj.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include &amp;lt;stdio.h&amp;gt;
&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;say_hello&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, world!"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the above scenario our &lt;code&gt;say_hello&lt;/code&gt; function now depends on the &lt;code&gt;puts&lt;/code&gt; &lt;a href="https://man7.org/linux/man-pages/man3/puts.3.html"&gt;function from the C standard library&lt;/a&gt;. To try it out we also need to modify our &lt;code&gt;loader&lt;/code&gt; to import the new function and execute it:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execute_funcs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* pointers to imported functions */&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;get_hello&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;get_var&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;set_var&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;say_hello&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;say_hello&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"say_hello"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;say_hello&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find say_hello function&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing say_hello..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;say_hello&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-c&lt;/span&gt; obj.c
&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader
No runtime base address &lt;span class="k"&gt;for &lt;/span&gt;section
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seems something went wrong when the &lt;code&gt;loader&lt;/code&gt; tried to process relocations, so let's check the relocations table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--relocs&lt;/span&gt; obj.o

Relocation section &lt;span class="s1"&gt;'.rela.text'&lt;/span&gt; at offset 0x3c8 contains 7 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000a00000004 R_X86_64_PLT32    0000000000000000 add5 - 4
00000000002d  000a00000004 R_X86_64_PLT32    0000000000000000 add5 - 4
00000000003a  000500000002 R_X86_64_PC32     0000000000000000 .rodata - 4
000000000046  000300000002 R_X86_64_PC32     0000000000000000 .data - 4
000000000058  000300000002 R_X86_64_PC32     0000000000000000 .data - 4
000000000066  000500000002 R_X86_64_PC32     0000000000000000 .rodata - 4
00000000006b  001100000004 R_X86_64_PLT32    0000000000000000 puts - 4
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler generated a relocation for the &lt;code&gt;puts&lt;/code&gt; invocation. The relocation type is &lt;code&gt;R_X86_64_PLT32&lt;/code&gt; and our &lt;code&gt;loader&lt;/code&gt; already knows how to process these, so the problem is elsewhere. The above entry shows that the relocation references 17th entry (&lt;code&gt;0x11&lt;/code&gt; in hex) in the symbol table, so let's check that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--symbols&lt;/span&gt; obj.o

Symbol table &lt;span class="s1"&gt;'.symtab'&lt;/span&gt; contains 18 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS obj.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     4 OBJECT  LOCAL  DEFAULT    3 var
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    7
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
    10: 0000000000000000    15 FUNC    GLOBAL DEFAULT    1 add5
    11: 000000000000000f    36 FUNC    GLOBAL DEFAULT    1 add10
    12: 0000000000000033    13 FUNC    GLOBAL DEFAULT    1 get_hello
    13: 0000000000000040    12 FUNC    GLOBAL DEFAULT    1 get_var
    14: 000000000000004c    19 FUNC    GLOBAL DEFAULT    1 set_var
    15: 000000000000005f    19 FUNC    GLOBAL DEFAULT    1 say_hello
    16: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _GLOBAL_OFFSET_TABLE_
    17: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Oh! The section index for the &lt;code&gt;puts&lt;/code&gt; function is &lt;code&gt;UND&lt;/code&gt; (essentially &lt;code&gt;0&lt;/code&gt; in the code), which makes total sense: unlike previous symbols, &lt;code&gt;puts&lt;/code&gt; is an external dependency, and it is not implemented in our &lt;code&gt;obj.o&lt;/code&gt; file. Therefore, it can't be a part of any section within &lt;code&gt;obj.o&lt;/code&gt;.&lt;br&gt;
So how do we resolve this relocation? We need to somehow point the code to jump to a &lt;code&gt;puts&lt;/code&gt; implementation. Our &lt;code&gt;loader&lt;/code&gt; actually already has access to the C library &lt;code&gt;puts&lt;/code&gt; function (because it is written in C and we've used &lt;code&gt;puts&lt;/code&gt; in the &lt;code&gt;loader&lt;/code&gt; code itself already), but technically it doesn't have to be the C library &lt;code&gt;puts&lt;/code&gt;, just some &lt;code&gt;puts&lt;/code&gt; implementation. For completeness, let's implement our own custom &lt;code&gt;puts&lt;/code&gt; function in the &lt;code&gt;loader&lt;/code&gt;, which is just a decorator around the C library &lt;code&gt;puts&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* external dependencies for obj.o */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;my_puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"my_puts executed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we have a &lt;code&gt;puts&lt;/code&gt; implementation (and thus its runtime address) we should just write logic in the &lt;code&gt;loader&lt;/code&gt; to resolve the relocation by instructing the code to jump to the correct function. However, there is one complication: in &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-2-2fmo"&gt;part 2 of our series&lt;/a&gt;, when we processed relocations for constants and global variables, we learned we're mostly dealing with 32-bit relative relocations and that the code or data we're referencing needs to be no more than 2147483647 (&lt;code&gt;0x7fffffff&lt;/code&gt; in hex) bytes away from the relocation itself. &lt;code&gt;R_X86_64_PLT32&lt;/code&gt; is also a 32-bit relative relocation, so it has the same requirements, but unfortunately we can't reuse the trick from &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-2-2fmo"&gt;part 2&lt;/a&gt; as our &lt;code&gt;my_puts&lt;/code&gt; function is part of the &lt;code&gt;loader&lt;/code&gt; itself and we don't have control over where in the address space the operating system places the &lt;code&gt;loader&lt;/code&gt; code.&lt;/p&gt;

&lt;p&gt;Luckily, we don't have to come up with any new solutions and can just borrow the approach used in shared libraries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exploring PLT/GOT
&lt;/h3&gt;

&lt;p&gt;Real world ELF executables and shared libraries have the same problem: often executables have dependencies on shared libraries and shared libraries have dependencies on other shared libraries. And all of the different pieces of a complete runtime program may be mapped to random ranges in the process address space. When a shared library or an ELF executable is linked together, the linker enumerates all the external references and creates two or more additional sections (for a refresher on ELF sections check out the &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-1-2n9f"&gt;part 1 of our series&lt;/a&gt;) in the ELF file. The two mandatory ones are &lt;a href="https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x2251.html"&gt;the Procedure Linkage Table (PLT) and the Global Offset Table (GOT)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We will not deep-dive into specifics of the standard PLT/GOT implementation as there are many other great resources online, but in a nutshell PLT/GOT is just a jumptable for external code. At the linking stage the linker resolves all external 32-bit relative relocations with respect to a locally generated PLT/GOT table. It can do that, because this table would become part of the final ELF file itself, so it will be "close" to the main code, when the file is mapped into memory at runtime. Later, at runtime &lt;a href="https://man7.org/linux/man-pages/man8/ld.so.8.html"&gt;the dynamic loader&lt;/a&gt; populates PLT/GOT tables for every loaded ELF file (both the executable and the shared libraries) with the runtime addresses of all the dependencies. Eventually, when the program code calls some external library function, the CPU "jumps" through the local PLT/GOT table to the final code:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--naNuogHJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.cloudflare.com/content/images/2021/09/image2-5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--naNuogHJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.cloudflare.com/content/images/2021/09/image2-5.png" alt="simplified PLT/GOT call flow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Why do we need two ELF sections to implement one jumptable you may ask? Well, because real world PLT/GOT is a bit more complex than described above. Turns out resolving all external references at runtime may significantly slow down program startup time, so symbol resolution is implemented via a "lazy approach": a reference is resolved by &lt;a href="https://man7.org/linux/man-pages/man8/ld.so.8.html"&gt;the dynamic loader&lt;/a&gt; only when the code actually tries to call a particular function. If the main application code never calls a library function, that reference will never be resolved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementing a simplified PLT/GOT
&lt;/h3&gt;

&lt;p&gt;For learning and demonstrative purposes though we will not be reimplementing a full-blown PLT/GOT with lazy resolution, but a simple jumptable, which resolves external references when the object file is loaded and parsed. First of all we need to know the size of the table: for ELF executables and shared libraries the linker will count the external references at link stage and create appropriately sized PLT and GOT sections. Because we are dealing with raw object files we would have to do another pass over the &lt;code&gt;.rela.text&lt;/code&gt; section and count all the relocations, which point to an entry in the symbol table with undefined section index (or &lt;code&gt;0&lt;/code&gt; in code). Let's add a function for this and store the number of external references in a global variable:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* number of external symbols in the symbol table */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num_ext_symbols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;count_external_symbols&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rela_text_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".rela.text"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .rela.text&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_entsize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Rela&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Elf64_Rela&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;symbol_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ELF64_R_SYM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_info&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="cm"&gt;/* if there is no section associated with a symbol, it is probably
         * an external reference */&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_shndx&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;SHN_UNDEF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;num_ext_symbols&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function is very similar to our &lt;code&gt;do_text_relocations&lt;/code&gt; function. Only instead of actually performing relocations it just counts the number of external symbol references.&lt;/p&gt;

&lt;p&gt;Next we need to decide the actual size in bytes for our jumptable. &lt;code&gt;num_ext_symbols&lt;/code&gt; has the number of external symbol references in the object file, but how many bytes per symbol to allocate? To figure this out we need to design our jumptable format. As we established above, in its simple form our jumptable should be just a collection of unconditional CPU jump instructions — one for each external symbol. However, unfortunately modern x64 CPU architecture &lt;a href="https://www.felixcloutier.com/x86/jmp"&gt;does not provide a jump instruction&lt;/a&gt;, where an address pointer can be a direct operand. Instead, the jump address needs to be stored in memory somewhere "close" — that is within 32-bit offset — and the offset is the actual operand. So, for each external symbol we need to store the jump address (64 bits or 8 bytes on a 64-bit CPU system) and the actual jump instruction with an offset operand (&lt;a href="https://www.felixcloutier.com/x86/jmp"&gt;6 bytes for x64 architecture&lt;/a&gt;). We can represent an entry in our jumptable with the following C structure:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ext_jump&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* address to jump to */&lt;/span&gt;
    &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="cm"&gt;/* unconditional x64 JMP instruction */&lt;/span&gt;
    &lt;span class="cm"&gt;/* should always be {0xff, 0x25, 0xf2, 0xff, 0xff, 0xff} */&lt;/span&gt;
    &lt;span class="cm"&gt;/* so it would jump to an address stored at addr above */&lt;/span&gt;
    &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ext_jump&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've also added a global variable to store the base address of the jumptable, which will be allocated later. Notice that with the above approach the actual jump instruction will always be constant for every external symbol. Since we allocate a dedicated entry for each external symbol with this structure, the &lt;code&gt;addr&lt;/code&gt; member would always be at the same offset from the end of the jump instruction in &lt;code&gt;instr&lt;/code&gt;: &lt;code&gt;-14&lt;/code&gt; bytes or &lt;code&gt;0xfffffff2&lt;/code&gt; in hex for a 32-bit operand. So &lt;code&gt;instr&lt;/code&gt; will always be &lt;code&gt;{0xff, 0x25, 0xf2, 0xff, 0xff, 0xff}&lt;/code&gt;: &lt;code&gt;0xff&lt;/code&gt; and &lt;code&gt;0x25&lt;/code&gt; is the encoding of the x64 jump instruction and its modifier and &lt;code&gt;0xfffffff2&lt;/code&gt; is the operand offset in little-endian format.&lt;/p&gt;

&lt;p&gt;Now that we have defined the entry format for our jumptable, we can allocate and populate it when parsing the object file. First of all, let's not forget to call our new &lt;code&gt;count_external_symbols&lt;/code&gt; function from the &lt;code&gt;parse_obj&lt;/code&gt; to populate &lt;code&gt;num_ext_symbols&lt;/code&gt; (it has to be done before we allocate the jumptable):&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;count_external_symbols&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="cm"&gt;/* allocate memory for `.text`, `.data` and `.rodata` copies rounding up each section to whole pages */&lt;/span&gt;
    &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)...&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we need to allocate memory for the jumptable and store the pointer in the &lt;code&gt;jumptable&lt;/code&gt; global variable for later use. Just a reminder that in order to resolve 32-bit relocations from the &lt;code&gt;.text&lt;/code&gt; section to this table, it has to be "close" in memory to the main code. So we need to allocate it in the same &lt;code&gt;mmap&lt;/code&gt; call as the rest of the object sections. Since we defined the table's entry format in &lt;code&gt;struct ext_jump&lt;/code&gt; and have &lt;code&gt;num_ext_symbols&lt;/code&gt;, the size of the table would simply be &lt;code&gt;sizeof(struct ext_jump) * num_ext_symbols&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;count_external_symbols&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="cm"&gt;/* allocate memory for `.text`, `.data` and `.rodata` copies and the jumptable for external symbols, rounding up each section to whole pages */&lt;/span&gt;
    &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; \
                                   &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; \
                                   &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; \
                                   &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ext_jump&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;num_ext_symbols&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                                   &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_WRITE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAP_PRIVATE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;MAP_ANONYMOUS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;MAP_FAILED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to allocate memory"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* jumptable will come after .rodata */&lt;/span&gt;
    &lt;span class="n"&gt;jumptable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ext_jump&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, because the CPU will actually be executing the jump instructions from our &lt;code&gt;instr&lt;/code&gt; fields from the jumptable, we need to mark this memory readonly and executable (after &lt;code&gt;do_text_relocations&lt;/code&gt; earlier in this function has completed):&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* make the jumptable readonly and executable */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mprotect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ext_jump&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;num_ext_symbols&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_EXEC&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to make the jumptable executable"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this stage we have our jumptable allocated and usable — all is left to do is to populate it properly. We’ll do this by improving the &lt;code&gt;do_text_relocations&lt;/code&gt; implementation to handle the case of external symbols. The &lt;code&gt;No runtime base address for section&lt;/code&gt; error from the beginning of this post is actually caused by this line in &lt;code&gt;do_text_relocations&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
        &lt;span class="cm"&gt;/* symbol, with respect to which the relocation is performed */&lt;/span&gt;
        &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;section_runtime_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_shndx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Currently we try to determine the runtime symbol address for the relocation by looking up the symbol's section runtime address and adding the symbol's offset. But we have established above that external symbols do not have an associated section, so their handling needs to be a special case. Let's update the implementation to reflect this:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
        &lt;span class="cm"&gt;/* symbol, with respect to which the relocation is performed */&lt;/span&gt;
        &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;symbol_address&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="cm"&gt;/* if this is an external symbol */&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_shndx&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;SHN_UNDEF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="cm"&gt;/* get external symbol/function address by name */&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_ext_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strtab&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;  &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="cm"&gt;/* x64 unconditional JMP with address stored at -14 bytes offset */&lt;/span&gt;
            &lt;span class="cm"&gt;/* will use the address stored in addr above */&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0x25&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xf2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="cm"&gt;/* resolve the relocation with respect to this unconditional JMP */&lt;/span&gt;
            &lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;jumptable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="n"&gt;curr_jmp_idx&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;section_runtime_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_shndx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a relocation symbol does not have an associated section, we consider it external and call a helper function to lookup the symbol's runtime address by its name. We store this address in the next available jumptable entry, populate the x64 jump instruction with our fixed operand and store the address of the instruction in the &lt;code&gt;symbol_address&lt;/code&gt; variable. Later, the existing code in &lt;code&gt;do_text_relocations&lt;/code&gt; will resolve the &lt;code&gt;.text&lt;/code&gt; relocation with respect to the address in &lt;code&gt;symbol_address&lt;/code&gt; in the same way it does for local symbols in &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-2-2fmo"&gt;part 2 of our series&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The only missing bit here now is the implementation of the newly introduced &lt;code&gt;lookup_ext_function&lt;/code&gt; helper. Real world loaders may have complicated logic on how to find and resolve symbols in memory at runtime. But for the purposes of this article we'll provide a simple naive implementation, which can only resolve the &lt;code&gt;puts&lt;/code&gt; function:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;lookup_ext_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"puts"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"puts"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;my_puts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"No address for function %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice though that because we control the &lt;code&gt;loader&lt;/code&gt; logic we are free to implement resolution as we please. In the above case we actually "divert" the object file to use our own "custom" &lt;code&gt;my_puts&lt;/code&gt; function instead of the C library one. Let's recompile the &lt;code&gt;loader&lt;/code&gt; and see if it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
Executing get_hello...
get_hello&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; Hello, world!
Executing get_var...
get_var&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 5
Executing set_var&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt;...
Executing get_var again...
get_var&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 42
Executing say_hello...
my_puts executed
Hello, world!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hooray! We not only fixed our &lt;code&gt;loader&lt;/code&gt; to handle external references in object files — we have also learned how to "hook" any such external function call and divert the code to a custom implementation, which might be useful in some cases, like malware research.&lt;/p&gt;

&lt;p&gt;As in the previous posts, the complete source code from this post is &lt;a href="https://github.com/cloudflare/cloudflare-blog/tree/master/2021-03-obj-file/3"&gt;available on GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>programming</category>
      <category>showdev</category>
      <category>computerscience</category>
    </item>
    <item>
      <title>How to execute an object file: Part 2</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Fri, 02 Apr 2021 20:56:32 +0000</pubDate>
      <link>https://dev.to/ignatk/how-to-execute-an-object-file-part-2-2fmo</link>
      <guid>https://dev.to/ignatk/how-to-execute-an-object-file-part-2-2fmo</guid>
      <description>&lt;h2&gt;
  
  
  Handling relocations
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is a repost of my post from the &lt;a href="https://blog.cloudflare.com/how-to-execute-an-object-file-part-2/" rel="noopener noreferrer"&gt;Cloudflare Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-1-2n9f"&gt;previous post&lt;/a&gt;, we learned how to parse an object file and import and execute some functions from it. However, the functions in our toy object file were simple and self-contained: they computed their output solely based on their inputs and didn't have any external code or data dependencies. In this post we will build upon &lt;a href="https://github.com/cloudflare/cloudflare-blog/tree/master/2021-03-obj-file/1" rel="noopener noreferrer"&gt;the code from part 1&lt;/a&gt;, exploring additional steps needed to handle code with some dependencies.&lt;/p&gt;

&lt;p&gt;As an example, we may notice that we can actually rewrite our &lt;code&gt;add10&lt;/code&gt; function using our &lt;code&gt;add5&lt;/code&gt; function:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;obj.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's recompile the object file and try to use it as a library with our &lt;code&gt;loader&lt;/code&gt; program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-c&lt;/span&gt; obj.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 42
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whoa! Something is not right here. &lt;code&gt;add5&lt;/code&gt; still produces the correct result, but &lt;code&gt;add10&lt;/code&gt; does not . Depending on your environment and code composition, you may even see the &lt;code&gt;loader&lt;/code&gt; program crashing instead of outputting incorrect results. To understand what happened, let's investigate the machine code generated by the compiler. We can do that by asking the &lt;a href="https://man7.org/linux/man-pages/man1/objdump.1.html" rel="noopener noreferrer"&gt;objdump tool&lt;/a&gt; to disassemble the &lt;code&gt;.text&lt;/code&gt; section from our &lt;code&gt;obj.o&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;objdump &lt;span class="nt"&gt;--disassemble&lt;/span&gt; &lt;span class="nt"&gt;--section&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.text obj.o

obj.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 &amp;lt;add5&amp;gt;:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   89 7d &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    %edi,-0x4&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;
   7:   8b 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    &lt;span class="nt"&gt;-0x4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;,%eax
   a:   83 c0 05                add    &lt;span class="nv"&gt;$0x5&lt;/span&gt;,%eax
   d:   5d                      pop    %rbp
   e:   c3                      retq

000000000000000f &amp;lt;add10&amp;gt;:
   f:   55                      push   %rbp
  10:   48 89 e5                mov    %rsp,%rbp
  13:   48 83 ec 08             sub    &lt;span class="nv"&gt;$0x8&lt;/span&gt;,%rsp
  17:   89 7d &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    %edi,-0x4&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;
  1a:   8b 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    &lt;span class="nt"&gt;-0x4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;,%eax
  1d:   89 c7                   mov    %eax,%edi
  1f:   e8 00 00 00 00          callq  24 &amp;lt;add10+0x15&amp;gt;
  24:   89 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    %eax,-0x4&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;
  27:   8b 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    &lt;span class="nt"&gt;-0x4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;,%eax
  2a:   89 c7                   mov    %eax,%edi
  2c:   e8 00 00 00 00          callq  31 &amp;lt;add10+0x22&amp;gt;
  31:   c9                      leaveq
  32:   c3                      retq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You don't have to understand the full output above. There are only two relevant lines here: &lt;code&gt;1f:   e8 00 00 00 00&lt;/code&gt; and &lt;code&gt;2c:   e8 00 00 00 00&lt;/code&gt;. These correspond to the two &lt;code&gt;add5&lt;/code&gt; function invocations we have in the source code and &lt;a href="https://man7.org/linux/man-pages/man1/objdump.1.html" rel="noopener noreferrer"&gt;objdump&lt;/a&gt; even conveniently decodes the instruction for us as &lt;code&gt;callq&lt;/code&gt;. By looking at descriptions of the &lt;code&gt;callq&lt;/code&gt; instruction online (like &lt;a href="https://www.felixcloutier.com/x86/call" rel="noopener noreferrer"&gt;this one&lt;/a&gt;), we can further see we're dealing with a "near, relative call", because of the &lt;code&gt;0xe8&lt;/code&gt; prefix:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Call near, relative, displacement relative to next instruction.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;According to the &lt;a href="https://www.felixcloutier.com/x86/call" rel="noopener noreferrer"&gt;description&lt;/a&gt;, this variant of the &lt;code&gt;callq&lt;/code&gt; instruction consists of 5 bytes: the &lt;code&gt;0xe8&lt;/code&gt; prefix and a 4-byte (32 bit) argument. This is where "relative" comes from: the argument should contain the “distance” between the function we want to call and the current position — because the way how x86 works this distance is calculated from the next instruction and not our current &lt;code&gt;callq&lt;/code&gt; instruction. &lt;a href="https://man7.org/linux/man-pages/man1/objdump.1.html" rel="noopener noreferrer"&gt;objdump&lt;/a&gt; conveniently outputs each machine instruction's offset in the output above, so we can easily calculate the needed argument. For example, for the first &lt;code&gt;callq&lt;/code&gt; instruction (&lt;code&gt;1f:   e8 00 00 00 00&lt;/code&gt;) the next instruction is at offset &lt;code&gt;0x24&lt;/code&gt;. We know we should be calling the &lt;code&gt;add5&lt;/code&gt; function, which starts at offset &lt;code&gt;0x0&lt;/code&gt; (beginning of our &lt;code&gt;.text&lt;/code&gt; section). So the relative offset is &lt;code&gt;0x0 - 0x24 = -0x24&lt;/code&gt;. Notice, we have a negative argument, because the &lt;code&gt;add5&lt;/code&gt; function is located before our calling instruction, so we would be instructing the CPU to "jump backwards" from its current position. Lastly, we have to remember that negative numbers — at least on x86 systems — are presented by their &lt;a href="https://en.wikipedia.org/wiki/Two%27s_complement" rel="noopener noreferrer"&gt;two's complements&lt;/a&gt;, so a 4-byte (32 bit) representation of &lt;code&gt;-0x24&lt;/code&gt; would be &lt;code&gt;0xffffffdc&lt;/code&gt;. In the same way we can calculate the &lt;code&gt;callq&lt;/code&gt; argument for the second &lt;code&gt;add5&lt;/code&gt; call: &lt;code&gt;0x0 - 0x31 = -0x31&lt;/code&gt;, two's complement - &lt;code&gt;0xffffffcf&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3ywpw0pgosrwxjm3h3n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3ywpw0pgosrwxjm3h3n.png" alt="relative calls"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It seems the compiler does not generate the right &lt;code&gt;callq&lt;/code&gt; arguments for us. We've calculated the expected arguments to be &lt;code&gt;0xffffffdc&lt;/code&gt; and &lt;code&gt;0xffffffcf&lt;/code&gt;, but the compiler has just left &lt;code&gt;0x00000000&lt;/code&gt; in both places. Let's check first if our expectations are correct by patching our loaded &lt;code&gt;.text&lt;/code&gt; copy before trying to execute it:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy the contents of `.text` section from the ELF file */&lt;/span&gt;
    &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* the first add5 callq argument is located at offset 0x20 and should be 0xffffffdc:
     * 0x1f is the instruction offset + 1 byte instruction prefix
     */&lt;/span&gt;
    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;uint32_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mh"&gt;0x1f&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xffffffdc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* the second add5 callq argument is located at offset 0x2d and should be 0xffffffcf */&lt;/span&gt;
    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;uint32_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mh"&gt;0x2c&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0xffffffcf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* make the `.text` copy readonly and executable */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mprotect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_EXEC&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now let's test it out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c 
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader 
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clearly our monkey-patching helped: &lt;code&gt;add10&lt;/code&gt; executes fine now and produces the correct output. This means our expected &lt;code&gt;callq&lt;/code&gt; arguments, which we calculated, are correct. So why did the compiler emit wrong &lt;code&gt;callq&lt;/code&gt; arguments?&lt;/p&gt;

&lt;h3&gt;
  
  
  Relocations
&lt;/h3&gt;

&lt;p&gt;The problem with our toy object file is that both functions are declared with external linkage — the default setting for all functions and global variables in C. And, although both functions are declared in the same file, the compiler is not sure where the &lt;code&gt;add5&lt;/code&gt; code will end up in the target binary. So the compiler avoids making any assumptions and doesn’t calculate the relative offset argument of the &lt;code&gt;callq&lt;/code&gt; instructions. Let's verify this by removing our monkey patching and declaring the &lt;code&gt;add5&lt;/code&gt; function as &lt;code&gt;static&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* the first add5 callq argument is located at offset 0x20 and should be 0xffffffdc:
     * 0x1f is the instruction offset + 1 byte instruction prefix
     */&lt;/span&gt;
    &lt;span class="cm"&gt;/* *((uint32_t *)(text_runtime_base + 0x1f + 1)) = 0xffffffdc; */&lt;/span&gt;

    &lt;span class="cm"&gt;/* the second add5 callq argument is located at offset 0x2d and should be 0xffffffcf */&lt;/span&gt;
    &lt;span class="cm"&gt;/* *((uint32_t *)(text_runtime_base + 0x2c + 1)) = 0xffffffcf; */&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;obj.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cm"&gt;/* int add5(int num) */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Recompiling and disassembling &lt;code&gt;obj.o&lt;/code&gt; gives us the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-c&lt;/span&gt; obj.c
&lt;span class="nv"&gt;$ &lt;/span&gt;objdump &lt;span class="nt"&gt;--disassemble&lt;/span&gt; &lt;span class="nt"&gt;--section&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.text obj.o

obj.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 &amp;lt;add5&amp;gt;:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   89 7d &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    %edi,-0x4&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;
   7:   8b 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    &lt;span class="nt"&gt;-0x4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;,%eax
   a:   83 c0 05                add    &lt;span class="nv"&gt;$0x5&lt;/span&gt;,%eax
   d:   5d                      pop    %rbp
   e:   c3                      retq

000000000000000f &amp;lt;add10&amp;gt;:
   f:   55                      push   %rbp
  10:   48 89 e5                mov    %rsp,%rbp
  13:   48 83 ec 08             sub    &lt;span class="nv"&gt;$0x8&lt;/span&gt;,%rsp
  17:   89 7d &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    %edi,-0x4&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;
  1a:   8b 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    &lt;span class="nt"&gt;-0x4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;,%eax
  1d:   89 c7                   mov    %eax,%edi
  1f:   e8 dc ff ff ff          callq  0 &amp;lt;add5&amp;gt;
  24:   89 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    %eax,-0x4&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;
  27:   8b 45 &lt;span class="nb"&gt;fc                &lt;/span&gt;mov    &lt;span class="nt"&gt;-0x4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;%rbp&lt;span class="o"&gt;)&lt;/span&gt;,%eax
  2a:   89 c7                   mov    %eax,%edi
  2c:   e8 cf ff ff ff          callq  0 &amp;lt;add5&amp;gt;
  31:   c9                      leaveq
  32:   c3                      retq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because we re-declared the &lt;code&gt;add5&lt;/code&gt; function with internal linkage, the compiler is more confident now and calculates &lt;code&gt;callq&lt;/code&gt; arguments correctly (note that x86 systems are &lt;a href="https://en.wikipedia.org/wiki/Endianness" rel="noopener noreferrer"&gt;little-endian&lt;/a&gt;, so multibyte numbers like &lt;code&gt;0xffffffdc&lt;/code&gt; will be represented with least significant byte first). We can double check this by recompiling and running our &lt;code&gt;loader&lt;/code&gt; test tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even though the &lt;code&gt;add5&lt;/code&gt; function is declared as &lt;code&gt;static&lt;/code&gt;, we can still call it from the &lt;code&gt;loader&lt;/code&gt; tool, basically ignoring the fact that it is an "internal" function now. Because of this, the &lt;code&gt;static&lt;/code&gt; keyword should not be used as a security feature to hide APIs from potential malicious users.&lt;/p&gt;

&lt;p&gt;But let's step back and revert our &lt;code&gt;add5&lt;/code&gt; function in &lt;code&gt;obj.c&lt;/code&gt; to the one with external linkage:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;obj.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-c&lt;/span&gt; obj.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 42
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As we have established above, the compiler did not compute proper &lt;code&gt;callq&lt;/code&gt; arguments for us because it didn't have enough information. But later stages (namely the linker) will have that information, so instead the compiler leaves some clues on how to fix those arguments. These clues — or instructions for the later stages — are called &lt;strong&gt;relocations&lt;/strong&gt;. We can inspect them with our friend, the &lt;a href="https://man7.org/linux/man-pages/man1/readelf.1.html" rel="noopener noreferrer"&gt;readelf&lt;/a&gt; utility. Let's examine &lt;code&gt;obj.o&lt;/code&gt; sections table again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--sections&lt;/span&gt; obj.o
There are 12 section headers, starting at offset 0x2b0:

Section Headers:
  &lt;span class="o"&gt;[&lt;/span&gt;Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  &lt;span class="o"&gt;[&lt;/span&gt; 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  &lt;span class="o"&gt;[&lt;/span&gt; 1] .text             PROGBITS         0000000000000000  00000040
       0000000000000033  0000000000000000  AX       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 2] .rela.text        RELA             0000000000000000  000001f0
       0000000000000030  0000000000000018   I       9     1     8
  &lt;span class="o"&gt;[&lt;/span&gt; 3] .data             PROGBITS         0000000000000000  00000073
       0000000000000000  0000000000000000  WA       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 4] .bss              NOBITS           0000000000000000  00000073
       0000000000000000  0000000000000000  WA       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 5] .comment          PROGBITS         0000000000000000  00000073
       000000000000001d  0000000000000001  MS       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 6] .note.GNU-stack   PROGBITS         0000000000000000  00000090
       0000000000000000  0000000000000000           0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 7] .eh_frame         PROGBITS         0000000000000000  00000090
       0000000000000058  0000000000000000   A       0     0     8
  &lt;span class="o"&gt;[&lt;/span&gt; 8] .rela.eh_frame    RELA             0000000000000000  00000220
       0000000000000030  0000000000000018   I       9     7     8
  &lt;span class="o"&gt;[&lt;/span&gt; 9] .symtab           SYMTAB           0000000000000000  000000e8
       00000000000000f0  0000000000000018          10     8     8
  &lt;span class="o"&gt;[&lt;/span&gt;10] .strtab           STRTAB           0000000000000000  000001d8
       0000000000000012  0000000000000000           0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt;11] .shstrtab         STRTAB           0000000000000000  00000250
       0000000000000059  0000000000000000           0     0     1
Key to Flags:
  W &lt;span class="o"&gt;(&lt;/span&gt;write&lt;span class="o"&gt;)&lt;/span&gt;, A &lt;span class="o"&gt;(&lt;/span&gt;alloc&lt;span class="o"&gt;)&lt;/span&gt;, X &lt;span class="o"&gt;(&lt;/span&gt;execute&lt;span class="o"&gt;)&lt;/span&gt;, M &lt;span class="o"&gt;(&lt;/span&gt;merge&lt;span class="o"&gt;)&lt;/span&gt;, S &lt;span class="o"&gt;(&lt;/span&gt;strings&lt;span class="o"&gt;)&lt;/span&gt;, I &lt;span class="o"&gt;(&lt;/span&gt;info&lt;span class="o"&gt;)&lt;/span&gt;,
  L &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;link &lt;/span&gt;order&lt;span class="o"&gt;)&lt;/span&gt;, O &lt;span class="o"&gt;(&lt;/span&gt;extra OS processing required&lt;span class="o"&gt;)&lt;/span&gt;, G &lt;span class="o"&gt;(&lt;/span&gt;group&lt;span class="o"&gt;)&lt;/span&gt;, T &lt;span class="o"&gt;(&lt;/span&gt;TLS&lt;span class="o"&gt;)&lt;/span&gt;,
  C &lt;span class="o"&gt;(&lt;/span&gt;compressed&lt;span class="o"&gt;)&lt;/span&gt;, x &lt;span class="o"&gt;(&lt;/span&gt;unknown&lt;span class="o"&gt;)&lt;/span&gt;, o &lt;span class="o"&gt;(&lt;/span&gt;OS specific&lt;span class="o"&gt;)&lt;/span&gt;, E &lt;span class="o"&gt;(&lt;/span&gt;exclude&lt;span class="o"&gt;)&lt;/span&gt;,
  l &lt;span class="o"&gt;(&lt;/span&gt;large&lt;span class="o"&gt;)&lt;/span&gt;, p &lt;span class="o"&gt;(&lt;/span&gt;processor specific&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We see that the compiler created a new section called &lt;code&gt;.rela.text&lt;/code&gt;. By convention, a section with relocations for a section named &lt;code&gt;.foo&lt;/code&gt; will be called &lt;code&gt;.rela.foo&lt;/code&gt;, so we can see that the compiler created a section with relocations for the &lt;code&gt;.text&lt;/code&gt; section. We can examine the relocations further:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--relocs&lt;/span&gt; obj.o

Relocation section &lt;span class="s1"&gt;'.rela.text'&lt;/span&gt; at offset 0x1f0 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000800000004 R_X86_64_PLT32    0000000000000000 add5 - 4
00000000002d  000800000004 R_X86_64_PLT32    0000000000000000 add5 - 4

Relocation section &lt;span class="s1"&gt;'.rela.eh_frame'&lt;/span&gt; at offset 0x220 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0
000000000040  000200000002 R_X86_64_PC32     0000000000000000 .text + f
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's ignore the relocations from the &lt;code&gt;.rela.eh_frame&lt;/code&gt; section because they are out of scope of this post. Instead, let’s try to understand the relocations from the &lt;code&gt;.rela.text&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Offset&lt;/code&gt; column tells us exactly where in the target section (&lt;code&gt;.text&lt;/code&gt; in this case) the fix/adjustment is needed. Note that these offsets are exactly the same as in our self-calculated monkey-patching above.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Info&lt;/code&gt; is a combined value: the upper 32 bits — only 16 bits are shown in the output above — represent the index of the symbol in the symbol table, with respect to which the relocation is performed. In our example it is &lt;code&gt;8&lt;/code&gt; and if we run &lt;code&gt;readelf --symbols obj.o&lt;/code&gt; we will see that it points to an entry corresponding to the &lt;code&gt;add5&lt;/code&gt; function. The lower 32 bits (&lt;code&gt;4&lt;/code&gt; in our case) is a relocation type (see &lt;code&gt;Type&lt;/code&gt; below).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Type&lt;/code&gt; describes the relocation type. This is a pseudo-column: &lt;code&gt;readelf&lt;/code&gt; actually generates it from the lower 32 bits of the &lt;code&gt;Info&lt;/code&gt; field. Different relocation types have different formulas we need to apply to perform the relocation.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Sym. Value&lt;/code&gt; may mean different things depending on the relocation type, but most of the time it is the symbol offset with respect to which we perform the relocation. The offset is calculated from the beginning of that symbol’s section.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Addend&lt;/code&gt; is a constant we may need to use in the relocation formula. Depending on the relocation type, &lt;a href="https://man7.org/linux/man-pages/man1/readelf.1.html" rel="noopener noreferrer"&gt;readelf&lt;/a&gt; actually adds the decoded symbol name to the output, so the column name is &lt;code&gt;Sym. Name + Addend&lt;/code&gt; above but the actual field stores the addend only.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a nutshell, these entries tell us that we need to patch the &lt;code&gt;.text&lt;/code&gt; section at offsets &lt;code&gt;0x20&lt;/code&gt; and &lt;code&gt;0x2d&lt;/code&gt;. To calculate what to put there, we need to apply the formula for the &lt;code&gt;R_X86_64_PLT32&lt;/code&gt; relocation type. Searching online, we can find different ELF specifications — like &lt;a href="https://refspecs.linuxfoundation.org/elf/x86_64-abi-0.95.pdf" rel="noopener noreferrer"&gt;this one&lt;/a&gt; — which will tell us how to implement the &lt;code&gt;R_X86_64_PLT32&lt;/code&gt; relocation. The specification mentions that the result of this relocation is &lt;code&gt;word32&lt;/code&gt; — which is what we expect because &lt;code&gt;callq&lt;/code&gt; arguments are 32 bit in our case — and the formula we need to apply is &lt;code&gt;L + A - P&lt;/code&gt;, where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;L&lt;/code&gt; is the address of the symbol, with respect to which the relocation is performed (&lt;code&gt;add5&lt;/code&gt; in our case)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;A&lt;/code&gt; is the constant addend (&lt;code&gt;4&lt;/code&gt; in our case)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;P&lt;/code&gt; is the address/offset, where we store the result of the relocation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the relocation formula references some symbol addresses or offsets, we should use the actual — runtime in our case — addresses in the calculations. For example, we will be using &lt;code&gt;text_runtime_base + 0x2d&lt;/code&gt; as &lt;code&gt;P&lt;/code&gt; for the second relocation and not just &lt;code&gt;0x2d&lt;/code&gt;. So let's try to implement this relocation logic in our object loader:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* from https://elixir.bootlin.com/linux/v5.11.6/source/arch/x86/include/asm/elf.h#L51 */&lt;/span&gt;
&lt;span class="cp"&gt;#define R_X86_64_PLT32 4
&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;section_runtime_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;section_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shstrtab&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* we only mmap .text section so far */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".text"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"No runtime base address for section %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* we actually cheat here - the name .rela.text is a convention, but not a
     * rule: to figure out which section should be patched by these relocations
     * we would need to examine the rela_text_hdr, but we skip it for simplicity
     */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rela_text_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".rela.text"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .rela.text&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_entsize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Rela&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Elf64_Rela&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;symbol_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ELF64_R_SYM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_info&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ELF64_R_TYPE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_info&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="cm"&gt;/* where to patch .text */&lt;/span&gt;
        &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;patch_offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_offset&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="cm"&gt;/* symbol, with respect to which the relocation is performed */&lt;/span&gt;
        &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;section_runtime_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_shndx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;R_X86_64_PLT32&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="cm"&gt;/* L + A - P, 32 bit output */&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;uint32_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;patch_offset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_addend&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;patch_offset&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Calculated relocation: 0x%08x&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;uint32_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;patch_offset&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* copy the contents of `.text` section from the ELF file */&lt;/span&gt;
    &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="cm"&gt;/* make the `.text` copy readonly and executable */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mprotect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_EXEC&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We are now calling the &lt;code&gt;do_text_relocations&lt;/code&gt; function before marking our &lt;code&gt;.text&lt;/code&gt; copy executable. We have also added some debugging output to inspect the result of the relocation calculations. Let's try it out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c 
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader 
Calculated relocation: 0xffffffdc
Calculated relocation: 0xffffffcf
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great! Our imported code works as expected now. By following the relocation hints left for us by the compiler, we've got the same results as in our monkey-patching calculations in the beginning of this post. Our relocation calculations also involved &lt;code&gt;text_runtime_base&lt;/code&gt; address, which is not available at compile time. That's why the compiler could not calculate the &lt;code&gt;callq&lt;/code&gt; arguments in the first place and had to emit the relocations instead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling constant data and global variables
&lt;/h3&gt;

&lt;p&gt;So far, we have been dealing with object files containing only executable code with no state. That is, the imported functions could compute their output solely based on the inputs. Let's see what happens if we add some constant data and global variables dependencies to our imported code. First, we add some more functions to our &lt;code&gt;obj.o&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;obj.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;get_hello&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Hello, world!"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;get_var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;set_var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;var&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;get_hello&lt;/code&gt; returns a constant string and &lt;code&gt;get_var&lt;/code&gt;/&lt;code&gt;set_var&lt;/code&gt; get and set a global variable respectively. Next, let's recompile the &lt;code&gt;obj.o&lt;/code&gt; and run our loader:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-c&lt;/span&gt; obj.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader 
Calculated relocation: 0xffffffdc
Calculated relocation: 0xffffffcf
No runtime base address &lt;span class="k"&gt;for &lt;/span&gt;section .rodata
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks like our loader tried to process more relocations but could not find the runtime address for &lt;code&gt;.rodata&lt;/code&gt; section. Previously, we didn't even have a &lt;code&gt;.rodata&lt;/code&gt; section, but it was added now because our &lt;code&gt;obj.o&lt;/code&gt; needs somewhere to store the constant string &lt;code&gt;Hello, world!&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--sections&lt;/span&gt; obj.o
There are 13 section headers, starting at offset 0x478:

Section Headers:
  &lt;span class="o"&gt;[&lt;/span&gt;Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  &lt;span class="o"&gt;[&lt;/span&gt; 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  &lt;span class="o"&gt;[&lt;/span&gt; 1] .text             PROGBITS         0000000000000000  00000040
       000000000000005f  0000000000000000  AX       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 2] .rela.text        RELA             0000000000000000  00000320
       0000000000000078  0000000000000018   I      10     1     8
  &lt;span class="o"&gt;[&lt;/span&gt; 3] .data             PROGBITS         0000000000000000  000000a0
       0000000000000004  0000000000000000  WA       0     0     4
  &lt;span class="o"&gt;[&lt;/span&gt; 4] .bss              NOBITS           0000000000000000  000000a4
       0000000000000000  0000000000000000  WA       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 5] .rodata           PROGBITS         0000000000000000  000000a4
       000000000000000d  0000000000000000   A       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 6] .comment          PROGBITS         0000000000000000  000000b1
       000000000000001d  0000000000000001  MS       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 7] .note.GNU-stack   PROGBITS         0000000000000000  000000ce
       0000000000000000  0000000000000000           0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 8] .eh_frame         PROGBITS         0000000000000000  000000d0
       00000000000000b8  0000000000000000   A       0     0     8
  &lt;span class="o"&gt;[&lt;/span&gt; 9] .rela.eh_frame    RELA             0000000000000000  00000398
       0000000000000078  0000000000000018   I      10     8     8
  &lt;span class="o"&gt;[&lt;/span&gt;10] .symtab           SYMTAB           0000000000000000  00000188
       0000000000000168  0000000000000018          11    10     8
  &lt;span class="o"&gt;[&lt;/span&gt;11] .strtab           STRTAB           0000000000000000  000002f0
       000000000000002c  0000000000000000           0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt;12] .shstrtab         STRTAB           0000000000000000  00000410
       0000000000000061  0000000000000000           0     0     1
Key to Flags:
  W &lt;span class="o"&gt;(&lt;/span&gt;write&lt;span class="o"&gt;)&lt;/span&gt;, A &lt;span class="o"&gt;(&lt;/span&gt;alloc&lt;span class="o"&gt;)&lt;/span&gt;, X &lt;span class="o"&gt;(&lt;/span&gt;execute&lt;span class="o"&gt;)&lt;/span&gt;, M &lt;span class="o"&gt;(&lt;/span&gt;merge&lt;span class="o"&gt;)&lt;/span&gt;, S &lt;span class="o"&gt;(&lt;/span&gt;strings&lt;span class="o"&gt;)&lt;/span&gt;, I &lt;span class="o"&gt;(&lt;/span&gt;info&lt;span class="o"&gt;)&lt;/span&gt;,
  L &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;link &lt;/span&gt;order&lt;span class="o"&gt;)&lt;/span&gt;, O &lt;span class="o"&gt;(&lt;/span&gt;extra OS processing required&lt;span class="o"&gt;)&lt;/span&gt;, G &lt;span class="o"&gt;(&lt;/span&gt;group&lt;span class="o"&gt;)&lt;/span&gt;, T &lt;span class="o"&gt;(&lt;/span&gt;TLS&lt;span class="o"&gt;)&lt;/span&gt;,
  C &lt;span class="o"&gt;(&lt;/span&gt;compressed&lt;span class="o"&gt;)&lt;/span&gt;, x &lt;span class="o"&gt;(&lt;/span&gt;unknown&lt;span class="o"&gt;)&lt;/span&gt;, o &lt;span class="o"&gt;(&lt;/span&gt;OS specific&lt;span class="o"&gt;)&lt;/span&gt;, E &lt;span class="o"&gt;(&lt;/span&gt;exclude&lt;span class="o"&gt;)&lt;/span&gt;,
  l &lt;span class="o"&gt;(&lt;/span&gt;large&lt;span class="o"&gt;)&lt;/span&gt;, p &lt;span class="o"&gt;(&lt;/span&gt;processor specific&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We also have more &lt;code&gt;.text&lt;/code&gt; relocations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--relocs&lt;/span&gt; obj.o

Relocation section &lt;span class="s1"&gt;'.rela.text'&lt;/span&gt; at offset 0x320 contains 5 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000a00000004 R_X86_64_PLT32    0000000000000000 add5 - 4
00000000002d  000a00000004 R_X86_64_PLT32    0000000000000000 add5 - 4
00000000003a  000500000002 R_X86_64_PC32     0000000000000000 .rodata - 4
000000000046  000300000002 R_X86_64_PC32     0000000000000000 .data - 4
000000000058  000300000002 R_X86_64_PC32     0000000000000000 .data - 4
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler emitted three more &lt;code&gt;R_X86_64_PC32&lt;/code&gt; relocations this time. They reference symbols with index &lt;code&gt;3&lt;/code&gt; and &lt;code&gt;5&lt;/code&gt;, so let's find out what they are:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--symbols&lt;/span&gt; obj.o

Symbol table &lt;span class="s1"&gt;'.symtab'&lt;/span&gt; contains 15 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS obj.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     4 OBJECT  LOCAL  DEFAULT    3 var
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    7
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
    10: 0000000000000000    15 FUNC    GLOBAL DEFAULT    1 add5
    11: 000000000000000f    36 FUNC    GLOBAL DEFAULT    1 add10
    12: 0000000000000033    13 FUNC    GLOBAL DEFAULT    1 get_hello
    13: 0000000000000040    12 FUNC    GLOBAL DEFAULT    1 get_var
    14: 000000000000004c    19 FUNC    GLOBAL DEFAULT    1 set_var
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Entries &lt;code&gt;3&lt;/code&gt; and &lt;code&gt;5&lt;/code&gt; don't have any names attached, but they reference something in sections with index &lt;code&gt;3&lt;/code&gt; and &lt;code&gt;5&lt;/code&gt; respectively. In the output of the section table above, we can see that the section with index &lt;code&gt;3&lt;/code&gt; is &lt;code&gt;.data&lt;/code&gt; and the section with index &lt;code&gt;5&lt;/code&gt; is &lt;code&gt;.rodata&lt;/code&gt;. For a refresher on the most common sections in an ELF file check out our &lt;a href="https://dev.to/ignatk/how-to-execute-an-object-file-part-1-2n9f"&gt;previous post&lt;/a&gt;. To import our newly added code and make it work, we also need to map &lt;code&gt;.data&lt;/code&gt; and &lt;code&gt;.rodata&lt;/code&gt; sections in addition to the &lt;code&gt;.text&lt;/code&gt; section and process these &lt;code&gt;R_X86_64_PC32&lt;/code&gt; relocations.&lt;/p&gt;

&lt;p&gt;There is one caveat though. If we check &lt;a href="https://refspecs.linuxfoundation.org/elf/x86_64-abi-0.95.pdf" rel="noopener noreferrer"&gt;the specification&lt;/a&gt;, we'll see that &lt;code&gt;R_X86_64_PC32&lt;/code&gt; relocation produces a 32-bit output similar to the &lt;code&gt;R_X86_64_PLT32&lt;/code&gt; relocation. This means that the "distance" in memory between the patched position in &lt;code&gt;.text&lt;/code&gt; and the referenced symbol has to be small enough to fit into a 32-bit value (1 bit for the positive/negative sign and 31 bits for the actual data, so less than 2147483647 bytes). Our &lt;code&gt;loader&lt;/code&gt; program uses &lt;a href="https://man7.org/linux/man-pages/man2/mmap.2.html" rel="noopener noreferrer"&gt;mmap system call&lt;/a&gt; to allocate memory for the object section copies, but &lt;a href="https://man7.org/linux/man-pages/man2/mmap.2.html" rel="noopener noreferrer"&gt;mmap&lt;/a&gt; may allocate the mapping almost anywhere in the process address space. If we modify the &lt;code&gt;loader&lt;/code&gt; program to call &lt;a href="https://man7.org/linux/man-pages/man2/mmap.2.html" rel="noopener noreferrer"&gt;mmap&lt;/a&gt; for each section separately, we may end up having &lt;code&gt;.rodata&lt;/code&gt; or &lt;code&gt;.data&lt;/code&gt; section mapped too far away from the &lt;code&gt;.text&lt;/code&gt; section and will not be able to process the &lt;code&gt;R_X86_64_PC32&lt;/code&gt; relocations. In other words, we need to ensure that &lt;code&gt;.data&lt;/code&gt; and &lt;code&gt;.rodata&lt;/code&gt; sections are located relatively close to the &lt;code&gt;.text&lt;/code&gt; section at runtime:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko75vz57kc7e9g2dlx70.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko75vz57kc7e9g2dlx70.png" alt="runtime diff"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One way to achieve that would be to allocate the memory we need for all the sections with one &lt;a href="https://man7.org/linux/man-pages/man2/mmap.2.html" rel="noopener noreferrer"&gt;mmap call&lt;/a&gt;. Then, we’d break it in chunks and assign proper access permissions to each chunk. Let's modify our &lt;code&gt;loader&lt;/code&gt; program to do just that:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* runtime base address of the imported code */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cm"&gt;/* runtime base of the .data section */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cm"&gt;/* runtime base of the .rodata section */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* find the `.text` entry in the sections table */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".text"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .text&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* find the `.data` entry in the sections table */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".data"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .data&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* find the `.rodata` entry in the sections table */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rodata_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".rodata"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .rodata&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* allocate memory for `.text`, `.data` and `.rodata` copies rounding up each section to whole pages */&lt;/span&gt;
    &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_WRITE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAP_PRIVATE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;MAP_ANONYMOUS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;MAP_FAILED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to allocate memory"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* .data will come right after .text */&lt;/span&gt;
    &lt;span class="n"&gt;data_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* .rodata will come after .data */&lt;/span&gt;
    &lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* copy the contents of `.text` section from the ELF file */&lt;/span&gt;
    &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy .data */&lt;/span&gt;
    &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy .rodata */&lt;/span&gt;
    &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="cm"&gt;/* make the `.text` copy readonly and executable */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mprotect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_EXEC&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to make .text executable"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* we don't need to do anything with .data - it should remain read/write */&lt;/span&gt;

    &lt;span class="cm"&gt;/* make the `.rodata` copy readonly */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mprotect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rodata_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to make .rodata readonly"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we have runtime addresses of &lt;code&gt;.data&lt;/code&gt; and &lt;code&gt;.rodata&lt;/code&gt;, we can update the relocation runtime address lookup function:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;section_runtime_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;section_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shstrtab&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".text"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".data"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".rodata"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".rodata"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rodata_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"No runtime base address for section %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally we can import and execute our new functions:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execute_funcs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* pointers to imported functions */&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;get_hello&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;get_var&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;set_var&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"add10(%d) = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;get_hello&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"get_hello"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;get_hello&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find get_hello function&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing get_hello..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"get_hello() = %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_hello&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

    &lt;span class="n"&gt;get_var&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"get_var"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;get_var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find get_var function&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing get_var..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"get_var() = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_var&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

    &lt;span class="n"&gt;set_var&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"set_var"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;set_var&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find set_var function&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing set_var(42)..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;set_var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing get_var again..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"get_var() = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_var&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's try it out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c 
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader 
Calculated relocation: 0xffffffdc
Calculated relocation: 0xffffffcf
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
Executing get_hello...
get_hello&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;�UH��
Executing get_var...
get_var&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 1213580125
Executing set_var&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt;...
Segmentation fault
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Uh-oh! We forgot to implement the new &lt;code&gt;R_X86_64_PC32&lt;/code&gt; relocation type. The &lt;a href="https://refspecs.linuxfoundation.org/elf/x86_64-abi-0.95.pdf" rel="noopener noreferrer"&gt;relocation formula&lt;/a&gt; here is &lt;code&gt;S + A - P&lt;/code&gt;. We already know about &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;P&lt;/code&gt;. As for &lt;code&gt;S&lt;/code&gt; (quoting from &lt;a href="https://refspecs.linuxfoundation.org/elf/x86_64-abi-0.95.pdf" rel="noopener noreferrer"&gt;the spec&lt;/a&gt;):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“the value of the symbol whose index resides in the relocation entry"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In our case, it is essentially the same as &lt;code&gt;L&lt;/code&gt; for &lt;code&gt;R_X86_64_PLT32&lt;/code&gt;. We can just reuse the implementation and remove the debug output in the process:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* from https://elixir.bootlin.com/linux/v5.11.6/source/arch/x86/include/asm/elf.h#L51 */&lt;/span&gt;
&lt;span class="cp"&gt;#define R_X86_64_PC32 2
#define R_X86_64_PLT32 4
&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;do_text_relocations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* we actually cheat here - the name .rela.text is a convention, but not a
     * rule: to figure out which section should be patched by these relocations
     * we would need to examine the rela_text_hdr, but we skip it for simplicity
     */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rela_text_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".rela.text"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .rela.text&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_entsize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Rela&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Elf64_Rela&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rela_text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_relocations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;symbol_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ELF64_R_SYM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_info&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ELF64_R_TYPE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_info&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="cm"&gt;/* where to patch .text */&lt;/span&gt;
        &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;patch_offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_offset&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="cm"&gt;/* symbol, with respect to which the relocation is performed */&lt;/span&gt;
        &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;section_runtime_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_shndx&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbol_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;R_X86_64_PC32&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="cm"&gt;/* S + A - P, 32 bit output, S == L here */&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;R_X86_64_PLT32&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="cm"&gt;/* L + A - P, 32 bit output */&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;uint32_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;patch_offset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;symbol_address&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;relocations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;r_addend&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;patch_offset&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we should be done. Another try:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c 
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader 
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
Executing get_hello...
get_hello&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; Hello, world!
Executing get_var...
get_var&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 5
Executing set_var&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt;...
Executing get_var again...
get_var&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 42
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time we can successfully import functions that reference static constant data and global variables. We can even manipulate the object file’s internal state through the defined accessor interface. As before, the complete source code for this post is &lt;a href="https://github.com/cloudflare/cloudflare-blog/tree/master/2021-03-obj-file/2" rel="noopener noreferrer"&gt;available on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the next post, we will look into importing and executing object code with references to external libraries. Stay tuned!&lt;/p&gt;

</description>
      <category>programming</category>
      <category>computerscience</category>
      <category>linux</category>
      <category>showdev</category>
    </item>
    <item>
      <title>How to execute an object file: Part 1</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Mon, 08 Mar 2021 21:21:03 +0000</pubDate>
      <link>https://dev.to/ignatk/how-to-execute-an-object-file-part-1-2n9f</link>
      <guid>https://dev.to/ignatk/how-to-execute-an-object-file-part-1-2n9f</guid>
      <description>&lt;h2&gt;
  
  
  Calling a simple function without linking
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This is a repost of my post from the &lt;a href="https://blog.cloudflare.com/how-to-execute-an-object-file-part-1/"&gt;Cloudflare Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When we write software using a high-level compiled programming language, there are usually a number of steps involved in transforming our source code into the final executable binary:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--MNt9La-p--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2nx8oaoi4im1d7vryhby.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--MNt9La-p--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2nx8oaoi4im1d7vryhby.png" alt="compile and link"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First, our source files are compiled by a &lt;em&gt;compiler&lt;/em&gt; translating the high-level programming language into machine code. The output of the compiler is a number of &lt;em&gt;object&lt;/em&gt; files. If the project contains multiple source files, we usually get as many object files. The next step is the &lt;em&gt;linker&lt;/em&gt;: since the code in different object files may reference each other, the linker is responsible for assembling all these object files into one big program and binding these references together. The output of the linker is usually our target executable, so only one file.&lt;/p&gt;

&lt;p&gt;However, at this point, our executable might still be incomplete. These days, most executables on Linux are dynamically linked: the executable itself does not have all the code it needs to run a program. Instead it expects to "borrow" part of the code at runtime from &lt;a href="https://en.wikipedia.org/wiki/Library_(computing)#Shared_libraries"&gt;shared libraries&lt;/a&gt; for some of its functionality:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4s70hbMF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5uemqwl6sxpty6sg4zmr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4s70hbMF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5uemqwl6sxpty6sg4zmr.png" alt="dynamic loader"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This process is called &lt;em&gt;runtime linking&lt;/em&gt;: when our executable is being started, the operating system will invoke the &lt;em&gt;dynamic loader&lt;/em&gt;, which should find all the needed libraries, copy/map their code into our target process address space, and resolve all the dependencies our code has on them.&lt;/p&gt;

&lt;p&gt;One interesting thing to note about this overall process is that we get the executable machine code directly from step 1 (compiling the source code), but if any of the later steps fail, we still can't execute our program. So, in this series of blog posts we will investigate if it is possible to execute machine code directly from object files skipping all the later steps.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why would we want to execute an object file?
&lt;/h4&gt;

&lt;p&gt;There may be many reasons. Perhaps we're writing an open-source replacement for a proprietary Linux driver or an application, and want to compare if the behaviour of some code is the same. Or we have a piece of a rare, obscure program and we can't link to it, because it was compiled with a rare, obscure compiler. Maybe we have a source file, but cannot create a full featured executable, because of the missing build time or runtime dependencies. Malware analysis, code from a different operating system etc - all these scenarios may put us in a position, where either linking is not possible or the runtime environment is not suitable.&lt;/p&gt;

&lt;h3&gt;
  
  
  A simple toy object file
&lt;/h3&gt;

&lt;p&gt;For the purposes of this article, let's create a simple toy object file, so we can use it in our experiments:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;obj.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our source file contains only 2 functions, &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt;, which adds 5 or 10 respectively to the only input parameter. It's a small but fully functional piece of code, and we can easily compile it into an object file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-c&lt;/span&gt; obj.c 
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;ls
&lt;/span&gt;obj.c  obj.o
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Loading an object file into the process memory
&lt;/h3&gt;

&lt;p&gt;Now we will try to import the &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt; functions from the object file and execute them. When we talk about executing an object file, we mean using an object file as some sort of a library. As we learned above, when we have an executable that utilises external shared libraries, the &lt;em&gt;dynamic loader&lt;/em&gt; loads these libraries into the process address space for us. With object files, however, we have to do this manually, because ultimately we can't execute machine code that doesn't reside in the operating system's RAM. So, to execute object files we still need some kind of a wrapper program:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdint.h&amp;gt;
#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;load_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* load obj.o into memory */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* parse an object file and find add5 and add10 functions */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execute_funcs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* execute add5 and add10 with some inputs */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;load_obj&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;execute_funcs&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Above is a self-contained object loader program with some functions as placeholders. We will be implementing these functions (and adding more) in the course of this post.&lt;/p&gt;

&lt;p&gt;First, as we established already, we need to load our object file into the process address space. We could just read the whole file into a buffer, but that would not be very efficient. Real-world object files might be big, but as we will see later, we don't need all of the object's file contents. So it is better to &lt;a href="https://man7.org/linux/man-pages/man2/mmap.2.html"&gt;&lt;code&gt;mmap&lt;/code&gt;&lt;/a&gt; the file instead: this way the operating system will lazily read the parts from the file we need at the time we need them. Let's implement the &lt;code&gt;load_obj&lt;/code&gt; function:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="cm"&gt;/* for open(2), fstat(2) */&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;sys/types.h&amp;gt;
#include &amp;lt;sys/stat.h&amp;gt;
#include &amp;lt;fcntl.h&amp;gt;
&lt;/span&gt;
&lt;span class="cm"&gt;/* for close(2), fstat(2) */&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;unistd.h&amp;gt;
&lt;/span&gt;
&lt;span class="cm"&gt;/* for mmap(2) */&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;sys/mman.h&amp;gt;
&lt;/span&gt;
&lt;span class="cm"&gt;/* parsing ELF files */&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;elf.h&amp;gt;
&lt;/span&gt;
&lt;span class="cm"&gt;/* for errno */&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;errno.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="k"&gt;union&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Ehdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;objhdr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* obj.o memory address */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;objhdr&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;load_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;stat&lt;/span&gt; &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"obj.o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;O_RDONLY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Cannot open obj.o"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* we need obj.o size for mmap(2) */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fstat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to get obj.o info"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* mmap obj.o into memory */&lt;/span&gt;
    &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;st_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAP_PRIVATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;MAP_FAILED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Maping obj.o failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we don't encounter any errors, after &lt;code&gt;load_obj&lt;/code&gt; executes we should get the memory address, which points to the beginning of our &lt;code&gt;obj.o&lt;/code&gt; in the &lt;code&gt;obj&lt;/code&gt; global variable. It is worth noting we have created a special union type for the &lt;code&gt;obj&lt;/code&gt; variable: we will be parsing &lt;code&gt;obj.o&lt;/code&gt; later (and peeking ahead - object files are actually &lt;a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format"&gt;ELF files&lt;/a&gt;), so will be referring to the address both as &lt;code&gt;Elf64_Ehdr&lt;/code&gt; (ELF header structure in C) and a byte pointer (parsing ELF files involves calculations of byte offsets from the beginning of the file).&lt;/p&gt;

&lt;h3&gt;
  
  
  A peek inside an object file
&lt;/h3&gt;

&lt;p&gt;To use some code from an object file, we need to find it first. As I've leaked above, object files are actually &lt;a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format"&gt;ELF files&lt;/a&gt; (the same format as Linux executables and shared libraries) and luckily they’re easy to parse on Linux with the help of the standard &lt;code&gt;elf.h&lt;/code&gt; header, which includes many useful definitions related to the ELF file structure. But we actually need to know what we’re looking for, so a high-level understanding of an ELF file is needed.&lt;/p&gt;

&lt;h4&gt;
  
  
  ELF segments and sections
&lt;/h4&gt;

&lt;p&gt;Segments (also known as program headers) and sections are probably the main parts of an ELF file and usually a starting point of any ELF tutorial. However, there is often some confusion between the two. Different sections contain different types of ELF data: executable code (which we are most interested in in this post), constant data, global variables etc. Segments, on the other hand, do not contain any data themselves - they just describe to the operating system how to properly load sections into RAM for the executable to work correctly. Some tutorials say "a segment may include 0 or more sections", which is not entirely accurate: segments do not contain sections, rather they just indicate to the OS where in memory a particular section should be loaded and what is the access pattern for this memory (read, write or execute):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--m7rPe6A---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vii0urb5inufo20kqsbe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--m7rPe6A---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vii0urb5inufo20kqsbe.png" alt="segments and sections"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Furthermore, object files do not contain any segments at all: an object file is not meant to be directly loaded by the OS. Instead, it is assumed it will be linked with some other code, so ELF segments are usually generated by the linker, not the compiler. We can check this by using the &lt;a href="https://man7.org/linux/man-pages/man1/readelf.1.html"&gt;readelf command&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--segments&lt;/span&gt; obj.o

There are no program headers &lt;span class="k"&gt;in &lt;/span&gt;this file.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Object file sections
&lt;/h4&gt;

&lt;p&gt;The same &lt;a href="https://man7.org/linux/man-pages/man1/readelf.1.html"&gt;readelf command&lt;/a&gt; can be used to get all the sections from our object file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--sections&lt;/span&gt; obj.o
There are 11 section headers, starting at offset 0x268:

Section Headers:
  &lt;span class="o"&gt;[&lt;/span&gt;Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  &lt;span class="o"&gt;[&lt;/span&gt; 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  &lt;span class="o"&gt;[&lt;/span&gt; 1] .text             PROGBITS         0000000000000000  00000040
       000000000000001e  0000000000000000  AX       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 2] .data             PROGBITS         0000000000000000  0000005e
       0000000000000000  0000000000000000  WA       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 3] .bss              NOBITS           0000000000000000  0000005e
       0000000000000000  0000000000000000  WA       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 4] .comment          PROGBITS         0000000000000000  0000005e
       000000000000001d  0000000000000001  MS       0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 5] .note.GNU-stack   PROGBITS         0000000000000000  0000007b
       0000000000000000  0000000000000000           0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt; 6] .eh_frame         PROGBITS         0000000000000000  00000080
       0000000000000058  0000000000000000   A       0     0     8
  &lt;span class="o"&gt;[&lt;/span&gt; 7] .rela.eh_frame    RELA             0000000000000000  000001e0
       0000000000000030  0000000000000018   I       8     6     8
  &lt;span class="o"&gt;[&lt;/span&gt; 8] .symtab           SYMTAB           0000000000000000  000000d8
       00000000000000f0  0000000000000018           9     8     8
  &lt;span class="o"&gt;[&lt;/span&gt; 9] .strtab           STRTAB           0000000000000000  000001c8
       0000000000000012  0000000000000000           0     0     1
  &lt;span class="o"&gt;[&lt;/span&gt;10] .shstrtab         STRTAB           0000000000000000  00000210
       0000000000000054  0000000000000000           0     0     1
Key to Flags:
  W &lt;span class="o"&gt;(&lt;/span&gt;write&lt;span class="o"&gt;)&lt;/span&gt;, A &lt;span class="o"&gt;(&lt;/span&gt;alloc&lt;span class="o"&gt;)&lt;/span&gt;, X &lt;span class="o"&gt;(&lt;/span&gt;execute&lt;span class="o"&gt;)&lt;/span&gt;, M &lt;span class="o"&gt;(&lt;/span&gt;merge&lt;span class="o"&gt;)&lt;/span&gt;, S &lt;span class="o"&gt;(&lt;/span&gt;strings&lt;span class="o"&gt;)&lt;/span&gt;, I &lt;span class="o"&gt;(&lt;/span&gt;info&lt;span class="o"&gt;)&lt;/span&gt;,
  L &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;link &lt;/span&gt;order&lt;span class="o"&gt;)&lt;/span&gt;, O &lt;span class="o"&gt;(&lt;/span&gt;extra OS processing required&lt;span class="o"&gt;)&lt;/span&gt;, G &lt;span class="o"&gt;(&lt;/span&gt;group&lt;span class="o"&gt;)&lt;/span&gt;, T &lt;span class="o"&gt;(&lt;/span&gt;TLS&lt;span class="o"&gt;)&lt;/span&gt;,
  C &lt;span class="o"&gt;(&lt;/span&gt;compressed&lt;span class="o"&gt;)&lt;/span&gt;, x &lt;span class="o"&gt;(&lt;/span&gt;unknown&lt;span class="o"&gt;)&lt;/span&gt;, o &lt;span class="o"&gt;(&lt;/span&gt;OS specific&lt;span class="o"&gt;)&lt;/span&gt;, E &lt;span class="o"&gt;(&lt;/span&gt;exclude&lt;span class="o"&gt;)&lt;/span&gt;,
  l &lt;span class="o"&gt;(&lt;/span&gt;large&lt;span class="o"&gt;)&lt;/span&gt;, p &lt;span class="o"&gt;(&lt;/span&gt;processor specific&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are different tutorials online describing the most popular ELF sections in detail. Another great reference is the &lt;a href="https://man7.org/linux/man-pages/man5/elf.5.html"&gt;Linux manpages project&lt;/a&gt;. It is handy because it describes both sections’ purpose as well as C structure definitions from &lt;code&gt;elf.h&lt;/code&gt;, which makes it a one-stop shop for parsing ELF files. However, for completeness, below is a short description of the most popular sections one may encounter in an ELF file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.text&lt;/code&gt;: this section contains the executable code (the actual machine code, which was created by the compiler from our source code). This section is the primary area of interest for this post as it should contain the &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt; functions we want to use.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.data&lt;/code&gt; and &lt;code&gt;.bss&lt;/code&gt;: these sections contain global and static local variables. The difference is: &lt;code&gt;.data&lt;/code&gt; has variables with an initial value (defined like &lt;code&gt;int foo = 5;&lt;/code&gt;) and &lt;code&gt;.bss&lt;/code&gt; just reserves space for variables with no initial value (defined like &lt;code&gt;int bar;&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.rodata&lt;/code&gt;: this section contains constant data (mostly strings or byte arrays). For example, if we use a string literal in the code (for example, for &lt;code&gt;printf&lt;/code&gt; or some error message), it will be stored here. Note, that &lt;code&gt;.rodata&lt;/code&gt; is missing from the output above as we didn't use any string literals or constant byte arrays in &lt;code&gt;obj.c&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.symtab&lt;/code&gt;: this section contains information about the symbols in the object file: functions, global variables, constants etc. It may also contain information about external symbols the object file needs, like needed functions from the external libraries.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.strtab&lt;/code&gt; and &lt;code&gt;.shstrtab&lt;/code&gt;: contain packed strings for the ELF file. Note, that these are not the strings we may define in our source code (those go to the &lt;code&gt;.rodata&lt;/code&gt; section). These are the strings describing the names of other ELF structures, like symbols from &lt;code&gt;.symtab&lt;/code&gt; or even section names from the table above. ELF binary format aims to make its structures compact and of a fixed size, so all strings are stored in one place and the respective data structures just reference them as an offset in either &lt;code&gt;.shstrtab&lt;/code&gt; or &lt;code&gt;.strtab&lt;/code&gt; sections instead of storing the full string locally.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The &lt;code&gt;.symtab&lt;/code&gt; section
&lt;/h4&gt;

&lt;p&gt;At this point, we know that the code we want to import and execute is located in the &lt;code&gt;obj.o&lt;/code&gt;'s &lt;code&gt;.text&lt;/code&gt; section. But we have two functions, &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt;, remember? At this level the &lt;code&gt;.text&lt;/code&gt; section is just a byte blob - how do we know where each of these functions is located? This is where the &lt;code&gt;.symtab&lt;/code&gt; (the "symbol table") comes in handy. It is so important that it has its own dedicated parameter in &lt;a href="https://man7.org/linux/man-pages/man1/readelf.1.html"&gt;readelf&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;readelf &lt;span class="nt"&gt;--symbols&lt;/span&gt; obj.o

Symbol table &lt;span class="s1"&gt;'.symtab'&lt;/span&gt; contains 10 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS obj.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     8: 0000000000000000    15 FUNC    GLOBAL DEFAULT    1 add5
     9: 000000000000000f    15 FUNC    GLOBAL DEFAULT    1 add10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's ignore the other entries for now and just focus on the last two lines, because they conveniently have &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt; as their symbol names. And indeed, this is the info about our functions. Apart from the names, the symbol table provides us with some additional metadata:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;Ndx&lt;/code&gt; column tells us the index of the section, where the symbol is located. We can cross-check it with the section table above and confirm that indeed these functions are located in &lt;code&gt;.text&lt;/code&gt; (section with the index &lt;code&gt;1&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Type&lt;/code&gt; being set to &lt;code&gt;FUNC&lt;/code&gt; confirms that these are indeed functions.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Size&lt;/code&gt; tells us the size of each function, but this information is not very useful in our context. The same goes for &lt;code&gt;Bind&lt;/code&gt; and &lt;code&gt;Vis&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Probably the most useful piece of information is &lt;code&gt;Value&lt;/code&gt;. The name is misleading, because it is actually an offset from the start of the containing section in this context. That is, the &lt;code&gt;add5&lt;/code&gt; function starts just from the beginning of &lt;code&gt;.text&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt; is located from 15th byte and onwards.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So now we have all the pieces on how to parse an ELF file and find the functions we need.&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding and executing a function from an object file
&lt;/h3&gt;

&lt;p&gt;Given what we have learned so far, let's define a plan on how to proceed to import and execute a function from an object file:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find the ELF sections table and &lt;code&gt;.shstrtab&lt;/code&gt; section (we need &lt;code&gt;.shstrtab&lt;/code&gt; later to lookup sections in the section table by name).&lt;/li&gt;
&lt;li&gt;Find the &lt;code&gt;.symtab&lt;/code&gt; and &lt;code&gt;.strtab&lt;/code&gt; sections (we need &lt;code&gt;.strtab&lt;/code&gt; to lookup symbols by name in &lt;code&gt;.symtab&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Find the &lt;code&gt;.text&lt;/code&gt; section and copy it into RAM with executable permissions.&lt;/li&gt;
&lt;li&gt;Find &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt; function offsets from the &lt;code&gt;.symtab&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Execute &lt;code&gt;add5&lt;/code&gt; and &lt;code&gt;add10&lt;/code&gt; functions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's start by adding some more global variables and implementing the &lt;code&gt;parse_obj&lt;/code&gt; function:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* sections table */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;shstrtab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* symbols table */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Sym&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cm"&gt;/* number of entries in the symbols table */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num_symbols&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;strtab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* the sections table offset is encoded in the ELF header */&lt;/span&gt;
    &lt;span class="n"&gt;sections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;e_shoff&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* the index of `.shstrtab` in the sections table is encoded in the ELF header
     * so we can find it without actually using a name lookup
     */&lt;/span&gt;
    &lt;span class="n"&gt;shstrtab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;e_shstrndx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we have references to both the sections table and the &lt;code&gt;.shstrtab&lt;/code&gt; section, we can lookup other sections by their name. Let's create a helper function for that:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* number of entries in the sections table is encoded in the ELF header */&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Elf64_Half&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;e_shnum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* sections table entry does not contain the string name of the section
         * instead, the `sh_name` parameter is an offset in the `.shstrtab`
         * section, which points to a string name
         */&lt;/span&gt;
        &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;section_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shstrtab&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;sh_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;section_name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="cm"&gt;/* we ignore sections with 0 size */&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using our new helper function, we can now find the &lt;code&gt;.symtab&lt;/code&gt; and &lt;code&gt;.strtab&lt;/code&gt; sections:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* find the `.symtab` entry in the sections table */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;symtab_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".symtab"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;symtab_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .symtab&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* the symbols table */&lt;/span&gt;
    &lt;span class="n"&gt;symbols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Sym&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symtab_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* number of entries in the symbols table = table size / entry size */&lt;/span&gt;
    &lt;span class="n"&gt;num_symbols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;symtab_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;symtab_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_entsize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;strtab_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".strtab"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strtab_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .strtab&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;strtab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;strtab_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, let's focus on the &lt;code&gt;.text&lt;/code&gt; section. We noted earlier in our plan that it is not enough to just locate the &lt;code&gt;.text&lt;/code&gt; section in the object file, like we did with other sections. We would need to copy it over to a different location in RAM with executable permissions. There are several reasons for that, but these are the main ones:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Many CPU architectures either don't allow execution of the machine code, which is &lt;a href="https://en.wikipedia.org/wiki/Page_(computer_memory)"&gt;unaligned in memory&lt;/a&gt; (4 kilobytes for x86 systems), or they execute it with a performance penalty. However, the &lt;code&gt;.text&lt;/code&gt; section in an ELF file is not guaranteed to be positioned at a page aligned offset, because the on-disk version of the ELF file aims to be compact rather than convenient.&lt;/li&gt;
&lt;li&gt;We may need to modify some bytes in the &lt;code&gt;.text&lt;/code&gt; section to perform relocations (we don't need to do it in this case, but will be dealing with relocations in future posts). If, for example, we forget to use the &lt;code&gt;MAP_PRIVATE&lt;/code&gt; flag, when mapping the ELF file, our modifications may propagate to the underlying file and corrupt it.&lt;/li&gt;
&lt;li&gt;Finally, different sections, which are needed at runtime, like &lt;code&gt;.text&lt;/code&gt;, &lt;code&gt;.data&lt;/code&gt;, &lt;code&gt;.bss&lt;/code&gt; and &lt;code&gt;.rodata&lt;/code&gt;, require different memory permission bits: the &lt;code&gt;.text&lt;/code&gt; section memory needs to be both readable and executable, but not writable (it is considered a bad security practice to have memory both writable and executable). The &lt;code&gt;.data&lt;/code&gt; and &lt;code&gt;.bss&lt;/code&gt; sections need to be readable and writable to support global variables, but not executable. The &lt;code&gt;.rodata&lt;/code&gt; section should be readonly, because its purpose is to hold constant data. To support this, each section must be allocated on a page boundary as we can only set memory permission bits on whole pages and not custom ranges. Therefore, we need to create new, page aligned memory ranges for these sections and copy the data there.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To create a page aligned copy of the &lt;code&gt;.text&lt;/code&gt; section, first we actually need to know the page size. Many programs usually just hardcode the page size to 4096 (4 kilobytes), but we shouldn't rely on that. While it's accurate for most x86 systems, other CPU architectures, like arm64, might have a different page size. So hard coding a page size may make our program non-portable. Let's find the page size and store it in another global variable:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint64_t&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kr"&gt;inline&lt;/span&gt; &lt;span class="kt"&gt;uint64_t&lt;/span&gt; &lt;span class="nf"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;uint64_t&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* get system page size */&lt;/span&gt;
    &lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sysconf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_SC_PAGESIZE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice, we have also added a convenience function &lt;code&gt;page_align&lt;/code&gt;, which will round up the passed in number to the next page aligned boundary. Next, back to the &lt;code&gt;.text&lt;/code&gt; section. As a reminder, we need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find the &lt;code&gt;.text&lt;/code&gt; section metadata in the sections table.&lt;/li&gt;
&lt;li&gt;Allocate a chunk of memory to hold the &lt;code&gt;.text&lt;/code&gt; section copy.&lt;/li&gt;
&lt;li&gt;Actually copy the &lt;code&gt;.text&lt;/code&gt; section to the newly allocated memory.&lt;/li&gt;
&lt;li&gt;Make the &lt;code&gt;.text&lt;/code&gt; section executable, so we can later call functions from it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here is the implementation of the above steps:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="cm"&gt;/* runtime base address of the imported code */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="cm"&gt;/* find the `.text` entry in the sections table */&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Elf64_Shdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".text"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find .text&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOEXEC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* allocate memory for `.text` copy rounding it up to whole pages */&lt;/span&gt;
    &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_WRITE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAP_PRIVATE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;MAP_ANONYMOUS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;MAP_FAILED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to allocate memory for .text"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* copy the contents of `.text` section from the ELF file */&lt;/span&gt;
    &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* make the `.text` copy readonly and executable */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mprotect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_runtime_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_align&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_hdr&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sh_size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_EXEC&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to make .text executable"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we have all the pieces we need to locate the address of a function. Let's write a helper for it:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* loop through all the symbols in the symbol table */&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_symbols&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* consider only function symbols */&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ELF64_ST_TYPE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_info&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;STT_FUNC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="cm"&gt;/* symbol table entry does not contain the string name of the symbol
             * instead, the `st_name` parameter is an offset in the `.strtab`
             * section, which points to a string name
             */&lt;/span&gt;
            &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;function_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strtab&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;function_name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;function_name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="cm"&gt;/* st_value is an offset in bytes of the function from the
                 * beginning of the `.text` section
                 */&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text_runtime_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;st_value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally we can implement the &lt;code&gt;execute_funcs&lt;/code&gt; function to import and execute code from an object file:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;loader.c&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execute_funcs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* pointers to imported add5 and add10 functions */&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;add5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"add5"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find add5 function&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing add5..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"add5(%d) = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;add10&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lookup_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"add10"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to find add10 function&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENOENT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Executing add10..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"add10(%d) = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's compile our loader and make sure it works as expected:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; loader loader.c 
&lt;span class="nv"&gt;$ &lt;/span&gt;./loader 
Executing add5...
add5&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 47
Executing add10...
add10&lt;span class="o"&gt;(&lt;/span&gt;42&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 52
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Voila! We have successfully imported code from &lt;code&gt;obj.o&lt;/code&gt; and executed it. Of course, the example above is simplified: the code in the object file is self-contained, does not reference any global variables or constants, and does not have any external dependencies. In future posts we will look into more complex code and how to handle such cases.&lt;/p&gt;

&lt;h4&gt;
  
  
  Security considerations
&lt;/h4&gt;

&lt;p&gt;Processing external inputs, like parsing an ELF file from the disk above, should be handled with care. The code from &lt;em&gt;loader.c&lt;/em&gt; omits a lot of bounds checking and additional ELF integrity checks, when parsing the object file. The code is simplified for the purposes of this post, but most likely not production ready, as it can probably be exploited by specifically crafted malicious inputs. Use it only for educational purposes!&lt;/p&gt;

&lt;p&gt;The complete source code from this post can be found &lt;a href="https://github.com/cloudflare/cloudflare-blog/tree/master/2021-03-obj-file/1"&gt;here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>programming</category>
      <category>computerscience</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Probably the simplest way to install Debian/Ubuntu in QEMU</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Tue, 12 Jan 2021 23:30:58 +0000</pubDate>
      <link>https://dev.to/ignatk/probably-the-simplest-way-to-install-debian-ubuntu-in-qemu-1j7c</link>
      <guid>https://dev.to/ignatk/probably-the-simplest-way-to-install-debian-ubuntu-in-qemu-1j7c</guid>
      <description>&lt;h2&gt;
  
  
  Without downloading any installation media
&lt;/h2&gt;

&lt;p&gt;Sometimes you might need to quickly spin up a VM for some testing or safe experiments. The typical way to do this is to download the installation media (usually some ISO image), attach it to the VM instance and start the installation process. However, there is a much simpler solution to bootstrap Debian/Ubuntu installation in QEMU over the network without downloading a single ISO.&lt;/p&gt;

&lt;h3&gt;
  
  
  Debian/Ubuntu installation media types
&lt;/h3&gt;

&lt;p&gt;When it comes to installation media Debian (and its derivative distribution Ubuntu) provides various installation media types for different needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;full installation CD and DVD images&lt;/li&gt;
&lt;li&gt;CD images for network install&lt;/li&gt;
&lt;li&gt;even smaller CD images for network install (also known as &lt;code&gt;mini.iso&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Full CDs/DVDs
&lt;/h4&gt;

&lt;p&gt;Debian generally offers &lt;a href="https://www.debian.org/CD/http-ftp/#stable"&gt;two flavours of these&lt;/a&gt;: a CD image, which is around 650 MB (actually, for &lt;code&gt;amd64&lt;/code&gt; the current stable "Buster" image is 694 MB) and a bunch of DVD images up to 4.4 GB in size. For a simple standard installation we should be fine with a CD image. Even if we require extra packages and need to use the DVD, Debian recommends downloading only the first DVD image (1 of 3 at the time of this writing) and download the rest, if required by the installer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ubuntu.com/download"&gt;Ubuntu images&lt;/a&gt; are categorised a bit differently: the project offers a "Desktop" image, which is around 2.6 GB for the current 20.04.1 LTS "Focal Fossa" release and a "Server" image, which is around 914 MB! Desktop images are... well if we want to install Ubuntu on a desktop and server images, as you may have guessed already, are for server installations. The main difference is desktop images will install a graphical user interface by default, while server images will not (although we can install one later). It is also worth mentioning that Ubuntu installation images are &lt;a href="https://en.wikipedia.org/wiki/Live_CD"&gt;live CDs&lt;/a&gt;: we can run Ubuntu directly from the CD without needing to install it. Finally, Ubuntu also offers variations of a "desktop" image with alternative graphical user interfaces: Kubuntu with a KDE based desktop, Xubuntu with a Xfce based desktop etc. The default desktop for Ubuntu is GNOME by the way.&lt;/p&gt;

&lt;h4&gt;
  
  
  Network install images
&lt;/h4&gt;

&lt;p&gt;Using a full CD/DVD is usually useful only in specific scenarios, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;we want to provision many machines at once (so we download the software once and use it multiple times).&lt;/li&gt;
&lt;li&gt;installing Linux on a machine not connected to the Internet (very unlikely use case in today's connected world - probably, only useful for some security critical setup).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a simple one-off single setup in almost all other cases we may be better off with the installation over the network. There are two major reasons for this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;with network install we only download packages we actually need: for example, if we want to run a simple HTTP server, we don't need to download the GNOME desktop (which would be included in the full installation CD).&lt;/li&gt;
&lt;li&gt;software updates: modern Linux distributions regularly publish software updates (usually much more often than they produce installation CDs), so right after the installation from the CD is complete we would likely need to check for updates and download new versions of some packages. With network install we get the latest and greatest from the start and don't download packages just to immediately overwrite them with a newer version.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So if the network install is much better, why do we have "network install CDs"? Well, an empty machine (or a VM) can't just do complex OS installation on its own, because it is empty (has no software). We need the installer software to solve the chicken and egg problem of installing software. These "network install CDs" provide just that: a minimal live operating system (Linux as well) to launch the installer, which will download and install the full operating system for us (often allowing us to customise and personalise the installation in the process).&lt;/p&gt;

&lt;p&gt;If we search for Debian network install, most likely we'll find &lt;a href="https://www.debian.org/CD/netinst/"&gt;this page&lt;/a&gt; offering us to download a "minimal bootable CD" for Debian installation over the network. Debian claims these CDs should be between 150 MB and 300 MB in size depending on the architecture, but the current "Buster" image for &lt;code&gt;amd64&lt;/code&gt; is 336 MB (still a lot, but more than half the size of the full CD).&lt;/p&gt;

&lt;h4&gt;
  
  
  The mini.iso
&lt;/h4&gt;

&lt;p&gt;If you're wondering why do we need a 336 MB CD image just to bootstrap a simple installer, which downloads packages from the Internet, you're not alone. However, both Debian and Ubuntu provide another variant of the installation media, known as &lt;code&gt;mini.iso&lt;/code&gt; (you can get the Debian Buster one &lt;a href="https://ftp.debian.org/debian/dists/buster/main/installer-amd64/current/images/netboot/mini.iso"&gt;here&lt;/a&gt;). These are indeed much smaller (only 48 MB for Debian Buster installation on &lt;code&gt;amd64&lt;/code&gt;), but do the job of network installation equally well. They are a bit tricky to find though as there are not many links from the official documentation pointing to those images. Nevertheless, if you need to install Debian/Ubuntu and you have to use an ISO, I recommend the &lt;code&gt;mini.iso&lt;/code&gt; as it provides a truly minimal installer bootstrap.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing Debian/Ubuntu in QEMU without any installation media
&lt;/h3&gt;

&lt;p&gt;48 MB is good, but what is better? 0 MB, of course! In addition to downloading less we get the benefit of not having these leftover garbage ISO files lying around after the installation is complete. This is not always possible though, but for a specific use case of "bootstrapping a Debian/Ubuntu VM in QEMU" is quite doable.&lt;/p&gt;

&lt;p&gt;Let's create a QEMU disk image, which will host our installation (we're creating a 16 GB image below, but you may adjust the capacity for your needs):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;qemu-img create &lt;span class="nt"&gt;-f&lt;/span&gt; qcow2 test.img 16G
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now if we start a QEMU VM with this disk image, we would see something like below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;qemu-system-x86_64 &lt;span class="nt"&gt;-nographic&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 4G &lt;span class="nt"&gt;-hda&lt;/span&gt; test.img
SeaBIOS &lt;span class="o"&gt;(&lt;/span&gt;version 1.12.0-1&lt;span class="o"&gt;)&lt;/span&gt;


iPXE &lt;span class="o"&gt;(&lt;/span&gt;http://ipxe.org&lt;span class="o"&gt;)&lt;/span&gt; 00:03.0 C980 PCI2.10 PnP PMM+BFF900F0+BFED00F0 C980



Booting from Hard Disk...
Boot failed: not a bootable disk

Booting from Floppy...
Boot failed: could not &lt;span class="nb"&gt;read &lt;/span&gt;the boot disk

Booting from DVD/CD...
Boot failed: Could not &lt;span class="nb"&gt;read &lt;/span&gt;from CDROM &lt;span class="o"&gt;(&lt;/span&gt;code 0003&lt;span class="o"&gt;)&lt;/span&gt;
Booting from ROM...
iPXE &lt;span class="o"&gt;(&lt;/span&gt;PCI 00:03.0&lt;span class="o"&gt;)&lt;/span&gt; starting execution...ok
iPXE initialising devices...ok



iPXE 1.0.0+git-20190125.36a4c85-1 &lt;span class="nt"&gt;--&lt;/span&gt; Open Source Network Boot Firmware &lt;span class="nt"&gt;--&lt;/span&gt; http:/
/ipxe.org
Features: DNS HTTP iSCSI NFS TFTP AoE ELF MBOOT PXE bzImage Menu PXEXT

net0: 52:54:00:12:34:56 using 82540em on 0000:00:03.0 &lt;span class="o"&gt;(&lt;/span&gt;open&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;[&lt;/span&gt;Link:up, TX:0 TXE:0 RX:0 RXE:0]
Configuring &lt;span class="o"&gt;(&lt;/span&gt;net0 52:54:00:12:34:56&lt;span class="o"&gt;)&lt;/span&gt;...... ok
net0: 10.0.2.15/255.255.255.0 gw 10.0.2.2
net0: fec0::5054:ff:fe12:3456/64 gw fe80::2
net0: fe80::5054:ff:fe12:3456/64
Nothing to boot: No such file or directory &lt;span class="o"&gt;(&lt;/span&gt;http://ipxe.org/2d03e13b&lt;span class="o"&gt;)&lt;/span&gt;
No more network devices

No bootable device.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our QEMU VM tried to boot from the disk first, but obviously failed, because the disk is new and empty. Next, it tried to boot from a... floppy 💾! Although, there is even an emoji for it, I will not cover anything about floppies here (and if someone doesn't know what a floppy is, just click &lt;a href="https://en.wikipedia.org/wiki/Floppy_disk"&gt;here&lt;/a&gt;). Next, it tried to boot from a CD/DVD, but we did not attach any ISO images to this VM, so failed again. Finally, it tried to &lt;a href="https://en.wikipedia.org/wiki/Preboot_Execution_Environment"&gt;PXE-boot&lt;/a&gt; - a standardised form of machine network boot, but it requires supporting infrastructure on the local network: properly configured DHCP and TFTP servers. QEMU can actually emulate those for us with &lt;code&gt;tftp=&lt;/code&gt; and &lt;code&gt;bootfile=&lt;/code&gt; options (for details, see &lt;a href="https://www.qemu.org/docs/master/system/invocation.html#hxtool-5"&gt;QEMU networking documentation&lt;/a&gt;), but we still need to provide the actual bootfiles, so in the end this option is not better than just attaching a &lt;code&gt;mini.iso&lt;/code&gt; to our VM.&lt;/p&gt;

&lt;p&gt;We may notice though, that the PXE-boot option in QEMU is implemented via iPXE - &lt;a href="https://www.ipxe.org/"&gt;a popular open source bootloader and network card firmware&lt;/a&gt;. The nice thing about iPXE is that it is much "smarter" than the traditional network card firmware and can communicate over more common HTTP protocol rather than the very old PXE-boot protocols, suitable only for the local network. And that means we can just use iPXE to boot our VM straight from the Internet over HTTP!&lt;/p&gt;

&lt;p&gt;We'll use just that to bootstrap our Debian/Ubuntu installation. But where do we get the installer from? The &lt;a href="https://ftp.debian.org/debian/dists/buster/main/installer-amd64/current/images/netboot/"&gt;same online folder&lt;/a&gt;, which hosts the &lt;code&gt;mini.iso&lt;/code&gt; also has a folder named &lt;code&gt;debian-installer&lt;/code&gt;, which has an architecture specific folder inside (currently &lt;code&gt;amd64&lt;/code&gt; only). &lt;a href="http://ftp.debian.org/debian/dists/buster/main/installer-amd64/current/images/netboot/debian-installer/amd64/"&gt;There&lt;/a&gt; we will find a bunch of files, but we need only two: &lt;code&gt;linux&lt;/code&gt; - the installation environment Linux kernel image, and &lt;code&gt;initrd.gz&lt;/code&gt; - the userspace portion of the installation environment (the installer itself). Now let's try to get these into QEMU directly with iPXE.&lt;/p&gt;

&lt;p&gt;Start the QEMU (we're using the &lt;code&gt;-nographic&lt;/code&gt; command line option to leave the console output in the terminal instead of the default QEMU-emulated monitor - this is useful to be able to copy-paste long debian installer HTTP links later):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;qemu-system-x86_64 &lt;span class="nt"&gt;-nographic&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 4G &lt;span class="nt"&gt;-hda&lt;/span&gt; test.img
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When it gets to the PXE-boot stage we will briefly see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Press Ctrl-B for the iPXE command line...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need to quickly press &lt;code&gt;Ctrl+B&lt;/code&gt; here to interrupt the PXE-boot stage and get into the iPXE shell (if you missed the opportunity, just reboot the VM and retry):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iPXE&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can direct iPXE to download the Debian installer from the Internet and launch it. But first, we need to configure the network card in iPXE. If you're using the default QEMU user networking mode (like we do here), QEMU will simulate a DHCP server for you and the VM will be &lt;a href="https://en.wikipedia.org/wiki/Network_address_translation"&gt;NAT-ed&lt;/a&gt; to the Internet through the host machine. So all we need to do, is DHCP the network interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iPXE&amp;gt; dhcp net0
Configuring (net0 52:54:00:12:34:56)...... ok
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we will instruct iPXE to download the installer kernel (&lt;code&gt;linux&lt;/code&gt;) and the initrd (&lt;code&gt;initrd.gz&lt;/code&gt;) directly from the &lt;a href="http://ftp.debian.org/debian/dists/buster/main/installer-amd64/current/images/netboot/debian-installer/amd64/"&gt;online netboot folder&lt;/a&gt; via HTTP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iPXE&amp;gt; kernel http://ftp.debian.org/debian/dists/buster/main/installer-amd64/curr
ent/images/netboot/debian-installer/amd64/linux console=ttyS0
http://ftp.debian.org/debian/dists/buster/main/installer-amd64/current/images/ne
tboot/debian-installer/amd64/linux... ok
iPXE&amp;gt; initrd http://ftp.debian.org/debian/dists/buster/main/installer-amd64/curr
ent/images/netboot/debian-installer/amd64/initrd.gz
http://ftp.debian.org/debian/dists/buster/main/installer-amd64/current/images/ne
tboot/debian-installer/amd64/initrd.gz... ok
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we've added the &lt;code&gt;console=ttyS0&lt;/code&gt; command line option for the kernel. This is because we run QEMU with the &lt;code&gt;-nographic&lt;/code&gt; option and our VM input/output is done via an emulated serial port. So we need to tell the booting kernel to redirect its primary console to this serial port as well.&lt;/p&gt;

&lt;p&gt;Finally, let's boot the downloaded installer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iPXE&amp;gt; boot
Probing EDD (edd=off to disable)... o
[    0.000000] Linux version 4.19.0-13-amd64 (debian-kernel@lists.debian.org) ()
[    0.000000] Command line: console=ttyS0
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.8 present.
[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/014
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When it finally boots, we will see the familiar Debian installation window:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JkwfsUGg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/vgk2euwyyu0gde0pa3r8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JkwfsUGg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/vgk2euwyyu0gde0pa3r8.jpg" alt="Debian install"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this point we can proceed with the installation as if we've booted from the &lt;code&gt;mini.iso&lt;/code&gt;, but we've downloaded only 34 MB (5 MB for &lt;code&gt;linux&lt;/code&gt; and 29 MB for &lt;code&gt;initrd.gz&lt;/code&gt;) and won't have any leftover ISO files lying around, when the installation is complete.&lt;/p&gt;

&lt;p&gt;It is worth noting that the same approach works, if we run QEMU in &lt;a href="https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface"&gt;UEFI mode&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Security considerations
&lt;/h4&gt;

&lt;p&gt;One downside of this approach is that we've downloaded the installer over non-encrypted HTTP: even though the upstream iPXE &lt;a href="https://ipxe.org/crypto"&gt;supports HTTPS&lt;/a&gt;, it is not enabled in the QEMU builds as of now. This means we might not be able to fully trust the running software on our VM, but it is probably OK for simple testing and experiments. If you're concerned about the security of the installation, use the &lt;code&gt;mini.iso&lt;/code&gt; approach instead.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>qemu</category>
      <category>ubuntu</category>
      <category>debian</category>
    </item>
    <item>
      <title>Passkb: how to reliably and securely bypass password paste blocking</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Sun, 13 Dec 2020 17:08:06 +0000</pubDate>
      <link>https://dev.to/ignatk/passkb-how-to-reliably-and-securely-bypass-password-paste-blocking-3oip</link>
      <guid>https://dev.to/ignatk/passkb-how-to-reliably-and-securely-bypass-password-paste-blocking-3oip</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pqsec.org/2020/12/09/passkb.html"&gt;pqsec.org&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These days passwords have to be really strong to withstand all modern passwords attacks. Not only passwords have to be long and complex, with all the data breaches around the world it is also very critical not to reuse the same password on several websites. As a result there is no way a human can memorize all the strong passwords for all the online services they need to access.&lt;/p&gt;

&lt;p&gt;This is where password managers come into play: instead of memorising all the passwords we only need to remember one strong password and the password manager will remember the rest for us by storing our service passwords in an encrypted database. However, some websites try to make using a password manager particularly inconvenient by blocking the ability to paste passwords on their password input forms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why block password paste
&lt;/h2&gt;

&lt;p&gt;Most websites, which block password paste, think they improve their security by doing so, when in fact they make it even worse, especially for their users. I will not list the details here, just recommend to read this although dated but still quite relevant &lt;a href="https://www.troyhunt.com/the-cobra-effect-that-is-disabling/"&gt;post from Troy Hunt&lt;/a&gt;. In a nutshell, disabling paste on password input fields is just a "security theatre" and even some &lt;a href="https://www.ncsc.gov.uk/blog-post/let-them-paste-passwords"&gt;government organisations advise against&lt;/a&gt; it.&lt;/p&gt;

&lt;p&gt;Unfortunately, though, not all website owners recognise the counter-reasons for blocking paste and still do it. This is very annoying, if you use a password manager, because passwords in a password manager are usually randomly generated and you have to somehow type each character by hand peeking on the password plaintext. And sometimes you can miss a character, fat finger or whatever and it is not immediately apparent where the typo is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Unblocking password paste
&lt;/h2&gt;

&lt;p&gt;There are many articles online (like &lt;a href="https://www.cyberciti.biz/linux-news/google-chrome-extension-to-removes-password-paste-blocking-on-website/"&gt;this one&lt;/a&gt;) on how to restore paste functionality on password fields that block it. However, they boil down to two approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;if you're using Firefox, you can set &lt;code&gt;dom.event.clipboardevents.enabled&lt;/code&gt; to &lt;code&gt;false&lt;/code&gt; in &lt;code&gt;about:config&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;on both Chrome and Firefox, you can install &lt;a href="https://github.com/jswanner/DontFuckWithPaste"&gt;this&lt;/a&gt; (or any other similar) browser extension, which should unblock the paste functionality for you&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But there are shortcomings of these approaches. First, they are browser dependent: what if you use Safari, Opera or MS Edge? Secondly, the idea behind both approaches is to somehow disable/rewrite some JavaScript on pages with password input fields, because paste blocking is usually implemented by this JavaScript. But this is a bit like "cracking a nut with a sledgehammer", because modern Web applications extensively use JavaScript for many other primary functionality and just messing with it can potentially break the whole application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security considerations for browser extensions
&lt;/h3&gt;

&lt;p&gt;While a functionally broken Web application is a mere inconvenience, there are some security risks involved when using a browser extension to unblock password paste. For example, when you install &lt;a href="https://github.com/jswanner/DontFuckWithPaste"&gt;this popular extension&lt;/a&gt; in Chrome, you get a warning like below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rHiXGtMN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hirch2012h4abzcdqym7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rHiXGtMN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hirch2012h4abzcdqym7.jpg" alt="chrome permissions"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's reiterate: &lt;strong&gt;this extension would be able to read and modify all your browsing data on the fly&lt;/strong&gt;. And this is not some kind of a bug, this is by design as the extension needs this functionality to do its job. Moreover, remember that browser extensions process the data after the TLS termination, so even if you browse HTTPS resources, these extensions still see full plaintext data. Finally, browser extensions run in the browser's context - a process with full network access, so these extensions not only see all your browsing data, but can potentially leak it over the network. What could go wrong?&lt;/p&gt;

&lt;p&gt;Just to be clear, I'm not implying that all these extensions are evil by design. Most of them are even open-source and you can check that the author's intention is indeed only to unblock password paste functionality and nothing else. However, it is the environment that is risky: browser extensions (like the Web applications themselves) are mostly written in JavaScript. And in JavaScript ecosystem developers tend to heavily rely on &lt;a href="https://docs.npmjs.com/cli/v6/using-npm/registry"&gt;NPM registry&lt;/a&gt; (and similar code repositories) for their code dependencies. However, sometimes these dependencies themselves may become compromised (for example, due to the maintainers' account takeover/hijack), thus all dependent projects suddenly start distributing potentially malicious code. And on top of that, as we've learned above, this code runs in a privileged context having full access to your browsing data and the Internet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rethinking paste unblock
&lt;/h2&gt;

&lt;p&gt;Let's zoom out for a bit and take a look at the current paste unblock methods. From a high-level perspective they follow the same path: we know that password paste blocking is implemented via some JavaScript code in the Web application and the above mentioned methods try to somehow neutralise or break this code.&lt;/p&gt;

&lt;p&gt;What if we take a different approach? Instead of trying to disable the paste blocking code we will just present the password input in the form the Web application expects - by typing it in. However, the typing doesn't have to be performed by the operator - we can have a program, which will instruct the operating system to type the complex password on our behalf. The ability to simulate keystrokes exists in modern operating systems for a while now to support different accessibility applications, virtual keyboards, voice input etc. So why not use it to our advantage and "ask" the operating system to type a password for us (if it is otherwise cumbersome to type manually)?&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing &lt;a href="https://github.com/pqsec/passkb"&gt;passkb&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/pqsec/passkb"&gt;Passkb&lt;/a&gt; is a simple command line application, which helps to type complex passwords. The workflow is pretty simple: if you encounter a Web application, which blocks password paste, you can paste your password into &lt;a href="https://github.com/pqsec/passkb"&gt;passkb&lt;/a&gt; and &lt;a href="https://github.com/pqsec/passkb"&gt;passkb&lt;/a&gt; will type it in for you:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s---Fn_IFG9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/uieaulvutiv8bx6gl87b.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s---Fn_IFG9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/uieaulvutiv8bx6gl87b.gif" alt="passkb-demo"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We don't need to run third-party browser extensions or mess with browser settings anymore - this approach works for every browser and for any Web application. Moreover, because &lt;a href="https://github.com/pqsec/passkb"&gt;passkb&lt;/a&gt; emulates typing, Web applications can never block it, as they have to allow typing by design. Another notable security advantage is that &lt;a href="https://github.com/pqsec/passkb"&gt;passkb&lt;/a&gt; runs as a separate process in the operating system, so it can be easily sandboxed and denied network access (after all, we're trusting it with our passwords). The only downside is that there is a slight inconvenience of switching several windows to "copy-type" a password and you have to do it in a timely manner: by default you have 5 seconds (but it is configurable) to put the cursor in the right place for the tool to type the password in the needed form. Otherwise, it will type the password to wherever the cursor and the focus currently is.&lt;/p&gt;

&lt;h3&gt;
  
  
  Linux specifics
&lt;/h3&gt;

&lt;p&gt;While on both Windows and Mac OS &lt;a href="https://github.com/pqsec/passkb"&gt;passkb&lt;/a&gt; uses standard APIs to generate typing events, Linux may require some additional config to make the tool usable. On Linux the tool uses the special &lt;a href="https://www.kernel.org/doc/html/latest/input/uinput.html"&gt;&lt;code&gt;/dev/uinput&lt;/code&gt;&lt;/a&gt; device, thus it has to be present in the system. Most popular Linux distributions already support this special device file (although you might have to run &lt;code&gt;modprobe uinput&lt;/code&gt;), but if you compile your own kernel, make sure to enable &lt;code&gt;CONFIG_INPUT_UINPUT&lt;/code&gt; in the kernel configuration file.&lt;/p&gt;

&lt;p&gt;Additionally, to generate typing events through this interface the process needs read/write access for &lt;a href="https://www.kernel.org/doc/html/latest/input/uinput.html"&gt;&lt;code&gt;/dev/uinput&lt;/code&gt;&lt;/a&gt;. However, by default, only the &lt;code&gt;root&lt;/code&gt; user is allowed to read and write to the file. Probably, the best way to reconfigure this is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create a dedicated group&lt;/li&gt;
&lt;li&gt;change the group ownership of the &lt;code&gt;/dev/uinput&lt;/code&gt; to our newly created group&lt;/li&gt;
&lt;li&gt;change the access bits on &lt;code&gt;/dev/uinput&lt;/code&gt; to &lt;code&gt;660&lt;/code&gt; (group can read/write as well)&lt;/li&gt;
&lt;li&gt;add yourself (and any other system users, who would use the tool) to the group&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And remember: permission and ownership changes to the device filesystem do not persist, so will reset on reboot. You need some kind of a startup script or a &lt;a href="https://www.freedesktop.org/software/systemd/man/udev.html"&gt;systemd udev rule&lt;/a&gt; to adjust the ownership and the permission bits of &lt;code&gt;/dev/uinput&lt;/code&gt; on each boot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Potential future improvements
&lt;/h2&gt;

&lt;p&gt;It would be nice to improve the user experience of the tool and not to confine the user to 5 seconds (or any other timeout) to type the password in the correct place. One way to do so is to figure out, if some keyboard shortcuts could be registered for the tool, which can trigger the typing of the provided password. This way the user may have the familiar experience of &lt;code&gt;Ctrl/Cmd+C&lt;/code&gt;/&lt;code&gt;Ctrl/Cmd+V&lt;/code&gt;, but &lt;code&gt;Ctrl/Cmd+V&lt;/code&gt; may be replaced by some other key combo and will type the password instead of pasting it. If you have any ideas on how to implement this, &lt;a href="https://github.com/pqsec/passkb/pulls"&gt;pull requests are welcome&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>passwords</category>
      <category>paste</category>
      <category>security</category>
      <category>go</category>
    </item>
    <item>
      <title>Sandboxing in Linux with zero lines of code</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Wed, 08 Jul 2020 13:02:32 +0000</pubDate>
      <link>https://dev.to/ignatk/sandboxing-in-linux-with-zero-lines-of-code-28e3</link>
      <guid>https://dev.to/ignatk/sandboxing-in-linux-with-zero-lines-of-code-28e3</guid>
      <description>&lt;h1&gt;
  
  
  Sandboxing in Linux with zero lines of code
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;This is a repost of my post from the &lt;a href="https://blog.cloudflare.com/sandboxing-in-linux-with-zero-lines-of-code/"&gt;Cloudflare Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Modern Linux operating systems provide many tools to run code more securely. There are &lt;a href="https://www.man7.org/linux/man-pages/man7/namespaces.7.html"&gt;namespaces&lt;/a&gt; (the basic building blocks for containers), &lt;a href="https://www.kernel.org/doc/html/latest/admin-guide/LSM/index.html"&gt;Linux Security Modules&lt;/a&gt;, &lt;a href="https://wiki.gentoo.org/wiki/Integrity_Measurement_Architecture"&gt;Integrity Measurement Architecture&lt;/a&gt; etc.&lt;/p&gt;

&lt;p&gt;In this post we will review &lt;a href="https://www.kernel.org/doc/html/latest/userspace-api/seccomp_filter.html"&gt;Linux seccomp&lt;/a&gt; and learn how to sandbox any (even a proprietary) application without writing a single line of code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--occgH43a--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/linux-sandbox-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--occgH43a--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/linux-sandbox-1.jpg" alt="linux-sandbox-1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;small&gt;&lt;a href="https://www.deviantart.com/qubodup/art/Tux-Flat-SVG-607655623"&gt;Tux by Iwan Gabovitch, GPL&lt;/a&gt;&lt;/small&gt;&lt;br&gt;
&lt;small&gt;&lt;a href="https://pixabay.com/vectors/sandpit-sandbox-container-sand-35536/"&gt;Sandbox, Simplified Pixabay License&lt;/a&gt;&lt;br&gt;
&lt;/small&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Linux system calls
&lt;/h2&gt;

&lt;p&gt;System calls (syscalls) is a well-defined interface between &lt;a href="https://en.wikipedia.org/wiki/User_space"&gt;userspace applications&lt;/a&gt; and the &lt;a href="https://en.wikipedia.org/wiki/Kernel_(operating_system)"&gt;operating system (OS) kernel&lt;/a&gt;. On modern operating systems most applications provide only application-specific logic as code. Applications do not, and most of the time cannot, directly access low-level hardware or networking, when they need to store data or send something over the wire. Instead they use system calls to ask the OS kernel to do specific hardware and networking tasks on their behalf:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7oqCtYsE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/image2-2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7oqCtYsE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/image2-2.png" alt="image2-2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apart from providing a generic high level way for applications to interact with the low level hardware, the system call architecture allows the OS kernel to manage available resources between applications as well as enforce policies, like application permissions, networking access control lists etc.&lt;/p&gt;
&lt;h2&gt;
  
  
  Linux seccomp
&lt;/h2&gt;

&lt;p&gt;Linux seccomp is &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;yet another syscall&lt;/a&gt; on Linux, but it is a bit special, because it influences how the OS kernel will behave when the application uses other system calls. By default, the OS kernel has almost no insight into userspace application logic, so it provides all the possible services it can. But not all applications require all services. Consider an application which converts image formats: it needs the ability to read and write data from disk, but in its simplest form probably does not need any network access. Using seccomp an application can declare its intentions in advance to the Linux kernel. For this particular case it can notify the kernel that it will be using the &lt;a href="https://www.man7.org/linux/man-pages/man2/read.2.html"&gt;read&lt;/a&gt; and &lt;a href="https://www.man7.org/linux/man-pages/man2/write.2.html"&gt;write&lt;/a&gt; system calls, but never the &lt;a href="https://www.man7.org/linux/man-pages/man2/send.2.html"&gt;send&lt;/a&gt; and &lt;a href="https://www.man7.org/linux/man-pages/man2/recv.2.html"&gt;recv&lt;/a&gt; system calls (because its intent is to work with local files and never with the network). It’s like establishing a contract between the application and the OS kernel:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kPDliOuF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/image1-4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kPDliOuF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/image1-4.png" alt="image1-4"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But what happens if the application later breaks the contract and tries to use one of the system calls it promised not to use? The kernel will “penalise” the application, usually by immediately terminating it. Linux seccomp also &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;allows less restrictive actions&lt;/a&gt; for the kernel to take:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;instead of terminating the whole application, the kernel can be requested to terminate only the thread, which issued the prohibited system call&lt;/li&gt;
&lt;li&gt;the kernel may just send a &lt;code&gt;SIGSYS&lt;/code&gt; &lt;a href="https://man7.org/linux/man-pages/man7/signal.7.html"&gt;signal&lt;/a&gt; to the calling thread&lt;/li&gt;
&lt;li&gt;the seccomp policy can specify an error code, which the kernel will then return to the calling application instead of executing the prohibited system call&lt;/li&gt;
&lt;li&gt;if the violating process is under &lt;a href="https://man7.org/linux/man-pages/man2/ptrace.2.html"&gt;ptrace&lt;/a&gt; (for example executing under a debugger), the kernel can notify the tracer (the debugger) that a prohibited system call is about to happen and let the debugger decide what to do&lt;/li&gt;
&lt;li&gt;the kernel may be instructed to allow and execute the system call, but log the attempt: this is useful, when we want to verify that our seccomp policy is not too tight without the risk of terminating the application and potentially creating an outage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Although there is a lot of flexibility in defining the potential penalty for the application, from a security perspective it is usually best to stick with the complete application termination upon seccomp policy violation. The reason for that will be described later in the examples in the post.&lt;/p&gt;

&lt;p&gt;So why would the application take the risk of being abruptly terminated and declare its intentions beforehand, if it can just be “silent” and the OS kernel will allow it to use any system call by default? Of course, for a normal behaving application it makes no sense, but it turns out this feature is quite effective to protect from rogue applications and &lt;a href="https://en.wikipedia.org/wiki/Arbitrary_code_execution"&gt;arbitrary code execution&lt;/a&gt; exploits.&lt;/p&gt;

&lt;p&gt;Imagine our image format converter is written in some unsafe language and &lt;a href="https://imagetragick.com/"&gt;an attacker was able to take control of the application by making it process some malformed image&lt;/a&gt;. What the attacker might do is to try to steal some sensitive information from the machine running our converter and send it to themselves via the network. By default, the OS kernel will most likely allow it and a data leak will happen. But if our image converter “confined” (or sandboxed) itself beforehand to only read and write local data the kernel will terminate the application when the latter tries to leak the data over the network thus preventing the leak and locking out the attacker from our system!&lt;/p&gt;
&lt;h2&gt;
  
  
  Integrating seccomp into the application
&lt;/h2&gt;

&lt;p&gt;To see how seccomp can be used in practice, let’s consider a toy example program&lt;/p&gt;

&lt;p&gt;&lt;em&gt;myos.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;sys/utsname.h&amp;gt;
&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;utsname&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uname failed: "&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"My OS is %s!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sysname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a simplified version of the &lt;a href="https://www.man7.org/linux/man-pages/man1/uname.1.html"&gt;uname command line tool&lt;/a&gt;, which just prints your operating system name. Like its full-featured counterpart, it uses the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname system call&lt;/a&gt; to actually get the name of the current operating system from the kernel. Let’s see it action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./myos
My OS is Linux!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great! We’re on Linux, so can further experiment with &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;seccomp&lt;/a&gt; (it is a Linux-only feature). Notice that we’re properly handling the error code after invoking the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname system call&lt;/a&gt;. However, according to the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;man page&lt;/a&gt; it can only fail, when the passed in buffer pointer is invalid. And in this case the set error number will be “EINVAL”, which translates to invalid parameter. In our case, the “struct utsname” structure is being allocated on the stack, so our pointer will always be valid. In other words, in normal circumstances the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname system call&lt;/a&gt; should never fail in this particular program.&lt;/p&gt;

&lt;p&gt;To illustrate seccomp capabilities we will add a “sandbox” function to our program before the main logic&lt;/p&gt;

&lt;p&gt;&lt;em&gt;myos_raw_seccomp.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include &amp;lt;linux/seccomp.h&amp;gt;
#include &amp;lt;linux/filter.h&amp;gt;
#include &amp;lt;linux/audit.h&amp;gt;
#include &amp;lt;sys/ptrace.h&amp;gt;
#include &amp;lt;sys/prctl.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stddef.h&amp;gt;
#include &amp;lt;sys/utsname.h&amp;gt;
#include &amp;lt;errno.h&amp;gt;
#include &amp;lt;unistd.h&amp;gt;
#include &amp;lt;sys/syscall.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;sandbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;sock_filter&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* seccomp(2) says we should always check the arch */&lt;/span&gt;
        &lt;span class="cm"&gt;/* as syscalls may have different numbers on different architectures */&lt;/span&gt;
        &lt;span class="cm"&gt;/* see https://fedora.juszkiewicz.com.pl/syscalls.html */&lt;/span&gt;
        &lt;span class="cm"&gt;/* for simplicity we only allow x86_64 */&lt;/span&gt;
        &lt;span class="n"&gt;BPF_STMT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_LD&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_W&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_ABS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offsetof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;seccomp_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arch&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
        &lt;span class="cm"&gt;/* if not x86_64, tell the kernel to kill the process */&lt;/span&gt;
        &lt;span class="n"&gt;BPF_JUMP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_JMP&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_JEQ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AUDIT_ARCH_X86_64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="cm"&gt;/* get the actual syscall number */&lt;/span&gt;
        &lt;span class="n"&gt;BPF_STMT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_LD&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_W&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_ABS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offsetof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;seccomp_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nr&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
        &lt;span class="cm"&gt;/* if "uname", tell the kernel to return EPERM, otherwise just allow */&lt;/span&gt;
        &lt;span class="n"&gt;BPF_JUMP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_JMP&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_JEQ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SYS_uname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;BPF_STMT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_RET&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECCOMP_RET_ERRNO&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EPERM&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;SECCOMP_RET_DATA&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
        &lt;span class="n"&gt;BPF_STMT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_RET&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECCOMP_RET_ALLOW&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;BPF_STMT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_RET&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BPF_K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECCOMP_RET_KILL&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;sock_fprog&lt;/span&gt; &lt;span class="n"&gt;prog&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])),&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="cm"&gt;/* see seccomp(2) on why this is needed */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prctl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PR_SET_NO_NEW_PRIVS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"PR_SET_NO_NEW_PRIVS failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="cm"&gt;/* glibc does not have a wrapper for seccomp(2) */&lt;/span&gt;
    &lt;span class="cm"&gt;/* invoke it via the generic syscall wrapper */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;syscall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SYS_seccomp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECCOMP_SET_MODE_FILTER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"seccomp failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;utsname&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;sandbox&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uname failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"My OS is %s!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sysname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To sandbox itself the application defines a &lt;a href="https://www.kernel.org/doc/Documentation/networking/filter.txt"&gt;BPF program&lt;/a&gt;, which implements the desired sandboxing policy. Then the application passes this program to the kernel via the &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;seccomp&lt;/a&gt; system call. The kernel does some validation checks to ensure the BPF program is OK and then runs this program on every system call the application makes. The results of the execution of the program is used by the kernel to determine if the current call complies with the desired policy. In other words the BPF program is the “contract” between the application and the kernel.&lt;/p&gt;

&lt;p&gt;In our toy example above, the BPF program simply checks which system call is about to be invoked. If the application is trying to use the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname system call&lt;/a&gt; we tell the kernel to just return a EPERM (which stands for “operation not permitted”) error code. We also tell the kernel to allow any other system call. Let’s see if it works now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos_raw_seccomp.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./myos
&lt;span class="nb"&gt;uname &lt;/span&gt;failed: Operation not permitted
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;uname&lt;/code&gt; failed now with the EPERM error code and EPERM is not even described as a potential failure code in the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname manpage&lt;/a&gt;! So we know now that this happened because we “told” the kernel to prohibit us using the uname syscall and to return EPERM instead. We can double check this by replacing EPERM with some other error code, which is totally inappropriate for this context, for example ENETDOWN (“network is down”). Why would we need the network to be up to just get the currently executing OS? Yet, recompiling and rerunning the program we get:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos_raw_seccomp.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./myos
&lt;span class="nb"&gt;uname &lt;/span&gt;failed: Network is down
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also verify the other part of our “contract” works as expected. We told the kernel to allow any other system call, remember? In our program, when uname fails, we convert the error code to a human readable message and print it on the screen with the &lt;a href="https://www.man7.org/linux/man-pages/man3/perror.3.html"&gt;perror&lt;/a&gt; function. To print on the screen &lt;a href="https://www.man7.org/linux/man-pages/man3/perror.3.html"&gt;perror&lt;/a&gt; uses the &lt;a href="https://www.man7.org/linux/man-pages/man2/write.2.html"&gt;write system call&lt;/a&gt; under the hood and since we can actually see the printed error message, we know that the kernel allowed our program to make the &lt;a href="https://www.man7.org/linux/man-pages/man2/write.2.html"&gt;write system call&lt;/a&gt; in the first place.&lt;/p&gt;

&lt;h3&gt;
  
  
  seccomp with libseccomp
&lt;/h3&gt;

&lt;p&gt;While it is possible to use seccomp directly, as in the examples above, BPF programs are cumbersome to write by hand and hard to debug, review and update later. That’s why it is usually a good idea to use a more high-level library, which abstracts away most of the low-level details. Luckily &lt;a href="https://github.com/seccomp/libseccomp"&gt;such a library exists&lt;/a&gt;: it is called libseccomp and is even recommended by the &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;seccomp man page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let’s rewrite our program’s &lt;code&gt;sandbox()&lt;/code&gt; function to use this library instead:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;myos_libseccomp.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#define _GNU_SOURCE
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;sys/utsname.h&amp;gt;
#include &amp;lt;seccomp.h&amp;gt;
#include &amp;lt;err.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;sandbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* allow all syscalls by default */&lt;/span&gt;
    &lt;span class="n"&gt;scmp_filter_ctx&lt;/span&gt; &lt;span class="n"&gt;seccomp_ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seccomp_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SCMP_ACT_ALLOW&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;seccomp_ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"seccomp_init failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* kill the process, if it tries to use "uname" syscall */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seccomp_rule_add_exact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seccomp_ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SCMP_ACT_KILL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seccomp_syscall_resolve_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uname"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"seccomp_rule_add_exact failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* apply the composed filter */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seccomp_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seccomp_ctx&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"seccomp_load failed"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* release allocated context */&lt;/span&gt;
    &lt;span class="n"&gt;seccomp_release&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seccomp_ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;utsname&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;sandbox&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uname failed: "&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"My OS is %s!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sysname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our &lt;code&gt;sandbox()&lt;/code&gt; function not only became shorter and much more readable, but also provided the ability to reference syscalls in our rules by names and not internal numbers as well as not having to deal with other quirks, like setting &lt;code&gt;PR_SET_NO_NEW_PRIVS&lt;/code&gt; bit and dealing with system architectures.&lt;/p&gt;

&lt;p&gt;It is worth noting we have modified our seccomp policy a bit. In the raw seccomp example above we instructed the kernel to return an error code when the application tries to execute a prohibited syscall. This is good for demonstration purposes, but in most cases a stricter action is required. Just returning an error code and allowing the application to continue gives the potentially malicious code a chance to bypass the policy. There are many syscalls in Linux and some of them do the same or similar things. For example, we might want to prohibit the application to read data from disk, so we deny the &lt;a href="https://www.man7.org/linux/man-pages/man2/read.2.html"&gt;read&lt;/a&gt; syscall in our policy and tell the kernel to return an error code instead. However, if the application does get exploited, the exploit code/logic might look like below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="err"&gt;…&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* hm… read failed, but what about pread? */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;pread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* what about readv? */&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="cm"&gt;/* bypassed the prohibited read(2) syscall */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="err"&gt;…&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait what?! There is more than one read system call? Yes, there are &lt;a href="https://www.man7.org/linux/man-pages/man2/read.2.html"&gt;read&lt;/a&gt;, &lt;a href="https://man7.org/linux/man-pages/man2/pread.2.html"&gt;pread&lt;/a&gt;, &lt;a href="https://man7.org/linux/man-pages/man2/readv.2.html"&gt;readv&lt;/a&gt; as well as more obscure ones, like &lt;a href="https://blog.cloudflare.com/io_submit-the-epoll-alternative-youve-never-heard-about/"&gt;io_submit&lt;/a&gt; and &lt;code&gt;io_uring_enter&lt;/code&gt;. Of course, it is our fault for providing incomplete seccomp policy, which does not block all possible read syscalls. But if at least we had instructed the kernel to terminate the process immediately upon violation of the first plain &lt;code&gt;read&lt;/code&gt;, the malicious code above would not have the chance to be clever and try other options.&lt;/p&gt;

&lt;p&gt;Given the above in the libseccomp example we have a stricter policy now, which tells the kernel to terminate the process upon the policy violation. Let’s see if it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos_libseccomp.c &lt;span class="nt"&gt;-lseccomp&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;./myos
Bad system call
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we need to link against &lt;a href="https://github.com/seccomp/libseccomp"&gt;libseccomp&lt;/a&gt; when compiling the application. Also, when we run the application, we don’t see the &lt;code&gt;uname failed: Operation not permitted&lt;/code&gt; error output anymore, because we don’t give the application the ability to even print a failure message. Instead, we see a &lt;code&gt;Bad system call&lt;/code&gt; message from the shell, which tells us that the application was terminated with a &lt;code&gt;SIGSYS&lt;/code&gt; &lt;a href="https://man7.org/linux/man-pages/man7/signal.7.html"&gt;signal&lt;/a&gt;. Great!&lt;/p&gt;

&lt;h2&gt;
  
  
  zero code seccomp
&lt;/h2&gt;

&lt;p&gt;The previous examples worked fine, but both of them have one disadvantage: we actually needed to modify the source code to embed our desired seccomp policy into the application. This is because &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;seccomp syscall&lt;/a&gt; affects the calling process and its children, but there is no interface to inject the policy from “outside”. It is expected that developers will sandbox their code themselves as part of the application logic, but in practice this rarely happens. When developers are starting a new project, most of the time the focus is on primary functionality and security features are usually either postponed or omitted altogether. Also, most real-world software is usually written using some high-level programming language and/or a framework, where the developers do not deal with the system calls directly and probably are even unaware which system calls are being used by their code.&lt;/p&gt;

&lt;p&gt;On the other hand we have system operators, sysadmins, SRE and other folks, who run the above code in production. They are more incentivized to keep production systems secure, thus would probably want to sandbox the services as much as possible. But most of the time they don’t have access to the source code. So there are mismatched expectations: developers have the ability to sandbox their code, but are usually not incentivized to do so and operators have the incentive to sandbox the code, but don’t have the ability.&lt;/p&gt;

&lt;p&gt;This is where “zero code seccomp” might help, where an external operator can inject the desired sandbox policy into any process without needing to modify any source code. &lt;a href="https://www.freedesktop.org/wiki/Software/systemd/"&gt;Systemd&lt;/a&gt; is one of the popular implementations of a “zero code seccomp” approach. Systemd-managed services can have a &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html#SystemCallFilter="&gt;&lt;code&gt;SystemCallFilter=&lt;/code&gt;&lt;/a&gt; directive defined in their &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.service.html"&gt;unit files&lt;/a&gt; listing all the system calls the managed service is allowed to make. As an example, let’s go back to our toy application without any sandboxing code embedded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos.c
&lt;span class="nv"&gt;$ &lt;/span&gt;./myos
My OS is Linux!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can run the same code with systemd, but prohibit the application for using &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname&lt;/a&gt; without changing or recompiling any code (we’re using &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd-run.html"&gt;systemd-run&lt;/a&gt; to create an ephemeral systemd service unit for us):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;systemd-run &lt;span class="nt"&gt;--user&lt;/span&gt; &lt;span class="nt"&gt;--pty&lt;/span&gt; &lt;span class="nt"&gt;--same-dir&lt;/span&gt; &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="nt"&gt;--service-type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"SystemCallFilter=~uname"&lt;/span&gt; ./myos
Running as unit: run-u0.service
Press ^] three &lt;span class="nb"&gt;times &lt;/span&gt;within 1s to disconnect TTY.
Finished with result: signal
Main processes terminated with: &lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;killed/status&lt;span class="o"&gt;=&lt;/span&gt;SYS
Service runtime: 6ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We don’t see the normal &lt;code&gt;My OS is Linux!&lt;/code&gt; output anymore and systemd conveniently tells us that the managed process was terminated with a &lt;code&gt;SIGSYS&lt;/code&gt; signal. We can even go further and use another directive &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html#SystemCallErrorNumber="&gt;&lt;code&gt;SystemCallErrorNumber=&lt;/code&gt;&lt;/a&gt; to configure our seccomp policy not to terminate the application, but return an error code instead as in our first seccomp raw example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;systemd-run &lt;span class="nt"&gt;--user&lt;/span&gt; &lt;span class="nt"&gt;--pty&lt;/span&gt; &lt;span class="nt"&gt;--same-dir&lt;/span&gt; &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="nt"&gt;--service-type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"SystemCallFilter=~uname"&lt;/span&gt; &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"SystemCallErrorNumber=ENETDOWN"&lt;/span&gt; ./myos
Running as unit: run-u2.service
Press ^] three &lt;span class="nb"&gt;times &lt;/span&gt;within 1s to disconnect TTY.
&lt;span class="nb"&gt;uname &lt;/span&gt;failed: Network is down
Finished with result: exit-code
Main processes terminated with: &lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exited/status&lt;span class="o"&gt;=&lt;/span&gt;1
Service runtime: 6ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  systemd small print
&lt;/h3&gt;

&lt;p&gt;Great! We can now inject almost any seccomp policy into any process without the need to write any code or recompile the application. However, there is an interesting statement in the &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html#SystemCallFilter="&gt;systemd documentation&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;...Note that the &lt;code&gt;execve, exit, exit_group, getrlimit, rt_sigreturn, sigreturn&lt;/code&gt; system calls and the system calls for querying time and sleeping are implicitly whitelisted and do not need to be listed explicitly...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Some system calls are implicitly allowed and we don’t have to list them. This is mostly related to the way how systemd manages processes and injects the seccomp policy. We established earlier that seccomp policy applies to the current process and its children. So, to inject the policy, systemd &lt;a href="https://www.man7.org/linux/man-pages/man2/fork.2.html"&gt;forks&lt;/a&gt; itself, calls &lt;a href="https://www.man7.org/linux/man-pages/man2/seccomp.2.html"&gt;seccomp&lt;/a&gt; in the forked process and then &lt;a href="https://www.man7.org/linux/man-pages/man2/execve.2.html"&gt;execs&lt;/a&gt; the forked process into the target application. That’s why always allowing the &lt;a href="https://www.man7.org/linux/man-pages/man2/execve.2.html"&gt;execve&lt;/a&gt; system call is necessary in the first place, because otherwise systemd cannot do its job as a service manager.&lt;/p&gt;

&lt;p&gt;But what if we want to explicitly prohibit some of these system calls? If we continue with the &lt;a href="https://www.man7.org/linux/man-pages/man2/execve.2.html"&gt;execve&lt;/a&gt; as an example, that can actually be a dangerous system call most applications would want to prohibit. Seccomp is an effective tool to protect the code from arbitrary code execution exploits, remember? If a malicious actor takes over our code, most likely the first thing they will try is to get a shell (or replace our code with any other application which is easier to control) by directing our code to call &lt;a href="https://www.man7.org/linux/man-pages/man2/execve.2.html"&gt;execve&lt;/a&gt; with the desired binary. So, if our code does not need &lt;a href="https://www.man7.org/linux/man-pages/man2/execve.2.html"&gt;execve&lt;/a&gt; for its main functionality, it would be a good idea to prohibit it. Unfortunately, it is not possible with the systemd &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html#SystemCallFilter="&gt;&lt;code&gt;SystemCallFilter=&lt;/code&gt;&lt;/a&gt; approach...&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing Cloudflare sandbox
&lt;/h2&gt;

&lt;p&gt;We really liked the “zero code seccomp” approach with systemd &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html#SystemCallFilter="&gt;&lt;code&gt;SystemCallFilter=&lt;/code&gt;&lt;/a&gt; directive, but were not satisfied with its limitations. We decided to take it one step further and make it possible to prohibit any system call in any process externally without touching its source code, so came up with the &lt;a href="https://github.com/cloudflare/sandbox"&gt;Cloudflare sandbox&lt;/a&gt;. It’s a simple standalone toolkit consisting of a shared library and an executable. The shared library is supposed to be used with dynamically linked applications and the executable is for statically linked applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  sandboxing dynamically linked executables
&lt;/h3&gt;

&lt;p&gt;For dynamically linked executables it is possible to inject custom code into the process by utilizing the &lt;a href="https://www.man7.org/linux/man-pages/man8/ld.so.8.html"&gt;&lt;code&gt;LD_PRELOAD&lt;/code&gt;&lt;/a&gt; environment variable. The &lt;code&gt;libsandbox.so&lt;/code&gt; shared library from our toolkit also contains a so-called &lt;a href="https://gcc.gnu.org/onlinedocs/gccint/Initialization.html"&gt;initialization routine&lt;/a&gt;, which should be executed before the main logic. This is how we make the target application sandbox itself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.man7.org/linux/man-pages/man8/ld.so.8.html"&gt;&lt;code&gt;LD_PRELOAD&lt;/code&gt;&lt;/a&gt; tells the dynamic loader to load our &lt;code&gt;libsandbox.so&lt;/code&gt; as part of the application, when it starts&lt;/li&gt;
&lt;li&gt;the runtime executes the &lt;a href="https://gcc.gnu.org/onlinedocs/gccint/Initialization.html"&gt;initialization routine&lt;/a&gt; from the &lt;code&gt;libsandbox.so&lt;/code&gt; before most of the main logic&lt;/li&gt;
&lt;li&gt;our initialization routine configures the sandbox policy described in special environment variables&lt;/li&gt;
&lt;li&gt;by the time the main application logic begin executing, the target process has the configured seccomp policy enforced&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s see how it works with our &lt;code&gt;myos&lt;/code&gt; toy tool. First, we need to make sure it is actually a dynamically linked application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;ldd ./myos
    linux-vdso.so.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007ffd8e1e3000&lt;span class="o"&gt;)&lt;/span&gt;
    libc.so.6 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libc.so.6 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f339ddfb000&lt;span class="o"&gt;)&lt;/span&gt;
    /lib64/ld-linux-x86-64.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f339dfcf000&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yes, it is . Now, let’s prohibit it from using the &lt;a href="https://www.man7.org/linux/man-pages/man2/uname.2.html"&gt;uname&lt;/a&gt; system call with our toolkit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/usr/lib/x86_64-linux-gnu/libsandbox.so &lt;span class="nv"&gt;SECCOMP_SYSCALL_DENY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; ./myos
adding &lt;span class="nb"&gt;uname &lt;/span&gt;to the process seccomp filter
Bad system call
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yet again, we’ve managed to inject our desired seccomp policy into the &lt;code&gt;myos&lt;/code&gt; application without modifying or recompiling it. The advantage of this approach is that it doesn’t have the shortcomings of the systemd’s &lt;a href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html#SystemCallFilter="&gt;&lt;code&gt;SystemCallFilter=&lt;/code&gt;&lt;/a&gt; and we can block any system call (luckily &lt;a href="https://www.gnu.org/software/bash/"&gt;Bash&lt;/a&gt; is a dynamically linked application as well):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;/bin/bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'echo I will try to execve something...; exec /usr/bin/echo Doing arbitrary code execution!!!'&lt;/span&gt;
I will try to execve something...
Doing arbitrary code execution!!!
&lt;span class="nv"&gt;$ LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/usr/lib/x86_64-linux-gnu/libsandbox.so &lt;span class="nv"&gt;SECCOMP_SYSCALL_DENY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;execve /bin/bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'echo I will try to execve something...; exec /usr/bin/echo Doing arbitrary code execution!!!'&lt;/span&gt;
adding execve to the process seccomp filter
I will try to execve something...
Bad system call
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The only problem here is that we may accidentally forget to &lt;code&gt;LD_PRELOAD&lt;/code&gt; our &lt;code&gt;libsandbox.so&lt;/code&gt; library and potentially run unprotected. Also, as described in the &lt;a href="https://www.man7.org/linux/man-pages/man8/ld.so.8.html"&gt;man page&lt;/a&gt;, &lt;code&gt;LD_PRELOAD&lt;/code&gt; has some limitations. We can overcome all these problems by making &lt;code&gt;libsandbox.so&lt;/code&gt; a permanent part of our target application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;patchelf &lt;span class="nt"&gt;--add-needed&lt;/span&gt; /usr/lib/x86_64-linux-gnu/libsandbox.so ./myos
&lt;span class="nv"&gt;$ &lt;/span&gt;ldd ./myos
    linux-vdso.so.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007fff835ae000&lt;span class="o"&gt;)&lt;/span&gt;
    /usr/lib/x86_64-linux-gnu/libsandbox.so &lt;span class="o"&gt;(&lt;/span&gt;0x00007fc4f55f2000&lt;span class="o"&gt;)&lt;/span&gt;
    libc.so.6 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libc.so.6 &lt;span class="o"&gt;(&lt;/span&gt;0x00007fc4f5425000&lt;span class="o"&gt;)&lt;/span&gt;
    /lib64/ld-linux-x86-64.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007fc4f5647000&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, we didn’t need access to the source code here, but patched the compiled binary instead. Now we can just configure our seccomp policy as before without the need of &lt;code&gt;LD_PRELOAD&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./myos
My OS is Linux!
&lt;span class="nv"&gt;$ SECCOMP_SYSCALL_DENY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; ./myos
adding &lt;span class="nb"&gt;uname &lt;/span&gt;to the process seccomp filter
Bad system call
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  sandboxing statically linked executables
&lt;/h3&gt;

&lt;p&gt;The above method is quite convenient and easy, but it doesn’t work for statically linked executables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-static&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos.c
&lt;span class="nv"&gt;$ &lt;/span&gt;ldd ./myos
    not a dynamic executable
&lt;span class="nv"&gt;$ LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/usr/lib/x86_64-linux-gnu/libsandbox.so &lt;span class="nv"&gt;SECCOMP_SYSCALL_DENY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; ./myos
My OS is Linux!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is because there is no &lt;a href="https://www.man7.org/linux/man-pages/man8/ld.so.8.html"&gt;dynamic loader&lt;/a&gt; involved in starting a statically linked executable, so &lt;code&gt;LD_PRELOAD&lt;/code&gt; has no effect. For this case our toolkit contains a special application launcher, which will inject the seccomp rules similarly to the way systemd does it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;sandboxify ./myos
My OS is Linux!
&lt;span class="nv"&gt;$ SECCOMP_SYSCALL_DENY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;uname &lt;/span&gt;sandboxify ./myos
adding &lt;span class="nb"&gt;uname &lt;/span&gt;to the process seccomp filter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that we don’t see the &lt;code&gt;Bad system call&lt;/code&gt; shell message anymore, because our target executable is being started by the launcher instead of the shell directly. Unlike systemd however, we can use this launcher to block dangerous system calls, like &lt;a href="https://www.man7.org/linux/man-pages/man2/execve.2.html"&gt;execve&lt;/a&gt;, as well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;sandboxify /bin/bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'echo I will try to execve something...; exec /usr/bin/echo Doing arbitrary code execution!!!'&lt;/span&gt;
I will try to execve something...
Doing arbitrary code execution!!!
&lt;span class="nv"&gt;SECCOMP_SYSCALL_DENY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;execve sandboxify /bin/bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'echo I will try to execve something...; exec /usr/bin/echo Doing arbitrary code execution!!!'&lt;/span&gt;
adding execve to the process seccomp filter
I will try to execve something...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  sandboxify vs libsandbox.so
&lt;/h3&gt;

&lt;p&gt;From the examples above you may notice that it is possible to use &lt;code&gt;sandboxify&lt;/code&gt; with dynamically linked executables as well, so why even bother with &lt;code&gt;libsandbox.so&lt;/code&gt;? The difference becomes visible, when we start using not the “denylist” policy as in most examples in this post, but rather the preferred “allowlist” policy, where we explicitly allow only the system calls we need, but prohibit everything else.&lt;/p&gt;

&lt;p&gt;Let’s convert our toy application back into the dynamically-linked one and try to come up with the minimal list of allowed system calls it needs to function properly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; myos myos.c
&lt;span class="nv"&gt;$ &lt;/span&gt;ldd ./myos
    linux-vdso.so.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007ffe027f6000&lt;span class="o"&gt;)&lt;/span&gt;
    libc.so.6 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libc.so.6 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f4f1410a000&lt;span class="o"&gt;)&lt;/span&gt;
    /lib64/ld-linux-x86-64.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f4f142de000&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;$ LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/usr/lib/x86_64-linux-gnu/libsandbox.so &lt;span class="nv"&gt;SECCOMP_SYSCALL_ALLOW&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exit_group:fstat:uname:write ./myos
adding exit_group to the process seccomp filter
adding fstat to the process seccomp filter
adding &lt;span class="nb"&gt;uname &lt;/span&gt;to the process seccomp filter
adding write to the process seccomp filter
My OS is Linux
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So we need to allow 4 system calls: &lt;code&gt;exit_group:fstat:uname:write&lt;/code&gt;. This is the tightest “sandbox”, which still doesn’t break the application. If we remove any system call from this list, the application will terminate with the &lt;code&gt;Bad system call&lt;/code&gt; message (try it yourself!).&lt;/p&gt;

&lt;p&gt;If we use the same allowlist, but with the &lt;code&gt;sandboxify&lt;/code&gt; launcher, things do not work anymore:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ SECCOMP_SYSCALL_ALLOW&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exit_group:fstat:uname:write sandboxify ./myos
adding exit_group to the process seccomp filter
adding fstat to the process seccomp filter
adding &lt;span class="nb"&gt;uname &lt;/span&gt;to the process seccomp filter
adding write to the process seccomp filter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reason is &lt;code&gt;sandboxify&lt;/code&gt; and &lt;code&gt;libsandbox.so&lt;/code&gt; inject seccomp rules at different stages of the process lifecycle. Consider the following very high level diagram of a process startup:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TmQRTocR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/image3-2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TmQRTocR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/07/image3-2.png" alt="image3-2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In a nutshell, every process has two runtime stages: “runtime init” and the “main logic”. The main logic is basically the code, which is located in the program &lt;code&gt;main()&lt;/code&gt; function and other code put there by the application developers. But the process usually needs to do some work before the code from the &lt;code&gt;main()&lt;/code&gt; function is able to execute - we call this work the “runtime init” on the diagram above. Developers do not write this code directly, but most of the time this code is automatically generated by the compiler toolchain, which is used to compile the source code.&lt;/p&gt;

&lt;p&gt;To do its job, the “runtime init” stage uses a lot of different system calls, but most of them are not needed later at the “main logic” stage. If we’re using the “allowlist” approach for our sandboxing, it does not make sense to allow these system calls for the whole duration of the program, if they are only used once on program init. This is where the difference between &lt;code&gt;libsandbox.so&lt;/code&gt; and &lt;code&gt;sandboxify&lt;/code&gt; comes from: &lt;code&gt;libsandbox.so&lt;/code&gt; enforces the seccomp rules usually after the “runtime init” stage has already executed, so we don’t have to allow most system calls from that stage. &lt;code&gt;sandboxify&lt;/code&gt; on the other hand enforces the policy before the “runtime init” stage, so we have to allow all the system calls from both stages, which usually results in a bigger allowlist, thus wider attack surface.&lt;/p&gt;

&lt;p&gt;Going back to our toy &lt;code&gt;myos&lt;/code&gt; example, here is the minimal list of all the system calls we need to allow to make the application work under our sandbox:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ SECCOMP_SYSCALL_ALLOW&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;access:arch_prctl:brk:close:exit_group:fstat:mmap:mprotect:munmap:openat:read:uname:write sandboxify ./myos
adding access to the process seccomp filter
adding arch_prctl to the process seccomp filter
adding brk to the process seccomp filter
adding close to the process seccomp filter
adding exit_group to the process seccomp filter
adding fstat to the process seccomp filter
adding mmap to the process seccomp filter
adding mprotect to the process seccomp filter
adding munmap to the process seccomp filter
adding openat to the process seccomp filter
adding &lt;span class="nb"&gt;read &lt;/span&gt;to the process seccomp filter
adding &lt;span class="nb"&gt;uname &lt;/span&gt;to the process seccomp filter
adding write to the process seccomp filter
My OS is Linux!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is 13 syscalls vs 4 syscalls, if we’re using the &lt;code&gt;libsandbox.so&lt;/code&gt; approach!&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;In this post we discussed how to easily sandbox applications on Linux without the need to write any additional code. We introduced the &lt;a href="https://github.com/cloudflare/sandbox"&gt;Cloudflare sandbox toolkit&lt;/a&gt; and discussed the different approaches we take at sandboxing dynamically linked applications vs statically linked applications.&lt;/p&gt;

&lt;p&gt;Having safer code online helps to build a Better Internet and we would be happy if you find our &lt;a href="https://github.com/cloudflare/sandbox"&gt;sandbox toolkit&lt;/a&gt; useful. Looking forward to the feedback, improvements and other contributions!&lt;/p&gt;

</description>
      <category>linux</category>
      <category>security</category>
      <category>sandbox</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Import Markdown into Medium in 4 clicks</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Mon, 25 May 2020 19:19:23 +0000</pubDate>
      <link>https://dev.to/ignatk/import-markdown-into-medium-in-4-clicks-3jgb</link>
      <guid>https://dev.to/ignatk/import-markdown-into-medium-in-4-clicks-3jgb</guid>
      <description>&lt;h2&gt;
  
  
  Or my first web app in 10 years: what could go wrong?
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pqsec.org/2020/05/25/import-markdown-into-medium.html"&gt;pqsec.org&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR: if you're just looking for the tool itself, it is &lt;a href="https://m2m.pqsec.org/"&gt;here&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I write my personal posts on &lt;a href="https://pqsec.org/"&gt;my own blog&lt;/a&gt;, but to reach a wider audience I followed the advice on the Internet to cross-post my posts on popular publishing platforms. In the end I couldn't select between &lt;a href="https://medium.com/"&gt;Medium&lt;/a&gt; and &lt;a href="https://dev.to/"&gt;dev.to&lt;/a&gt;, so decided to use both.&lt;/p&gt;

&lt;p&gt;My own blog is hosted on &lt;a href="https://pages.github.com/"&gt;GitHub Pages&lt;/a&gt;, so I write my posts in &lt;a href="https://en.wikipedia.org/wiki/Markdown"&gt;Markdown&lt;/a&gt;. Luckily, &lt;a href="https://dev.to/"&gt;dev.to&lt;/a&gt; supports Markdown natively as well, so I can just copy-paste posts there. &lt;a href="https://medium.com/"&gt;Medium&lt;/a&gt;, however, is not very Markdown friendly: they heavily promote the usage of their superior editor, which is great indeed, but completely useless when it comes to cross-posting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Markdown to Medium prior art
&lt;/h3&gt;

&lt;p&gt;There is always the official &lt;a href="https://help.medium.com/hc/en-us/articles/214550207-Import-a-post"&gt;Medium import tool&lt;/a&gt;, but when I tried to use it on one of my posts, it completely omitted all the code blocks. It's probably OK, if you have one or two of those and you can re-add them manually, but quickly becomes tiresome for a heavy code oriented post.&lt;/p&gt;

&lt;p&gt;The second thing I found was this method of &lt;a href="https://medium.com/@andymcfee/how-to-import-markdown-into-medium-c06dc981bd96"&gt;using the official import tool on a GitHub gist&lt;/a&gt;, but this produced the same result for me as above. The post also mentions &lt;a href="https://markdowntomedium.com/"&gt;this automated tool&lt;/a&gt;, which goes one step further: it creates a separate &lt;a href="https://gist.github.com/"&gt;GitHub gist&lt;/a&gt; for every code block in the post and apparently tries to use Medium gist auto expansion, which again did not work for me as all gist links remained not expanded in the editor. I quickly noticed the downsides of the approach as well: when parsing my post &lt;a href="https://pqsec.org/2020/04/13/fixing-weak-crypto-in-openssl-based-applications.html"&gt;Fixing weak crypto in OpenSSL based applications&lt;/a&gt;, it created more than 20 gists in my GitHub account and that's for a single post! GitHub gists do not have the concept of "folders", so your account might quickly become messy because of all these gists. Finally, I prefer for all post contents to be hosted in one place. Otherwise, if GitHub is having problems for example, parts of your post suddenly become unreadable, which might frustrate the audience. For completness, I would like to mention &lt;a href="https://markdium.dev/"&gt;Markdium&lt;/a&gt; - I didn't try it myself, but according to the demo it uses the same automated GitHub gist approach, but also provides a nice Markdown editor directly in the browser.&lt;/p&gt;

&lt;p&gt;Next, there is &lt;a href="http://markdown-to-medium.surge.sh/"&gt;this browser based tool&lt;/a&gt;: it claims to generate html, which is compatible with Medium's editor. So the workflow is like this: you copy-paste the Markdown into the tool, the tool renders it and produces html and you copy-paste this html directly into the Medium's editor. The tool worked as described, when I tried it - I didn't find any inconsistencies and all the code blocks were in place. There are also no gists or giving away your GitHub credentials involved. The only downside is that you have to copy-paste twice. The other potential concern is while the tool produces "compatible" html now, there is no guarantee it would stay that way as Medium might change the way it handles the input any time.&lt;/p&gt;

&lt;p&gt;Finally, I found &lt;a href="https://medium.com/@icyphox/mdium-publish-your-markdown-to-medium-from-the-cli-79906ef6b16b"&gt;Mdium&lt;/a&gt; - a small python tool, which can publish a Markdown post via... Medium API! But the surprise came not from the tool itself, rather (quoting from the tool's post):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It’s 2019 and their editor doesn’t support Markdown, but their API does?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Wait, what? Medium does support Markdown natively, but API only? Heading to the &lt;a href="https://github.com/Medium/medium-api-docs#creating-a-post"&gt;official Medium API documentation&lt;/a&gt; we see:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;contentFormat: The format of the "content" field. There are two valid values, "html", and "markdown"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I didn't try the Mdium python tool to be honest, just took the previously mentioned &lt;a href="https://pqsec.org/2020/04/13/fixing-weak-crypto-in-openssl-based-applications.html"&gt;Fixing weak crypto in OpenSSL based applications&lt;/a&gt; Markdown source, generated myself &lt;a href="https://help.medium.com/hc/en-us/articles/213480228-Get-an-integration-token-for-your-writing-app"&gt;an API token&lt;/a&gt; and posted it as a draft with &lt;a href="https://curl.haxx.se/"&gt;curl&lt;/a&gt;. And voila! The post was nicely imported with all code blocks intact!&lt;/p&gt;

&lt;h3&gt;
  
  
  A browser based import tool
&lt;/h3&gt;

&lt;p&gt;Importing posts with curl is not a sustainable solution and, if there is an API, I was hoping there would be some ready tools providing this feature. But a quick search did not yield any results except the previously mentioned &lt;a href="https://medium.com/@icyphox/mdium-publish-your-markdown-to-medium-from-the-cli-79906ef6b16b"&gt;Mdium&lt;/a&gt;. There is nothing wrong with Mdium, but it just didn't tick all the boxes for me: probably, like many other modern Internet users, I have multiple devices with different operating systems, some of them are mobile tablets without straightforward access to python or terminal. Thanks to modern "cloud technologies" I can use any of those to start a new post or continue writing an existing one directly in the browser without the need to manually synchronise any data or installing any applications. I would like the same level of convenience for publishing posts as well, so it would be nice to have a small browser application to hit the Medium API without the need to install anything. And since I didn't find one, I decided to try to create it myself.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--CtP7Q7Hf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://pqsec.org/img/m2m/thanos.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--CtP7Q7Hf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://pqsec.org/img/m2m/thanos.gif" alt="thanos"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Selecting a UI framework
&lt;/h4&gt;

&lt;p&gt;I mostly do system programming and last time I did something around a non-console UI was more than 10 years ago. And that was not without horror stories. At the time &lt;a href="https://en.wikipedia.org/wiki/.NET_Framework"&gt;.NET Framework&lt;/a&gt; was quite popular as the primary tool to write Windows GUI applications. Microsoft also made a push to extend the technology to the Web with &lt;a href="https://dotnet.microsoft.com/apps/aspnet"&gt;ASP.NET&lt;/a&gt; - a server side scripting support for .NET Framework. One could even write a Web UI in ASP.NET with &lt;a href="https://dotnet.microsoft.com/apps/aspnet/web-forms"&gt;Web forms&lt;/a&gt; - basically, a bunch of ready made .NET components, which rendered into HTML elements. Once I used the technology to write a simple website, but the result was interesting: the site looked great and rendered fine in every modern browser at the time, except... Microsoft's own &lt;a href="https://en.wikipedia.org/wiki/Internet_Explorer"&gt;Internet Explorer&lt;/a&gt; 🤦. I didn't touch a Web UI ever since...&lt;/p&gt;

&lt;p&gt;Based on the above experience I was expecting that writing a Web UI would be the biggest challenge (and, oh my I was wrong!). I did hear there are numerous tools and frameworks for writing modern Web applications, which would hopefully take most of the pain away, but which one to choose? I read about &lt;a href="https://material.io/"&gt;Material Design&lt;/a&gt; and web frameworks like &lt;a href="https://reactjs.org/"&gt;React&lt;/a&gt;, &lt;a href="https://angular.io/"&gt;Angular&lt;/a&gt; and &lt;a href="https://vuejs.org/"&gt;Vue&lt;/a&gt;. These seem to be great tools to write complex Web applications, but each of them is not a complete solution. For example, you need to understand how to properly combine &lt;a href="https://material.io/"&gt;Material Design&lt;/a&gt; and &lt;a href="https://vuejs.org/"&gt;Vue&lt;/a&gt; or use some other third-party metaframework, like &lt;a href="https://vuematerial.io/"&gt;vuematerial.io&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;My requirements for the app were quite simple: a small web form and some backing Javascript, which makes JSON requests to the Medium API. I also wanted to get away from installing &lt;a href="https://nodejs.org/"&gt;Node JS&lt;/a&gt;, which seems to be required to follow any reasonable tutorials for the above technologies. It was just too much of a curve to write a simple web form. Then I looked at &lt;a href="https://getbootstrap.com/"&gt;Bootstrap&lt;/a&gt; and it seemed to be the right tool: a simple "all-in-one" solution, which allows to write plain familiar HTML directly in the browser and provide a reasonable cross-browser friendly UI. It also doesn't force you into a particular Javascript framework, so you don't even have to use any for simple things. In the end I was able to compose a small web form and draft simple UI notifications within hours with no prior experience with Bootstrap. My job was almost done...&lt;/p&gt;

&lt;h3&gt;
  
  
  Hello, CORS!
&lt;/h3&gt;

&lt;p&gt;... Or not! During the development of my Web app I even hacked a small local mock Medium API to test the whole flow and it worked great. I thought all I have to do now is to replace the API url with the real one and the app is ready. I was soon disappointed to see nothing happens, when I target my app at the real Medium API. Browser debugging tools showed my requests are simply being blocked because of &lt;a href="https://en.wikipedia.org/wiki/Cross-origin_resource_sharing"&gt;Cross-origin resource sharing (CORS)&lt;/a&gt; violation!&lt;/p&gt;

&lt;p&gt;This post will not deep dive into CORS as there are better tutorials, specs and detailed resources out there, but in a nutshell, CORS is a way to block by default most (especially authenticated!) cross-origin requests in modern browsers. During the testing with the mock Medium API it was fine, because requests originated from my &lt;code&gt;localhost&lt;/code&gt; "origin" and where hitting my mock server running on &lt;code&gt;localhost&lt;/code&gt; as well. In order for this to work with real Medium API, the API needs to reply with special &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS"&gt;CORS headers&lt;/a&gt; as well as support special "HTTP preflight requests". A proper combination of these gives the browser the "permission" to perform the API request - otherwise the browser will not do it as it happened in my case.&lt;/p&gt;

&lt;p&gt;I reached out to Medium support about CORS on their API in case I was doing something wrong, but they confirmed CORS is not supported as well as there are no plans to support it in the future. I don't quite understand why would anyone provide a public HTTP API, but make it usable from anything except the browser (if you know - please, reach out to &lt;a href="https://twitter.com/ignatkn"&gt;me on Twitter&lt;/a&gt;), but it is what it is. Without proper CORS support from the API publisher it seems the only way to make the API accessible in the browser is to use a third-pary CORS proxy, which will inject proper CORS headers.&lt;/p&gt;

&lt;h4&gt;
  
  
  Implementing a CORS proxy
&lt;/h4&gt;

&lt;p&gt;There are number of ready-made public CORS proxies, but probably one of the most popular one is &lt;a href="https://github.com/Rob--W/cors-anywhere"&gt;CORS anywhere&lt;/a&gt; running at &lt;a href="https://cors-anywhere.herokuapp.com/"&gt;https://cors-anywhere.herokuapp.com/&lt;/a&gt;. Unfortunately, it doesn't work for our use-case: the proxy injects a special CORS header &lt;code&gt;Access-Control-Allow-Origin&lt;/code&gt; with a &lt;code&gt;*&lt;/code&gt; (wildcard responce). This is good enough for most cases as it tells the browser that the server (or proxy in this case) allows cross-origin requests from any origin. Medium API, however, obviously requires authentication and looks for the presense of the &lt;code&gt;Authorization&lt;/code&gt; header in incoming requests with the &lt;a href="https://help.medium.com/hc/en-us/articles/213480228-Get-an-integration-token-for-your-writing-app"&gt;Medium API token&lt;/a&gt;. CORS dictates that browsers will only send cross-origin &lt;code&gt;Authorization&lt;/code&gt; header, if the origin was explicitly allowed (listed) by the server in &lt;code&gt;Access-Control-Allow-Origin&lt;/code&gt;, that is wildcard response is not good enough for these "authenticated" requests.&lt;/p&gt;

&lt;p&gt;Of course, one could grab the sources of &lt;a href="https://github.com/Rob--W/cors-anywhere"&gt;CORS anywhere&lt;/a&gt;, make the relevant changes and republish the proxy with proper &lt;code&gt;Authorization&lt;/code&gt; support, but I wanted something simpler than running my app in a public cloud. Then I remembered about &lt;a href="https://workers.cloudflare.com/"&gt;Cloudflare Workers&lt;/a&gt;, which are more than ideal for this kind of task. Moreover, Cloudflare Worker Templates resource already has a &lt;a href="https://developers.cloudflare.com/workers/templates/pages/cors_header_proxy/"&gt;barebone CORS proxy&lt;/a&gt;. The template, however, has similar limitations as &lt;a href="https://github.com/Rob--W/cors-anywhere"&gt;CORS anywhere&lt;/a&gt;, so requires slight adjustments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the template sets &lt;code&gt;Access-Control-Allow-Origin&lt;/code&gt; to &lt;code&gt;*&lt;/code&gt; in the preflight response the same way as &lt;a href="https://github.com/Rob--W/cors-anywhere"&gt;CORS anywhere&lt;/a&gt; - we need to replace this with something like &lt;code&gt;request.headers.get('Origin')&lt;/code&gt;, so the requester will always get their origin explicitly "allowed"&lt;/li&gt;
&lt;li&gt;we need to make &lt;code&gt;Access-Control-Allow-Methods&lt;/code&gt; dynamic as well to "allow" the browser to send the &lt;code&gt;Authorization&lt;/code&gt; header&lt;/li&gt;
&lt;li&gt;finally, we need to add &lt;code&gt;Access-Control-Allow-Credentials: true&lt;/code&gt; header to the preflight response to explicitly allow the browser to make authenticated requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition to the above I also removed the test page from the template and, unlike &lt;a href="https://github.com/Rob--W/cors-anywhere"&gt;CORS anywhere&lt;/a&gt;, where you have to specify the proxied URL via a query string, my implementation takes it from a custom &lt;code&gt;X-Corsify-Url&lt;/code&gt; request header. The full code is listed below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;handleRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiurl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Corsify-Url&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;
  &lt;span class="c1"&gt;// Rewrite request to point to API url. This also makes the request mutable&lt;/span&gt;
  &lt;span class="c1"&gt;// so we can add the correct Origin header to make the API server think&lt;/span&gt;
  &lt;span class="c1"&gt;// that this request isn't cross-site.&lt;/span&gt;
  &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiurl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiurl&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Host&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiurl&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Corsify-Url&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;// Recreate the response so we can modify the headers&lt;/span&gt;
  &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;// Set CORS headers&lt;/span&gt;
  &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Allow-Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;// Append to/Add Vary header so browser will cache response correctly&lt;/span&gt;
  &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Vary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;handleOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Make sure the necessary headers are present&lt;/span&gt;
  &lt;span class="c1"&gt;// for this to be a valid pre-flight request&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
    &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Request-Method&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
    &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Request-Headers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;corsHeaders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Allow-Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Allow-Methods&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GET, HEAD, POST, OPTIONS&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Allow-Headers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Request-Headers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Access-Control-Allow-Credentials&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Vary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Origin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;corsHeaders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Handle standard OPTIONS request.&lt;/span&gt;
    &lt;span class="c1"&gt;// If you want to allow other HTTP Methods, you can do that here.&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GET, HEAD, POST, OPTIONS&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fetch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;// if (url.pathname.startsWith(proxyEndpoint)) {&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OPTIONS&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Handle CORS preflight requests&lt;/span&gt;
      &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;respondWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;handleOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GET&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
      &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HEAD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
      &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Handle requests to the API server&lt;/span&gt;
      &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;respondWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;handleRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;respondWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;405&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;statusText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Method Not Allowed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After several iterations of testing and slight adjustments to my Medium import web application everything worked 🎉.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--obRl1xea--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://pqsec.org/img/m2m/m2m-demo.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--obRl1xea--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://pqsec.org/img/m2m/m2m-demo.gif" alt="m2m-demo"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CORS proxy security considerations
&lt;/h3&gt;

&lt;p&gt;There is definitely some risk in passing your requests, especially the ones with an API token, through a proxy. Because CORS proxies operate on the layer 7 of the &lt;a href="https://en.wikipedia.org/wiki/OSI_model"&gt;OSI model&lt;/a&gt;, they see request plaintext data, even if the original request was sent encrypted via &lt;a href="https://en.wikipedia.org/wiki/HTTPS"&gt;HTTPS&lt;/a&gt;. Rogue proxies can steal your credentials as well as modify the request/response data in any way they wish and not only inject proper CORS headers. In case of Medium, anyone with your token can publish posts in your name and do all the other things covered by the API, so make sure understand the risk of using a CORS proxy. This is, by the way, true for any other third-pary Medium integration mentioned in this post as all of them get your API token one way or the other, and some get a GitHub one as well.&lt;/p&gt;

&lt;p&gt;So, can you trust my proxy? I say "definitely yes", but if I was the casual reader of this post - a simple "yes" might not be enough for me. That's why for extra cautious I made the CORS proxy configurable in the UI, so you can point the app at your own instance. You can easily run one for free on Cloudflare Workers, which allows up to 100K requests per day (and I think no-one publishes 100K posts a day, so should be enough) with no need to think about applications, regions or instances. You can also opt for a more traditional approach and run a &lt;a href="https://github.com/Rob--W/cors-anywhere"&gt;CORS anywhere&lt;/a&gt;-like application in some public cloud or even on premise.&lt;/p&gt;

&lt;p&gt;Can we trust Cloudflare to run our proxy? I would be biased here to provide an opinion, so instead I would encourage interested readers to use CDN finder tools, such as &lt;a href="https://www.whatsmycdn.com/"&gt;this&lt;/a&gt; and check which CDN is used by Medium for more objective information.&lt;/p&gt;

&lt;p&gt;If you don't want to bother with running proxies, but still not comfortable with giving away your API token, the alternative approach would be to use one-time tokens. It seems there is no limit in how many Medium tokens you can generate and you can also easily revoke any token. So the workflow would be as simple as:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://help.medium.com/hc/en-us/articles/213480228-Get-an-integration-token-for-your-writing-app"&gt;generate a new integration token&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;publish a post using this token and any CORS proxy&lt;/li&gt;
&lt;li&gt;immediately revoke this token afterwards&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Final words
&lt;/h3&gt;

&lt;p&gt;By creating this simple app I learned that writing a simple Web UI with modern Web frameworks is not that scary anymore. Please, do note, I'm not a Web developer, so if you see some non-idiomatic Javascript code, don't shame me, but provide improvement suggestions or, even better, send a &lt;a href="https://github.com/pqsec/m2m/pulls"&gt;pull request on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I also learned a lot about CORS, but what I failed to understand is why many public APIs do not support it. At first glance it may seem that non-CORS API is more secure as it prevents malicious websites to impersonate users of the API. However, when a &lt;strong&gt;public&lt;/strong&gt; API does not support CORS, most developers seem to turn to third-party CORS proxies, like the ones mentioned in this post, thus probably even increasing the risk of a token leak should the proxy go rogue. Please, &lt;a href="https://twitter.com/ignatkn"&gt;reach out on Twitter&lt;/a&gt;, if you think otherwise.&lt;/p&gt;

&lt;p&gt;If you're reading this post on Medium, it was published from Markdown using this &lt;a href="https://m2m.pqsec.org/"&gt;Markdown to Medium&lt;/a&gt; tool. If you found it useful, send your comments or improvement suggestions &lt;a href="https://twitter.com/ignatkn"&gt;on Twitter&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>cors</category>
      <category>proxy</category>
      <category>markdown</category>
    </item>
    <item>
      <title>Using Go as a scripting language in Linux</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Sat, 18 Apr 2020 21:40:21 +0000</pubDate>
      <link>https://dev.to/ignatk/using-go-as-a-scripting-language-in-linux-4c8c</link>
      <guid>https://dev.to/ignatk/using-go-as-a-scripting-language-in-linux-4c8c</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a repost of my post from the &lt;a href="https://blog.cloudflare.com/using-go-as-a-scripting-language-in-linux/"&gt;Cloudflare Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;At Cloudflare we like Go. We use it in many &lt;a href="https://blog.cloudflare.com/what-weve-been-doing-with-go/"&gt;in-house software projects&lt;/a&gt; as well as parts of &lt;a href="https://blog.cloudflare.com/meet-gatebot-a-bot-that-allows-us-to-sleep/"&gt;bigger pipeline systems&lt;/a&gt;. But can we take Go to the next level and use it as a scripting language for our favourite operating system, Linux?&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bq9_Ewgz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.cloudflare.com/content/images/2018/02/gopher-tux-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bq9_Ewgz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog.cloudflare.com/content/images/2018/02/gopher-tux-1.png" alt="gopher and tux"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;&lt;a href="https://golang.org/doc/gopher/gophercolor.png"&gt;gopher image&lt;/a&gt; &lt;a href="https://creativecommons.org/licenses/by/3.0/"&gt;CC BY 3.0&lt;/a&gt; &lt;a href="http://reneefrench.blogspot.com/"&gt;Renee French&lt;/a&gt;&lt;/small&gt;&lt;br&gt;
&lt;small&gt;&lt;a href="https://pixabay.com/en/linux-penguin-tux-2025536/"&gt;Tux image&lt;/a&gt; &lt;a href="https://creativecommons.org/publicdomain/zero/1.0/deed.en"&gt;CC0 BY&lt;/a&gt; &lt;a href="https://pixabay.com/en/users/OpenClipart-Vectors-30363/"&gt;OpenClipart-Vectors&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Why consider Go as a scripting language
&lt;/h3&gt;

&lt;p&gt;Short answer: why not? Go is relatively easy to learn, not too verbose and there is a huge ecosystem of libraries which can be reused to avoid writing all the code from scratch. Some other potential advantages it might bring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go-based build system for your Go project: &lt;code&gt;go build&lt;/code&gt; command is mostly suitable for small, self-contained projects. More complex projects usually adopt some build system/set of scripts. Why not have these scripts written in Go then as well?&lt;/li&gt;
&lt;li&gt;Easy non-privileged package management out of the box: if you want to use a third-party library in your script, you can simply &lt;code&gt;go get&lt;/code&gt; it. And because the code will be installed in your &lt;code&gt;GOPATH&lt;/code&gt;, getting a third-party library does not require administrative privileges on the system (unlike some other scripting languages). This is especially useful in large corporate environments.&lt;/li&gt;
&lt;li&gt;Quick code prototyping on early project stages: when you're writing the first iteration of the code, it usually takes a lot of edits even to make it compile and you have to waste a lot of keystrokes on &lt;em&gt;"edit-&amp;gt;build-&amp;gt;check"&lt;/em&gt; cycle. Instead you can skip the "build" part and just immediately execute your source file.&lt;/li&gt;
&lt;li&gt;Strongly-typed scripting language: if you make a small typo somewhere in the middle of the script, most scripts will execute everything up to that point and fail on the typo itself. This might leave your system in an inconsistent state. With strongly-typed languages many typos can be caught at compile time, so the buggy script will not run in the first place.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Current state of Go scripting
&lt;/h3&gt;

&lt;p&gt;At first glance Go scripts seem easy to implement with Unix support of &lt;a href="https://en.wikipedia.org/wiki/Shebang_(Unix)"&gt;shebang lines&lt;/a&gt; for scripts. A shebang line is the first line of the script, which starts with &lt;code&gt;#!&lt;/code&gt; and specifies the script interpreter to be used to execute the script (for example, &lt;code&gt;#!/bin/bash&lt;/code&gt; or &lt;code&gt;#!/usr/bin/env python&lt;/code&gt;), so the system knows exactly how to execute the script regardless of the programming language used. And Go already supports interpreter-like invocation for &lt;code&gt;.go&lt;/code&gt; files with &lt;code&gt;go run&lt;/code&gt; command, so it should be just a matter of adding a proper shebang line, something like &lt;code&gt;#!/usr/bin/env go run&lt;/code&gt;, to any &lt;code&gt;.go&lt;/code&gt; file, setting the executable bit and we're good to go.&lt;/p&gt;

&lt;p&gt;However, there are problems around using &lt;code&gt;go run&lt;/code&gt; directly. &lt;a href="https://gist.github.com/posener/73ffd326d88483df6b1cb66e8ed1e0bd"&gt;This great post&lt;/a&gt; describes in detail all the issues around &lt;code&gt;go run&lt;/code&gt; and potential workarounds, but the gist is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;go run&lt;/code&gt; does not properly return the script error code back to the operating system and this is important for scripts, because error codes are one of the most common ways multiple scripts interact with each other and the operating system environment.&lt;/li&gt;
&lt;li&gt;you can't have a shebang line in a valid &lt;code&gt;.go&lt;/code&gt; file, because Go does not know how to process lines starting with &lt;code&gt;#&lt;/code&gt;. Other scripting languages do not have this problem, because for most of them &lt;code&gt;#&lt;/code&gt; is a way to specify comments, so the final interpreter just ignores the shebang line, but Go comments start with &lt;code&gt;//&lt;/code&gt; and &lt;code&gt;go run&lt;/code&gt; on invocation will just produce an error like:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package main:
helloscript.go:1:1: illegal character U+0023 '#'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://gist.github.com/posener/73ffd326d88483df6b1cb66e8ed1e0bd"&gt;The post&lt;/a&gt; describes several workarounds for above issues including using a custom wrapper program  &lt;a href="https://github.com/erning/gorun"&gt;gorun&lt;/a&gt; as an interpreter, but all of them do not provide an ideal solution. You either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;have to use non-standard shebang line, which starts with &lt;code&gt;//&lt;/code&gt;. This is technically not even a shebang line, but the way how &lt;code&gt;bash&lt;/code&gt; shell processes executable text files, so this solution is &lt;code&gt;bash&lt;/code&gt; specific. Also, because of the specific behaviour of &lt;code&gt;go run&lt;/code&gt;, this line is rather complex and not obvious (see &lt;a href="https://gist.github.com/posener/73ffd326d88483df6b1cb66e8ed1e0bd"&gt;original post&lt;/a&gt; for examples).&lt;/li&gt;
&lt;li&gt;have to use a custom wrapper program &lt;a href="https://github.com/erning/gorun"&gt;gorun&lt;/a&gt; in the shebang line, which works well, however, you end up with &lt;code&gt;.go&lt;/code&gt; files, which are not compilable with standard &lt;code&gt;go build&lt;/code&gt; command because of the illegal &lt;code&gt;#&lt;/code&gt; character.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  How Linux executes files
&lt;/h3&gt;

&lt;p&gt;OK, it seems the shebang approach does not provide us with an all-rounder solution. Is there anything else we could use? Let's take a closer look how Linux kernel executes binaries in the first place. When you try to execute a binary/script (or any file for that matter which has executable bit set), your shell in the end will just use Linux &lt;code&gt;execve&lt;/code&gt; &lt;a href="https://en.wikipedia.org/wiki/System_call"&gt;system call&lt;/a&gt; passing it the filesystem path of the binary in question, command line parameters and currently defined environment variables. Then the kernel is responsible for correct parsing of the file and creating a new process with the code from the file. Most of us know that Linux (and many other Unix-like operating systems) use &lt;a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format"&gt;ELF binary format&lt;/a&gt; for its executables.&lt;/p&gt;

&lt;p&gt;However, one of the core principles of Linux kernel development is to avoid "vendor/format lock-in" for any subsystem, which is part of the kernel. Therefore, Linux implements a "pluggable" system, which allows any binary format to be supported by the kernel - all you have to do is to write a correct module, which can parse the format of your choosing. And if you take a closer look at the kernel source code, you'll see that Linux supports more binary formats out of the box. For example, for the recent &lt;code&gt;4.14&lt;/code&gt; Linux kernel &lt;a href="https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/fs?h=linux-4.14.y"&gt;we can see&lt;/a&gt; that it supports at least 7 binary formats (in-tree modules for various binary formats usually have &lt;code&gt;binfmt_&lt;/code&gt; prefix in their names). It is worth to note the &lt;a href="https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/fs/binfmt_script.c?h=linux-4.14.y"&gt;binfmt_script&lt;/a&gt; module, which is responsible for parsing above mentioned shebang lines and  executing scripts on the target system (not everyone knows that the shebang support is actually implemented in the kernel itself and not in the shell or other daemon/process).&lt;/p&gt;
&lt;h3&gt;
  
  
  Extending supported binary formats from userspace
&lt;/h3&gt;

&lt;p&gt;But since we concluded that shebang is not the best option for our Go scripting, seems we need something else. Surprisingly Linux kernel already has a &lt;a href="https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/fs/binfmt_misc.c?h=linux-4.14.y"&gt;"something else" binary support module&lt;/a&gt;, which has an appropriate name &lt;code&gt;binfmt_misc&lt;/code&gt;. The module allows an administrator to dynamically add support for various executable formats directly from userspace through a well-defined &lt;code&gt;procfs&lt;/code&gt; interface and is &lt;a href="https://www.kernel.org/doc/html/v4.14/admin-guide/binfmt-misc.html"&gt;well-documented&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's follow &lt;a href="https://www.kernel.org/doc/html/v4.14/admin-guide/binfmt-misc.html"&gt;the documentation&lt;/a&gt; and try to setup a binary format description for &lt;code&gt;.go&lt;/code&gt; files. First of all the guide tells you to mount special &lt;code&gt;binfmt_misc&lt;/code&gt; filesystem to &lt;code&gt;/proc/sys/fs/binfmt_misc&lt;/code&gt;. If you're using relatively recent systemd-based Linux distribution, it is highly likely the filesystem is already mounted for you, because systemd by default installs special &lt;a href="https://github.com/systemd/systemd/blob/master/units/proc-sys-fs-binfmt_misc.mount"&gt;mount&lt;/a&gt; and &lt;a href="https://github.com/systemd/systemd/blob/master/units/proc-sys-fs-binfmt_misc.automount"&gt;automount&lt;/a&gt; units for this purpose. To double-check just run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;mount | &lt;span class="nb"&gt;grep &lt;/span&gt;binfmt_misc
systemd-1 on /proc/sys/fs/binfmt_misc &lt;span class="nb"&gt;type &lt;/span&gt;autofs &lt;span class="o"&gt;(&lt;/span&gt;rw,relatime,fd&lt;span class="o"&gt;=&lt;/span&gt;27,pgrp&lt;span class="o"&gt;=&lt;/span&gt;1,timeout&lt;span class="o"&gt;=&lt;/span&gt;0,minproto&lt;span class="o"&gt;=&lt;/span&gt;5,maxproto&lt;span class="o"&gt;=&lt;/span&gt;5,direct&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Another way is to check if you have any files in &lt;code&gt;/proc/sys/fs/binfmt_misc&lt;/code&gt;: properly mounted &lt;code&gt;binfmt_misc&lt;/code&gt; filesystem will create at least two special files with names &lt;code&gt;register&lt;/code&gt; and &lt;code&gt;status&lt;/code&gt; in that directory.&lt;/p&gt;

&lt;p&gt;Next, since we do want our &lt;code&gt;.go&lt;/code&gt; scripts to be able to properly pass the exit code to the operating system, we need the custom &lt;a href="https://github.com/erning/gorun"&gt;gorun&lt;/a&gt; wrapper as our "interpreter":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go get github.com/erning/gorun
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo mv&lt;/span&gt; ~/go/bin/gorun /usr/local/bin/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Technically we don't need to move &lt;code&gt;gorun&lt;/code&gt; to &lt;code&gt;/usr/local/bin&lt;/code&gt; or any other system path as &lt;code&gt;binfmt_misc&lt;/code&gt; requires full path to the interpreter anyway, but the system may run this executable with arbitrary privileges, so it is a good idea to limit access to the file from security perspective.&lt;/p&gt;

&lt;p&gt;At this point let's create a simple toy Go script &lt;code&gt;helloscript.go&lt;/code&gt; and verify we can successfully "interpret" it. The script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"os"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"world"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, %v!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"fail"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checking if parameter passing and error handling works as intended:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gorun helloscript.go
Hello, world!
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt;
0
&lt;span class="nv"&gt;$ &lt;/span&gt;gorun helloscript.go gopher
Hello, gopher!
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt;
0
&lt;span class="nv"&gt;$ &lt;/span&gt;gorun helloscript.go fail
Hello, fail!
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt;
30
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we need to tell &lt;code&gt;binfmt_misc&lt;/code&gt; module how to execute our &lt;code&gt;.go&lt;/code&gt; files with &lt;code&gt;gorun&lt;/code&gt;. Following &lt;a href="https://www.kernel.org/doc/html/v4.14/admin-guide/binfmt-misc.html"&gt;the documentation&lt;/a&gt; we need this configuration string: &lt;code&gt;:golang:E::go::/usr/local/bin/gorun:OC&lt;/code&gt;, which basically tells the system: "if you encounter an executable file with &lt;code&gt;.go&lt;/code&gt; extension, please, execute it with &lt;code&gt;/usr/local/bin/gorun&lt;/code&gt; interpreter". The &lt;code&gt;OC&lt;/code&gt; flags at the end of the string make sure, that the script will be executed according to the owner information and permission bits set on the script itself, and not the ones set on the interpreter binary. This makes Go script execution behaviour same as the rest of the executables and scripts in Linux.&lt;/p&gt;

&lt;p&gt;Let's register our new Go script binary format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;':golang:E::go::/usr/local/bin/gorun:OC'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /proc/sys/fs/binfmt_misc/register
:golang:E::go::/usr/local/bin/gorun:OC
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the system successfully registered the format, a new file &lt;code&gt;golang&lt;/code&gt; should appear under &lt;code&gt;/proc/sys/fs/binfmt_misc&lt;/code&gt; directory. Finally, we can natively execute our &lt;code&gt;.go&lt;/code&gt; files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;chmod &lt;/span&gt;u+x helloscript.go
&lt;span class="nv"&gt;$ &lt;/span&gt;./helloscript.go
Hello, world!
&lt;span class="nv"&gt;$ &lt;/span&gt;./helloscript.go gopher
Hello, gopher!
&lt;span class="nv"&gt;$ &lt;/span&gt;./helloscript.go fail
Hello, fail!
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt;
30
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it! Now we can edit &lt;code&gt;helloscript.go&lt;/code&gt; to our liking and see the changes will be immediately visible the next time the file is executed. Moreover, unlike the previous shebang approach, we can compile this file any time into a real executable with &lt;code&gt;go build&lt;/code&gt;.&lt;/p&gt;

</description>
      <category>go</category>
      <category>linux</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Fixing weak crypto in OpenSSL based applications</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Sat, 18 Apr 2020 21:37:58 +0000</pubDate>
      <link>https://dev.to/ignatk/fixing-weak-crypto-in-openssl-based-applications-4moi</link>
      <guid>https://dev.to/ignatk/fixing-weak-crypto-in-openssl-based-applications-4moi</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pqsec.org/2020/04/13/fixing-weak-crypto-in-openssl-based-applications.html"&gt;pqsec.org&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why crypto becomes weak
&lt;/h3&gt;

&lt;p&gt;No one designs weak cryptographic algorithms on purpose. Well, almost no one - sometimes state intelligence agencies &lt;a href="https://en.wikipedia.org/wiki/Dual_EC_DRBG#Weakness:_a_potential_backdoor"&gt;try to backdoor crypto for their own purposes&lt;/a&gt;, but hopefully this is an exception and in general people have best intentions in mind.&lt;/p&gt;

&lt;p&gt;So, why does some crypto suddenly become weak? All practical cryptographic algorithms are designed around some hard computational problems. That is, it is practically hard (but not impossible!) to efficiently execute the algorithm without knowing some secret information (the key). The basic assumptions of strong crypto are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an efficient algorithm, which allows to do a specific information transformation (for example, decryption or signing) without possessing the key does not exist&lt;/li&gt;
&lt;li&gt;all existing algorithms, which allow to do the above, require substantial resources (usually compute time and/or memory), which makes them impractical, because using current technology it will either require hundreds or thousands of years to crack a single byte or the whole world just does not have enough memory to accommodate the algorithm's state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A specific cryptographic algorithm becomes weak, when one of these assumptions or even both do not hold anymore. The first assumption might be broken, when some researcher invents and publishes an algorithm, which makes a hard computational problem not hard anymore: the published approach might significantly reduce compute/memory requirements to crack the protected information. For example, see why &lt;a href="https://www.rc4nomore.com/"&gt;RC4 cipher is not used in TLS anymore&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The second assumption is broken naturally over time, mostly because of rapid technological advancements. Not only computers grow more powerful every day and more compute and memory resources are available, but also completely new technologies emerge, which allow to &lt;a href="https://en.wikipedia.org/wiki/Quantum_computing#Cryptography"&gt;fully break some of the modern and most secure asymmetric cryptosystems&lt;/a&gt; in an instant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hypothetical case study
&lt;/h3&gt;

&lt;p&gt;Imagine you are a security engineer at a SaaS company, which provides cloud document storage as one of its offerings. Your cloud runs a third-party proprietary software stack from some vendor. All documents in the system are indexed by their IDs, which are generated, when the document is first uploaded. Your third-party software vendor decided that the simplest way to generate this ID for a document is to just compute its &lt;strong&gt;SHA-1 value&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;One day you come to the office and see chaos: the world is not the same anymore, because &lt;a href="https://shattered.io/"&gt;SHA-1 was officially declared broken in practice&lt;/a&gt; (and this part is real!). Your company reached out to the vendor to provide a fix, but, as often happens with vendors, they either said it would take them months or years to provide a fix or that the attack "is not applicable to the software security model". Either way your company disagrees and your job is to provide a hotfix, while the business is looking into alternatives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool analysis
&lt;/h3&gt;

&lt;p&gt;We previously agreed that the third-party vendor software is proprietary, but for the purposes of this exercise (and so you can compile and run this at home) here is the source version of the hypothetical tool:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;customhash.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;errno.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;openssl/evp.h&amp;gt;
#include &amp;lt;openssl/sha.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;FILE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;bytes_read&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;bytes_read&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;feof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EIO&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_Digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_sha1&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EFAULT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%02x"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;FILE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stdin&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"rb"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;fclose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So, in a nutshell, the tool just reads the contents of a file into a buffer and computes its SHA-1. Let's verify it works by comparing its output to a well-known SHA-1 implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; customhash customhash.c &lt;span class="nt"&gt;-lcrypto&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | ./customhash
03cfd743661f07975fa2f1220c5194cbaff48451
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha1sum
&lt;/span&gt;03cfd743661f07975fa2f1220c5194cbaff48451  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Works indeed. But remember: we did not compile the tool ourselves - it is proprietary. However, we can check if the tool was linked statically or dynamically and what libraries it uses in the latter case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;ldd ./customhash
    linux-vdso.so.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007ffc26bea000&lt;span class="o"&gt;)&lt;/span&gt;
    libcrypto.so.1.1 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libcrypto.so.1.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f6350798000&lt;span class="o"&gt;)&lt;/span&gt;
    libc.so.6 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libc.so.6 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f63505d7000&lt;span class="o"&gt;)&lt;/span&gt;
    libdl.so.2 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libdl.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f63505d2000&lt;span class="o"&gt;)&lt;/span&gt;
    libpthread.so.0 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libpthread.so.0 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f63505b1000&lt;span class="o"&gt;)&lt;/span&gt;
    /lib64/ld-linux-x86-64.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007f6350a94000&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're in luck: it was linked dynamically and it uses &lt;a href="https://www.openssl.org/"&gt;OpenSSL&lt;/a&gt;. The reason we're focusing on OpenSSL in this post is because OpenSSL is a de-facto cryptographic library of choice in many applications, even proprietary ones, because of its maturity and permissive license.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hooking crypto with LD_PRELOAD
&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://man7.org/linux/man-pages/man8/ld.so.8.html"&gt;&lt;code&gt;LD_PRELOAD&lt;/code&gt;&lt;/a&gt; is a powerful instrument to modify the behaviour of a dynamically linked application: it is possible to override almost any library function by defining an environment variable and writing some code. In our case we want to replace SHA-1 computation in our toy proprietary tool with more secure SHA-256. But first we need to actually know which function to "hook" (replace that is):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;nm &lt;span class="nt"&gt;-D&lt;/span&gt; ./customhash | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s1"&gt;'U '&lt;/span&gt;
                 U __errno_location
                 U EVP_Digest
                 U EVP_sha1
                 U fclose
                 U feof
                 U fopen
                 U fread
                 U __libc_start_main
                 U perror
                 U &lt;span class="nb"&gt;printf
                 &lt;/span&gt;U puts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above command outputs all the functions our &lt;code&gt;customhash&lt;/code&gt; tool uses from linked dynamic libraries ("U" stands for "uses" probably). Most functions are from &lt;code&gt;libc&lt;/code&gt;, but &lt;code&gt;EVP_Digest&lt;/code&gt; and &lt;code&gt;EVP_sha1&lt;/code&gt; come from OpenSSL (if we google those, we get directed to the OpenSSL online man page). At this point we need to write a small dynamic library, which exports same functions with same signatures, but compute SHA-256 instead. In fact we need to replace only &lt;code&gt;EVP_Digest&lt;/code&gt; as &lt;code&gt;EVP_sha1&lt;/code&gt; just returns the internal OpenSSL SHA-1 algorithm ID. One potential implementation might look like below:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;cryptofix.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#define _GNU_SOURCE &lt;/span&gt;&lt;span class="cm"&gt;/* for RTLD_NEXT */&lt;/span&gt;&lt;span class="cp"&gt;
#include &amp;lt;dlfcn.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;openssl/evp.h&amp;gt;
#include &amp;lt;openssl/sha.h&amp;gt;
&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;EVP_Digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;impl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA256_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;sha256_md_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;impl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;real_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dlsym&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RTLD_NEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"EVP_Digest"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cannot find EVP_Digest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVP_sha1&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sha256_md_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_sha256&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;impl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"replacing SHA1 with SHA256&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;impl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are couple of things to note from the above implementation: first of all, we need to get a pointer to the real OpenSSL &lt;code&gt;EVP_Digest&lt;/code&gt; function, so we can forward calls to it. We obtain the address using the &lt;a href="http://man7.org/linux/man-pages/man3/dlsym.3.html"&gt;&lt;code&gt;RTLD_NEXT&lt;/code&gt;&lt;/a&gt; trick from &lt;code&gt;libdl&lt;/code&gt;. This is needed because &lt;code&gt;EVP_Digest&lt;/code&gt; is a wrapper function, which computes any hash algorithm supported by OpenSSL. So we can't just replace it with a SHA-256 implementation, because the calling application might rely on different hash algorithms at once and use the same function for all of them. So in our implementation we "filter out" the calls, which request SHA-1 computations and pass the rest as is.&lt;/p&gt;

&lt;p&gt;Secondly, we don't want to write a SHA-256 implementation ourselves. We already know that the application is using OpenSSL, so when our code runs, we have access to OpenSSL library in our process address space. Moreover, we already have OpenSSL &lt;code&gt;EVP_Digest&lt;/code&gt; address obtained from above, so we just call OpenSSL to compute SHA-256 for us.&lt;/p&gt;

&lt;p&gt;Finally, the output of SHA-1 is just 20 bytes, but SHA-256 produces 32 bytes. OpenSSL returns the result to the caller allocated buffer, but at this point we can't assume the calling application allocated enough memory to store the full SHA-256 result, because it is expecting a SHA-1 hash. To be safe and not introduce a buffer overflow we will strip the extra 12 bytes from the computed SHA-256 before returning the result to the caller. Some security researches may argue that truncating hash results decreases security and they will be correct. However, for this use case, it is still more secure to use a secure hash algorithm with a truncated result rather than insecure hash algorithm.&lt;/p&gt;

&lt;p&gt;Let's check if everything works correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-shared&lt;/span&gt; &lt;span class="nt"&gt;-fPIC&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; cryptofix.so &lt;span class="p"&gt;;&lt;/span&gt;.c
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofix.so ./customhash
replacing SHA1 with SHA256
edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha256sum
&lt;/span&gt;edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hurray! We successfully replaced weak SHA-1 with stronger SHA-256 without touching any code in the original application.&lt;/p&gt;

&lt;h3&gt;
  
  
  The vendor strikes back
&lt;/h3&gt;

&lt;p&gt;While the vendor may find reasons to refuse fixing our insecure algorithm, they are obliged to fix bugs. If we examine our toy &lt;code&gt;customhash.c&lt;/code&gt; tool, we may notice it has a bug: it can't compute hashes of files larger than 4096 bytes because of the static buffer in the &lt;code&gt;hash&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'a%.0s'&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..4095&lt;span class="o"&gt;}&lt;/span&gt; | ./customhash
10236568a284fb3733bd87c15280af95bd528839
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'a%.0s'&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..4096&lt;span class="o"&gt;}&lt;/span&gt; | ./customhash
Input/output error
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the vendor fixes it and delivers the updated tool (because their code is nicely decoupled, they left the &lt;code&gt;main&lt;/code&gt; function as is and just rewrote the &lt;code&gt;hash&lt;/code&gt; function implementation with the same prototype):&lt;/p&gt;

&lt;p&gt;&lt;em&gt;customhashv2.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;FILE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;bytes_read&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EVP_MD_CTX_new&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENOMEM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_DigestInit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_sha1&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;EVP_MD_CTX_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EFAULT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bytes_read&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_DigestUpdate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bytes_read&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;EVP_MD_CTX_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EFAULT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;feof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;EVP_MD_CTX_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EIO&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_DigestFinal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;EVP_MD_CTX_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EFAULT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%02x"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's check if it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; customhashv2 customhashv2.c &lt;span class="nt"&gt;-lcrypto&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | ./customhashv2
03cfd743661f07975fa2f1220c5194cbaff48451
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha1sum
&lt;/span&gt;03cfd743661f07975fa2f1220c5194cbaff48451  -
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'a%.0s'&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..4096&lt;span class="o"&gt;}&lt;/span&gt; | ./customhashv2
8c51fb6a0b587ec95ca74acfa43df7539b486297
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'a%.0s'&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..4096&lt;span class="o"&gt;}&lt;/span&gt; | &lt;span class="nb"&gt;sha1sum
&lt;/span&gt;8c51fb6a0b587ec95ca74acfa43df7539b486297  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Good! The bug is fixed, but does our hack work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofix.so ./customhashv2
03cfd743661f07975fa2f1220c5194cbaff48451
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We don't see our "replacing SHA1 with SHA256" message anymore and the new tool clearly computes SHA-1. This is because the updated tool uses different functions from OpenSSL to do its job and we did not hook those:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;nm &lt;span class="nt"&gt;-D&lt;/span&gt; ./customhashv2 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s1"&gt;'U '&lt;/span&gt;
                 U __errno_location
                 U EVP_DigestFinal
                 U EVP_DigestInit
                 U EVP_DigestUpdate
                 U EVP_MD_CTX_free
                 U EVP_MD_CTX_new
                 U EVP_sha1
                 U fclose
                 U feof
                 U fopen
                 U fread
                 U __libc_start_main
                 U perror
                 U &lt;span class="nb"&gt;printf
                 &lt;/span&gt;U puts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To support arbitrary length files the tool now uses an interface, which processes the data iteratively. But we need to update our hooking library:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;cryptofixv2.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#define _GNU_SOURCE &lt;/span&gt;&lt;span class="cm"&gt;/* for RTLD_NEXT */&lt;/span&gt;&lt;span class="cp"&gt;
#include &amp;lt;dlfcn.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;openssl/evp.h&amp;gt;
#include &amp;lt;openssl/sha.h&amp;gt;
&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;EVP_DigestInit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;real_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dlsym&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RTLD_NEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"EVP_DigestInit"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cannot find EVP_DigestInit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVP_sha1&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_sha256&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;EVP_DigestFinal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;real_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dlsym&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RTLD_NEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"EVP_DigestFinal"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cannot find EVP_DigestFinal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX_md&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVP_sha256&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA256_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;sha256_md_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sha256_md_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"replacing SHA1 with SHA256&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we hook two functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;as before, in &lt;code&gt;EVP_DigestInit&lt;/code&gt; we detect when the caller requests SHA-1 calculation and instead request SHA-256 calculation from OpenSSL&lt;/li&gt;
&lt;li&gt;in &lt;code&gt;EVP_DigestFinal&lt;/code&gt; we truncate the results of any SHA-256 calculation to 20 bytes and return the results to the caller&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For simplicity, this implementation assumes that the calling application never requests SHA-256 hash calculations on its own. If that's not the case, the hooking library might become more complex, as we have to track somehow (for example, in a set) the OpenSSL context objects we "patched" in &lt;code&gt;EVP_DigestInit&lt;/code&gt;, so we only truncate the original-to-be SHA-1 results in the &lt;code&gt;EVP_DigestFinal&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Checking if it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-shared&lt;/span&gt; &lt;span class="nt"&gt;-fPIC&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; cryptofixv2.so cryptofixv2.c
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofixv2.so ./customhashv2
replacing SHA1 with SHA256
edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha256sum
&lt;/span&gt;edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OK, we're good! Did we cover all possible cases? Here is another potential update from the vendor:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;customhashv3.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;FILE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;md_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;bytes_read&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;BIO&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;filebio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;filebio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BIO_new_fp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BIO_NOCLOSE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;filebio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENOMEM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;sha1bio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BIO_new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BIO_f_md&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;BIO_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filebio&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENOMEM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;BIO_set_md&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_sha1&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="n"&gt;BIO_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filebio&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BIO_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BIO_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bytes_read&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;BIO_free_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EIO&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BIO_gets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;BIO_free_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;errno&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EFAULT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;BIO_free_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1bio&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;md_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%02x"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="n"&gt;puts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works as the previous one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-o&lt;/span&gt; customhashv3 customhashv3.c &lt;span class="nt"&gt;-lcrypto&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | ./customhashv3
03cfd743661f07975fa2f1220c5194cbaff48451
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But we may already guess our hooking library will not work anymore just by looking at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;nm &lt;span class="nt"&gt;-D&lt;/span&gt; ./customhashv3 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s1"&gt;'U '&lt;/span&gt;
                 U BIO_ctrl
                 U BIO_f_md
                 U BIO_free
                 U BIO_free_all
                 U BIO_gets
                 U BIO_new
                 U BIO_new_fp
                 U BIO_push
                 U BIO_read
                 U __errno_location
                 U EVP_sha1
                 U fclose
                 U fopen
                 U __libc_start_main
                 U perror
                 U &lt;span class="nb"&gt;printf
                 &lt;/span&gt;U puts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The calling application uses yet another set of function calls to compute the SHA-1 digest and we have to come up with a new fix:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;cryptofixv3.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#define _GNU_SOURCE &lt;/span&gt;&lt;span class="cm"&gt;/* for RTLD_NEXT */&lt;/span&gt;&lt;span class="cp"&gt;
#include &amp;lt;dlfcn.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;openssl/evp.h&amp;gt;
#include &amp;lt;openssl/sha.h&amp;gt;
&lt;/span&gt;
&lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="nf"&gt;BIO_ctrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BIO&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;larg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;parg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;BIO&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;larg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;parg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;real_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dlsym&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RTLD_NEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"BIO_ctrl"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cannot find BIO_ctrl"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;BIO_C_SET_MD&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;parg&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVP_sha1&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;larg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;EVP_sha256&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;larg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BIO_gets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BIO&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;BIO&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;real_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dlsym&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RTLD_NEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"BIO_gets"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cannot find BIO_gets"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BIO_method_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;BIO_TYPE_MD&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;BIO_get_md&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVP_sha256&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA256_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
            &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"replacing SHA1 with SHA256&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;real_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It up to the reader to verify the above code works, but it is worth noting it suffers from same limitations and assumptions as &lt;code&gt;v2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At this point it is clear that OpenSSL has a rather diverse API and the same thing can be implemented in many different ways. This makes OpenSSL algorithm hooking hard as it is almost impossible to account for all cases and combinations.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenSSL engines to the rescue
&lt;/h3&gt;

&lt;p&gt;While we can't provide reliable algorithm replacement for any cryptographic library, if the application uses OpenSSL, we can do better than above with &lt;a href="https://github.com/openssl/openssl/blob/master/README.ENGINE"&gt;OpenSSL engines&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;OpenSSL engines are third-party extensions anyone can write to provide a custom implementation of any cryptographic algorithm. They are used primarily for two cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;integrating different hardware cryptographic devices into OpenSSL and OpenSSL-based applications&lt;/li&gt;
&lt;li&gt;introduce new cryptographic algorithms into OpenSSL and make them available via generic OpenSSL &lt;code&gt;EVP_x&lt;/code&gt; API&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But we will abuse the framework a bit: we will write an "alternative" implementation of SHA-1 algorithm, which will do SHA-256 computations (the code below is based on &lt;a href="https://www.openssl.org/blog/blog/2015/11/23/engine-building-lesson-2-an-example-md5-engine/"&gt;the example from OpenSSL blog&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;em&gt;sha1-sha256.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include &amp;lt;string.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;openssl/engine.h&amp;gt;
#include &amp;lt;openssl/evp.h&amp;gt;
#include &amp;lt;openssl/sha.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;engine_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"sha1-sha256"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;engine_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="s"&gt;"An engine, which converts SHA1 to SHA256 for better security"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;digest_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;SHA256_Init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX_md_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;digest_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;SHA256_Update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX_md_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;digest_final&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_CTX&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SHA256_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SHA256_Final&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_MD_CTX_md_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"replacing SHA1 with SHA256&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;memcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha256_md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;digest_nids&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;NID_sha1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;digests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;nids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;nid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;nids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;digest_nids&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_nids&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_nids&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;NID_sha1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;digest_meth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EVP_MD_meth_new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NID_sha1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NID_sha1WithRSAEncryption&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_result_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SHA_DIGEST_LENGTH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_flags&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVP_MD_FLAG_DIGALGID_ABSENT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digest_init&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digest_update&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_final&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digest_final&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_cleanup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_ctrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_input_blocksize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SHA_CBLOCK&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_app_datasize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EVP_MD&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SHA256_CTX&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
          &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;EVP_MD_meth_set_copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;digest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;default:&lt;/span&gt;
    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;digest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nl"&gt;err:&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;EVP_MD_meth_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;digest_meth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;engine_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;engine_finish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;EVP_MD_meth_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;digest_meth&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;digest_meth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ENGINE_set_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;engine_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ENGINE_set_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;engine_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ENGINE_set_init_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;engine_init&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ENGINE_set_finish_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;engine_finish&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ENGINE_set_digests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digests&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nl"&gt;err:&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;IMPLEMENT_DYNAMIC_BIND_FN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;IMPLEMENT_DYNAMIC_CHECK_FN&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The engine above declares itself to OpenSSL as a SHA-1 implementation, but reuses the OpenSSL itself and calculates SHA-256 instead. It also truncates the output to 20 bytes not to confuse applications expecting a SHA-1 result. Let's test it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-shared&lt;/span&gt; &lt;span class="nt"&gt;-fPIC&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; cryptofix_engine.so sha1-sha256.c
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | openssl sha1
&lt;span class="o"&gt;(&lt;/span&gt;stdin&lt;span class="o"&gt;)=&lt;/span&gt; 03cfd743661f07975fa2f1220c5194cbaff48451
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | openssl sha1 &lt;span class="nt"&gt;-engine&lt;/span&gt; ./cryptofix_engine.so
engine &lt;span class="s2"&gt;"sha1-sha256"&lt;/span&gt; set.
replacing SHA1 with SHA256
&lt;span class="o"&gt;(&lt;/span&gt;stdin&lt;span class="o"&gt;)=&lt;/span&gt; edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha256sum
&lt;/span&gt;edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seems working as expected. Let's try it with our proprietary tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofix_engine.so ./customhashv3
03cfd743661f07975fa2f1220c5194cbaff48451
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hmm... Nothing changed: we don't get our debug message and still have a SHA-1 as a result. The reason is: to make the engine available we also need to call &lt;a href="https://www.openssl.org/docs/man1.1.1/man3/ENGINE_ctrl_cmd_string.html"&gt;some OpenSSL API&lt;/a&gt; to load and configure it! So not all OpenSSL based applications are engine aware. Obviously, the command line &lt;code&gt;openssl&lt;/code&gt; utility we used above is: the engine config API is invoked when we specify the &lt;code&gt;-engine&lt;/code&gt; parameter. There are others, like NGINX and OpenVPN - they have some directives in the configuration files, where the user can specify the desired OpenSSL engine. But most are not - developers just use OpenSSL as a crypto library and don't expect users to replace the crypto algorithms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Injecting code on process startup
&lt;/h3&gt;

&lt;p&gt;As we established above our custom tool is not OpenSSL engine aware, so we somehow need to make it call the &lt;a href="https://www.openssl.org/docs/man1.1.1/man3/ENGINE_ctrl_cmd_string.html"&gt;OpenSSL engine configuration API&lt;/a&gt; before it starts computing its first SHA-1. We could probably hook some other function, even from &lt;code&gt;libc&lt;/code&gt;, and hope it will be used before the OpenSSL ones, but we would be subject to the above problem of a vendor update potentially breaking our hotfix.&lt;/p&gt;

&lt;p&gt;A better way is to just implement the desired engine configuration in a function and mark it as an &lt;a href="https://gcc.gnu.org/onlinedocs/gccint/Initialization.html"&gt;"initialisation routine"&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;autoload.c:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="cp"&gt;#define _GNU_SOURCE &lt;/span&gt;&lt;span class="cm"&gt;/* for dladdr and Dl_info */&lt;/span&gt;&lt;span class="cp"&gt;
#include &amp;lt;dlfcn.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include &amp;lt;openssl/engine.h&amp;gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;fputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;engine_preload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// OpenSSL dynamic engine needs a filesystem path to the engine&lt;/span&gt;
  &lt;span class="c1"&gt;// so we determine our own filesystem path first&lt;/span&gt;
  &lt;span class="n"&gt;Dl_info&lt;/span&gt; &lt;span class="n"&gt;dinfo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dladdr&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;engine_preload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;dinfo&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to query engine module info"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;dinfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dli_fname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to determine engine filesystem path"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;ENGINE_load_dynamic&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENGINE_by_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"dynamic"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to load OpenSSL dynamic engine"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENGINE_ctrl_cmd_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"SO_PATH"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dinfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dli_fname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to set SO_PATH parameter for dynamic engine"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENGINE_ctrl_cmd_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"sha1-sha256"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to set ID parameter for dynamic engine"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENGINE_ctrl_cmd_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"LOAD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to LOAD sha1-sha256 engine"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ENGINE_set_default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ENGINE_METHOD_ALL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to set algorithms from sha1-sha256 engine as default"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenSSL engine configuration API needs a filesystem path to the desired engine. We assume that the above code will be part of our &lt;code&gt;cryptofix_engine.so&lt;/code&gt; library, so we just get the filesystem path for the currently executing module and pass it to the OpenSSL engine configuration API. But the magic here is in the function declaration: notice the &lt;code&gt;__attribute__((constructor))&lt;/code&gt; in the prototype. It marks this code as an "initialisation routine", so it will be automatically executed on process startup even before the &lt;code&gt;main&lt;/code&gt; function. And the beauty of this approach is that we don't rely on hooking any function in the target application. In fact, this code will always be executed regardless of the application logic as long as the application loads our shared library.&lt;/p&gt;

&lt;p&gt;Let's recompile our &lt;code&gt;cryptofix_engine.so&lt;/code&gt; including this function and test it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gcc &lt;span class="nt"&gt;-shared&lt;/span&gt; &lt;span class="nt"&gt;-fPIC&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; cryptofix_engine.so autoload.c sha1-sha256.c
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofix_engine.so ./customhashv3
replacing SHA1 with SHA256
edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha256sum
&lt;/span&gt;edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It worked! But because we replaced the algorithm via an OpenSSL engine it also works for every previous version of the tool and most likely for any future one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofix_engine.so ./customhashv2
replacing SHA1 with SHA256
edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./cryptofix_engine.so ./customhash
replacing SHA1 with SHA256
edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So our hotfix is much more reliable now and future-proof.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting rid of LD_PRELOAD
&lt;/h3&gt;

&lt;p&gt;So far we have a reliable hotfix for our weak proprietary hasing tool, however we need to ensure our code will always be loaded by specifying the &lt;code&gt;LD_PRELOAD&lt;/code&gt; environment variable, when the tool is being executed. This is not only error prone (we might just forget to define the variable, when invoking the tool), but also &lt;a href="http://man7.org/linux/man-pages/man8/ld.so.8.html"&gt;does not work in all cases&lt;/a&gt; (for example, the environment variable is ignored when invoking executables with &lt;a href="https://en.wikipedia.org/wiki/Setuid"&gt;&lt;code&gt;setuid&lt;/code&gt;/&lt;code&gt;setgid&lt;/code&gt;&lt;/a&gt; bit set).&lt;/p&gt;

&lt;p&gt;We can permanently patch the custom tool without recompiling it and add our &lt;code&gt;cryptofix_engine.so&lt;/code&gt; shared library as a runtime dependency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;patchelf &lt;span class="nt"&gt;--add-needed&lt;/span&gt; ./cryptofix_engine.so ./customhashv3
&lt;span class="nv"&gt;$ &lt;/span&gt;ldd ./customhashv3
    linux-vdso.so.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007ffd40977000&lt;span class="o"&gt;)&lt;/span&gt;
    ./cryptofix_engine.so &lt;span class="o"&gt;(&lt;/span&gt;0x00007faf1d1ce000&lt;span class="o"&gt;)&lt;/span&gt;
    libcrypto.so.1.1 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libcrypto.so.1.1 &lt;span class="o"&gt;(&lt;/span&gt;0x00007faf1ced9000&lt;span class="o"&gt;)&lt;/span&gt;
    libc.so.6 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libc.so.6 &lt;span class="o"&gt;(&lt;/span&gt;0x00007faf1cd18000&lt;span class="o"&gt;)&lt;/span&gt;
    libdl.so.2 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libdl.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007faf1cd13000&lt;span class="o"&gt;)&lt;/span&gt;
    libpthread.so.0 &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; /lib/x86_64-linux-gnu/libpthread.so.0 &lt;span class="o"&gt;(&lt;/span&gt;0x00007faf1ccf2000&lt;span class="o"&gt;)&lt;/span&gt;
    /lib64/ld-linux-x86-64.so.2 &lt;span class="o"&gt;(&lt;/span&gt;0x00007faf1d1db000&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From now on our &lt;code&gt;cryptofix_engine.so&lt;/code&gt; will be part of the &lt;code&gt;customhashv3&lt;/code&gt; tool and will always be loaded, when executing the binary even without any &lt;code&gt;LD_PRELOAD&lt;/code&gt; definitions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | ./customhashv3
replacing SHA1 with SHA256
edeaaff3f1774ad2888673770c6d64097e391bc3
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;abc | &lt;span class="nb"&gt;sha256sum
&lt;/span&gt;edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Conclusions
&lt;/h3&gt;

&lt;p&gt;This post, although based on imaginary scenario, reflects some of the real world use cases and experiences. It also covers some powerful runtime code patching approaches, which are useful even without the need to replace weak crypto in proprietary code and can be adopted separately or together. All code from the post is &lt;a href="https://github.com/pqsec/cryptofix"&gt;published here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>security</category>
      <category>crypto</category>
      <category>opensource</category>
      <category>linux</category>
    </item>
    <item>
      <title>Speeding up Linux disk encryption</title>
      <dc:creator>Ignat Korchagin</dc:creator>
      <pubDate>Sat, 18 Apr 2020 21:31:09 +0000</pubDate>
      <link>https://dev.to/ignatk/speeding-up-linux-disk-encryption-3n46</link>
      <guid>https://dev.to/ignatk/speeding-up-linux-disk-encryption-3n46</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a repost of my post from the &lt;a href="https://blog.cloudflare.com/speeding-up-linux-disk-encryption/"&gt;Cloudflare Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Data encryption at rest is a must-have for any modern Internet company. Many companies, however, don't encrypt their disks, because they fear the potential performance penalty caused by encryption overhead.&lt;/p&gt;

&lt;p&gt;Encrypting data at rest is vital for Cloudflare with &lt;a href="https://www.cloudflare.com/network/"&gt;more than 200 data centres across the world&lt;/a&gt;. In this post, we will investigate the performance of disk encryption on Linux and explain how we made it at least two times faster for ourselves and our customers!&lt;/p&gt;

&lt;h3&gt;
  
  
  Encrypting data at rest
&lt;/h3&gt;

&lt;p&gt;When it comes to encrypting data at rest there are several ways it can be implemented on a modern operating system (OS). Available techniques are tightly coupled with a &lt;a href="https://en.wikibooks.org/wiki/The_Linux_Kernel/Storage"&gt;typical OS storage stack&lt;/a&gt;. A simplified version of the storage stack and encryption solutions can be found on the diagram below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--pcrFZDIJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/storage-stack.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pcrFZDIJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/storage-stack.png" alt="storage-stack"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the top of the stack are applications, which read and write data in files (or streams). The file system in the OS kernel keeps track of which blocks of the underlying block device belong to which files and translates these file reads and writes into block reads and writes, however the hardware specifics of the underlying storage device is abstracted away from the filesystem. Finally, the block subsystem actually passes the block reads and writes to the underlying hardware using appropriate device drivers.&lt;/p&gt;

&lt;p&gt;The concept of the storage stack is actually similar to the &lt;a href="https://www.cloudflare.com/learning/ddos/glossary/open-systems-interconnection-model-osi/"&gt;well-known network OSI model&lt;/a&gt;, where each layer has a more high-level view of the information and the implementation details of the lower layers are abstracted away from the upper layers. And, similar to the OSI model, one can apply encryption at different layers (think about &lt;a href="https://www.cloudflare.com/learning/ssl/transport-layer-security-tls/"&gt;TLS&lt;/a&gt; vs &lt;a href="https://en.wikipedia.org/wiki/IPsec"&gt;IPsec&lt;/a&gt; or &lt;a href="https://www.cloudflare.com/learning/access-management/what-is-a-vpn/"&gt;a VPN&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;For data at rest we can apply encryption either at the block layers (either in hardware or in software) or at the file level (either directly in applications or in the filesystem). &lt;/p&gt;

&lt;h4&gt;
  
  
  Block vs file encryption
&lt;/h4&gt;

&lt;p&gt;Generally, the higher in the stack we apply encryption, the more flexibility we have. With application level encryption the application maintainers can apply any encryption code they please to any particular data they need. The downside of this approach is they actually have to implement it themselves and encryption in general is not very developer-friendly: one has to know the ins and outs of a specific cryptographic algorithm, properly generate keys, nonces, IVs etc. Additionally, application level encryption does not leverage OS-level caching and &lt;a href="https://en.wikipedia.org/wiki/Page_cache"&gt;Linux page cache&lt;/a&gt; in particular: each time the application needs to use the data, it has to either decrypt it again, wasting CPU cycles, or implement its own decrypted “cache”, which introduces more complexity to the code.&lt;/p&gt;

&lt;p&gt;File system level encryption makes data encryption transparent to applications, because the file system itself encrypts the data before passing it to the block subsystem, so files are encrypted regardless if the application has crypto support or not. Also, file systems can be configured to encrypt only a particular directory or have different keys for different files. This flexibility, however, comes at a cost of a more complex configuration. File system encryption is also considered less secure than block device encryption as only the contents of the files are encrypted. Files also have associated metadata, like file size, the number of files, the directory tree layout etc., which are still visible to a potential adversary.&lt;/p&gt;

&lt;p&gt;Encryption down at the block layer (often referred to as &lt;a href="https://en.wikipedia.org/wiki/Disk_encryption"&gt;disk encryption&lt;/a&gt; or full disk encryption) also makes data encryption transparent to applications and even whole file systems. Unlike file system level encryption it encrypts all data on the disk including file metadata and even free space. It is less flexible though - one can only encrypt the whole disk with a single key, so there is no per-directory, per-file or per-user configuration. From the crypto perspective, not all cryptographic algorithms can be used as the block layer doesn't have a high-level overview of the data anymore, so it needs to process each block independently. Most &lt;a href="https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Common_modes"&gt;common algorithms require some sort of block chaining&lt;/a&gt; to be secure, so are not applicable to disk encryption. Instead, &lt;a href="https://en.wikipedia.org/wiki/Disk_encryption_theory#Block_cipher-based_modes"&gt;special modes were developed&lt;/a&gt; just for this specific use-case.&lt;/p&gt;

&lt;p&gt;So which layer to choose? As always, it depends... Application and file system level encryption are usually the preferred choice for client systems because of the flexibility. For example, each user on a multi-user desktop may want to encrypt their home directory with a key they own and leave some shared directories unencrypted. On the contrary, on server systems, managed by SaaS/PaaS/IaaS companies (including Cloudflare) the preferred choice is configuration simplicity and security - with full disk encryption enabled any data from any application is automatically encrypted with no exceptions or overrides. We believe that all data needs to be protected without sorting it into "important" vs "not important" buckets, so the selective flexibility the upper layers provide is not needed.&lt;/p&gt;

&lt;h4&gt;
  
  
  Hardware vs software disk encryption
&lt;/h4&gt;

&lt;p&gt;When encrypting data at the block layer it is possible to do it directly in the storage hardware, if the hardware &lt;a href="https://en.wikipedia.org/wiki/Hardware-based_full_disk_encryption"&gt;supports it&lt;/a&gt;. Doing so usually gives better read/write performance and consumes less resources from the host. However, since most hardware firmware is proprietary, it does not receive as much attention and review from the security community. In the past this led to &lt;a href="https://www.us-cert.gov/ncas/current-activity/2018/11/06/Self-Encrypting-Solid-State-Drive-Vulnerabilities"&gt;flaws in some implementations of hardware disk encryption&lt;/a&gt;, which render the whole security model useless. Microsoft, for example, &lt;a href="https://support.microsoft.com/en-us/help/4516071/windows-10-update-kb4516071"&gt;started to prefer software-based disk encryption&lt;/a&gt; since then.&lt;/p&gt;

&lt;p&gt;We didn't want to put our data and our customers' data to the risk of using potentially insecure solutions and we &lt;a href="https://blog.cloudflare.com/helping-to-build-cloudflare-part-4/"&gt;strongly believe in open-source&lt;/a&gt;. That's why we rely only on software disk encryption in the Linux kernel, which is open and has been audited by many security professionals across the world.&lt;/p&gt;

&lt;h3&gt;
  
  
  Linux disk encryption performance
&lt;/h3&gt;

&lt;p&gt;We aim not only to save bandwidth costs for our customers, but to deliver content to Internet users as fast as possible.&lt;/p&gt;

&lt;p&gt;At one point we noticed that our disks were not as fast as we would like them to be. Some profiling as well as a quick A/B test pointed to Linux disk encryption. Because not encrypting the data (even if it is supposed-to-be a public Internet cache) is not a sustainable option, we decided to take a closer look into Linux disk encryption performance.&lt;/p&gt;

&lt;h4&gt;
  
  
  Device mapper and dm-crypt
&lt;/h4&gt;

&lt;p&gt;Linux implements transparent disk encryption via a &lt;a href="https://en.wikipedia.org/wiki/Dm-crypt"&gt;dm-crypt module&lt;/a&gt; and &lt;code&gt;dm-crypt&lt;/code&gt; itself is part of &lt;a href="https://en.wikipedia.org/wiki/Device_mapper"&gt;device mapper&lt;/a&gt; kernel framework. In a nutshell, the device mapper allows pre/post-process IO requests as they travel between the file system and the underlying block device.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;dm-crypt&lt;/code&gt; in particular encrypts "write" IO requests before sending them further down the stack to the actual block device and decrypts "read" IO requests before sending them up to the file system driver. Simple and easy! Or is it?&lt;/p&gt;

&lt;h4&gt;
  
  
  Benchmarking setup
&lt;/h4&gt;

&lt;p&gt;For the record, the numbers in this post were obtained by running specified commands on an idle &lt;a href="https://blog.cloudflare.com/a-tour-inside-cloudflares-g9-servers/"&gt;Cloudflare G9 server&lt;/a&gt; out of production. However, the setup should be easily reproducible on any modern x86 laptop.&lt;/p&gt;

&lt;p&gt;Generally, benchmarking anything around a storage stack is hard because of the noise introduced by the storage hardware itself. Not all disks are created equal, so for the purpose of this post we will use the fastest disks available out there - that is no disks.&lt;/p&gt;

&lt;p&gt;Instead Linux has an option to emulate a disk directly in &lt;a href="https://en.wikipedia.org/wiki/Random-access_memory"&gt;RAM&lt;/a&gt;. Since RAM is much faster than any persistent storage, it should introduce little bias in our results.&lt;/p&gt;

&lt;p&gt;The following command creates a 4GB ramdisk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;modprobe brd &lt;span class="nv"&gt;rd_nr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;rd_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4194304
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; /dev/ram0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can set up a &lt;code&gt;dm-crypt&lt;/code&gt; instance on top of it thus enabling encryption for the disk. First, we need to generate the disk encryption key, "format" the disk and specify a password to unlock the newly generated key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;fallocate &lt;span class="nt"&gt;-l&lt;/span&gt; 2M crypthdr.img
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup luksFormat /dev/ram0 &lt;span class="nt"&gt;--header&lt;/span&gt; crypthdr.img

WARNING!
&lt;span class="o"&gt;========&lt;/span&gt;
This will overwrite data on crypthdr.img irrevocably.

Are you sure? &lt;span class="o"&gt;(&lt;/span&gt;Type uppercase &lt;span class="nb"&gt;yes&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;: YES
Enter passphrase:
Verify passphrase:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Those who are familiar with &lt;code&gt;LUKS/dm-crypt&lt;/code&gt; might have noticed we used a &lt;a href="http://man7.org/linux/man-pages/man8/cryptsetup.8.html"&gt;LUKS detached header&lt;/a&gt; here. Normally, LUKS stores the password-encrypted disk encryption key on the same disk as the data, but since we want to compare read/write performance between encrypted and unencrypted devices, we might accidentally overwrite the encrypted key during our benchmarking later. Keeping the encrypted key in a separate file avoids this problem for the purposes of this post.&lt;/p&gt;

&lt;p&gt;Now, we can actually "unlock" the encrypted device for our testing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup open &lt;span class="nt"&gt;--header&lt;/span&gt; crypthdr.img /dev/ram0 encrypted-ram0
Enter passphrase &lt;span class="k"&gt;for&lt;/span&gt; /dev/ram0:
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; /dev/mapper/encrypted-ram0
/dev/mapper/encrypted-ram0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point we can now compare the performance of encrypted vs unencrypted ramdisk: if we read/write data to &lt;code&gt;/dev/ram0&lt;/code&gt;, it will be stored in &lt;a href="https://en.wikipedia.org/wiki/Plaintext"&gt;plaintext&lt;/a&gt;. Likewise, if we read/write data to &lt;code&gt;/dev/mapper/encrypted-ram0&lt;/code&gt;, it will be decrypted/encrypted on the way by &lt;code&gt;dm-crypt&lt;/code&gt; and stored in &lt;a href="https://en.wikipedia.org/wiki/Ciphertext"&gt;ciphertext&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It's worth noting that we're not creating any file system on top of our block devices to avoid biasing results with a file system overhead.&lt;/p&gt;

&lt;h4&gt;
  
  
  Measuring throughput
&lt;/h4&gt;

&lt;p&gt;When it comes to storage testing/benchmarking &lt;a href="https://fio.readthedocs.io/en/latest/fio_doc.html"&gt;Flexible I/O tester&lt;/a&gt; is the usual go-to solution. Let's simulate simple sequential read/write load with 4K block size on the ramdisk without encryption:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;fio &lt;span class="nt"&gt;--filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/dev/ram0 &lt;span class="nt"&gt;--readwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;readwrite &lt;span class="nt"&gt;--bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4k &lt;span class="nt"&gt;--direct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nt"&gt;--loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;plain
plain: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt;: &lt;span class="nv"&gt;rw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rw, &lt;span class="nv"&gt;bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4K-4K/4K-4K/4K-4K, &lt;span class="nv"&gt;ioengine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;psync, &lt;span class="nv"&gt;iodepth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
fio-2.16
Starting 1 process
...
Run status group 0 &lt;span class="o"&gt;(&lt;/span&gt;all &lt;span class="nb"&gt;jobs&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;:
   READ: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;21013MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1126.5MB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1126.5MB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1126.5MB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;18655msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;18655msec
  WRITE: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;21023MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1126.1MB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1126.1MB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1126.1MB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;18655msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;18655msec

Disk stats &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt;/write&lt;span class="o"&gt;)&lt;/span&gt;:
  ram0: &lt;span class="nv"&gt;ios&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/0, &lt;span class="nv"&gt;merge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/0, &lt;span class="nv"&gt;ticks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/0, &lt;span class="nv"&gt;in_queue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0, &lt;span class="nv"&gt;util&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above command will run for a long time, so we just stop it after a while. As we can see from the stats, we're able to read and write roughly with the same throughput around &lt;code&gt;1126 MB/s&lt;/code&gt;. Let's repeat the test with the encrypted ramdisk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;fio &lt;span class="nt"&gt;--filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/dev/mapper/encrypted-ram0 &lt;span class="nt"&gt;--readwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;readwrite &lt;span class="nt"&gt;--bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4k &lt;span class="nt"&gt;--direct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nt"&gt;--loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;crypt
crypt: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt;: &lt;span class="nv"&gt;rw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rw, &lt;span class="nv"&gt;bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4K-4K/4K-4K/4K-4K, &lt;span class="nv"&gt;ioengine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;psync, &lt;span class="nv"&gt;iodepth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
fio-2.16
Starting 1 process
...
Run status group 0 &lt;span class="o"&gt;(&lt;/span&gt;all &lt;span class="nb"&gt;jobs&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;:
   READ: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1693.7MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;150874KB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;150874KB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;150874KB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11491msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11491msec
  WRITE: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1696.4MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;151170KB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;151170KB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;151170KB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11491msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11491msec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whoa, that's a drop! We only get &lt;code&gt;~147 MB/s&lt;/code&gt; now, which is more than 7 times slower! And this is on a totally idle machine!&lt;/p&gt;

&lt;h4&gt;
  
  
  Maybe, crypto is just slow
&lt;/h4&gt;

&lt;p&gt;The first thing we considered is to ensure we use the fastest crypto. &lt;code&gt;cryptsetup&lt;/code&gt; allows us to benchmark all the available crypto implementations on the system to select the best one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup benchmark
&lt;span class="c"&gt;# Tests are approximate using memory only (no storage IO).&lt;/span&gt;
PBKDF2-sha1      1340890 iterations per second &lt;span class="k"&gt;for &lt;/span&gt;256-bit key
PBKDF2-sha256    1539759 iterations per second &lt;span class="k"&gt;for &lt;/span&gt;256-bit key
PBKDF2-sha512    1205259 iterations per second &lt;span class="k"&gt;for &lt;/span&gt;256-bit key
PBKDF2-ripemd160  967321 iterations per second &lt;span class="k"&gt;for &lt;/span&gt;256-bit key
PBKDF2-whirlpool  720175 iterations per second &lt;span class="k"&gt;for &lt;/span&gt;256-bit key
&lt;span class="c"&gt;#  Algorithm | Key |  Encryption |  Decryption&lt;/span&gt;
     aes-cbc   128b   969.7 MiB/s  3110.0 MiB/s
 serpent-cbc   128b           N/A           N/A
 twofish-cbc   128b           N/A           N/A
     aes-cbc   256b   756.1 MiB/s  2474.7 MiB/s
 serpent-cbc   256b           N/A           N/A
 twofish-cbc   256b           N/A           N/A
     aes-xts   256b  1823.1 MiB/s  1900.3 MiB/s
 serpent-xts   256b           N/A           N/A
 twofish-xts   256b           N/A           N/A
     aes-xts   512b  1724.4 MiB/s  1765.8 MiB/s
 serpent-xts   512b           N/A           N/A
 twofish-xts   512b           N/A           N/A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It seems &lt;code&gt;aes-xts&lt;/code&gt; with a 256-bit data encryption key is the fastest here. But which one are we actually using for our encrypted ramdisk?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup table /dev/mapper/encrypted-ram0
0 8388608 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 1:0 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We do use &lt;code&gt;aes-xts&lt;/code&gt; with a 256-bit data encryption key (count all the zeroes conveniently masked by &lt;code&gt;dmsetup&lt;/code&gt; tool - if you want to see the actual bytes, add the &lt;code&gt;--showkeys&lt;/code&gt; option to the above command). The numbers do not add up however: &lt;code&gt;cryptsetup benchmark&lt;/code&gt; tells us above not to rely on the results, as "Tests are approximate using memory only (no storage IO)", but that is exactly how we've set up our experiment using the ramdisk. In a somewhat worse case (assuming we're reading all the data and then encrypting/decrypting it sequentially with no parallelism) doing &lt;a href="https://en.wikipedia.org/wiki/Back-of-the-envelope_calculation"&gt;back-of-the-envelope calculation&lt;/a&gt; we should be getting around &lt;code&gt;(1126 * 1823) / (1126 + 1823) =~696 MB/s&lt;/code&gt;, which is still quite far from the actual &lt;code&gt;147 * 2 = 294 MB/s&lt;/code&gt; (total for reads and writes).&lt;/p&gt;

&lt;h4&gt;
  
  
  dm-crypt performance flags
&lt;/h4&gt;

&lt;p&gt;While reading the &lt;a href="http://man7.org/linux/man-pages/man8/cryptsetup.8.html"&gt;cryptsetup man page&lt;/a&gt; we noticed that it has two options prefixed with &lt;code&gt;--perf-&lt;/code&gt;, which are probably related to performance tuning. The first one is &lt;code&gt;--perf-same_cpu_crypt&lt;/code&gt; with a rather cryptic description:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Perform encryption using the same cpu that IO was submitted on.  The default is to use an unbound workqueue so that encryption work is automatically balanced between available CPUs.  This option is only relevant for open action.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So we enable the option&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup close encrypted-ram0
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup open &lt;span class="nt"&gt;--header&lt;/span&gt; crypthdr.img &lt;span class="nt"&gt;--perf-same_cpu_crypt&lt;/span&gt; /dev/ram0 encrypted-ram0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: according to the &lt;a href="http://man7.org/linux/man-pages/man8/cryptsetup.8.html"&gt;latest man page&lt;/a&gt; there is also a &lt;code&gt;cryptsetup refresh&lt;/code&gt; command, which can be used to enable these options live without having to "close" and "re-open" the encrypted device. Our &lt;code&gt;cryptsetup&lt;/code&gt; however didn't support it yet.&lt;/p&gt;

&lt;p&gt;Verifying if the option has been really enabled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup table encrypted-ram0
0 8388608 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 1:0 0 1 same_cpu_crypt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yes, we can now see &lt;code&gt;same_cpu_crypt&lt;/code&gt; in the output, which is what we wanted. Let's rerun the benchmark:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;fio &lt;span class="nt"&gt;--filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/dev/mapper/encrypted-ram0 &lt;span class="nt"&gt;--readwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;readwrite &lt;span class="nt"&gt;--bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4k &lt;span class="nt"&gt;--direct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nt"&gt;--loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;crypt
crypt: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt;: &lt;span class="nv"&gt;rw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rw, &lt;span class="nv"&gt;bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4K-4K/4K-4K/4K-4K, &lt;span class="nv"&gt;ioengine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;psync, &lt;span class="nv"&gt;iodepth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
fio-2.16
Starting 1 process
...
Run status group 0 &lt;span class="o"&gt;(&lt;/span&gt;all &lt;span class="nb"&gt;jobs&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;:
   READ: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1596.6MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;139811KB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;139811KB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;139811KB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11693msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11693msec
  WRITE: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1600.9MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;140192KB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;140192KB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;140192KB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11693msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11693msec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hmm, now it is &lt;code&gt;~136 MB/s&lt;/code&gt; which is slightly worse than before, so no good. What about the second option &lt;code&gt;--perf-submit_from_crypt_cpus&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Disable offloading writes to a separate thread after encryption.  There are some situations where offloading write bios from the encryption threads to a single thread degrades performance significantly.  The default is to offload write bios to the same thread.  This option is only relevant for open action.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Maybe, we are in the "some situation" here, so let's try it out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup close encrypted-ram0
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;cryptsetup open &lt;span class="nt"&gt;--header&lt;/span&gt; crypthdr.img &lt;span class="nt"&gt;--perf-submit_from_crypt_cpus&lt;/span&gt; /dev/ram0 encrypted-ram0
Enter passphrase &lt;span class="k"&gt;for&lt;/span&gt; /dev/ram0:
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup table encrypted-ram0
0 8388608 crypt aes-xts-plain64 0000000000000000000000000000000000000000000000000000000000000000 0 1:0 0 1 submit_from_crypt_cpus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now the benchmark:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;fio &lt;span class="nt"&gt;--filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/dev/mapper/encrypted-ram0 &lt;span class="nt"&gt;--readwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;readwrite &lt;span class="nt"&gt;--bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4k &lt;span class="nt"&gt;--direct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nt"&gt;--loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;crypt
crypt: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt;: &lt;span class="nv"&gt;rw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rw, &lt;span class="nv"&gt;bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4K-4K/4K-4K/4K-4K, &lt;span class="nv"&gt;ioengine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;psync, &lt;span class="nv"&gt;iodepth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
fio-2.16
Starting 1 process
...
Run status group 0 &lt;span class="o"&gt;(&lt;/span&gt;all &lt;span class="nb"&gt;jobs&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;:
   READ: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2066.6MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;169835KB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;169835KB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;169835KB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12457msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12457msec
  WRITE: &lt;span class="nv"&gt;io&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2067.7MB, &lt;span class="nv"&gt;aggrb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;169965KB/s, &lt;span class="nv"&gt;minb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;169965KB/s, &lt;span class="nv"&gt;maxb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;169965KB/s, &lt;span class="nv"&gt;mint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12457msec, &lt;span class="nv"&gt;maxt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12457msec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;~166 MB/s&lt;/code&gt;, which is a bit better, but still not good...&lt;/p&gt;

&lt;h4&gt;
  
  
  Asking the community
&lt;/h4&gt;

&lt;p&gt;Being desperate we decided to seek support from the Internet and &lt;a href="https://www.spinics.net/lists/dm-crypt/msg07516.html"&gt;posted our findings to the &lt;code&gt;dm-crypt&lt;/code&gt; mailing list&lt;/a&gt;, but the response we got was not very encouraging:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the numbers disturb you, then this is from lack of understanding on your side. You are probably unaware that encryption is a heavy-weight operation...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We decided to make a scientific research on this topic by typing "is encryption expensive" into Google Search and one of the top results, which actually contains meaningful measurements, is... &lt;a href="https://blog.cloudflare.com/how-expensive-is-crypto-anyway/"&gt;our own post about cost of encryption&lt;/a&gt;, but in the context of &lt;a href="https://www.cloudflare.com/learning/ssl/transport-layer-security-tls/"&gt;TLS&lt;/a&gt;! This is a fascinating read on its own, but the gist is: modern crypto on modern hardware is very cheap even at Cloudflare scale (doing millions of encrypted HTTP requests per second). In fact, it is so cheap that Cloudflare was the first provider to offer &lt;a href="https://blog.cloudflare.com/introducing-universal-ssl/"&gt;free SSL/TLS for everyone&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Digging into the source code
&lt;/h4&gt;

&lt;p&gt;When trying to use the custom &lt;code&gt;dm-crypt&lt;/code&gt; options described above we were curious why they exist in the first place and what is that "offloading" all about. Originally we expected &lt;code&gt;dm-crypt&lt;/code&gt; to be a simple "proxy", which just encrypts/decrypts data as it flows through the stack. Turns out &lt;code&gt;dm-crypt&lt;/code&gt; does more than just encrypting memory buffers and a (simplified) IO traverse path diagram is presented below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--MSCZiHWX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/dm-crypt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--MSCZiHWX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/dm-crypt.png" alt="dm-crypt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When the file system issues a write request, &lt;code&gt;dm-crypt&lt;/code&gt; does not process it immediately - instead it puts it into a &lt;a href="https://www.kernel.org/doc/html/v4.19/core-api/workqueue.html"&gt;workqueue&lt;/a&gt; &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L3124"&gt;named "kcryptd"&lt;/a&gt;. In a nutshell, a kernel workqueue just schedules some work (encryption in this case) to be performed at some later time, when it is more convenient. When "the time" comes, &lt;code&gt;dm-crypt&lt;/code&gt; &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1940"&gt;sends the request&lt;/a&gt; to &lt;a href="https://www.kernel.org/doc/html/v4.19/crypto/index.html"&gt;Linux Crypto API&lt;/a&gt; for actual encryption. However, modern Linux Crypto API &lt;a href="https://www.kernel.org/doc/html/v4.19/crypto/api-skcipher.html#symmetric-key-cipher-api"&gt;is asynchronous&lt;/a&gt; as well, so depending on which particular implementation your system will use, most likely it will not be processed immediately, but queued again for "later time". When Linux Crypto API will finally &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1980"&gt;do the encryption&lt;/a&gt;, &lt;code&gt;dm-crypt&lt;/code&gt; may try to &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1909-L1910"&gt;sort pending write requests by putting each request&lt;/a&gt; into a &lt;a href="https://en.wikipedia.org/wiki/Red%E2%80%93black_tree"&gt;red-black tree&lt;/a&gt;. Then a &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1819"&gt;separate kernel thread&lt;/a&gt; again at "some time later" actually takes all IO requests in the tree and &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1864"&gt;sends them down the stack&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now for read requests: this time we need to get the encrypted data first from the hardware, but &lt;code&gt;dm-crypt&lt;/code&gt; does not just ask for the driver for the data, but queues the request into a different &lt;a href="https://www.kernel.org/doc/html/v4.19/core-api/workqueue.html"&gt;workqueue&lt;/a&gt; &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L3122"&gt;named "kcryptd_io"&lt;/a&gt;. At some point later, when we actually have the encrypted data, we &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1742"&gt;schedule it for decryption&lt;/a&gt; using the now familiar "kcryptd" workqueue. "kcryptd" &lt;a href="https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/md/dm-crypt.c#L1970"&gt;will send the request&lt;/a&gt; to Linux Crypto API, which may decrypt the data asynchronously as well.&lt;/p&gt;

&lt;p&gt;To be fair the request does not always traverse all these queues, but the important part here is that write requests may be queued up to 4 times in &lt;code&gt;dm-crypt&lt;/code&gt; and read requests up to 3 times. At this point we were wondering if all this extra queueing can cause any performance issues. For example, there is a &lt;a href="https://www.usenix.org/conference/srecon19asia/presentation/plenz"&gt;nice presentation from Google&lt;/a&gt; about the relationship between queueing and tail latency. One key takeaway from the presentation is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A significant amount of tail latency is due to queueing effects&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, why are all these queues there and can we remove them?&lt;/p&gt;

&lt;h4&gt;
  
  
  Git archeology
&lt;/h4&gt;

&lt;p&gt;No-one writes more complex code just for fun, especially for the OS kernel. So all these queues must have been put there for a reason. Luckily, the Linux kernel source is managed by &lt;a href="https://en.wikipedia.org/wiki/Git"&gt;git&lt;/a&gt;, so we can try to retrace the changes and the decisions around them.&lt;/p&gt;

&lt;p&gt;The "kcryptd" workqueue was in the source &lt;a href="https://github.com/torvalds/linux/blob/1da177e4c3f41524e886b7f1b8a0c1fc7321cac2/drivers/md/dm-crypt.c"&gt;since the beginning of the available history&lt;/a&gt; with the following comment:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Needed because it would be very unwise to do decryption in an interrupt context, so bios returning from read requests get queued here.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So it was for reads only, but even then - why do we care if it is interrupt context or not, if Linux Crypto API will likely use a dedicated thread/queue for encryption anyway? Well, back in 2005 Crypto API &lt;a href="https://github.com/torvalds/linux/blob/1da177e4c3f41524e886b7f1b8a0c1fc7321cac2/Documentation/crypto/api-intro.txt"&gt;was not asynchronous&lt;/a&gt;, so this made perfect sense.&lt;/p&gt;

&lt;p&gt;In 2006 &lt;code&gt;dm-crypt&lt;/code&gt; &lt;a href="https://github.com/torvalds/linux/commit/23541d2d288cdb54f417ba1001dacc7f3ea10a97"&gt;started to use&lt;/a&gt; the "kcryptd" workqueue not only for encryption, but for submitting IO requests:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This patch is designed to help dm-crypt comply with the new constraints imposed by the following patch in -mm: md-dm-reduce-stack-usage-with-stacked-block-devices.patch&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It seems the goal here was not to add more concurrency, but rather reduce kernel stack usage, which makes sense again as the kernel has a common stack across all the code, so it is a quite limited resource. It is worth noting, however, that the &lt;a href="https://github.com/torvalds/linux/commit/6538b8ea886e472f4431db8ca1d60478f838d14b"&gt;Linux kernel stack has been expanded&lt;/a&gt; in 2014 for x86 platforms, so this might not be a problem anymore.&lt;/p&gt;

&lt;p&gt;A &lt;a href="https://github.com/torvalds/linux/commit/cabf08e4d3d1181d7c408edae97fb4d1c31518af"&gt;first version of "kcryptd_io" workqueue was added&lt;/a&gt; in 2007 with the intent to avoid:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;starvation caused by many requests waiting for memory allocation...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The request processing was bottlenecking on a single workqueue here, so the solution was to add another one. Makes sense.&lt;/p&gt;

&lt;p&gt;We are definitely not the first ones experiencing performance degradation because of extensive queueing: in 2011 a change was introduced to &lt;a href="https://github.com/torvalds/linux/commit/20c82538e4f5ede51bc2b4795bc6e5cae772796d"&gt;conditionally revert some of the queueing for read requests&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If there is enough memory, code can directly submit bio instead queuing this operation in a separate thread.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Unfortunately, at that time Linux kernel commit messages were not as verbose as today, so there is no performance data available.&lt;/p&gt;

&lt;p&gt;In 2015 &lt;a href="https://github.com/torvalds/linux/commit/dc2676210c425ee8e5cb1bec5bc84d004ddf4179"&gt;dm-crypt started to sort writes&lt;/a&gt; in a separate "dmcrypt_write" thread before sending them down the stack:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;On a multiprocessor machine, encryption requests finish in a different order than they were submitted.  Consequently, write requests would be submitted in a different order and it could cause severe performance degradation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It does make sense as sequential disk access used to be much faster than the random one and &lt;code&gt;dm-crypt&lt;/code&gt; was breaking the pattern. But this mostly applies to &lt;a href="https://en.wikipedia.org/wiki/Hard_disk_drive"&gt;spinning disks&lt;/a&gt;, which were still dominant in 2015. It may not be as important with modern fast &lt;a href="https://en.wikipedia.org/wiki/Solid-state_drive"&gt;SSDs (including NVME SSDs)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Another part of the commit message is worth mentioning:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;...in particular it enables IO schedulers like CFQ to sort more effectively...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It mentions the performance benefits for the &lt;a href="https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt"&gt;CFQ IO scheduler&lt;/a&gt;, but Linux schedulers have improved since then to the point that &lt;a href="https://github.com/torvalds/linux/commit/f382fb0bcef4c37dc049e9f6963e3baf204d815c"&gt;CFQ scheduler has been removed&lt;/a&gt; from the kernel in 2018.&lt;/p&gt;

&lt;p&gt;The same patchset &lt;a href="https://github.com/torvalds/linux/commit/b3c5fd3052492f1b8d060799d4f18be5a5438"&gt;replaces the sorting list with a red-black tree&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In theory the sorting should be performed by the underlying disk scheduler, however, in practice the disk scheduler only accepts and sorts a finite number of requests.  To allow the sorting of all requests, dm-crypt needs to implement its own sorting.&lt;/p&gt;

&lt;p&gt;The overhead associated with rbtree-based sorting is considered negligible so it is not used conditionally.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;All that make sense, but it would be nice to have some backing data.&lt;/p&gt;

&lt;p&gt;Interestingly, in the same patchset we see &lt;a href="https://github.com/torvalds/linux/commit/0f5d8e6ee758f7023e4353cca75d785b2d4f6abe"&gt;the introduction of our familiar "submit_from_crypt_cpus" option&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;There are some situations where offloading write bios from the encryption threads to a single thread degrades performance significantly&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Overall, we can see that every change was reasonable and needed, however things have changed since then:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hardware became faster and smarter&lt;/li&gt;
&lt;li&gt;Linux resource allocation was revisited&lt;/li&gt;
&lt;li&gt;coupled Linux subsystems were rearchitected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And many of the design choices above may not be applicable to modern Linux.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "clean-up"
&lt;/h3&gt;

&lt;p&gt;Based on the research above we decided to try to remove all the extra queueing and asynchronous behaviour and revert &lt;code&gt;dm-crypt&lt;/code&gt; to its original purpose: simply encrypt/decrypt IO requests as they pass through. But for the sake of stability and further benchmarking we ended up not removing the actual code, but rather adding yet another &lt;code&gt;dm-crypt&lt;/code&gt; option, which bypasses all the queues/threads, if enabled. The flag allows us to switch between the current and new behaviour at runtime under full production load, so we can easily revert our changes should we see any side-effects. The resulting patch can be found on the &lt;a href="https://github.com/cloudflare/linux/blob/master/patches/0023-Add-DM_CRYPT_FORCE_INLINE-flag-to-dm-crypt-target.patch"&gt;Cloudflare GitHub Linux repository&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Synchronous Linux Crypto API
&lt;/h4&gt;

&lt;p&gt;From the diagram above we remember that not all queueing is implemented in &lt;code&gt;dm-crypt&lt;/code&gt;. Modern Linux Crypto API may also be asynchronous and for the sake of this experiment we want to eliminate queues there as well. What does "may be" mean, though? The OS may contain different implementations of the same algorithm (for example, &lt;a href="https://en.wikipedia.org/wiki/AES_instruction_set"&gt;hardware-accelerated AES-NI on x86 platforms&lt;/a&gt; and generic C-code AES implementations). By default the system chooses the "best" one based on &lt;a href="https://www.kernel.org/doc/html/v4.19/crypto/architecture.html#crypto-api-cipher-references-and-priority"&gt;the configured algorithm priority&lt;/a&gt;. &lt;code&gt;dm-crypt&lt;/code&gt; allows overriding this behaviour and &lt;a href="https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMCrypt#mapping-table-for-crypt-target"&gt;request a particular cipher implementation&lt;/a&gt; using the &lt;code&gt;capi:&lt;/code&gt; prefix. However, there is one problem. Let us actually check the available AES-XTS (this is our disk encryption cipher, remember?) implementations on our system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; 11 &lt;span class="s1"&gt;'xts(aes)'&lt;/span&gt; /proc/crypto
name         : xts&lt;span class="o"&gt;(&lt;/span&gt;aes&lt;span class="o"&gt;)&lt;/span&gt;
driver       : xts&lt;span class="o"&gt;(&lt;/span&gt;ecb&lt;span class="o"&gt;(&lt;/span&gt;aes-generic&lt;span class="o"&gt;))&lt;/span&gt;
module       : kernel
priority     : 100
refcnt       : 7
selftest     : passed
internal     : no
&lt;span class="nb"&gt;type&lt;/span&gt;         : skcipher
async        : no
blocksize    : 16
min keysize  : 32
max keysize  : 64
&lt;span class="nt"&gt;--&lt;/span&gt;
name         : __xts&lt;span class="o"&gt;(&lt;/span&gt;aes&lt;span class="o"&gt;)&lt;/span&gt;
driver       : cryptd&lt;span class="o"&gt;(&lt;/span&gt;__xts-aes-aesni&lt;span class="o"&gt;)&lt;/span&gt;
module       : cryptd
priority     : 451
refcnt       : 1
selftest     : passed
internal     : &lt;span class="nb"&gt;yes
type&lt;/span&gt;         : skcipher
async        : &lt;span class="nb"&gt;yes
&lt;/span&gt;blocksize    : 16
min keysize  : 32
max keysize  : 64
&lt;span class="nt"&gt;--&lt;/span&gt;
name         : xts&lt;span class="o"&gt;(&lt;/span&gt;aes&lt;span class="o"&gt;)&lt;/span&gt;
driver       : xts-aes-aesni
module       : aesni_intel
priority     : 401
refcnt       : 1
selftest     : passed
internal     : no
&lt;span class="nb"&gt;type&lt;/span&gt;         : skcipher
async        : &lt;span class="nb"&gt;yes
&lt;/span&gt;blocksize    : 16
min keysize  : 32
max keysize  : 64
&lt;span class="nt"&gt;--&lt;/span&gt;
name         : __xts&lt;span class="o"&gt;(&lt;/span&gt;aes&lt;span class="o"&gt;)&lt;/span&gt;
driver       : __xts-aes-aesni
module       : aesni_intel
priority     : 401
refcnt       : 7
selftest     : passed
internal     : &lt;span class="nb"&gt;yes
type&lt;/span&gt;         : skcipher
async        : no
blocksize    : 16
min keysize  : 32
max keysize  : 64
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We want to explicitly select a synchronous cipher from the above list to avoid queueing effects in threads, but the only two supported are &lt;code&gt;xts(ecb(aes-generic))&lt;/code&gt; (the generic C implementation) and &lt;code&gt;__xts-aes-aesni&lt;/code&gt; (the &lt;a href="https://en.wikipedia.org/wiki/AES_instruction_set"&gt;x86 hardware-accelerated implementation&lt;/a&gt;). We definitely want the latter as it is much faster (we're aiming for performance here), but it is suspiciously marked as internal (see &lt;code&gt;internal: yes&lt;/code&gt;). If we &lt;a href="https://github.com/torvalds/linux/blob/fb33c6510d5595144d585aa194d377cf74d31911/include/linux/crypto.h#L91"&gt;check the source code&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Mark a cipher as a service implementation only usable by another cipher and never by a normal user of the kernel crypto API&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So this cipher is meant to be used only by other wrapper code in the Crypto API and not outside it. In practice this means, that the caller of the Crypto API needs to explicitly specify this flag, when requesting a particular cipher implementation, but &lt;code&gt;dm-crypt&lt;/code&gt; does not do it, because by design it is not part of the Linux Crypto API, rather an "external" user. We already patch the &lt;code&gt;dm-crypt&lt;/code&gt; module, so we could as well just add the relevant flag. However, there is another problem with &lt;a href="https://en.wikipedia.org/wiki/AES_instruction_set"&gt;AES-NI&lt;/a&gt; in particular: &lt;a href="https://en.wikipedia.org/wiki/X87"&gt;x86 FPU&lt;/a&gt;. "Floating point" you say? Why do we need floating point math to do symmetric encryption which should only be about bit shifts and XOR operations? We don't need the math, but AES-NI instructions use some of the CPU registers, which are dedicated to the FPU. Unfortunately the Linux kernel &lt;a href="https://github.com/torvalds/linux/blob/fb33c6510d5595144d585aa194d377cf74d31911/arch/x86/kernel/fpu/core.c#L77"&gt;does not always preserve these registers in interrupt context&lt;/a&gt; for performance reasons (saving/restoring FPU is expensive). But &lt;code&gt;dm-crypt&lt;/code&gt; may execute code in interrupt context, so we risk corrupting some other process data and we go back to "it would be very unwise to do decryption in an interrupt context" statement in the original code.&lt;/p&gt;

&lt;p&gt;Our solution to address the above was to create another somewhat &lt;a href="https://github.com/cloudflare/linux/blob/master/patches/0024-Add-xtsproxy-Crypto-API-module.patch"&gt;"smart" Crypto API module&lt;/a&gt;. This module is synchronous and does not roll its own crypto, but is just a "router" of encryption requests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;if we can use the FPU (and thus AES-NI) in the current execution context, we just forward the encryption request to the faster, "internal" &lt;code&gt;__xts-aes-aesni&lt;/code&gt; implementation (and we can use it here, because now we are part of the Crypto API)&lt;/li&gt;
&lt;li&gt;otherwise, we just forward the encryption request to the slower, generic C-based &lt;code&gt;xts(ecb(aes-generic))&lt;/code&gt; implementation&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Using the whole lot
&lt;/h4&gt;

&lt;p&gt;Let's walk through the process of using it all together. The first step is to &lt;a href="https://github.com/cloudflare/linux/blob/master/patches/"&gt;grab the patches&lt;/a&gt; and recompile the kernel (or just compile &lt;code&gt;dm-crypt&lt;/code&gt; and our &lt;code&gt;xtsproxy&lt;/code&gt; modules).&lt;/p&gt;

&lt;p&gt;Next, let's restart our IO workload in a separate terminal, so we can make sure we can reconfigure the kernel at runtime under load:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;fio &lt;span class="nt"&gt;--filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/dev/mapper/encrypted-ram0 &lt;span class="nt"&gt;--readwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;readwrite &lt;span class="nt"&gt;--bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4k &lt;span class="nt"&gt;--direct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nt"&gt;--loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000000 &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;crypt
crypt: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;g&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt;: &lt;span class="nv"&gt;rw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rw, &lt;span class="nv"&gt;bs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4K-4K/4K-4K/4K-4K, &lt;span class="nv"&gt;ioengine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;psync, &lt;span class="nv"&gt;iodepth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
fio-2.16
Starting 1 process
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the main terminal make sure our new Crypto API module is loaded and available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;modprobe xtsproxy
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; 11 &lt;span class="s1"&gt;'xtsproxy'&lt;/span&gt; /proc/crypto
driver       : xts-aes-xtsproxy
module       : xtsproxy
priority     : 0
refcnt       : 0
selftest     : passed
internal     : no
&lt;span class="nb"&gt;type&lt;/span&gt;         : skcipher
async        : no
blocksize    : 16
min keysize  : 32
max keysize  : 64
ivsize       : 16
chunksize    : 16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reconfigure the encrypted disk to use our newly loaded module and enable our patched &lt;code&gt;dm-crypt&lt;/code&gt; flag (we have to use low-level &lt;code&gt;dmsetup&lt;/code&gt; tool as &lt;code&gt;cryptsetup&lt;/code&gt; obviously is not aware of our modifications):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup table encrypted-ram0 &lt;span class="nt"&gt;--showkeys&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/aes-xts-plain64/capi:xts-aes-xtsproxy-plain64/'&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/$/ 1 force_inline/'&lt;/span&gt; | &lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup reload encrypted-ram0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We just "loaded" the new configuration, but for it to take effect, we need to suspend/resume the encrypted device:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup &lt;span class="nb"&gt;suspend &lt;/span&gt;encrypted-ram0 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;dmsetup resume encrypted-ram0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now observe the result. We may go back to the other terminal running the &lt;code&gt;fio&lt;/code&gt; job and look at the output, but to make things nicer, here's a snapshot of the observed read/write throughput in &lt;a href="https://grafana.com/"&gt;Grafana&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GVYoGeWZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/read-throughput-annotated.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GVYoGeWZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/read-throughput-annotated.png" alt="read-throughput-annotated"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eTGvtvs5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/write-throughput-annotated.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eTGvtvs5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/write-throughput-annotated.png" alt="write-throughput-annotated"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Wow, we have more than doubled the throughput! With the total throughput of &lt;code&gt;~640 MB/s&lt;/code&gt; we're now much closer to the expected &lt;code&gt;~696 MB/s&lt;/code&gt; from above.  What about the IO latency? (The &lt;code&gt;await&lt;/code&gt; statistic from the &lt;a href="http://man7.org/linux/man-pages/man1/iostat.1.html"&gt;iostat reporting tool&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vFhhavJ_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/await-annotated.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vFhhavJ_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/await-annotated.png" alt="await-annotated"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The latency has been cut in half as well!&lt;/p&gt;

&lt;h4&gt;
  
  
  To production
&lt;/h4&gt;

&lt;p&gt;So far we have been using a synthetic setup with some parts of the full production stack missing, like file systems, real hardware and most importantly, production workload. To ensure we’re not optimising imaginary things, here is a snapshot of the production impact these changes bring to the caching part of our stack:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hgfSCYGJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/prod.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hgfSCYGJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/prod.png" alt="prod"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This graph represents a three-way comparison of the worst-case response times (99th percentile) for a &lt;a href="https://blog.cloudflare.com/how-we-scaled-nginx-and-saved-the-world-54-years-every-day/"&gt;cache hit in one of our servers&lt;/a&gt;. The green line is from a server with unencrypted disks, which we will use as baseline. The red line is from a server with encrypted disks with the default Linux disk encryption implementation and the blue line is from a server with encrypted disks and our optimisations enabled. As we can see the default Linux disk encryption implementation has a significant impact on our cache latency in worst case scenarios, whereas the patched implementation is indistinguishable from not using encryption at all. In other words the improved encryption implementation does not have any impact at all on our cache response speed, so we basically get it for free! That’s a win!&lt;/p&gt;

&lt;h3&gt;
  
  
  We're just getting started
&lt;/h3&gt;

&lt;p&gt;This post shows how an architecture review can double the performance of a system. Also we &lt;a href="https://blog.cloudflare.com/how-expensive-is-crypto-anyway/"&gt;reconfirmed that modern cryptography is not expensive&lt;/a&gt; and there is usually no excuse not to protect your data.&lt;/p&gt;

&lt;p&gt;We are going to submit this work for inclusion in the main kernel source tree, but most likely not in its current form. Although the results look encouraging we have to remember that Linux is a highly portable operating system: it runs on powerful servers as well as small resource constrained IoT devices and on &lt;a href="https://blog.cloudflare.com/arm-takes-wing/"&gt;many other CPU architectures&lt;/a&gt; as well. The current version of the patches just optimises disk encryption for a particular workload on a particular architecture, but Linux needs a solution which runs smoothly everywhere.&lt;/p&gt;

&lt;p&gt;That said, if you think your case is similar and you want to take advantage of the performance improvements now, you may &lt;a href="https://github.com/cloudflare/linux/blob/master/patches/"&gt;grab the patches&lt;/a&gt; and hopefully provide feedback. The runtime flag makes it easy to toggle the functionality on the fly and a simple A/B test may be performed to see if it benefits any particular case or setup. These patches have been running across our &lt;a href="https://www.cloudflare.com/network/"&gt;wide network of more than 200 data centres&lt;/a&gt; on five generations of hardware, so can be reasonably considered stable. Enjoy both performance and security from Cloudflare for all!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9VZr49T---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/perf-sec.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9VZr49T---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://blog-cloudflare-com-assets.storage.googleapis.com/2020/03/perf-sec.png" alt="perf-sec"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Update (October 11, 2020)
&lt;/h3&gt;

&lt;p&gt;The main patch from this blog (in a slightly updated form) has been &lt;a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/md/dm-crypt.c?id=39d42fa96ba1b7d2544db3f8ed5da8fb0d5cb877"&gt;merged&lt;/a&gt; into mainline Linux kernel and is available since version 5.9 and onwards. The main difference is the mainline version exposes two flags instead of one, which provide the ability to bypass dm-crypt workqueues for reads and writes independently. For details, see &lt;a href="https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/dm-crypt.html"&gt;the official dm-crypt documentation&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>security</category>
      <category>performance</category>
      <category>kernel</category>
    </item>
  </channel>
</rss>
