<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: absterdabster</title>
    <description>The latest articles on DEV Community by absterdabster (@absterdabster).</description>
    <link>https://dev.to/absterdabster</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F391504%2Fc392fa30-db31-4cc6-be1f-fb859ae1d157.jpg</url>
      <title>DEV Community: absterdabster</title>
      <link>https://dev.to/absterdabster</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/absterdabster"/>
    <language>en</language>
    <item>
      <title>How to vectorize your code for faster performance 🚀</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Wed, 23 Jul 2025 05:31:51 +0000</pubDate>
      <link>https://dev.to/absterdabster/how-to-vectorize-your-code-for-faster-performance-1flp</link>
      <guid>https://dev.to/absterdabster/how-to-vectorize-your-code-for-faster-performance-1flp</guid>
      <description>&lt;p&gt;Hi! Let's say you have a time sensitive application. Either you have a lot of data that you need to process quickly. Or you are trying to write code that is very fast.&lt;/p&gt;

&lt;p&gt;It may be possible to make your code very performant. 👀&lt;/p&gt;

&lt;p&gt;How so? &lt;/p&gt;

&lt;p&gt;With the help of vectorization! &lt;/p&gt;

&lt;p&gt;There's a chance you are running a very big loop and running the same set of instructions on all your data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6go08xk6e4sbf3ltx85.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6go08xk6e4sbf3ltx85.jpg" alt="Hamster Wheel" width="500" height="511"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What if we can shrink this loop a lot? We can process chunks of this loop in one step. &lt;/p&gt;

&lt;p&gt;In fact, if you've ever used Python, fast processing libraries like &lt;code&gt;numpy&lt;/code&gt; tend to use vectorized instructions as well for handling large amounts of data faster.&lt;/p&gt;

&lt;p&gt;Before I show you how to steal the moon.... ahem... I mean vectorize your code, please drop any questions you have in the comments below! &lt;/p&gt;

&lt;h2&gt;
  
  
  Vectorized Instructions (SIMD)
&lt;/h2&gt;

&lt;p&gt;SIMD stands for Single Instruction Multiple Data. Okay let's explore some instructions.&lt;/p&gt;

&lt;p&gt;Before, we look at instructions, I must say every computer is different. Every CPU has a different architecture. &lt;/p&gt;

&lt;p&gt;So some may support vectorized instructions, but some may not. &lt;/p&gt;

&lt;p&gt;Lucky for us, most CPUs these days are x86 or x86-64 or ARM architectures. All these architectures support SIMD instructions. (Even the Apple M1 chips too I believe).&lt;/p&gt;

&lt;h3&gt;
  
  
  How do SIMD instructions work?
&lt;/h3&gt;

&lt;p&gt;Good question. (If you're lazy and want to use SIMD instructions without knowing much about how they work, jump to the &lt;code&gt;I'm Lazy&lt;/code&gt; section lol).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5l60ucboudcieut5af84.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5l60ucboudcieut5af84.jpg" alt="Lazy" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you ever took a computer architecture course, you may have heard of these things called registers.&lt;/p&gt;

&lt;p&gt;Registers are like memory holders for tiny pieces of data. Generally, a lot of the usual ones your compiler uses are 64 bit or 32 bit registers.&lt;/p&gt;

&lt;p&gt;This is for several reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;x86-64 means an x86 architecture with 64 bit instructions, x86 generally uses 32 bit instructions&lt;/li&gt;
&lt;li&gt;Memory addresses for modern computers are addressable with 64 bit addresses&lt;/li&gt;
&lt;li&gt;The largest data types languages support are 64 bits (&lt;code&gt;uint64_t&lt;/code&gt;, &lt;code&gt;double&lt;/code&gt;, &lt;code&gt;long&lt;/code&gt;).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;However, computer architectures have been supporting larger and larger registers for things like vectorization, generally 128 bit, 256 bit, and even 512 registers.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;x86/x86-64&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;These are the registers generally used for x86 architectures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;mm0-mm7&lt;/code&gt;: 64 bit registers for SIMD&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;xmm0-xmm15&lt;/code&gt;: 128 bit registers for SIMD&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ymm0-ymm15&lt;/code&gt;: 256 bit registers for SIMD&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;zmm0-zmm15&lt;/code&gt;: 512 bit registers for SIMD&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The registers and operations available to you on x86 largely depends on the support your CPU has. Here are the CPU supports for SIMD available over the years:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;MMX&lt;/code&gt;: 64 bit registers for SIMD and instructions, oldest (1997)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SSE&lt;/code&gt;: 128 bit registers and introduced 70 new instructions (1999)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SSE2&lt;/code&gt;: introduced 144 new instructions to 128 bit registers (2000)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SSE3&lt;/code&gt;: introduced 13 new instructions (horizontal add/subtract) (2004)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SSSE3&lt;/code&gt;: introduced 38 new instructions to extend MMX and SSE (2008)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AVX&lt;/code&gt;: introduced 256 bit register vectors (2011)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AVX-512&lt;/code&gt;: introduced 512 bit register vectors (2016)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AMX&lt;/code&gt;: introduce 8192 bit registers (&lt;code&gt;tmm0...tmm7&lt;/code&gt;) (2023)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How do you use them??? No need to fear, superman is here. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi37v8gwwd0923z71w4sj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi37v8gwwd0923z71w4sj.png" alt="Superman" width="401" height="498"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are special functions for the architecture you can use in C/C++ called intrinsics. There are intrinsics that you can use to vectorize add, vectorize multiply, etc.&lt;/p&gt;

&lt;p&gt;To use the intrinsics, for x86 intel CPUs, &lt;/p&gt;

&lt;p&gt;All you have to do now to vectorize your code is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Include one of the following header files based on the intrinsics you want to use, for example: 

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;xmmintrin.h&amp;gt;&lt;/code&gt;: (MMX)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;emmintrin.h&amp;gt;&lt;/code&gt;: (SSE)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;pmmintrin.h&amp;gt;&lt;/code&gt;: (SSE3)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;immintrin.h&amp;gt;&lt;/code&gt;: (AVX/AVX-512)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Use one of these intrinsics, here is a list: &lt;a href="https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html" rel="noopener noreferrer"&gt;here&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Finally compile with one of the flags on gcc, for example:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-mmmx&lt;/code&gt;: (MMX)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-msse&lt;/code&gt;: (SSE)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-msse3&lt;/code&gt;: (SSE3)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-mavx&lt;/code&gt;: (AVX)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-mavx512f&lt;/code&gt;: (AVX-512)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  ARM
&lt;/h3&gt;

&lt;p&gt;ARM is the other popular architecture for CPUs, sometimes used for mobile devices. It also supports SIMD. &lt;/p&gt;

&lt;p&gt;It uses these registers they call NEON registers, but the idea is similar:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;D0-D31&lt;/code&gt;: 64 bit registers for SIMD&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Q0-Q15&lt;/code&gt;: 128 bit registers for SIMD&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To use these ARM vectorized instructions, you would have to do the following in C/C++:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;#include &amp;lt;arm_neon.h&amp;gt;&lt;/code&gt; at the top of your file&lt;/li&gt;
&lt;li&gt;Use the intrinic c++ functions (like vaddq_u32) from &lt;a href="https://developer.arm.com/architectures/instruction-sets/intrinsics/" rel="noopener noreferrer"&gt;here&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Compile your program with the gcc flag &lt;code&gt;-mfpu=neon&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  I'm Lazy
&lt;/h3&gt;

&lt;p&gt;Okay lazy boy. Or girl lol. &lt;/p&gt;

&lt;p&gt;If you don't want to think about x86 or ARM, compilers are made powerful just for you. &lt;/p&gt;

&lt;p&gt;You can let your compiler automagically figure out your architecture and compile your code with SIMD instructions.&lt;/p&gt;

&lt;p&gt;Let's keep it short, but all you have to do is compile like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g++ -o test test.cpp -O3 -ftree-vectorize -march=native
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-O3&lt;/code&gt;: extreme optimization, technically you just need &lt;code&gt;-O2&lt;/code&gt; or higher. It also includes &lt;code&gt;ftree-vectorize&lt;/code&gt;, so having &lt;code&gt;ftree-vectorize&lt;/code&gt; is redundant. &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-ftree-vectorize&lt;/code&gt;: in case you forget &lt;code&gt;O3&lt;/code&gt; or you use a lower optimization, you can see SIMD instructions in unoptimized code&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-march=native&lt;/code&gt;: if you use &lt;code&gt;-ftree-vectorize&lt;/code&gt; or &lt;code&gt;-O3&lt;/code&gt; without this flag, the compiler will use a default set of vectorized instructions (up to SSE2 for &lt;code&gt;x86-64&lt;/code&gt;). Including this flag, utilizes the best features of your CPU's inventory of SIMD instructions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Comparing speeds
&lt;/h2&gt;

&lt;p&gt;Let's look at this simple sample program lol, adding 30 ints together.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
uint64_t rdtsc(){
        volatile uint64_t v{0};
        __asm__ volatile(
                "rdtsc"
                :"=A" (v)
        );
        return v;
}

int main(int argc, char** argv){
        int a[30] = {1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5};
        int b[30] = {5,4,3,2,1,2,3,4,5,4,3,2,1,2,3,5,4,3,2,1,2,3,4,5,4,3,2,1,2,3};
        int c[30];

        volatile uint64_t start = rdtsc();
        for(size_t i = 0; i &amp;lt; 30; i++){
                c[i] = a[i] + b[i];
        }
        volatile uint64_t end = rdtsc();

        std::cout &amp;lt;&amp;lt; end-start &amp;lt;&amp;lt; " cycles" &amp;lt;&amp;lt; std::endl;
        for(auto v: c){
                std::cout &amp;lt;&amp;lt;  v &amp;lt;&amp;lt; " ";
        }
        std::cout &amp;lt;&amp;lt; std::endl;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, I use rdtsc() to time my program. For reference, I am timing and running my program from an x86-64 architecture CPU.&lt;/p&gt;

&lt;p&gt;If you are unfamiliar with RDTSC, GO CHECK OUT MY TIMING YOUR PROGRAM &lt;a href="https://dev.to/absterdabster/measuring-your-program-speed-correctly-4a52"&gt;BLOG&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If I try to compile and this program with no optimization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g++ -o test test.cpp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I get the following results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;463 cycles
6 6 6 6 6 3 5 7 9 9 4 4 4 6 8 6 6 6 6 6 3 5 7 9 9 4 4 4 6 8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For reference, the loop takes 463 CPU cycles. My CPU runs at 2.5GHz. This means it took roughly &lt;code&gt;182&lt;/code&gt; nanoseconds to run!&lt;/p&gt;

&lt;p&gt;Let's take a look at the assembly, low level instructions for the juicy part of the code which we can get from with the &lt;code&gt;-s&lt;/code&gt; flag when compiling.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        call    _Z5rdtscv
        movq    %rax, -432(%rbp)
        movq    $0, -416(%rbp)
.L5:
        cmpq    $29, -416(%rbp)
        ja      .L4
        movq    -416(%rbp), %rax
        movl    -384(%rbp,%rax,4), %edx
        movq    -416(%rbp), %rax
        movl    -256(%rbp,%rax,4), %eax
        addl    %eax, %edx
        movq    -416(%rbp), %rax
        movl    %edx, -128(%rbp,%rax,4)
        addq    $1, -416(%rbp)
        jmp     .L5
.L4:
        call    _Z5rdtscv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great we can see that normal compilation leads to no use of vectorized registers/SIMD instructions. We just have the classic &lt;code&gt;rax&lt;/code&gt;, &lt;code&gt;rbp&lt;/code&gt; registers which are 64 bit and 32 bit registers (&lt;code&gt;edx&lt;/code&gt;, &lt;code&gt;eax&lt;/code&gt;, etc.)&lt;/p&gt;

&lt;p&gt;Now let's see if we can vectorize these. Okay, let's do something simple with just &lt;code&gt;O3&lt;/code&gt; optimization.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g++ -O3 -o test test.cpp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Add the &lt;code&gt;-s&lt;/code&gt; flag for assembly)&lt;/p&gt;

&lt;p&gt;As a fun challenge, try to guess how long the optimized version took to run.&lt;/p&gt;

&lt;p&gt;Before I reveal the answer, let's see how this program got optimized in the assembly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#APP
# 7 "test.cpp" 1
        rdtsc
# 0 "" 2
#NO_APP
        movq    %rax, 16(%rsp)
        movq    16(%rsp), %rax
        movdqa  %xmm1, %xmm5
        movdqa  160(%rsp), %xmm4
        paddd   %xmm0, %xmm5
        paddd   192(%rsp), %xmm3
        paddd   208(%rsp), %xmm2
        movq    %rax, (%rsp)
        paddd   %xmm0, %xmm4
        movl    272(%rsp), %eax
        paddd   240(%rsp), %xmm0
        movaps  %xmm5, 304(%rsp)
        addl    144(%rsp), %eax
        movaps  %xmm4, 288(%rsp)
        paddd   256(%rsp), %xmm1
        movl    %eax, 400(%rsp)
        movl    148(%rsp), %eax
        addl    276(%rsp), %eax
        movaps  %xmm3, 320(%rsp)
        movaps  %xmm2, 336(%rsp)
        movaps  %xmm4, 352(%rsp)
        movaps  %xmm0, 368(%rsp)
        movaps  %xmm1, 384(%rsp)
        movl    %eax, 404(%rsp)
        movq    $0, 24(%rsp)
#APP
# 7 "test.cpp" 1
        rdtsc
# 0 "" 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ahhh, now we have &lt;code&gt;xmm&lt;/code&gt; registers! We also see SIMD x86 instruction like &lt;code&gt;movaps&lt;/code&gt; and &lt;code&gt;paddd&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;If you remember from earlier, &lt;code&gt;xmm&lt;/code&gt; registers are 128 bit registers which is covered by the default &lt;code&gt;SSE&lt;/code&gt; support.&lt;/p&gt;

&lt;p&gt;Now to show you the speed difference this has:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;217 cycles
6 6 6 6 6 3 5 7 9 9 4 4 4 6 8 6 6 6 6 6 3 5 7 9 9 4 4 4 6 8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whoa! We 2xed our speed by doubling the size of our registers!&lt;/p&gt;

&lt;p&gt;We were just at like 182 nanos around, and now its closer to 85 nanos!&lt;/p&gt;

&lt;p&gt;Remember, we're only adding 30 numbers here, but if we were doing a billion numbers, these nanoseconds WILL add up.&lt;/p&gt;

&lt;p&gt;Okay, now let's see what happens when we introduce the microarchitecture flag into our compilation. Any better?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;g++ -o test test.cpp -O3 -march=native
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here is the new assembly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#APP
# 7 "test.cpp" 1
        rdtsc
# 0 "" 2
#NO_APP
        movq    %rax, 16(%rsp)
        movq    16(%rsp), %rax
        vpaddd  %ymm3, %ymm0, %ymm0
        vpaddd  160(%rsp), %ymm2, %ymm2
        vpaddd  192(%rsp), %ymm1, %ymm1
        vmovdqa %ymm0, 352(%rsp)
        movq    %rax, (%rsp)
        movl    256(%rsp), %eax
        vmovdqa %ymm2, 288(%rsp)
        addl    128(%rsp), %eax
        movl    %eax, 384(%rsp)
        movl    260(%rsp), %eax
        vmovdqa %ymm1, 320(%rsp)
        addl    132(%rsp), %eax
        movl    %eax, 388(%rsp)
        movl    264(%rsp), %eax
        movq    $0, 24(%rsp)
        addl    136(%rsp), %eax
        movl    %eax, 392(%rsp)
        movl    268(%rsp), %eax
        addl    140(%rsp), %eax
        movl    %eax, 396(%rsp)
        movl    272(%rsp), %eax
        addl    144(%rsp), %eax
        movl    %eax, 400(%rsp)
        movl    148(%rsp), %eax
        addl    276(%rsp), %eax
        movl    %eax, 404(%rsp)
#APP
# 7 "test.cpp" 1
        rdtsc
# 0 "" 2
#NO_APP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;WHOAAAAA! Do you see what I see?!&lt;/p&gt;

&lt;p&gt;We just got &lt;code&gt;ymm&lt;/code&gt; registers! These guys are like 256 bit registers that my CPU supports.&lt;/p&gt;

&lt;p&gt;Let's see what that means about our speed. Any guesses?&lt;/p&gt;

&lt;p&gt;3....&lt;br&gt;
2....&lt;br&gt;
1....&lt;br&gt;
Boom:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;162 cycles
6 6 6 6 6 3 5 7 9 9 4 4 4 6 8 6 6 6 6 6 3 5 7 9 9 4 4 4 6 8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This took &lt;code&gt;65&lt;/code&gt; nanoseconds to run. It seems like our benefits are starting to decay with only 30 numbers.&lt;/p&gt;

&lt;p&gt;We went from &lt;code&gt;182&lt;/code&gt; nanoseconds to &lt;code&gt;65&lt;/code&gt; nanoseconds. We basically &lt;code&gt;3x&lt;/code&gt;ed our speed!&lt;/p&gt;

&lt;p&gt;Note: I could've alternatively chosen to used intrinsics as well. But I trust you will figure it out with the resources I've given you and the internet &amp;lt;3&lt;/p&gt;

&lt;h2&gt;
  
  
  Examples in the world
&lt;/h2&gt;

&lt;p&gt;Okay, I shall attempt to find 1 example of SIMD instructions in the wild wild west.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frymj35gs6tizwejc1vz0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frymj35gs6tizwejc1vz0.jpg" alt="Wild West" width="538" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenCV is a library used for image and vision processing! &lt;/p&gt;

&lt;p&gt;Vectorized instructions can be useful for processing rows of pixels the same way.&lt;/p&gt;

&lt;p&gt;Here is an example of SSE x86 intrinsics being used in OpenCV to merge two maps/images together.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;void convertMaps_nninterpolate32f1c16s_SSE41(const float* src1f, const float* src2f, short* dst1, int width)
{
    int x = 0;
    for (; x &amp;lt;= width - 16; x += 16)
    {
        __m128i v_dst0 = _mm_packs_epi32(_mm_cvtps_epi32(_mm_loadu_ps(src1f + x)),
            _mm_cvtps_epi32(_mm_loadu_ps(src1f + x + 4)));
        __m128i v_dst1 = _mm_packs_epi32(_mm_cvtps_epi32(_mm_loadu_ps(src1f + x + 8)),
            _mm_cvtps_epi32(_mm_loadu_ps(src1f + x + 12)));

        __m128i v_dst2 = _mm_packs_epi32(_mm_cvtps_epi32(_mm_loadu_ps(src2f + x)),
            _mm_cvtps_epi32(_mm_loadu_ps(src2f + x + 4)));
        __m128i v_dst3 = _mm_packs_epi32(_mm_cvtps_epi32(_mm_loadu_ps(src2f + x + 8)),
            _mm_cvtps_epi32(_mm_loadu_ps(src2f + x + 12)));

        _mm_interleave_epi16(v_dst0, v_dst1, v_dst2, v_dst3);

        _mm_storeu_si128((__m128i *)(dst1 + x * 2), v_dst0);
        _mm_storeu_si128((__m128i *)(dst1 + x * 2 + 8), v_dst1);
        _mm_storeu_si128((__m128i *)(dst1 + x * 2 + 16), v_dst2);
        _mm_storeu_si128((__m128i *)(dst1 + x * 2 + 24), v_dst3);
    }

    for (; x &amp;lt; width; x++)
    {
        dst1[x * 2] = saturate_cast&amp;lt;short&amp;gt;(src1f[x]);
        dst1[x * 2 + 1] = saturate_cast&amp;lt;short&amp;gt;(src2f[x]);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Okay let's keep this breakdown brief and simple.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;__m128i v_dst0 = _mm_packs_epi32(_mm_cvtps_epi32(_mm_loadu_ps(src1f + x)), _mm_cvtps_epi32(_mm_loadu_ps(src1f + x + 4)));&lt;/code&gt;&lt;br&gt;
These instructions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;_mm_loadu_ps&lt;/code&gt;: load 4 floats into an &lt;code&gt;xmm&lt;/code&gt; 128 bit register&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;_mm_cvtps_epi32&lt;/code&gt;: convert the 4 floats in the &lt;code&gt;xmm&lt;/code&gt; into 4 byte ints &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;_mm_packs_epi32&lt;/code&gt;: conver the 2 x 4 ints into 8 shorts (2 Bytes) in another &lt;code&gt;xmm&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;_mm_interleave_epi16(v_dst0, v_dst1, v_dst2, v_dst3);&lt;/code&gt;&lt;br&gt;
Interleaves the 1st and 3rd xmm vectors together. Then it interleaves the 2nd and 4th vectors together. This is done so that src1 and src2 points are interleaved together.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;_mm_storeu_si128((__m128i *)(dst1 + x * 2), v_dst0);&lt;/code&gt;&lt;br&gt;
Uses vectorized instructions to copy from &lt;code&gt;xmm&lt;/code&gt; registers to &lt;code&gt;dst1&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The last loop is scalar to handle the non 16 byte aligned data that is left over.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cool, that's enough examples in the wild.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Conclusion
&lt;/h2&gt;

&lt;p&gt;Let's keep this simple. We saw that with a small example you can speed up a program up to 3x with vectorized instructions. &lt;/p&gt;

&lt;p&gt;However, with larger programs, you could see even more performance benefits from SIMD (Single Instruction Multiple Data) instructions.&lt;/p&gt;

&lt;p&gt;So next time you consider speeding up your program, think about how you can use your CPU's features to your advantage.&lt;/p&gt;

&lt;p&gt;All in all:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;SIMD support depends on CPU architecture&lt;/li&gt;
&lt;li&gt;SIMD instructions can be used with intrinsics&lt;/li&gt;
&lt;li&gt;The compiler can handle SIMD optimization for you alternatively&lt;/li&gt;
&lt;li&gt;SIMD could speed up your program 2-10x, depending on CPU support/amount of data.&lt;/li&gt;
&lt;li&gt;SIMD is used in the wild, like in OpenCV and even numpy libraries.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I hope you are ready to be faster.&lt;/p&gt;

&lt;p&gt;That's all I have this time.&lt;/p&gt;

&lt;p&gt;Peace&lt;br&gt;
-absterdabster&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6aji6zg217behfkcjris.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6aji6zg217behfkcjris.jpg" alt="Peace" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>systems</category>
      <category>compiling</category>
      <category>assembly</category>
    </item>
    <item>
      <title>Creating a list that contains different types in C++ 😎</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Fri, 30 May 2025 02:06:48 +0000</pubDate>
      <link>https://dev.to/absterdabster/creating-a-list-that-contains-different-types-in-c-17nb</link>
      <guid>https://dev.to/absterdabster/creating-a-list-that-contains-different-types-in-c-17nb</guid>
      <description>&lt;p&gt;Hi! Let's say you are trying to create a list of people and information about them (name, age, gender, address, etc.). &lt;/p&gt;

&lt;p&gt;Notice all of these are different types! In Python, lists can handle different types, so you could store all of these.&lt;/p&gt;

&lt;p&gt;In C++, you cannot store all of these in a vector or array because these container require the same types for each element. However, you can make a class or struct to represent a person like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct Person{
std::string name;
std::string address
uint8_t age;
char gender;
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great you could store all this info together in your program. &lt;/p&gt;

&lt;p&gt;However, what if later I told you we are adding &lt;code&gt;email&lt;/code&gt; to person?&lt;/p&gt;

&lt;p&gt;You would have to stop your program, store the program data somewhere, update the &lt;code&gt;Person&lt;/code&gt; struct to have &lt;code&gt;email&lt;/code&gt; and reload the program data into the struct properly. Ughhh, so much extra work!&lt;/p&gt;

&lt;p&gt;What if I told you, I wanted to iterate through all the fields of &lt;code&gt;Person&lt;/code&gt; and print them out. You would have to manually type something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;void print(Person p){
std::cout &amp;lt;&amp;lt; p.name &amp;lt;&amp;lt; std::endl;
std::cout &amp;lt;&amp;lt; p.age &amp;lt;&amp;lt; std::endl;
...
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ughhh, so much extra typing :(&lt;/p&gt;

&lt;p&gt;If only, there was some solution.......&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8nnhxjd6uk9vo5dc8hm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8nnhxjd6uk9vo5dc8hm.jpg" alt="The solution" width="300" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is. Today we are going to try to make a list that can maintain different types of values.&lt;/p&gt;

&lt;p&gt;This is inspired from &lt;code&gt;std::tuple&lt;/code&gt; which already can hold different types of data. However, you can't add/remove elements to it or iterate through it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;std::tuple&amp;lt;int, char, double&amp;gt; = std::make_tuple(1, 'a', 4.2);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this new container, we'll try to support all of this. And also try to make operations have fast O(1) constant access where we can with the help of the C++20 compiler!&lt;/p&gt;

&lt;p&gt;In order to build this, we'll talk a bit about C++ variadic templates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basics of &lt;code&gt;templates&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Okay cool. Let me just show you a template. If you already know how templates work, skip past this intro.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename T&amp;gt;
void print(T arg){
     std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; std::endl;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we made a template function. Here, we say that the argument has type T. T could be any type. &lt;/p&gt;

&lt;p&gt;T could be an int. T could be char. But everything inside the function has to work for the given types.&lt;/p&gt;

&lt;p&gt;The compiler looks for the usages of &lt;code&gt;print&lt;/code&gt; and creates versions of functions for types that were used.&lt;/p&gt;

&lt;p&gt;For instance, if you had in your program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print('a');
....
print(2.5);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler will expand print by creating functions that essentially look like this behind the scenes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;void print(double arg){
    std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; std::endl;
}

void print(char arg){
    std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; std::endl;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In fact we can even templatize &lt;code&gt;structs&lt;/code&gt; and &lt;code&gt;class&lt;/code&gt; declarations, and those will expand out to the specialized versions of the declarations after the compiler sees all use cases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename T&amp;gt;
struct test{
    T val;
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;could expand to something like this if you use &lt;code&gt;test&amp;lt;double&amp;gt;&lt;/code&gt; or &lt;code&gt;test&amp;lt;char&amp;gt;&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct test&amp;lt;double&amp;gt;{
   double val;
};

struct test&amp;lt;char&amp;gt;{
   char val;
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will be useful when constructing our tuple as we may want to declare different types that our tuple can store at compile time so that the compiler knows how much memory to use. (In the &lt;code&gt;test&lt;/code&gt; struct example, &lt;code&gt;test&amp;lt;double&amp;gt;&lt;/code&gt; is a different memory size than &lt;code&gt;test&amp;lt;char&amp;gt;&lt;/code&gt;, 8 bytes vs 1 byte)&lt;/p&gt;

&lt;p&gt;Now let's talk about custom template specialization. We can declare a template, but create our own specialization of that template for certain types if we want it to behave differently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename T&amp;gt;
void print(T arg){
     std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; std::endl;
}

template&amp;lt;&amp;gt;
void print&amp;lt;int&amp;gt;(int arg){
     std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; " is an int. " &amp;lt;&amp;lt; std::endl;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler won't create a new &lt;code&gt;print&amp;lt;int&amp;gt;&lt;/code&gt; implementation. Instead, it'll use our custom specialized one.&lt;/p&gt;

&lt;p&gt;Okay, that's enough of basic templates. &lt;/p&gt;

&lt;p&gt;Let's talk about advanced templates with variadic templates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Variadic templates
&lt;/h3&gt;

&lt;p&gt;If you already know variadic templates, jump ahead and we'll start making our list! But for you noobs like me, stick around :).&lt;/p&gt;

&lt;p&gt;We'll be very brief about it. There's a lot to explore in this world.&lt;/p&gt;

&lt;p&gt;Variadic templates/packs have been around since C++11.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename... Pack&amp;gt;
void print(Pack... args){
     (std::cout &amp;lt;&amp;lt; ... &amp;lt;&amp;lt; args) &amp;lt;&amp;lt; std::endl;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we show that print is using a pack in its template for its arguments.  &lt;code&gt;Pack... args&lt;/code&gt; shows that the print function can use any number of arguments. So:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print('a', 2, "abc");
print(2, 1, '3', 3.5, true);
print(); // empty line
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All of these are valid! The example also had this line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;     (std::cout &amp;lt;&amp;lt; ... &amp;lt;&amp;lt; args) &amp;lt;&amp;lt; std::endl;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is an example of a (binary right) fold expression. Basically, it is syntax to expand out to its full form:&lt;br&gt;
&lt;code&gt;std::cout &amp;lt;&amp;lt; arg1 &amp;lt;&amp;lt; arg2 ... etc &amp;lt;&amp;lt; std::endl;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;If you would like to learn more about fold expressions please read &lt;a href="https://en.cppreference.com/w/cpp/language/fold" rel="noopener noreferrer"&gt;here.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Okay, this was a very very basic intro, but I think we can start building our list data-structure and learn the rest on the way.&lt;/p&gt;
&lt;h2&gt;
  
  
  Building a tuple list
&lt;/h2&gt;

&lt;p&gt;Let's make a tuple list!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename...&amp;gt;
struct TupleList;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At the very basic level, this is all the TupleList is. &lt;/p&gt;

&lt;p&gt;Here you may notice something weird. We use &lt;code&gt;typename...&lt;/code&gt; as opposed to &lt;code&gt;typename... Args&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is because we are only declaring it here and we don't intend to use pack in the declaration which is valid. &lt;/p&gt;

&lt;p&gt;We'll be specializing TupleList and you will see us introduce named packs like &lt;code&gt;typename... Args&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Okay, let's specialize the TupleList so we can actually understand how it works lol. The declaration wasn't very informative.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// delcaration
template&amp;lt;typename...&amp;gt;
struct TupleList;

// base case, empty tuple, empty pack
template&amp;lt;&amp;gt;
struct TupleList&amp;lt;&amp;gt;{};

// recursive inheritance case
template&amp;lt;typename T, typename... Rem&amp;gt;
struct TupleList&amp;lt;T, Rem...&amp;gt;: TupleList&amp;lt;Rem...&amp;gt;{
    T val;
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is an implementation of the &lt;code&gt;TupleList&lt;/code&gt;. There may be other ways to implement it, but for now we will talk about this method.&lt;/p&gt;

&lt;p&gt;After declaring the &lt;code&gt;TupleList&lt;/code&gt;, we created a specialization for the empty &lt;code&gt;TupleList&lt;/code&gt;. This is our base case, the smallest/simplest unit of TupleList.&lt;/p&gt;

&lt;p&gt;All other &lt;code&gt;TupleList&lt;/code&gt; branches off the empty &lt;code&gt;TupleList&amp;lt;&amp;gt;&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Let's talk about the non empty &lt;code&gt;TupleList&lt;/code&gt;, the recursive case. All non-empty &lt;code&gt;TupleList&lt;/code&gt; store a &lt;code&gt;T val&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;This represents our &lt;code&gt;TupleList&lt;/code&gt; element that we store.&lt;/p&gt;

&lt;p&gt;You may also notice that the recursive case is child of another &lt;code&gt;TupleList&lt;/code&gt;, &lt;code&gt;TupleList&amp;lt;Rem...&amp;gt;&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Yes templated structs can inherit from itself only if the template parameters are different (and if there is no infinite recursion/inheritence). &lt;/p&gt;

&lt;p&gt;The inheritence must come to an end, which in our case is the base case. &lt;/p&gt;

&lt;p&gt;Every non empty TupleList inherits from its parent which has one less template argument. So it here is an example visualization of how this may look:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TupleList&amp;lt;int, char, double&amp;gt; -&amp;gt; TupleList&amp;lt;char, double&amp;gt; -&amp;gt; TupleList&amp;lt;double&amp;gt; -&amp;gt; TupleList&amp;lt;&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The arrows show its next immediate parent. This means that &lt;code&gt;TupleList&amp;lt;int, char, double&amp;gt;&lt;/code&gt; is also a type &lt;code&gt;TupleList&amp;lt;&amp;gt;&lt;/code&gt; at the very base layer. &lt;/p&gt;

&lt;p&gt;But each parent has a different &lt;code&gt;T val&lt;/code&gt; because its first template parameter is different.&lt;/p&gt;

&lt;p&gt;Okay cool, we have a &lt;code&gt;TupleList&lt;/code&gt;. How do we create one in our code?&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a &lt;code&gt;TupleList&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To create a &lt;code&gt;TupleList&lt;/code&gt; instance, we will need some constructors.&lt;/p&gt;

&lt;p&gt;We'll have to add a constructor to our non empty &lt;code&gt;TupleList&lt;/code&gt; so that we can pass values into our new container.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename T, typename... Rem&amp;gt;
struct TupleList&amp;lt;T, Rem...&amp;gt;: TupleList&amp;lt;Rem...&amp;gt;{
    T val;
    TupleList&amp;lt;T, Rem...&amp;gt;(T val, Rem... rem): TupleList&amp;lt;Rem...&amp;gt;(rem), val(val){}
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This constructor takes in a value, and a pack of remaining values. &lt;/p&gt;

&lt;p&gt;The remaining values are passed into its parents constructor. &lt;/p&gt;

&lt;p&gt;And its parents will pass it to its parents... and so on.&lt;/p&gt;

&lt;p&gt;We strip away one value for each parent until it becomes an empty pack at which point the &lt;code&gt;TupleList&lt;/code&gt; parent will be the base case.&lt;/p&gt;

&lt;p&gt;Isn't this cool? &lt;/p&gt;

&lt;p&gt;It's like a type linked list. &lt;/p&gt;

&lt;p&gt;One &lt;code&gt;TupleList&lt;/code&gt; stores one types value and links to the next via inheritance which stores the next value.&lt;/p&gt;

&lt;p&gt;The constructor traverses this linked list of parents that stores various types.&lt;/p&gt;

&lt;p&gt;So now if I want to construct a &lt;code&gt;TupleList&lt;/code&gt;, it would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TupleList&amp;lt;int, double, char&amp;gt; tuple(1, 2.5, 'c');
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yayyyy! Now we can store multiple types of values contiguously in memory like a list.&lt;/p&gt;

&lt;p&gt;Now how do I get values out?&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;get&lt;/code&gt; values from &lt;code&gt;TupleList&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The ideal interface for getting values from a list would be with indexes. We'd like to randomly access different indexes.&lt;/p&gt;

&lt;p&gt;It gets weird though because we want a get function, but depending on the index, get is going to have different return types....&lt;/p&gt;

&lt;p&gt;Oh no, .... how can we fix this????&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpx1i3825mto71tedyeoo.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpx1i3825mto71tedyeoo.gif" alt="Fixing it" width="220" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Well good thing we have templates. &lt;/p&gt;

&lt;p&gt;We'll be able to generate many types of &lt;code&gt;get&lt;/code&gt; functions with minimal implementations.&lt;/p&gt;

&lt;p&gt;First we need to figure out how to translate indexes to a specific parent of &lt;code&gt;TupleList&lt;/code&gt; so that we can extract its value.&lt;/p&gt;

&lt;p&gt;I find this part very interesting...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;size_t idx, typename TupleList&amp;gt;
struct GetIndex;

template&amp;lt;template T, template... Rem&amp;gt;
struct GetIndex&amp;lt;0, TupleList&amp;lt;T, Rem...&amp;gt;&amp;gt;{
     using type = T;
};

template&amp;lt;size_t idx, template T, template... Rem&amp;gt;
struct GetIndex&amp;lt;idx, TupleList&amp;lt;T, Rem..&amp;gt;&amp;gt;{
     using type = typename GetIndex&amp;lt;idx-1, TupleList&amp;lt;Rem...&amp;gt;&amp;gt;::type;
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We created a struct called &lt;code&gt;GetIndex&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;GetIndex&lt;/code&gt; is a templated empty struct. It is defined by a pair, index and a  &lt;code&gt;TupleList&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Let's start with the base case here. When index is 0, we have reached our element in this TupleList. &lt;/p&gt;

&lt;p&gt;But to get to index 0, we have to keep decrementing index as we strip away an element in the &lt;code&gt;TupleList&lt;/code&gt; type. &lt;/p&gt;

&lt;p&gt;As we saw earlier, to get to the next element, we have to go to our parent &lt;code&gt;TupleList&lt;/code&gt;. By stripping an element and decrementing an index, we go one step into our parent.&lt;/p&gt;

&lt;p&gt;Hence each of these &lt;code&gt;GetIndex&lt;/code&gt; empty structs are nodes in a type linked list that takes you from one element to another with an index for each parent.&lt;/p&gt;

&lt;p&gt;Great. We have indexes, now how do we implement &lt;code&gt;get&lt;/code&gt; using &lt;code&gt;GetIndex&lt;/code&gt;?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;size_t idx, typename TupleList&amp;gt;
struct GetIndex;

template&amp;lt;template T, template... Rem&amp;gt;
struct GetIndex&amp;lt;0, TupleList&amp;lt;T, Rem...&amp;gt;&amp;gt;{
     using type = T;
     static type get(TupleList&amp;lt;T, Rem...&amp;gt;&amp;amp; tuple){
          return tuple.val;
     }
};

template&amp;lt;size_t idx, template T, template... Rem&amp;gt;
struct GetIndex&amp;lt;idx, TupleList&amp;lt;T, Rem..&amp;gt;&amp;gt;{
     using type = typename GetIndex&amp;lt;idx-1, TupleList&amp;lt;Rem...&amp;gt;&amp;gt;::type;
     static type get(TupleList&amp;lt;T, Rem...&amp;gt;&amp;amp; tuple){
          return GetIndex&amp;lt;idx-1, TupleList&amp;lt;Rem...&amp;gt;&amp;gt;(tuple);
     }
};

template&amp;lt;size_t idx, typename T, typename... Rem&amp;gt;
GetIndex&amp;lt;idx, TupleList&amp;lt;T, Rem...&amp;gt;::type get(TupleList&amp;lt;T, Rem...&amp;gt;&amp;amp; tuple){
     return GetIndex&amp;lt;idx, TupleList&amp;lt;T, Rem...&amp;gt;&amp;gt;::get(tuple);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So we just implemented the &lt;code&gt;get&lt;/code&gt; function. &lt;/p&gt;

&lt;p&gt;The main one that gets called is the non-member &lt;code&gt;get&lt;/code&gt; function, the one outside the &lt;code&gt;GetIndex&lt;/code&gt; struct. &lt;/p&gt;

&lt;p&gt;This main &lt;code&gt;get&lt;/code&gt; function calls the &lt;code&gt;get&lt;/code&gt; functions of the &lt;code&gt;GetIndex&lt;/code&gt; structs to recursively go through the linkedlist and grab the value. &lt;/p&gt;

&lt;p&gt;The compiler should optimize this into an O(1) operation as each &lt;code&gt;get&lt;/code&gt; function only has 1 line which is to call the next &lt;code&gt;get&lt;/code&gt; function or return the value.&lt;/p&gt;

&lt;p&gt;But here is the interesting part. The main &lt;code&gt;get&lt;/code&gt; function is a templated function. It's return type is dependent on the &lt;code&gt;GetIndex&amp;lt;idx, TupleList&amp;lt;T, Rem...&amp;gt;::type&lt;/code&gt; type.&lt;/p&gt;

&lt;p&gt;The return type goes through the templates and runs through the GetIndex &lt;code&gt;type&lt;/code&gt; alias recursive chain (of decrementing index and looking at the parent's &lt;code&gt;type&lt;/code&gt;) until it hits the definition for &lt;code&gt;type&lt;/code&gt; in the base class (index is 0).&lt;/p&gt;

&lt;p&gt;The compiler will create definitions by recursing through &lt;code&gt;GetIndex&lt;/code&gt; to determine the function's return type.&lt;/p&gt;

&lt;p&gt;Now the only part that sucks about this is that the compiler requires you to know the index you want to access at compile time.&lt;/p&gt;

&lt;p&gt;Templates are evaluated and specialized by the compiler when you compile your program.&lt;/p&gt;

&lt;p&gt;So when you call this &lt;code&gt;get&lt;/code&gt; function like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TupleList&amp;lt;int, double, char&amp;gt; list(1, 2.5, 'c');
double val = get&amp;lt;1&amp;gt;(list);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler determines at compile time that this get function returns type &lt;code&gt;double&lt;/code&gt; and also wants index 1. &lt;/p&gt;

&lt;p&gt;(Template parameters must be constant r-values or constexpr values.) So you can't do &lt;code&gt;get&amp;lt;variable&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Great, if I have a for loop then, how do I access each index one by one?&lt;/p&gt;

&lt;p&gt;You have two options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; implement a &lt;code&gt;loop&lt;/code&gt; function &lt;/li&gt;
&lt;li&gt; have a &lt;code&gt;get&lt;/code&gt; function that we can give an index at runtime&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Okay let's try to both:&lt;/p&gt;

&lt;h3&gt;
  
  
  a runtime &lt;code&gt;get&lt;/code&gt; function
&lt;/h3&gt;

&lt;p&gt;Remember the idea here is we want to use our index at runtime to get the value out of our tuple.&lt;/p&gt;

&lt;p&gt;It gets tricky because we no longer get to use our templated index where 0 is our base case. We'll have to check for 0 at runtime.&lt;/p&gt;

&lt;p&gt;This makes our return type a bit tricky. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3g318lezoxvf7mdywhk4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3g318lezoxvf7mdywhk4.jpg" alt="Tricks are for kids" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We'll now have to use type erasure because our compiler won't know the return type at compile time anymore without compile time indexes. &lt;/p&gt;

&lt;p&gt;What is type erasure? It's where the type of the value is unknown until you query for it.&lt;/p&gt;

&lt;p&gt;C++ has &lt;code&gt;std::any&lt;/code&gt; and &lt;code&gt;std::variant&lt;/code&gt; for this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename...&amp;gt;
struct TupleList;

template&amp;lt;&amp;gt;
struct TupleList&amp;lt;&amp;gt; {
    std::any get(size_t) const {
        throw std::out_of_range("Index out of bounds");
    }
};

template&amp;lt;typename T, typename... Rem&amp;gt;
struct TupleList&amp;lt;T, Rem...&amp;gt; : TupleList&amp;lt;Rem...&amp;gt; {
    T val;
    TupleList(T val, Rem... rem): TupleList&amp;lt;Rem...&amp;gt;(rem...), val(val) {}

    std::any get(size_t index) const {
        if (index == 0)
            return val;
        else
            return TupleList&amp;lt;Rem...&amp;gt;::get(index - 1);
    }
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great! With this, you can call &lt;code&gt;get&lt;/code&gt; like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TupleList&amp;lt;int, double, char&amp;gt; t(1,2.5,'c');
int idx = 2;
std::any res = t.get(idx);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference now is that we can pass in variable indexes at runtime. &lt;/p&gt;

&lt;p&gt;We can also optionally make the &lt;code&gt;get&lt;/code&gt; function static, so that it is used as before with &lt;code&gt;get(tuple, idx)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Another main thing to note, is that &lt;code&gt;std::any&lt;/code&gt; is returned. Now we have to check the type before extracting the value out.&lt;/p&gt;

&lt;p&gt;This is an annoying part with a runtime &lt;code&gt;get&lt;/code&gt; function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if (res.type() == typeid(char)) std::cout &amp;lt;&amp;lt; std::any_cast&amp;lt;char&amp;gt;(res) &amp;lt;&amp;lt; std::endl;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Another annoying part about this function is that the compiler can't optimize it. &lt;/p&gt;

&lt;p&gt;So accessing indexes is worst case O(n) now because each time we enter the next parent, we have to check if index is 0 at runtime.&lt;/p&gt;

&lt;p&gt;It shouldn't be a problem however if your &lt;code&gt;TupleList&lt;/code&gt;s tend to be small. But if possible, prefer the smarter compile time &lt;code&gt;get&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;Okay let's explore the &lt;code&gt;loop&lt;/code&gt; option.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can we loop without runtime indices?
&lt;/h3&gt;

&lt;p&gt;The answer is yes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename...&amp;gt;
struct TupleList;

template&amp;lt;&amp;gt;
struct TupleList&amp;lt;&amp;gt; {
     static void loop(size_t idx, TupleList&amp;lt;T, Rem...&amp;gt;&amp;amp; tuple, auto&amp;amp;&amp;amp; func){}
};

template&amp;lt;typename T, typename... Rem&amp;gt;
struct TupleList&amp;lt;T, Rem...&amp;gt; : TupleList&amp;lt;Rem...&amp;gt; {
    T val;
    TupleList(T val, Rem... rem): TupleList&amp;lt;Rem...&amp;gt;(rem...), val(val) {}

    static void loop(size_t idx, TupleList&amp;lt;T, Rem...&amp;gt;&amp;amp; tuple, auto&amp;amp;&amp;amp; func ){
         func(idx, tuple.val);
         TupleList&amp;lt;Rem...&amp;gt;::loop(idx+1, tuple, func);
    }
};

template&amp;lt;typename T, typename... Rem&amp;gt;
void loop(TupleList&amp;lt;T, Rem...&amp;gt;&amp;amp; tuple, auto&amp;amp;&amp;amp; func){
    TupleList&amp;lt;T, Rem...&amp;gt;::loop(0, tuple, func);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cool. We have a loop function that takes in a reference to a function.&lt;/p&gt;

&lt;p&gt;Honestly, we could also put &lt;code&gt;std::function&lt;/code&gt; instead of &lt;code&gt;auto&amp;amp;&amp;amp;&lt;/code&gt; as well.&lt;/p&gt;

&lt;p&gt;But we can see that we iterate into the static function of the parent which lets us extract the next value as we increment indices. This makes this function O(n). &lt;/p&gt;

&lt;p&gt;We can provide both to our function for processing. &lt;/p&gt;

&lt;p&gt;The only problem is now your &lt;code&gt;func&lt;/code&gt; is expected to handle multiple types of values when looping through.&lt;/p&gt;

&lt;p&gt;No need to fear we can have templates. Here are some examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TupleList&amp;lt;int, double, char&amp;gt; tuple(1,2.5,'c');
// lambda 'auto' arg becomes a templated arg
auto print = [](size_t idx, auto arg){
     std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; std::endl;
}
loop(tuple, print); 

auto int_nonint_print = [](size_t idx, auto arg){
     if constexpr (std::is_same_v&amp;lt;decltype(arg), int&amp;gt;){
          std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; " is an int" &amp;lt;&amp;lt; std::endl;
     }else{
          std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; " is not an int" &amp;lt;&amp;lt;std::endl;
     }
}
loop(tuple, int_nonint_print);

// template lambda
template&amp;lt;typename T&amp;gt;
auto print2 = [](size_t idx, T arg){
     std::cout &amp;lt;&amp;lt; arg &amp;lt;&amp;lt; std::endl;
}
loop(tuple, print2);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here I show 3 examples of using loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;auto lambda will become a functor with a templated argument for its function, so it acts like example 3.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;int_nonint_print&lt;/code&gt; shows we can use &lt;code&gt;constexpr&lt;/code&gt; with type traits to create different functions for ints and nonints, so we can have type based logic implicitly (C++20)&lt;/li&gt;
&lt;li&gt;a explicit template lambda is used. The compiler will specialize and generate versions of the function based on the types in the TupleList&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Ok we've talked about a lot for this blog. This topic is not over yet.&lt;/p&gt;

&lt;p&gt;I want to build out the rest of &lt;code&gt;TupleList&lt;/code&gt; including adding/removing elements.&lt;/p&gt;

&lt;p&gt;But right now we have a TupleArray. And we can loop through it and access different elements at runtime or compile time.&lt;/p&gt;

&lt;p&gt;Let's talk about the differences between the &lt;code&gt;TupleList&lt;/code&gt; and &lt;code&gt;std::vector&lt;/code&gt; as a baseline.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;TupleList&lt;/code&gt; is created on the stack (but it could be heap as well) whereas &lt;code&gt;std::vector&lt;/code&gt; allocates its elements on the heap&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TupleList&lt;/code&gt; can store different types and &lt;code&gt;std::vector&lt;/code&gt; cannot (unless it uses &lt;code&gt;std::any&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TupleList&lt;/code&gt; can have constant compile time access or O(n) runtime access whereas &lt;code&gt;std::vector&lt;/code&gt; has constant random access O(1).&lt;/li&gt;
&lt;li&gt;You can loop through both.&lt;/li&gt;
&lt;li&gt;As of right now, you can only change the size of &lt;code&gt;std::vector&lt;/code&gt; until part 2 :)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Okay overall, I hope you liked this mini lesson and cool usage of variadic templates.&lt;/p&gt;

&lt;p&gt;And we'll explore more of this another part.&lt;/p&gt;

&lt;p&gt;As usual, drop your questions below :)&lt;/p&gt;

&lt;p&gt;And here's example code, with an additional hash example if you want to mess around with it: &lt;a href="https://pastebin.com/rERy1avn" rel="noopener noreferrer"&gt;https://pastebin.com/rERy1avn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Until next time&lt;br&gt;
-absterdabster&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqq39ysdvjac7m2ka3tw1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqq39ysdvjac7m2ka3tw1.jpg" alt="Peace" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>datastructures</category>
      <category>compiling</category>
      <category>programming</category>
    </item>
    <item>
      <title>Motivation behind C++ Concepts</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Tue, 08 Apr 2025 03:21:55 +0000</pubDate>
      <link>https://dev.to/absterdabster/motivation-behind-c-concepts-402i</link>
      <guid>https://dev.to/absterdabster/motivation-behind-c-concepts-402i</guid>
      <description>&lt;p&gt;C++ 20 introduced &lt;strong&gt;&lt;em&gt;concepts&lt;/em&gt;&lt;/strong&gt;. What are they? Why should I care about them? How do I use them?&lt;/p&gt;

&lt;p&gt;Concepts are a powerful tool to help you write generic code with restrictions evaluated at compile time. &lt;/p&gt;

&lt;p&gt;What does that mean?&lt;/p&gt;

&lt;p&gt;Let's say I make a library and I want to create a function that allows my users to pass in a singular integer, float, or string into it. &lt;/p&gt;

&lt;p&gt;However, I don't want to let them pass in a boolean. &lt;/p&gt;

&lt;p&gt;Is there a way to accomplish this by writing &lt;strong&gt;&lt;em&gt;only one function&lt;/em&gt;&lt;/strong&gt;?&lt;/p&gt;

&lt;p&gt;YES!&lt;/p&gt;

&lt;p&gt;We can do this with C++ templates and concepts. (Thanks C++ 20)&lt;/p&gt;

&lt;p&gt;Ok let's start at the very beginning... templates&lt;/p&gt;

&lt;h2&gt;
  
  
  templates
&lt;/h2&gt;

&lt;p&gt;Okay, let' say I'm making a library that accepts an integer, float, or string. &lt;/p&gt;

&lt;p&gt;One way to do this would be like this with function overloading:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;

void function(int v){
        std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

void function(double v){
        std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

void function(std::string v){
        std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

int main(int argc, char** argv){
        function("hi");
        function(2);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yes all the functions do the same thing and this compiles. &lt;/p&gt;

&lt;p&gt;But why do I have to write it 3 times???&lt;/p&gt;

&lt;p&gt;You don't!!! &lt;/p&gt;

&lt;p&gt;Let's use templates to simplify this to one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;

template &amp;lt;typename T&amp;gt;
void function(T v){
        std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

int main(int argc, char** argv){
        function("hi");
        function(2);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nice so we have a program that works and compiles. &lt;/p&gt;

&lt;p&gt;The compiler sees that we call function for "hi" and &lt;code&gt;2&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;So when compiling, it automatically creates two variations of the function. One accepts the integer and another accepts the string. &lt;/p&gt;

&lt;p&gt;If I introduced a third type, let's say a double, it would then compile a double version of function.&lt;/p&gt;

&lt;p&gt;Super cool! &lt;/p&gt;

&lt;p&gt;I can write multiple versions of my function with less code using templates.&lt;/p&gt;

&lt;p&gt;I do want to note there is one other way you could do this. In C++ 17 and after, there are type erasure types. (&lt;code&gt;std::variant&lt;/code&gt; and &lt;code&gt;std::any&lt;/code&gt;)&lt;/p&gt;

&lt;p&gt;These types can hold &lt;em&gt;multiple&lt;/em&gt; types of variables at once. The types that they hold are determined at runtime as opposed to compile time.&lt;/p&gt;

&lt;p&gt;Long story short, you can do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;variant&amp;gt;

void function(std::variant&amp;lt;int, std::string&amp;gt; v){
        if(std::holds_alternative&amp;lt;int&amp;gt;(v)){
                std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; std::get&amp;lt;int&amp;gt;(v) &amp;lt;&amp;lt; std::endl;
        }else{
                std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; std::get&amp;lt;std::string&amp;gt;(v) &amp;lt;&amp;lt; std::endl;
        }
}

int main(int argc, char** argv){
        function("hi");
        function(2);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In my opinion, this looks ugly/verbose. Also there is an extra if statement in there. &lt;/p&gt;

&lt;p&gt;As a result, I prefer using a template for this case. &lt;/p&gt;

&lt;p&gt;The good thing about a variant is it can restrict the types allowed into the function unlike a general template.&lt;/p&gt;

&lt;p&gt;For instance, it only let in an integer and string type into &lt;code&gt;function&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Okay, great! Is there a way we can do this with templates?&lt;/p&gt;

&lt;p&gt;The answer is yes! &lt;/p&gt;

&lt;p&gt;We can look at ways to do this shortly.&lt;/p&gt;

&lt;p&gt;Just before we get there, I want to show you another limitation of templates.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;std::enable_if&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;C++ has &lt;code&gt;std::enable_if&lt;/code&gt; to activate certain functions if a condition is true.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;std::enable_if&lt;/code&gt; is often used with templates. &lt;/p&gt;

&lt;p&gt;&lt;code&gt;std::enable_if&lt;/code&gt; is also a templated type as well. The first template argument is the condition, and the second is the type to use if the condition is true.&lt;/p&gt;

&lt;p&gt;In C++, this introduces a concept called &lt;em&gt;SFINAE&lt;/em&gt; (Substitution Failure is Not an Error).&lt;/p&gt;

&lt;p&gt;Let's see an example of this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;type_traits&amp;gt;

template &amp;lt;typename T&amp;gt;
typename std::enable_if&amp;lt;std::is_same&amp;lt;T, int&amp;gt;::value ||
std::is_constructible&amp;lt;std::string, T&amp;gt;::value, void&amp;gt;::type
function(T v){
        std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

template &amp;lt;typename T&amp;gt;
typename std::enable_if&amp;lt;!(std::is_same&amp;lt;T, int&amp;gt;::value ||
std::is_constructible&amp;lt;std::string, T&amp;gt;::value), void&amp;gt;::type
function(T v){
        std::cout &amp;lt;&amp;lt; "diff function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

int main(int argc, char** argv){
        function("hi");
        function(2);
        function(true);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output here is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function: hi
function: 2
diff function: true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So both the &lt;code&gt;std::enable_if&lt;/code&gt;s are used to make a &lt;code&gt;void&lt;/code&gt; return type if their condition is true for the templated type. &lt;/p&gt;

&lt;p&gt;We use &lt;code&gt;type_traits&lt;/code&gt; in C++ to build our condition. (&lt;code&gt;std::is_same&lt;/code&gt; and &lt;code&gt;std::is_constructible&lt;/code&gt;)&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;T&lt;/code&gt; is an int or can be constructed to a string, the first function is given a void return type.&lt;/p&gt;

&lt;p&gt;However, the second function is given an &lt;em&gt;invalid&lt;/em&gt; return type. As a result, the function gets thrown away by the compiler.&lt;/p&gt;

&lt;p&gt;Hence, with &lt;code&gt;std::enable_if&lt;/code&gt; we can have multiple functions such that if type substitution fails with one function, there may be another function that can be used.&lt;/p&gt;

&lt;p&gt;Not only that, by using &lt;code&gt;std::enable_if&lt;/code&gt; you can allow partial specialized functions. Like (&lt;code&gt;T&lt;/code&gt; and &lt;code&gt;std::vector&amp;lt;T&amp;gt;&lt;/code&gt;) coexisting.&lt;/p&gt;

&lt;p&gt;BOOM. SFINAE solved.&lt;/p&gt;

&lt;p&gt;Okay, cool, we can restrict types with &lt;code&gt;std::enable_if&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;But even this looks ugly. Our enable if conditions look long.&lt;/p&gt;

&lt;p&gt;Is there a better way we could fix this?&lt;/p&gt;

&lt;h2&gt;
  
  
  Concepts
&lt;/h2&gt;

&lt;p&gt;In C++ 20, concepts were introduced. &lt;/p&gt;

&lt;p&gt;Concepts are so cool.&lt;/p&gt;

&lt;p&gt;Let's solve the problem we had earlier with concepts. Things get more concise.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;type_traits&amp;gt;
#include &amp;lt;concepts&amp;gt;

template &amp;lt;typename T&amp;gt;
concept ValidType = std::same_as&amp;lt;T, int&amp;gt; || std::is_constructible_v&amp;lt;std::string, T&amp;gt;;

template &amp;lt;ValidType T&amp;gt;
void function(T v){
        std::cout &amp;lt;&amp;lt; "function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

template &amp;lt;typename T&amp;gt;
void function(T v){
        std::cout &amp;lt;&amp;lt; "diff function: " &amp;lt;&amp;lt; v &amp;lt;&amp;lt; std::endl;
}

int main(int argc, char** argv){
        function("hi");
        function(2);
        function(true);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is kind of cool and way cleaner. Shouldn't &lt;code&gt;2&lt;/code&gt; work for both functions?&lt;/p&gt;

&lt;p&gt;Concepts won't throw an error here.&lt;/p&gt;

&lt;p&gt;Instead, the compiler picks the most restrictive option, which is the first function.&lt;/p&gt;

&lt;p&gt;You can chain multiple concepts together with &lt;code&gt;Concept1 || Concept2&lt;/code&gt; or &lt;code&gt;Concept1 &amp;amp;&amp;amp; Concept2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can create concepts with other concepts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template&amp;lt;typename T&amp;gt;
concept IsInt = std::same_as&amp;lt;T, int&amp;gt;;

template&amp;lt;typename T&amp;gt;
concept IsString = std::is_constructible_v&amp;lt;std::string, T&amp;gt;;

template &amp;lt;typename T&amp;gt;
concept ValidType = IsInt&amp;lt;T&amp;gt; || IsString&amp;lt;T&amp;gt;;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And all of this is done at compile time!&lt;/p&gt;

&lt;p&gt;This is the modern form of the C++ SFINAE concept.&lt;/p&gt;

&lt;p&gt;Concepts go hand in hand with the &lt;code&gt;requires&lt;/code&gt; keyword. &lt;/p&gt;

&lt;p&gt;I don't want to make this blog tooooo long, but if you would like me to explain &lt;code&gt;requires&lt;/code&gt; in more detail, drop a comment! Or read more &lt;a href="https://en.cppreference.com/w/cpp/language/requires" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Conclusion
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Concepts are cool and clean and powerful&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;std::enable_if&lt;/code&gt; is noice but it can get very verbose&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;std::variant&lt;/code&gt; can get messy with type based if statements and runtime running&lt;/li&gt;
&lt;li&gt;Generic templates are great! But they cannot handle partial specialization or type restriction.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Try using concepts. They are easy and make you feel great for using modern C++.&lt;/p&gt;

&lt;p&gt;Peace&lt;br&gt;
-absterdabster&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe20ccqhxtxq8oqpb48i2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe20ccqhxtxq8oqpb48i2.jpg" alt="Image description" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>programming</category>
    </item>
    <item>
      <title>Measuring your program speed correctly</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Thu, 27 Mar 2025 03:46:48 +0000</pubDate>
      <link>https://dev.to/absterdabster/measuring-your-program-speed-correctly-4a52</link>
      <guid>https://dev.to/absterdabster/measuring-your-program-speed-correctly-4a52</guid>
      <description>&lt;p&gt;Hallo curious friend! Have you ever run a program and wondered how long it took? &lt;/p&gt;

&lt;p&gt;Let's say you had two programs and you were trying to figure out which one was faster. Maybe you used a tool and got a measurement for both. How sure are you that it took that long?&lt;/p&gt;

&lt;p&gt;Let's explore measuring our program speed and try to find out how to be as accurate as possible. MAYBE EVEN DOWN TO THE NANOSECOND! &lt;/p&gt;

&lt;p&gt;As Barney Stinson would say, CHALLENGE ACCEPTED! &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftj243epp89xdczlvfbi0.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftj243epp89xdczlvfbi0.gif" alt="challng accptd" width="400" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Okay there are several ways we could try to do this. Let's look at them one by one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why would I care about nanosecond precision?
&lt;/h2&gt;

&lt;p&gt;CPUs these days run at 3 GHz. &lt;/p&gt;

&lt;p&gt;What does that mean?&lt;/p&gt;

&lt;p&gt;That means it runs 3 * 1e9 cpu clock cycles per second. &lt;/p&gt;

&lt;p&gt;3 BILLION CYCLES PER SECOND.&lt;/p&gt;

&lt;p&gt;This also means 3 cpu clock cycles per nanosecond... &lt;br&gt;
(a nanosecond is 1e-9 seconds, super small....). &lt;/p&gt;

&lt;p&gt;If I can run 1 addition operation in 1 cpu clock cycle, in 1 second, I can add 3 BILLION things together!&lt;/p&gt;

&lt;p&gt;Wow computers are so powerful. But this also means to be precise about our performance, we should probably try to measure as close to nanoseconds as possible. &lt;/p&gt;
&lt;h2&gt;
  
  
  The &lt;code&gt;time&lt;/code&gt; command posix
&lt;/h2&gt;

&lt;p&gt;If you use the &lt;code&gt;time&lt;/code&gt; command, which is implemented in POSIX systems like linux, you will get 3 types of times for your program.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;real&lt;/code&gt;: The real time is end to end time of your program from invocation to the end of the process.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;user&lt;/code&gt;: This is the time the cpu spent in user space (the logic of your program).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;system&lt;/code&gt;: This is the time the cpu spent in kernel space for either system calls or interrupts. (System calls are ways your program can make use of your operating system's resources. Interrupts are the way your operating system prevents you from hogging the cpu for yourself.)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Great let's run a simple &lt;code&gt;C&lt;/code&gt; or &lt;code&gt;C++&lt;/code&gt; program and see how things get measured.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;int main(int argc, char** argv){
        int sum{0};
        for(int i = 0; i &amp;lt; 20; i++){
                sum += i;
        }
        return sum;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's make this fun. Give a guess how long you think this will run. If you're brave enough, lock in your answer in the comments lol. We can see who gets the closest. &lt;/p&gt;

&lt;p&gt;Okay, I'm going to run it now...&lt;/p&gt;

&lt;p&gt;&lt;code&gt;time ./test&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Here is what we get:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;real    0m0.014s
user    0m0.001s
sys     0m0.000s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Okay it seems like our process took 0.014s overall, but the logic of our program took 0.001s (1 millisecond) of it. &lt;/p&gt;

&lt;p&gt;Honestly, at this point I have no clue if this is right or not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flp7gd9imdbjvejd10w6p.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flp7gd9imdbjvejd10w6p.jpg" alt="idk help" width="600" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But many people might stop here and believe this value.&lt;/p&gt;

&lt;p&gt;Let's see what is happening behind the scenes. It seems like &lt;code&gt;time&lt;/code&gt; is a built in shell command, so let's look at the bash shell codebase.&lt;/p&gt;

&lt;p&gt;After some digging, we find the &lt;code&gt;time_command&lt;/code&gt; C function &lt;a href="https://github.com/bminor/bash/blob/6794b5478f660256a1023712b5fc169196ed0a22/execute_cmd.c#L1335" rel="noopener noreferrer"&gt;here&lt;/a&gt;. Here is an important part of it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#if defined (HAVE_GETRUSAGE) &amp;amp;&amp;amp; defined (HAVE_GETTIMEOFDAY)
  struct timeval real, user, sys;
  struct timeval before, after;
#  if defined (HAVE_STRUCT_TIMEZONE)
  struct timezone dtz;              /* posix doesn't define this */
#  endif
  struct rusage selfb, selfa, kidsb, kidsa; /* a = after, b = before */
#else
#  if defined (HAVE_TIMES)
  clock_t tbefore, tafter, real, user, sys;
  struct tms before, after;
#  endif
#endif
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's dissect this. There are 2 important terms here to highlight &lt;code&gt;timeval&lt;/code&gt; and &lt;code&gt;rusage&lt;/code&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;timeval&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;timeval&lt;/code&gt; holds two variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct timeval {
    time_t      tv_sec;     /* seconds */
    susecond_t  tv_usec;    /* microseconds */
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is often used with the &lt;code&gt;gettimeofday&lt;/code&gt; system call. This system call is defined &lt;a href="https://man7.org/linux/man-pages/man2/settimeofday.2.html" rel="noopener noreferrer"&gt;here&lt;/a&gt;. It gives you microseconds since epoch (Jan 1, 1970). &lt;/p&gt;

&lt;p&gt;A system call is a call your program makes to the system to perform privileged tasks like asking for the time from hardware.&lt;/p&gt;

&lt;p&gt;Specifically, &lt;code&gt;gettimeofday&lt;/code&gt; is a vsyscall and you can see how it works &lt;a href="https://0xax.gitbooks.io/linux-insides/content/Timers/linux-timers-7.html" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;But long story short, it asks the system's real-time clock (RTC) for the time. RTC actually gives you the time in nanoseconds.&lt;/p&gt;

&lt;p&gt;But we actually lose this precision and it gets converted to microseconds/seconds when we get it! AHHHHHHH! This sucks...&lt;/p&gt;

&lt;p&gt;The idea is we use &lt;code&gt;gettimeofday&lt;/code&gt; before the program starts and after it ends to measure the end to end program time. &lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;rusage&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;It's actually short for resource usage. The linux kernel maintains statistics about your program like the amount of time spent in user space and time spent by the system on other tasks. &lt;/p&gt;

&lt;p&gt;It also contains other information relating to memory and input/output devices. To give you an idea, it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct rusage {
               struct timeval ru_utime; /* user CPU time used */
               struct timeval ru_stime; /* system CPU time used */
               long   ru_maxrss;        /* maximum resident set size */
               long   ru_ixrss;         /* integral shared memory size */
               long   ru_idrss;         /* integral unshared data size */
               long   ru_isrss;         /* integral unshared stack size */
               long   ru_minflt;        /* page reclaims (soft page faults) */
               long   ru_majflt;        /* page faults (hard page faults) */
               long   ru_nswap;         /* swaps */
               long   ru_inblock;       /* block input operations */
               long   ru_oublock;       /* block output operations */
               long   ru_msgsnd;        /* IPC messages sent */
               long   ru_msgrcv;        /* IPC messages received */
               long   ru_nsignals;      /* signals received */
               long   ru_nvcsw;         /* voluntary context switches */
               long   ru_nivcsw;        /* involuntary context switches */
           };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yep, it's a lot of statistics. &lt;/p&gt;

&lt;p&gt;To collect this data, your linux libc library supports the &lt;code&gt;getrusage&lt;/code&gt; syscall (&lt;a href="https://man7.org/linux/man-pages/man2/getrusage.2.html" rel="noopener noreferrer"&gt;more info&lt;/a&gt;). &lt;/p&gt;

&lt;p&gt;You can track the entire process (combines all threads), a specific thread, or even children processes!&lt;/p&gt;

&lt;p&gt;As you can see once again, this once again utilizes the timeval struct which had microsecond precision.&lt;/p&gt;

&lt;p&gt;While the microsecond precision sucks a little, it is pretty cool that the kernel tracks stuff for us (especially the user/system time).&lt;/p&gt;

&lt;p&gt;Overall, here is what I've concluded from the &lt;code&gt;time&lt;/code&gt; command.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;It has microsecond precision&lt;/li&gt;
&lt;li&gt;It maintains user/system time for a process&lt;/li&gt;
&lt;li&gt;It provides data from process startup to process destruction&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ok let's test it out. I was a bit lazy so I asked ChatGPT to help me implement the &lt;code&gt;getrusage&lt;/code&gt; calls for this one. It looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;sys/resource.h&amp;gt;
int main(int argc, char** argv){
        struct rusage usage;
        struct timeval start_user, start_system, end_user, end_system;
        long long start_user_us, start_system_us, end_user_us, end_system_us;

        // Get starting resource usage
        getrusage(RUSAGE_SELF, &amp;amp;usage);

        start_user = usage.ru_utime;
        start_system = usage.ru_stime;

        // Convert timeval to microseconds for easier calculation
        start_user_us = (start_user.tv_sec * 1000000LL) + start_user.tv_usec;
        start_system_us = (start_system.tv_sec * 1000000LL) + start_system.tv_usec;

        int sum{0};
        for(int i = 0; i &amp;lt; 20; i++){
                sum += i;
        }

        getrusage(RUSAGE_SELF, &amp;amp;usage);

        end_user = usage.ru_utime;
        end_system = usage.ru_stime;

        // Convert to microseconds
        end_user_us = (end_user.tv_sec * 1000000LL) + end_user.tv_usec;
        end_system_us = (end_system.tv_sec * 1000000LL) + end_system.tv_usec;

        // Calculate elapsed time
        long long elapsed_user_us = end_user_us - start_user_us;
        long long elapsed_system_us = end_system_us - start_system_us;
        long long elapsed_total_us = elapsed_user_us + elapsed_system_us;

        // Print results
        std::cout &amp;lt;&amp;lt; "User CPU time: " &amp;lt;&amp;lt; elapsed_user_us / 1000000.0 &amp;lt;&amp;lt; " seconds" &amp;lt;&amp;lt; std::endl;
        std::cout &amp;lt;&amp;lt; "System CPU time: " &amp;lt;&amp;lt; elapsed_system_us / 1000000.0 &amp;lt;&amp;lt; " seconds" &amp;lt;&amp;lt; std::endl;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And so how long did it take??? Here is the output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User CPU time: 1e-06 seconds
System CPU time: 0 seconds
Total CPU time: 1e-06 seconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;1 MICROSECOND! whoa did we get faster??? Or are we wrong?&lt;/p&gt;

&lt;p&gt;Remember the &lt;code&gt;time&lt;/code&gt; command said we spent 1 millisecond in user space. And the whole program took 14 milliseconds process creation to destruction. &lt;/p&gt;

&lt;p&gt;So who is more right??? One measurement is off from the other by 100x!!!&lt;/p&gt;

&lt;p&gt;Well, the &lt;code&gt;time&lt;/code&gt; command takes into account more than just the logic in the program. Process creation/destruction can be expensive. &lt;/p&gt;

&lt;p&gt;At the same time, the &lt;code&gt;time&lt;/code&gt; command tends to lose a lot of precision when outputting to users.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;clock_t&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This one is a cool one. Why?&lt;/p&gt;

&lt;p&gt;It uses the hardware clock rather than the RTC. &lt;/p&gt;

&lt;p&gt;So what exactly is &lt;code&gt;clock_t&lt;/code&gt;? &lt;/p&gt;

&lt;p&gt;It can be an &lt;code&gt;int&lt;/code&gt; or a &lt;code&gt;float&lt;/code&gt; or other type depending on your libc implementation as long as it is capable of keeping track of clock ticks.&lt;/p&gt;

&lt;p&gt;Clock ticks are an arbitrary measurement of time determined by your hardware clock/timer. It is much more granular than milliseconds/seconds.&lt;/p&gt;

&lt;p&gt;In fact, there is a &lt;code&gt;CLOCKS_PER_SECOND&lt;/code&gt; macro/variable in C that converts clock ticks to seconds. This is sometimes set to values like &lt;code&gt;1,000,000&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;This would mean you would get microsecond precision. If it was a larger value, you could get even more precision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Note that this is not the same as clock cycles&lt;/em&gt;&lt;/strong&gt;. Clock cycles are the CPU's internal processor frequency.&lt;/p&gt;

&lt;p&gt;To use &lt;code&gt;clock_t&lt;/code&gt;, you can get the current clock value from the &lt;code&gt;clock()&lt;/code&gt; function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;time.h&amp;gt;
#include &amp;lt;iostream&amp;gt;
int main(int argc, char** argv){
        clock_t start = clock();
        int sum{0};
        for(int i = 0; i &amp;lt; 20; i++){
                sum += i;
        }
        clock_t end = clock();
        std::cout &amp;lt;&amp;lt; (end - start) &amp;lt;&amp;lt; " us"&amp;lt;&amp;lt; std::endl;
        std::cout &amp;lt;&amp;lt; "clocks per sec: " &amp;lt;&amp;lt; CLOCKS_PER_SEC &amp;lt;&amp;lt; std::endl;
        return sum;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we time, this program again and observe the results... here it is:&lt;br&gt;
And tadaaa....&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1 us
clocks per sec: 1000000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This for loop takes 1 microsecond according to our measurement. Similar to &lt;code&gt;getrusage&lt;/code&gt;! &lt;/p&gt;

&lt;p&gt;Is this the smallest granularity we can go??? &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fob6g21vxipv8gt27qjml.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fob6g21vxipv8gt27qjml.jpg" alt="tiny violin" width="251" height="201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Well, I feel like we can do better because even if a cycle were to take 1 nanosecond, that means our code would've took 1000 cycles. &lt;/p&gt;

&lt;p&gt;1000 instructions to loop through 20 items is INSANE!! &lt;/p&gt;

&lt;p&gt;But with only microsecond precision, we are definitely overshooting with a huge error from the actual program runtime.&lt;/p&gt;

&lt;p&gt;Okay, let's try something else because ticks aren't frequent enough.&lt;/p&gt;

&lt;p&gt;In C++, there is a &lt;code&gt;chrono&lt;/code&gt; library and it seems like it can support nanosecond granularity for its system clock.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;chrono&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;chrono&lt;/code&gt; has been around since C++11, which means its been there for a while now lol. It is a great time library. It can even do timezone conversions in C++20!! (ikr! why'd it take so long...)&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;chrono&lt;/code&gt;, I can use the high resolution clock to supposedly get nanosecond level precision. Here we go:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;chrono&amp;gt;
int main(int argc, char** argv){
        auto start = std::chrono::high_resolution_clock::now();
        int sum{0};
        for(int i = 0; i &amp;lt; 20; i++){
                sum += i;
        }
        auto end = std::chrono::high_resolution_clock::now();
        std::chrono::duration&amp;lt;long long, std::nano&amp;gt; duration = end - start;
        std::cout &amp;lt;&amp;lt; "Time taken: " &amp;lt;&amp;lt; duration.count() &amp;lt;&amp;lt; " nanoseconds" &amp;lt;&amp;lt; std::endl;
        return sum;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code is short and simple. So what does it say about our loop?&lt;br&gt;
Our program outputted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Time taken: 254 nanoseconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Much faster than 1 microsecond. In fact 4x better. And 4000x faster than &lt;code&gt;time&lt;/code&gt;. CRAZY!&lt;/p&gt;

&lt;p&gt;So now is it really 254 nanoseconds?? If we assumed a cycle took a nanosecond, this says that we went through 254 cycles for a 20 item for loop???&lt;/p&gt;

&lt;p&gt;Let's see what this clock is really doing and maybe we can find out why...&lt;/p&gt;

&lt;p&gt;After looking at the GNU/linux C++ source code (libstdc++), I found the implementation of now(). &lt;/p&gt;

&lt;p&gt;It seems like it is an alias to either a system clock or steady clock on different versions, but here is a the &lt;a href="https://github.com/gcc-mirror/gcc/blob/master/libstdc++-v3/src/c++11/chrono.cc" rel="noopener noreferrer"&gt;system clock implementation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The system clock means a real time clock that users use to perceive time. It can be affected by the calendar and can jump forwards/backwards in time.&lt;/p&gt;

&lt;p&gt;The steady clock is a monotonic clock kind of like what &lt;code&gt;clock()&lt;/code&gt; used. It only increases at a certain frequency.&lt;/p&gt;

&lt;p&gt;It seems like my system clock gives nanosecond precision. &lt;/p&gt;

&lt;p&gt;When we look at the source code, we see the following system call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
      syscall(SYS_clock_gettime, CLOCK_REALTIME, &amp;amp;tp);
#else
      clock_gettime(CLOCK_REALTIME, &amp;amp;tp);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means the kernel has to do some work and get us the time. &lt;code&gt;tp&lt;/code&gt; is a &lt;code&gt;timespec&lt;/code&gt; struct and it has nanosecond precision (defined in &lt;code&gt;&amp;lt;time.h&amp;gt;&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct timespec {
    time_t tv_sec;  // seconds
    long   tv_nsec; // nanoseconds
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The kernel goes to the system's real time clock with &lt;code&gt;CLOCK_REALTIME&lt;/code&gt; and fills in this struct. This process of switching from user space code to kernel space and executing kernel instructions takes time!&lt;/p&gt;

&lt;p&gt;And in fact, it is affecting us by a lot of nanoseconds. &lt;/p&gt;

&lt;p&gt;Not only, that we don't really know the frequency/precision of our system's real time clock with respect to nanoseconds. Does it update every couple nanoseconds? Or every nanosecond?&lt;/p&gt;

&lt;p&gt;Lots of questions regarding the real time clock. &lt;/p&gt;

&lt;p&gt;There is a lot of factors relating to the hardware and protocol that can impact the frequency, so it is hard to tell.&lt;/p&gt;

&lt;p&gt;Clearly, the &lt;code&gt;clock_gettime()&lt;/code&gt; syscall used by &lt;code&gt;chrono&lt;/code&gt;'s &lt;code&gt;high_resolution_clock&lt;/code&gt; and &lt;code&gt;system_clock&lt;/code&gt; aren't precise enough for this tiny program.&lt;/p&gt;

&lt;p&gt;SO CAN WE GET MORE PRECISE???????&lt;br&gt;
CAN WE????????&lt;/p&gt;

&lt;p&gt;I can't hear you!!!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnhlhboflui7x8upsw896.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnhlhboflui7x8upsw896.jpg" alt="spongebob louder" width="625" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ohhhhhhhhhh......&lt;/p&gt;

&lt;p&gt;Just kidding. This isn't Spongebob. I'll stop with the cliff hangers.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;code&gt;TSC&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Introducing the &lt;code&gt;TSC&lt;/code&gt;, the Time Stamp Counter. &lt;/p&gt;

&lt;p&gt;Silly enough, the answer has been in front of you all along since the beginning of this blog.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flpshdsishv3owqxznzwn.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flpshdsishv3owqxznzwn.gif" alt="hahaha" width="398" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can use the clock cycles to measure how long our program took. TSC is a monotonic counter that increments for each clock cycle.&lt;/p&gt;

&lt;p&gt;How do we get access to the TSC?&lt;/p&gt;

&lt;p&gt;Specifically, TSC is 64 bit register on x86 processors. So... I'm sorry if  you are on a different processor (like ARM). You probably have something else.&lt;/p&gt;

&lt;p&gt;But many of the world's standard machines run x86, so this is very relevant.&lt;/p&gt;

&lt;p&gt;Okay, how do we get the 64 bits of data from the x86 register? &lt;/p&gt;

&lt;p&gt;Luckily for us there is an assembly instruction that copies the &lt;code&gt;TSC&lt;/code&gt; value to two 4 byte registers that we can access.&lt;/p&gt;

&lt;p&gt;The instruction is &lt;code&gt;rdtsc&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Also lucky for us, we can embed assembly into our C++. If we do this and then move the register value into our variable, we can use the value.&lt;/p&gt;

&lt;p&gt;Here is what that looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
int main(int argc, char** argv){
        uint64_t start{0};
        uint64_t end{0};
        __asm__ volatile(
                "rdtsc"
                :"=A" (start)
        );
        int sum{0};
        for(int i = 0; i &amp;lt; 20; i++){
                sum += i;
        }
        __asm__ volatile(
                "rdtsc"
                :"=A" (end)
        );
        std::cout &amp;lt;&amp;lt; "Result: " &amp;lt;&amp;lt; (end-start) &amp;lt;&amp;lt; " cycles" &amp;lt;&amp;lt; std::endl;
        return sum;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now guess what you think the output is in cycles... How many cycles does this simple for loop of additions take?&lt;/p&gt;

&lt;p&gt;And the answer issss&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Result: 300 cycles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alright alright, this isn't as useful unless we know the frequency of our processor cycles.&lt;/p&gt;

&lt;p&gt;If I run &lt;code&gt;lscpu&lt;/code&gt; in my terminal, it tells me a lot of cool things about my cpu/processor...&lt;/p&gt;

&lt;p&gt;In fact it shows me this useful fact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CPU MHz:                              2495.998
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;2496 Megahertz! That's roughly 2.5 GHz!!! &lt;/p&gt;

&lt;p&gt;If I do &lt;code&gt;1 / (2.5 * 10^9)&lt;/code&gt; to get the number of seconds per cycle, I get roughly 0.4 nanoseconds per cycle!&lt;/p&gt;

&lt;p&gt;If I multiply 0.4 nanoseconds per cycle with 300 cycles, I get 120 nanoseconds!!! &lt;/p&gt;

&lt;p&gt;We just halved our time from the previous method of using &lt;code&gt;get_clocktime()&lt;/code&gt; with the system clock! &lt;/p&gt;

&lt;p&gt;We were a lot more precise as we weren't accounting for any system calls and kernel instructions for the most part.&lt;/p&gt;

&lt;p&gt;Wow! We have nanosecond level precision.&lt;/p&gt;

&lt;p&gt;That's not all. What if we ran our program with O3 optimization on our compiler. Could we get a lower number?&lt;/p&gt;

&lt;p&gt;In fact we can....... and get the following results......&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Result: 35 cycles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;BOOOOOOOOM!!! Faster than the FLASH.... (jk, i've never seen the Flash)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzroui6kp4dhhf8jn4vme.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzroui6kp4dhhf8jn4vme.jpeg" alt="Flashmeme" width="750" height="554"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Okay calm down buddy. Turns out our program was so simple the compiler  precomputed the sum (I think).&lt;/p&gt;

&lt;p&gt;Let's see the assembly just to make sure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#APP
# 8 "test.cpp" 1
        rdtsc
# 0 "" 2
#NO_APP
        movq    %rax, %rbx
#APP
# 16 "test.cpp" 1
        rdtsc
# 0 "" 2
#NO_APP
        movl    $8, %edx
        movq    %rax, %rbp
... more assembly ...
_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_@PLT
        addq    $8, %rsp
        .cfi_def_cfa_offset 24
        movl    $190, %eax
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Okay, we see the two &lt;code&gt;rdtsc&lt;/code&gt; commands. BUT THERE IS ONLY 1 INSTRUCTION BETWEEN!&lt;/p&gt;

&lt;p&gt;And the compiler precomputed the sum and stores 190 into &lt;code&gt;%eax&lt;/code&gt; and returns that. &lt;/p&gt;

&lt;p&gt;LOL so that's why our -O3 optimization looks faster. Things got reordered and precomputed. &lt;/p&gt;

&lt;p&gt;So our algorithm/for loop in fact no longer exists.&lt;/p&gt;

&lt;p&gt;Beautiful. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpm5sgv38h9wwqfw8avts.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpm5sgv38h9wwqfw8avts.jpg" alt="Beautiful" width="736" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As my boss says at work...&lt;br&gt;
"The best kind of code is &lt;strong&gt;&lt;em&gt;no code&lt;/em&gt;&lt;/strong&gt;"&lt;/p&gt;

&lt;h2&gt;
  
  
  The Conclusion
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;time&lt;/code&gt; is gross and too slow&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getrusage&lt;/code&gt; is somewhat better but not as precise but it can show you user vs system time&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;chrono&lt;/code&gt;'s system_clock is better but has some imprecision. It has nanoseconds!&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rdtsc&lt;/code&gt; is the GOAT! And using it is simple. Clock cycles are the lowest form of precison, even better than nanoseconds.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We started at 1 millisecond and went to 100 nanos. That is a factor of around 10,000x for our precision! (10^-3 to 10^-7)&lt;/p&gt;

&lt;p&gt;I hope you learned how to be insanely precise...&lt;/p&gt;

&lt;p&gt;Now I must leave... See you next time &lt;br&gt;
(or many clock cycles from now)&lt;/p&gt;

&lt;p&gt;Peace &amp;lt;3&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;absterdabster&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyt6r8v99s6tmo95q71nb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyt6r8v99s6tmo95q71nb.jpg" alt="Absterdabster" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systems</category>
      <category>asm</category>
      <category>programming</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Summarizing "What Every Computer Scientist Should Know About Floating Point Arithmetic"</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Mon, 27 Jan 2025 03:14:58 +0000</pubDate>
      <link>https://dev.to/absterdabster/summarizing-what-every-computer-scientist-should-know-about-floating-point-arithmetic-h0b</link>
      <guid>https://dev.to/absterdabster/summarizing-what-every-computer-scientist-should-know-about-floating-point-arithmetic-h0b</guid>
      <description>&lt;p&gt;Hi again! Have you ever used a floating point number in your code? They appear in the forms of &lt;code&gt;float&lt;/code&gt; or &lt;code&gt;double&lt;/code&gt; usually, but essentially it's a type of data to represent real numbers (like &lt;code&gt;0.1&lt;/code&gt; or &lt;code&gt;3.14159653589&lt;/code&gt; or &lt;code&gt;123456789 * 10^(-23))&lt;/code&gt;. While it can represent decimals, it can also do whole numbers like &lt;code&gt;1&lt;/code&gt; or &lt;code&gt;12345678&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Regardless of which one you used, there is a chance your code might be in trouble. When you use a number (like &lt;code&gt;1.5&lt;/code&gt;), your computer might not actually be using that number but instead something really close.&lt;/p&gt;

&lt;p&gt;Now multiply your wrong number a few times, add it with a few other wrong numbers, and soon you're math is chaos! Your computer isn't actually listening to you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;How do we reduce errors with floating point operations?&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Today I'll be summarizing &lt;a href="https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html#:~:text=Floating%2Dpoint%20representations%20are%20not,%C3%97%20101%20is%20not." rel="noopener noreferrer"&gt;"What Every Computer Scientist Should Know About Floating Point Arithmetic"&lt;/a&gt; by David Goldberg in 1991. Give it a read if you dare.... hehe&lt;/p&gt;

&lt;p&gt;Ok that blog was very long.... so I'mma just cover some of the basic points of what I read. Drop a comment if you want me to try and explain a section from the blog that I didn't cover (I'll try lol).&lt;/p&gt;

&lt;p&gt;Get ready to get smarter :)&lt;/p&gt;

&lt;h2&gt;
  
  
  Representing Floating Points
&lt;/h2&gt;

&lt;p&gt;Floating points are represented with a numerical base 

&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β\beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 (like decimal or binary or hexadecimal), an exponent range 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;emine_{min}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;e&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;min&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;emaxe_{max}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;e&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;ma&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;x&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, and a precision &lt;code&gt;p&lt;/code&gt; (the number of digits). They are represented in scientific notation with a single non-zero digit before the decimal point. &lt;/p&gt;

&lt;p&gt;For example, it would look something like this.&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;d0.d1d2...dp−1∗βe
d_0.d_1d_2...d_{p-1} * \beta^e
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;.&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;...&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;span class="mbin mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;The digits of the significand 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;did_i&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 are all in the range 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0≤di&amp;lt;β0 \leq d_i \lt \beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≤&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 where there are &lt;code&gt;p&lt;/code&gt; digits (
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0≤i&amp;lt;p0 \leq i \lt p&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≤&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;i&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
).&lt;/p&gt;

&lt;p&gt;So if 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=2\beta = 2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and &lt;code&gt;p = 3&lt;/code&gt; and I wanted to represent &lt;code&gt;0.77&lt;/code&gt; it would be something like this.&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1.10∗2−1
1.10 * 2^{-1}
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.10&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;
This would be equal to &lt;code&gt;0.110&lt;/code&gt; which is 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1/2+1/41/2 + 1/4&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1/2&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1/4&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. This is the closest we could get to &lt;code&gt;0.77&lt;/code&gt; with 3 digit precision &lt;code&gt;p&lt;/code&gt; in binary. As you can see, floating point representations lie &lt;code&gt;0.75&lt;/code&gt; is not &lt;code&gt;0.77&lt;/code&gt;, butttt it is &lt;em&gt;close enough&lt;/em&gt;.

&lt;p&gt;If 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10\beta = 10&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and &lt;code&gt;p = 3&lt;/code&gt; then &lt;code&gt;2734&lt;/code&gt; would be represented as&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;2.73∗1012.73 * 10^1 &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2.73&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;
If we keep saying floating points are "&lt;em&gt;close enough&lt;/em&gt;" for our numbers and we start doing operations on them, eventually our representations will be far from the actual number.

&lt;p&gt;Okay, let's measure how far we are from the actual number. AKA, what is the error?&lt;/p&gt;
&lt;h2&gt;
  
  
  Understanding the error
&lt;/h2&gt;

&lt;p&gt;There are two types of floating point errors or &lt;em&gt;rounding errors&lt;/em&gt; that are commonly measured. &lt;strong&gt;&lt;em&gt;ulps&lt;/em&gt;&lt;/strong&gt; (units in the last place) and &lt;strong&gt;&lt;em&gt;relative error&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  ulps
&lt;/h3&gt;

&lt;p&gt;The units in the last place is the total error the last digit is off by compared to the actual number. To be exact, it can be calculated by this complicated formula where &lt;code&gt;z&lt;/code&gt; is the actual number we are comparing to.&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ulps=∣d.dd...d−z/βe∣βp−1\text{ulps} = |d.dd...d - z/{\beta^e}|\beta^{p-1}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;ulps&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mord"&gt;.&lt;/span&gt;&lt;span class="mord mathnormal"&gt;dd&lt;/span&gt;&lt;span class="mord"&gt;...&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;z&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;span class="mbin mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;If this looks confusing, let's just do an example and it'll be a lot easier. Let's say we have a number &lt;code&gt;314.09&lt;/code&gt; and our &lt;code&gt;z = 314.1592653589&lt;/code&gt;.&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;314.09=3.1409∗102314.09 = 3.1409 * 10^2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;314.09&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;3.1409&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;From this we know, 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10\beta = 10&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, &lt;code&gt;p = 5&lt;/code&gt;, and &lt;code&gt;e = 2&lt;/code&gt;.&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ulps=∣3.1409−314.1592653589/102∣104\text{ulps} = |3.1409 - 314.1592653589/{10^2}|{10^{4}}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;ulps&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣3.1409&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;314.1592653589/&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;4&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=∣3.1409−3.14159653989∣104= |3.1409 - 3.14159653989|{10^{4}}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣3.1409&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;3.14159653989∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;4&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=∣−0.00069653989∣104= |-0.00069653989|{10^{4}}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.00069653989∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;4&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=6.9653989= 6.9653989&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;6.9653989&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;This has roughly &lt;code&gt;ulps = 6.965&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Relative error
&lt;/h3&gt;

&lt;p&gt;This error takes the absolute error and takes it into proportion to the magnitude of the real number. &lt;/p&gt;

&lt;p&gt;Super simple. First take the absolute error (the difference between the actual and the representation).&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;absolute error=∣(d.dd..d∗βe)−z∣\text{absolute error} = |(d.dd..d * \beta^e) - z| &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;absolute error&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mord"&gt;.&lt;/span&gt;&lt;span class="mord mathnormal"&gt;dd&lt;/span&gt;&lt;span class="mord"&gt;..&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;z&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Now divide that by the real number to take it into proportion.&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;relative error=absolute error/z\text{relative error} = \text{absolute error}/z &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;relative error&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;absolute error&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;z&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;relative error=∣(d.dd..d∗βe)−z∣/z\text{relative error} = |(d.dd..d * \beta^e) - z|/z &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;relative error&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mord"&gt;.&lt;/span&gt;&lt;span class="mord mathnormal"&gt;dd&lt;/span&gt;&lt;span class="mord"&gt;..&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;z&lt;/span&gt;&lt;span class="mord"&gt;∣/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;z&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;The idea is that if I have a very large number like a million. If I wanted to buy a million dollar home (if I'm ever rich enough), I wouldn't really mind a difference in 10$. &lt;/p&gt;

&lt;h2&gt;
  
  
  Converting 0.5 ulps to relative error
&lt;/h2&gt;

&lt;p&gt;Let's say I have a number represented as 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;d.dd...dd∗βed.dd...dd * \beta^e&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mord"&gt;.&lt;/span&gt;&lt;span class="mord mathnormal"&gt;dd&lt;/span&gt;&lt;span class="mord"&gt;...&lt;/span&gt;&lt;span class="mord mathnormal"&gt;dd&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. If this number had 0.5 ulps, the error could be bounded by 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.00...00β′0.00...00\beta'&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.00...00&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;′&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 where 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β′=β/2\beta' = \beta/2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;′&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. &lt;/p&gt;
&lt;h3&gt;
  
  
  Convincing you about the 0.5 ulps absolute error
&lt;/h3&gt;

&lt;p&gt;Let me convince you. Let's say we're in base 10 (
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10\beta=10 &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
) and we had &lt;code&gt;p=3&lt;/code&gt; with the number &lt;code&gt;9.97&lt;/code&gt;. If the actual number was properly represented via rounding, the actual number would have been between &lt;code&gt;&amp;gt;= 9.965&lt;/code&gt; and &lt;code&gt;&amp;lt; 9.975&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The limits of the actual number are shown to be bounded by the error &lt;code&gt;0.005&lt;/code&gt;. This error is also 0.5 ulps. It is also the same as 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.00...00β′0.00...00\beta'&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.00...00&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;′&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 because 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β′=β/2=5\beta' = \beta/2 = 5&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;′&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;5&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. &lt;/p&gt;

&lt;p&gt;This might seem obvious in base 10, but let's try something in base 2 (
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=2\beta=2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
). Let's say we had &lt;code&gt;1.11&lt;/code&gt; (&lt;code&gt;1.75&lt;/code&gt; in decimal) when 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=2\beta=2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and &lt;code&gt;p = 3&lt;/code&gt;. However, the real number before rounding could have been between &lt;code&gt;&amp;lt; 1.111&lt;/code&gt; and &lt;code&gt;&amp;gt;= 1.110&lt;/code&gt;. And of course this range results in 0.1 ulps &lt;code&gt;ex. (1.111 - 1.11)&lt;/code&gt; in binary which is 0.5 ulps in decimal which is the same as 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.00...00β′0.00...00\beta'&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.00...00&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;′&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 because 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β′=β/2=1\beta' = \beta/2 = 1&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;′&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;Therefore, if a number was rounded properly to its proper representation, it would have an error of &amp;lt; 0.5 ulps.&lt;/p&gt;

&lt;p&gt;In other words, 0.5 ulps is always has equal to 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;((β/2)β−p)∗βe((\beta/2)\beta^{-p}) * \beta^e &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;((&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. This would be the absolute error of 0.5 ulps.&lt;/p&gt;
&lt;h3&gt;
  
  
  Onto the relative error
&lt;/h3&gt;

&lt;p&gt;Now, what is 0.5 ulps in relative error? Remember that relative error was absolute error divided by the actual number. &lt;/p&gt;

&lt;p&gt;We just said that the absolute error when we have 0.5 ulps is 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;((β/2)β−p)∗βe((\beta/2)\beta^{-p}) * \beta^e &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;((&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. But the actual number could be anything. &lt;/p&gt;

&lt;p&gt;Specifically, it could be in the range 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1∗βe1 * \beta^e&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β∗βe\beta * \beta^e &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. So, if 0.5 ulps was &lt;code&gt;0.001&lt;/code&gt; in binary (
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1∗2−3∗201*2^{-3} *2^0&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;3&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
), then &lt;code&gt;p=3&lt;/code&gt; and &lt;code&gt;e=0&lt;/code&gt;. In that case the real number must have been between 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1∗201 * 2^0 &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;2∗202*2^0 &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 which would be &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;2&lt;/code&gt; in decimal or &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;10&lt;/code&gt; in binary.&lt;/p&gt;

&lt;p&gt;From our previous example, we can see that is true. &lt;code&gt;1.11&lt;/code&gt; (&lt;code&gt;1.75&lt;/code&gt; decimal) was in fact between &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;10&lt;/code&gt; in binary. &lt;/p&gt;

&lt;p&gt;Cool so we set some bounds on what the actual number could have been, namely: 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1∗βe1 * \beta^e&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β∗βe\beta * \beta^e &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;This means we can set bounds for the relative error for 0.5 ulps. So let's divide the absolute error with the bounds of the real number.&lt;/p&gt;

&lt;p&gt;Upper bound of relative error: &lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;((β/2)β−p)∗βe(1∗βe)\frac{((\beta/2)\beta^{-p}) * \beta^e}{(1*\beta^e)} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen"&gt;((&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=(β2)β−p= (\frac{\beta}{2})\beta^{-p} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Lower bound of relative error:&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;((β/2)β−p)∗βe(β∗βe)\frac{((\beta/2)\beta^{-p}) * \beta^e}{(\beta*\beta^e)} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen"&gt;((&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;/2&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=(12)β−p= (\frac{1}{2})\beta^{-p} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Therefore,&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;(12)β−p&amp;lt;0.5ulps≤(β2)β−p(\frac{1}{2})\beta^{-p} \lt 0.5 ulps \le (\frac{\beta}{2})\beta^{-p} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.5&lt;/span&gt;&lt;span class="mord mathnormal"&gt;u&lt;/span&gt;&lt;span class="mord mathnormal"&gt;lp&lt;/span&gt;&lt;span class="mord mathnormal"&gt;s&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≤&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Machine epsilon
&lt;/h2&gt;

&lt;p&gt;The upper bound relative error for 0.5 ulps is called &lt;strong&gt;&lt;em&gt;machine epsilon&lt;/em&gt;&lt;/strong&gt;. This is the largest relative error possible when given a base.&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ϵ=(β2)β−p\epsilon = (\frac{\beta}{2})\beta^{-p} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Larger precision &lt;code&gt;p&lt;/code&gt;, as expected, implies smaller relative error/machine epsilon. We also notice that 0.5 ulps is bounded by machine epsilon and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;(12)β−p(\frac{1}{2})\beta^{-p}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. These bounds have a factor of 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β\beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 which we call &lt;strong&gt;&lt;em&gt;wobble&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Yeahhh... &lt;a href="https://youtu.be/qd6UI6wEIsU?si=IjEh1G1skV3GNuiT" rel="noopener noreferrer"&gt;wobble baby wobble baby&lt;/a&gt;...&lt;/p&gt;

&lt;p&gt;BTW, machine epsilon is such a cool name. I'm not judging if you name your dog or child machine epsilon.&lt;/p&gt;
&lt;h3&gt;
  
  
  Relative errors with machine epsilon
&lt;/h3&gt;

&lt;p&gt;Remember machine epsilon was the upper bound for rounding errors or 0.5 ulps. So if we actually got a relative error much lower, we can represent the relative error as a ratio of the machine epsilon like this:&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;rel. error=k∗ϵ\text{rel. error} = k * \epsilon &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;rel. error&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Let's do an example. If I had the number &lt;code&gt;3.14159&lt;/code&gt; to represent with 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10\beta=10&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and &lt;code&gt;p = 3&lt;/code&gt;, I would have to round to &lt;code&gt;3.14&lt;/code&gt;. This would have an absolute error of &lt;code&gt;.00159&lt;/code&gt; or 0.159 ulps. For relative error, I do 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.00159/3.141590.00159/3.14159&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.00159/3.14159&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 which leads me to a relative error of &lt;code&gt;0.0005&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now, to find the ratio, we must find the machine epsilon:&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ϵ=(β2)β−p\epsilon = (\frac{\beta}{2})\beta^{-p} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=5∗10−3=0.005= 5 * 10^{-3} = 0.005&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;5&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;3&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.005&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;So... the ratio is:&lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;k=rel. error/ϵ=0.0005/0.005=0.1k = \text{rel. error}/\epsilon = 0.0005/0.005 = 0.1 &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;rel. error&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.0005/0.005&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;So we say that the relative error is 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.1ϵ0.1\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.1&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;
&lt;h2&gt;
  
  
  The &lt;strong&gt;&lt;em&gt;Wobble&lt;/em&gt;&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Get ready for things to wobble. First let me show you how ulps and relative error react to each other.&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;1.0&lt;/code&gt; to represent &lt;code&gt;1.04&lt;/code&gt; in decimal, has an error of 0.4 ulps and relative error of 0.038. The machine epsilon is &lt;code&gt;0.05&lt;/code&gt; which makes the relative error 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.76ϵ0.76\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.76&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;Great! Hopefully this made sense so far.&lt;/p&gt;

&lt;p&gt;Now, let's multiply our number by let's say &lt;code&gt;8&lt;/code&gt;. The actual number would be &lt;code&gt;8.32&lt;/code&gt; while the calculated number would be &lt;code&gt;8.0&lt;/code&gt;. This has 3.2 ulps which is 8 times larger than before! However, our relative error is still 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.32/8.32=0.0380.32/8.32 = 0.038&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.32/8.32&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.038&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 which is the same as 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;0.76ϵ0.76\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.76&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;Whoa! Our ulps increased, but our relative error was the same? &lt;/p&gt;

&lt;p&gt;Yep. It turns out whenever you have a fixed relative error, you're ulps can wobble by 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β\beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. &lt;/p&gt;

&lt;p&gt;On the other hand, whenever we have a fixed ulps (like we showed earlier with 0.5 ulps), the relative error had bounds which showed it can also wobble by 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β\beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;So, smaller the 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β\beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, smaller the wobble or smaller the error bounds! Using binary, can significantly reduce our error.&lt;/p&gt;
&lt;h3&gt;
  
  
  Contaminated digits
&lt;/h3&gt;

&lt;p&gt;We now know that ulps and relative error's ratio &lt;code&gt;k&lt;/code&gt; vary from each other by a factor of 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β\beta&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, the wobble. As a result, we can estimate the number of contaminated digits (the number of incorrect digits from the correct representation of the number).&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;contaminated digits≈log⁡βn\text{contaminated digits} \approx \log_{\beta}{n} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;contaminated digits&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop"&gt;&lt;span class="mop"&gt;lo&lt;span&gt;g&lt;/span&gt;&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;β&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;&lt;code&gt;n&lt;/code&gt; is the number of ulps. &lt;code&gt;n&lt;/code&gt; can also mean &lt;code&gt;k&lt;/code&gt;, the ratio between the relative error and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ϵ\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. It can mean either because of the wobble factor.&lt;/p&gt;

&lt;p&gt;So if I had a number in decimal, &lt;code&gt;3.10&lt;/code&gt; with &lt;code&gt;p=3&lt;/code&gt; and it was trying to represent &lt;code&gt;3.1415&lt;/code&gt;, it would have an error of 4.15 ulps. The contaminated digits would be roughly 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;log⁡104.15\log_{10}{4.15}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop"&gt;&lt;span class="mop"&gt;lo&lt;span&gt;g&lt;/span&gt;&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;4.15&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 which is roughly &lt;code&gt;0.61804809&lt;/code&gt; digits. &lt;/p&gt;

&lt;p&gt;LOL we can't have partial digits! We'll see that when pigs fly.&lt;/p&gt;

&lt;p&gt;Visually looking, we can see that it is wrong in 1 digit, the last one, which is pretty close to what we got from our calculation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Guard digits
&lt;/h2&gt;

&lt;p&gt;Let's subtract 2 values when 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10\beta = 10&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and &lt;code&gt;p=3&lt;/code&gt;.&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;x=1.01∗100x = 1.01 * 10^{0}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.01&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;y=9.93∗10−1y = 9.93 * 10^{-1}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;y&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;9.93&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;x−y=1.01−0.99=0.02x - y = 1.01 - 0.99 = 0.02&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;y&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.01&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.99&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.02&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;It becomes &lt;code&gt;0.99&lt;/code&gt; and not &lt;code&gt;0.993&lt;/code&gt; because we had to lose some data with &lt;code&gt;p=3&lt;/code&gt; so that they could be subtracted from each other at the same 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;βe\beta^{e}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;As you know, the actual answer &lt;code&gt;0.017&lt;/code&gt;, but the answer ended up being &lt;code&gt;0.02&lt;/code&gt;. So 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;2.00∗10−22.00 * 10^{-2}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2.00&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1.70∗10−21.70 * 10^{-2}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.70&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 have an error of 30 ulps!&lt;/p&gt;

&lt;p&gt;The relative error from this kind of subtraction is bounded by 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β−1\beta - 1&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. Let me show you why. &lt;/p&gt;

&lt;p&gt;If 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;x=1.00...00,y=ρ.ρρ...ρρ∗β−1x=1.00...00, y=\rho.\rho\rho...\rho\rho * \beta^{-1}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.00...00&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;y&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ρ&lt;/span&gt;&lt;span class="mord"&gt;.&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ρρ&lt;/span&gt;&lt;span class="mord"&gt;...&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ρρ&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 where 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ρ=β−1\rho=\beta-1&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ρ&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. (ex. 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10,ρ=9\beta=10, \rho=9&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ρ&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;9&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
). (&lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; have &lt;code&gt;p&lt;/code&gt; digits.)&lt;/p&gt;

&lt;p&gt;If I subtracted them, I should get the actual answer of 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1∗β−p1*\beta^{-p}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, but because we shift &lt;code&gt;y&lt;/code&gt; to the right and lose a digit, we end up getting 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1∗β−p+11*\beta^{-p+1}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;span class="mbin mtight"&gt;+&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. &lt;br&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;absolute err.=∣β−p−β−p+1∣=∣β−p(1−β)∣\text{absolute err.} = |\beta^{-p} - \beta{-p+1}| = |\beta^{-p}(1 - \beta)|&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;absolute err.&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal"&gt;p&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;relative err.=abs. error/z\text{relative err.} = \text{abs. error}/z &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;relative err.&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;abs. error&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;z&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;=∣β−p(1−β)∣β−p=β−1= \frac{|\beta^{-p}(1 - \beta)|}{\beta^{-p}} = \beta-1&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;∣&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;If our 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=2\beta=2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 that would make our relative error 1. In terms of 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;ϵ\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, it would mean 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;1=k∗ϵ1 = k*\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and so 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;k=1/ϵk = 1/\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. The contaminated digits would be 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;log21/ϵ=log22p=plog_2{1/\epsilon}=log_2{2^{p}}=p &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;l&lt;/span&gt;&lt;span class="mord mathnormal"&gt;o&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;g&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;l&lt;/span&gt;&lt;span class="mord mathnormal"&gt;o&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;g&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. If &lt;code&gt;p&lt;/code&gt; digits are contaminated, all of them are contaminated.&lt;/p&gt;

&lt;p&gt;Was there a way we could have avoided some of this error? Okay, fine, there iss.... otherwise I wouldn't have written about this section...&lt;/p&gt;

&lt;p&gt;Just add a temporary extra digit. And let's call it a ... wait for it ...  a &lt;strong&gt;&lt;em&gt;guard digit&lt;/em&gt;&lt;/strong&gt;. suprise surprise&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;x=1.01∗100x = 1.01 * 10^{0}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.01&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;y=9.93∗10−1y = 9.93 * 10^{-1}&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;y&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;9.93&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;∗&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;0&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;br&gt;

&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;x−y=1.010−0.993=0.017x - y = 1.010 - 0.993 = 0.017&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;y&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.010&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.993&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.017&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Now we have no error now. This is now a bit better than before.&lt;/p&gt;

&lt;p&gt;Turns out the guard digit bounds the relative error to 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;2ϵ2\epsilon&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ϵ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. I'm lazy but if someone in the comments asks, I'll figure out why and try to explain it (but its in the linked blog).&lt;/p&gt;

&lt;h2&gt;
  
  
  Benign and Catastrophic Cancellation
&lt;/h2&gt;

&lt;p&gt;When we try to subtract two really close numbers, many of the digits cancel out and become &lt;code&gt;0&lt;/code&gt;. We call this cancellation. Sometimes cancellation can be catastrophic or benign. &lt;/p&gt;

&lt;p&gt;Sometimes, when we do subtraction, there are often errors on the later, far right digits (the least significant digits) after rounding the value or after prior operations. The more accurate digits are at the front (the most significant digits). While the more significant digits at the front cancel out, the lesser accurate lower significant digits would have to subtract and produce an even more inaccurate value. (Like when you calculate the determinant 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;b2−4acb^2 - 4ac&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;b&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;4&lt;/span&gt;&lt;span class="mord mathnormal"&gt;a&lt;/span&gt;&lt;span class="mord mathnormal"&gt;c&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
). &lt;/p&gt;

&lt;p&gt;The catastrophic cancellation just exposes the rounding errors from prior operations.&lt;/p&gt;

&lt;p&gt;Benign cancellation happens when you subtract numbers that have no rounding errors. &lt;/p&gt;

&lt;h2&gt;
  
  
  IEEE Standard
&lt;/h2&gt;

&lt;p&gt;So the IEEE standard is a set of rules that many systems follow to ensure consistency. There are two IEEE standards that are followed: IEEE 754 and IEEE 854. They both support smaller and larger floating points called single precision and double precision.&lt;/p&gt;

&lt;h3&gt;
  
  
  IEEE 754
&lt;/h3&gt;

&lt;p&gt;The standard allows 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=2\beta=2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. It has single bit precision (&lt;code&gt;p=24&lt;/code&gt;) and double bit precision (&lt;code&gt;p=53&lt;/code&gt;). It also discusses how the bits should be laid out.&lt;/p&gt;

&lt;p&gt;In fact, here is a cool table that shows how IEEE 754 sets all its floating point parameters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhy2qsk2sip7hn34y7co.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhy2qsk2sip7hn34y7co.png" alt="IEEE 754 Precision Parameters" width="800" height="284"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Exponents are represented with a sign/magnitude split. One bit is used for the sign of the exponent. The remaining bits for the exponent are used to represent its magnitude. Two's complement is another approach but is not used by either IEEE standard.&lt;/p&gt;

&lt;p&gt;In fact, here is exactly how the bits are laid out to represent different kinds of values. To represent 0, you have to use 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;emin−1e_{\text{min}}-1&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;e&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord text mtight"&gt;&lt;span class="mord mtight"&gt;min&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. Infinity is 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;emax+1e_{\text{max}}+1&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;e&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord text mtight"&gt;&lt;span class="mord mtight"&gt;max&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 without the fractional section 0ed out. NaN is another type (like when 0 is divided by 0, or infinities are added). NaN is represented the same as infinity but with the fraction section set.&lt;/p&gt;

&lt;p&gt;The fractional section is the digits after the first digit (also called the significand.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7osrbezaq1ga1tyerzds.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7osrbezaq1ga1tyerzds.png" alt="Special Values" width="672" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  IEEE 854
&lt;/h3&gt;

&lt;p&gt;On the other hand, this standard allows 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=2\beta=2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 or 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;β=10\beta=10&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;β&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;10&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. However, there is no rules about how the bits should be laid out for double and single precision.&lt;/p&gt;

&lt;p&gt;It allows base 10 because it is also the standard way humans count. Base 2 is also included because of the low wobble. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Conclusion
&lt;/h2&gt;

&lt;p&gt;Okay, I kinda rushed the last section. But overall, I wanted to say floating points can have a lot of imprecision. If you can avoid them, use integers instead. &lt;/p&gt;

&lt;p&gt;In the case that you do use them, try to limit your wobbles and avoid catastrophic cancellations (there are ways you could do it sometimes by rearranging formulas). &lt;/p&gt;

&lt;p&gt;Try to read the original blog yourself. This blog you are reading right now is a summary of a fraction of the original blog by David Goldberg.&lt;/p&gt;

&lt;p&gt;But as always, drop your questions and comments. And I'm out for now...&lt;/p&gt;

&lt;p&gt;Peace&lt;br&gt;
-absterdabster&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9g8se6t78w9nehd7o8dy.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9g8se6t78w9nehd7o8dy.jpg" alt="The end" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>architecture</category>
      <category>systems</category>
      <category>math</category>
    </item>
    <item>
      <title>Trying to predict the performance of file reads/writes</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Sun, 05 Jan 2025 01:38:31 +0000</pubDate>
      <link>https://dev.to/absterdabster/trying-to-predict-the-performance-of-file-readswrites-39a5</link>
      <guid>https://dev.to/absterdabster/trying-to-predict-the-performance-of-file-readswrites-39a5</guid>
      <description>&lt;p&gt;Hi! Let's say you want to read or write to a text file. Maybe you are trying to persist application data, read file input or write output to a file. Will it be fast or slow? &lt;/p&gt;

&lt;p&gt;Could we estimate how long it could take?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibwpmquol7he94h50j7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibwpmquol7he94h50j7w.png" alt="me when i'm angry" width="305" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you don't want to read, just jump to the conclusion at the bottom.&lt;/p&gt;

&lt;p&gt;Long story short, I write this article in frustration because after trial and error, I realized performance can vary a lot from system to system. A couple microseconds on one system could mean a couple milliseconds on another system! 100-1,000,000x slower! smh smh...&lt;/p&gt;

&lt;p&gt;If things start to get slow, you could use background threads or start reading/writing in batches of data. But we'll get into that later. &lt;/p&gt;

&lt;p&gt;Let's figure out if we even have to worry about things getting too slow.&lt;br&gt;
(Note: I'm a newbie at this stuff, so please correct me if you need to)&lt;/p&gt;

&lt;p&gt;Okay. Let's start with how does reading from a file work.&lt;/p&gt;
&lt;h2&gt;
  
  
  How does reading/writing from a file work?
&lt;/h2&gt;

&lt;p&gt;The high level idea begins with your programming language. Pick your favorite programming language (that has file io). There is probably a read/write method/function in there.&lt;/p&gt;

&lt;p&gt;But everything boils down to system calls. System calls are the interface used for hardware interactions by programs/users through the guidance/safety of your operating system. (So you don't corrupt your systems accidentally lol) &lt;/p&gt;

&lt;p&gt;For reading, it's &lt;code&gt;read(int fd,  char* buf, size_t count)&lt;/code&gt;. &lt;/p&gt;
&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;

&lt;p&gt;Let's look at an example of file reading in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;with open('filename.txt', 'r') as file:
        # Read the first char
        first_char = file.read(1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Python is an interpreted language, meaning an interpreter is required to execute the Python logic. I dug a little into CPython, the original Python interpreter codebase. (Turns out CPython converts Python into Bytecode which is later interpreted by the Python Virtual Environment (PVM) with machine code.) Any C extensions are converted to machine code directly and executed at runtime.&lt;/p&gt;

&lt;p&gt;I found that under the hood of the file io logic, we had the sneaky system call used by both Windows and Linux:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#ifdef MS_WINDOWS
        _doserrno = 0;
        n = read(fd, buf, (int)count);
        // read() on a non-blocking empty pipe fails with EINVAL, which is
        // mapped from the Windows error code ERROR_NO_DATA.
        if (n &amp;lt; 0 &amp;amp;&amp;amp; errno == EINVAL) {
            if (_doserrno == ERROR_NO_DATA) {
                errno = EAGAIN;
            }
        }
#else
        n = read(fd, buf, count);
#endif
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Java
&lt;/h3&gt;

&lt;p&gt;In Java, there are a lot of ways to read files. For example, you could use a &lt;code&gt;FileInputStream&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        try (FileInputStream fileInputStream = new FileInputStream(filePath)) {
            int byteData;
            while ((byteData = fileInputStream.read()) != -1) {
                System.out.print((char) byteData);  
            }

        } catch (IOException e) {
            e.printStackTrace();
        }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now as you may know, Java is a compiled language. The Java Virtual Machine creates a bytecode in its object file. When ready to execute, the Java Virtual Machine then reads the bytecode into machine code. Like Python, C extensions like the Java Native Interface (JNI) are turned to machine code and executed in runtime.&lt;/p&gt;

&lt;p&gt;If you dig deep into the Java Development Kit codebase, you can see the JNI implementations of FileInputStream which has the read syscall hidden in its read logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssize_t
handleRead(FD fd, void *buf, jint len)
{
    ssize_t result;
    RESTARTABLE(read(fd, buf, len), result);
    return result;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  C++
&lt;/h2&gt;

&lt;p&gt;In C/C++, you can directly use the &lt;code&gt;read&lt;/code&gt; syscall. But in the case you don't, standard library constructs like &lt;code&gt;std::ifstream&lt;/code&gt; also use &lt;code&gt;read&lt;/code&gt; under the hood.&lt;/p&gt;

&lt;p&gt;I wasn't able to find &lt;code&gt;read&lt;/code&gt; in the implementation for &lt;code&gt;std::ifstream&lt;/code&gt;, but I suspect you will have to look inside the &lt;code&gt;bits&lt;/code&gt; directory of the gcc implementation. (Let me know if you find it! Do it as homework hehe.)&lt;/p&gt;

&lt;p&gt;So why am I showing you all this? I suggest you try finding some of these implementations in the interpreters/compilers yourself lol. &lt;/p&gt;

&lt;p&gt;If you do, you will probably notice that the &lt;code&gt;read&lt;/code&gt; and &lt;code&gt;write&lt;/code&gt; syscall is hidden under a lot of other clutter and logic.&lt;/p&gt;

&lt;p&gt;In this blog, I'll discuss the performance of &lt;code&gt;read&lt;/code&gt; and &lt;code&gt;write&lt;/code&gt; syscalls rather than the programming language higher level functions. We can avoid the overhead of the language if there is any. &lt;/p&gt;

&lt;h2&gt;
  
  
  Other ways to write
&lt;/h2&gt;

&lt;p&gt;Okayyy so I lied. &lt;code&gt;write&lt;/code&gt; isn't the only way to write to a file. Turns out you can also use &lt;code&gt;fprintf&lt;/code&gt;, &lt;code&gt;fflush&lt;/code&gt;, and &lt;code&gt;fsync&lt;/code&gt;. (I've seen a SQL implementation use this.)&lt;/p&gt;

&lt;p&gt;So what's the difference?&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;fprintf&lt;/code&gt;, &lt;code&gt;fflush&lt;/code&gt;, and &lt;code&gt;fsync&lt;/code&gt; splits writing into 3 steps respectively:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write to your file into a buffer/cache&lt;/li&gt;
&lt;li&gt;Flush the buffer to your OS's cache&lt;/li&gt;
&lt;li&gt;Transfer from the OS's cache to your disk driver to write to the disk (This could involve writing the entire disk cache.) &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;fsync&lt;/code&gt; blocks until your disk signals it is done transferring/writing.&lt;/p&gt;

&lt;p&gt;This could be useful if you have a lot of modifications you want to make, but you don't want to save them to disk yet. (Maybe you want to make your batch modifications into a giant transaction.)&lt;/p&gt;

&lt;p&gt;The issue is now you have to save the entire driver cache which could be like 64MB or 128MB! Here is a nice blog with more &lt;a href="https://ayende.com/blog/164673/the-difference-between-fsync-write-through-through-the-os-eyes" rel="noopener noreferrer"&gt;info&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;However, if we use &lt;code&gt;write&lt;/code&gt;, we can limit our writes to just the data we are sending. This would make the &lt;code&gt;write&lt;/code&gt; faster than our 3 step process to &lt;code&gt;fsync&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you use the 3 step process, just keep in mind how much data you are writing, aka your disk driver's cache size.&lt;/p&gt;

&lt;p&gt;You can find your disk's cache size by looking at your disk specification.&lt;/p&gt;

&lt;h2&gt;
  
  
  What kind of disk do I have?
&lt;/h2&gt;

&lt;p&gt;So if you don't know what disk you have like I do. Let's figure this out.&lt;/p&gt;

&lt;p&gt;If you type &lt;code&gt;lsblk&lt;/code&gt; in your linux terminal, you might see something that looks like this or similar to this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda           8:0    0   1.8T  0 disk
├─sda1        8:1    0  1000M  0 part  /boot/efi
├─sda2        8:2    0   600M  0 part  /boot
└─sda3        8:3    0   1.8T  0 part  /
zram0       252:0    0     8G  0 disk  [SWAP]
nvme1n1     259:0    0   1.8T  0 disk
├─nvme1n1p1 259:1    0  1000M  0 part
│ └─md125     9:125  0 999.9M  0 raid1
├─nvme1n1p2 259:2    0   600M  0 part
│ └─md127     9:127  0   599M  0 raid1
└─nvme1n1p3 259:3    0   1.8T  0 part
  └─md126     9:126  0   1.8T  0 raid1
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sda&lt;/code&gt; is a disk device. There may be a &lt;code&gt;sdb&lt;/code&gt; or &lt;code&gt;sdc&lt;/code&gt; and so on. If the type of any of these disks says &lt;code&gt;raid&lt;/code&gt;, these disks are probably part of some kind of hardware or software RAID configuration.&lt;/p&gt;

&lt;p&gt;Disks part of a RAID configuration are basically copying each other. If you write to disk, it'll be written to all of them. It's a way to backup your files.&lt;/p&gt;

&lt;p&gt;But remember that if you have a RAID configuration, each disk may have different specifications. Your writes and reads are going to be as slow as the slowest ones because the RAID controller would be writing to both of them.&lt;/p&gt;

&lt;p&gt;Overhead from a RAID controller is usually not a bottleneck, but performance can slightly differ between hardware/software controllers because of using separate hardware vs the busy CPU respectively.&lt;/p&gt;

&lt;p&gt;Each disk may have a different mountpoint. If that is the case, you only care about the disk(s) that have the file you intend to read/write from in it's mountpoint. You can see this in the &lt;code&gt;MOUNTPOINTS&lt;/code&gt; column.&lt;/p&gt;

&lt;p&gt;Ok final thing to note from the command. The &lt;code&gt;RO&lt;/code&gt; column says if you have a rotational hard drive. A rotational hard drive is mechanical, and as a result, HDDs tend to be slower than SSDs as flash memory is faster. The difference is magnitudes faster in reading/writing sometimes, as we'll see later.&lt;/p&gt;

&lt;p&gt;Okay... I'll stop stalling. Let's see what disk you have. Just modify the command to &lt;code&gt;lsblk -io NAME,MODEL&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Here is what I get:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME        MODEL
sda         PERC H730P Adp
|-sda1
|-sda2
`-sda3
zram0
nvme1n1     Samsung SSD 970 EVO 2TB
|-nvme1n1p1
| `-md125
|-nvme1n1p2
| `-md127
`-nvme1n1p3
  `-md126
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you have to look up that model and find your disk's specifications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding your disk specs
&lt;/h2&gt;

&lt;p&gt;If I look up PERC H730P Adp, it turns out this is one of DELL's Raid Controllers. Here is a snapshot of some of the specs:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c3ghjjpu97s6t2ekgwc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c3ghjjpu97s6t2ekgwc.png" alt="PERC H730P Adp Specs" width="800" height="185"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This RAID controller has a huge disk cache of 2GB! And it has a data transfer rate of 12 Gbps. As you can see, it is pretty fast. &lt;/p&gt;

&lt;p&gt;If I wanted to load the Bee Movie Script (80,000 characters). It would take about 50 microseconds to transfer for the Bee Movie Script, ~80KB!&lt;/p&gt;

&lt;p&gt;Note: RAID controllers can sometimes ignore fsync operations. It might not ensure a write to the devices because it has it stored in its cache. At this point, it might lazily store into the disk devices.&lt;/p&gt;

&lt;p&gt;Great, now what about the other disk?&lt;/p&gt;

&lt;h3&gt;
  
  
  Digging deeper
&lt;/h3&gt;

&lt;p&gt;Let's search up the Samsung SSD 970 EVO 2TB.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhcmzs3f1il3v7ua52hao.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhcmzs3f1il3v7ua52hao.png" alt="Samsung SSD 970 EVO 2TB Specs" width="800" height="297"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is what we care about. Sequential and Random Access operations/data transfers. Usually they either come in units of IOPS (Input/Output Operations) or bits/bytes per second.&lt;/p&gt;

&lt;p&gt;Sequential, as the name implies, is for sequentially writing, like the Bee Movie Script. If I wanted to modify different parts of a file, this would be random access writing. Generally, sequential is faster since memory is physically located close by.&lt;/p&gt;

&lt;p&gt;Here we have Sequential write is 2500 MB/s, but Random write is 480,000 IOPS for a queue depth of 32 (32 writes at the same time). This seems kind of dumb, why are they in two different units?&lt;/p&gt;

&lt;p&gt;Also, why are reads faster than writes? How fast is 2500 MB/s???&lt;/p&gt;

&lt;p&gt;No need to fear, I'm here to save you.&lt;/p&gt;

&lt;h4&gt;
  
  
  What are QDs?
&lt;/h4&gt;

&lt;p&gt;QDs are queue depths. Basically when your disk says QD32 or QD1, it refers to having 32 write or read requests or 1 write or read request waiting. This is important because disks could sometimes handle multiple requests at a time. This is why QD32 can be a lot faster than QD1.&lt;/p&gt;

&lt;p&gt;If we are writing our Bee Movie Script all at once, we'd be QD1. However, if we use &lt;code&gt;fsync&lt;/code&gt; or &lt;code&gt;write&lt;/code&gt; multiple times, then we would build a queue of requests.&lt;/p&gt;

&lt;p&gt;A nice way to estimate QD1 from QD32 is by taking 10%-20% of its IOPS. If you know a better way, let me know in the comments!&lt;/p&gt;

&lt;h4&gt;
  
  
  How fast is 2500 MB/s?
&lt;/h4&gt;

&lt;p&gt;You have a Bee Movie Script of 80,000 characters. That is 80KB. 80KB/2500MB/s is roughly 35 microseconds. &lt;/p&gt;

&lt;p&gt;Easy peasy lemon squeasy.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why are reads faster than writes?
&lt;/h4&gt;

&lt;p&gt;Let's explore how writing/reading disks work at a high level to understand this. &lt;/p&gt;

&lt;p&gt;Disks understand memory in regions called sectors. Sectors in HDD originally were 512 bytes. Now, sectors tend to be 4096 bytes as hardware has advanced. &lt;/p&gt;

&lt;p&gt;If I ever want to read or write, the minimum you can theoretically read or write at a time from the disk would be a sector size of data. If I want to read or write 1 byte of data, I have to read the entire sector to find that 1 byte. If I am writing, I have to read the entire sector, apply the change, and then write it back in (A 2 step process!)&lt;/p&gt;

&lt;p&gt;Okay, I lied a little again. You can't always write a single sector. Our OSes have file systems. File systems operate with blocks rather than sectors. Multiple sectors make up a block. If I want to modify 1 byte, I'd have to actually modify the entire block. &lt;/p&gt;

&lt;p&gt;Blocks can range from 1KB - 8KB, but they must be larger than disk sectors. &lt;/p&gt;

&lt;p&gt;PS: Blocks are different from OS pages. Pages in OS are like blocks but for accessing physical RAM.&lt;/p&gt;

&lt;h4&gt;
  
  
  IOPS vs transfer speed (bytes per second)
&lt;/h4&gt;

&lt;p&gt;Great we went over blocks and sectors!&lt;/p&gt;

&lt;p&gt;You probably noticed that the random access specs operate in IOPS. If I want to compare it to sequential reads/writes, I'll have to convert it into bytes per second.&lt;/p&gt;

&lt;p&gt;I mentioned that disks operate in sectors. Each input/output operation occurs over a sector. We see that a sector size for the Samsung SSD 970 EVO 2TB is 4KB.&lt;/p&gt;

&lt;p&gt;So if random writes are 480,000 IOPS, this is 480,000 sectors per second. This is roughly 2,000 MB/s. &lt;/p&gt;

&lt;p&gt;Boom! Random writes are slower than sequential writes. (2000 MB/s &amp;lt; 2500 MB/s). &lt;/p&gt;

&lt;p&gt;Randomly writing the Bee Movie Script is roughly 40 microseconds.&lt;/p&gt;

&lt;p&gt;Great! We looked at an SSD. Now, so that you can feel my pain, let's look at a HDD.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparing an HDD
&lt;/h2&gt;

&lt;p&gt;Let's pretend we have a RAID setup with that Samsung SSD and a HDD disk, for example ST9250610NS. Here are the specs:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn6dc5u0jz1fy33nghluc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn6dc5u0jz1fy33nghluc.png" alt="ST9250610NS HDD Specs" width="800" height="154"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It looks a bit different, but remember that HDDs are mechanical. Parts have to physically move and that takes time. We see that a write/read has an average time of 8.5, 9.5 milliseconds respectively. &lt;/p&gt;

&lt;p&gt;This average time is for a single sector. A single sector in this disk is 512 bytes according to the specs.&lt;/p&gt;

&lt;p&gt;It also mentions a transfer rate of 115 MB/s. Let's test that. If we have 512 bytes/9.5 ms, we get ~50KB/second. &lt;/p&gt;

&lt;p&gt;HUHHH??!!?!?!? That doesn't match 115 MB/s! &lt;/p&gt;

&lt;p&gt;This average read/write time includes the seek time and rotational latency. This means it includes both the transfer time along with the time it takes for the mechanical parts to move to complete the read/write. (I suspect that sequential writes may be faster, since seek times would be little)&lt;/p&gt;

&lt;p&gt;Okay, let's do this again. &lt;/p&gt;

&lt;p&gt;If I want to write the Bee Movie Script, 80,000 chars/bytes, it would take about 1.6 seconds if we operated at 50KB/second. &lt;/p&gt;

&lt;p&gt;LOOK AT THAT! We went from 30-40 microseconds to 1.6 seconds from SSD to HDD! That's a 1,000,000x latency increase. FEEL THE PAINNNN AHHHHHHHH!&lt;/p&gt;

&lt;p&gt;Remember since we are pretending this is a RAID device, the SSD might complete a write pretty fast, but we would have to wait for the HDD drive to finish before the disk can signal completion.&lt;/p&gt;

&lt;p&gt;OH! By the way, this hard drive has a 64MB cache. If you used &lt;code&gt;fsync&lt;/code&gt;, your large write may take a long time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Conclusion
&lt;/h2&gt;

&lt;p&gt;I hope you felt my pain. jkjk. &lt;/p&gt;

&lt;p&gt;But save yourself this pain and predict your read/write latencies. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find out how many bytes you want to read/write&lt;/li&gt;
&lt;li&gt;Find out if you are using &lt;code&gt;write&lt;/code&gt; or &lt;code&gt;fsync&lt;/code&gt; or &lt;code&gt;read&lt;/code&gt; or if there is any overhead&lt;/li&gt;
&lt;li&gt;Find out if they are sequential/random&lt;/li&gt;
&lt;li&gt;Find out if you have a RAID setup or where the file is mounted on &lt;/li&gt;
&lt;li&gt;Find out what kind of disk you have and its specs (IOPS/transfer rates)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the end, the estimation formula is essentially &lt;code&gt;bytes / rate = latency&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;For fun, you could try estimating your own read/write speeds and see if your read/write reflects that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Caveats
&lt;/h2&gt;

&lt;p&gt;Using a networked file system has its own fun. Maybe I'll come back to this topic another time. There might be more involved than just network latencies. If you know, drop a comment lol.&lt;/p&gt;

&lt;p&gt;Okay, I'm done now. Peace!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnivr2sadum1d0ij5mpwp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnivr2sadum1d0ij5mpwp.jpg" alt="Bye Bye!" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>database</category>
      <category>systems</category>
      <category>architecture</category>
    </item>
    <item>
      <title>ML Model for Font Color on Website</title>
      <dc:creator>absterdabster</dc:creator>
      <pubDate>Thu, 21 May 2020 02:39:02 +0000</pubDate>
      <link>https://dev.to/absterdabster/graduation-2020-3gn0</link>
      <guid>https://dev.to/absterdabster/graduation-2020-3gn0</guid>
      <description>&lt;h2&gt;
  
  
  My Final Project
&lt;/h2&gt;

&lt;p&gt;I created a HTML/CSS page that allows for user input of a background color. Based on the background color that you chose, I dynamically render the page. Based on the user input, I also run a trained ML algorithm (deep neural net) using binary classification to determine the best font that would stand out from the background.&lt;/p&gt;

&lt;h2&gt;
  
  
  Link to Code
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/Abinavraj5427/fontColorANN" rel="noopener noreferrer"&gt;https://github.com/Abinavraj5427/fontColorANN&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How I built it (what's the stack? did I run into issues or discover something new along the way?)
&lt;/h2&gt;

&lt;p&gt;I built and constructed my neural network first using python and modules like numpy. I decided to go through the math, so I avoided ML libraries like TensorFlow. After training my model, I decided I should create an interface for users. In order to build this interface, I created a simple HTML page with dynamic CSS through JavaScript. I update real time using AJAX requests via JQuery. Then I pass the data to the server backend, PHP which runs the python script for the model via command line. Then the server responds to the user with their text color.&lt;/p&gt;

&lt;p&gt;I never knew that PHP can run python scripts which I found was really cool. This was also one of my first times using WAMP Server to test my full-stack website. &lt;/p&gt;

&lt;h2&gt;
  
  
  Additional Thoughts / Feelings / Stories
&lt;/h2&gt;

&lt;p&gt;I think this is a great example of how machine learning shouldn't be used. ML is not lightweight, so it cannot be used everywhere. If anything, a mathematical equation for RGB can determine the color, but I did this merely out of excitement and practice. Hope you enjoy it!&lt;/p&gt;

&lt;p&gt;Happy Graduation Class of 2020!&lt;/p&gt;

</description>
      <category>octograd2020</category>
    </item>
  </channel>
</rss>
