<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Amir Mullagaliev</title>
    <description>The latest articles on DEV Community by Amir Mullagaliev (@amullagaliev).</description>
    <link>https://dev.to/amullagaliev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2036287%2Fc8bb6516-158c-43dd-b37b-c3e3b33e6d26.jpeg</url>
      <title>DEV Community: Amir Mullagaliev</title>
      <link>https://dev.to/amullagaliev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amullagaliev"/>
    <language>en</language>
    <item>
      <title>SPO600: Project Stage III - Enhancing the Clone-Pruning Analysis Pass</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Sat, 19 Apr 2025 17:49:10 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-project-stage-iii-enhancing-the-clone-pruning-analysis-pass-4kmo</link>
      <guid>https://dev.to/amullagaliev/spo600-project-stage-iii-enhancing-the-clone-pruning-analysis-pass-4kmo</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Stage III Requirements&lt;/li&gt;
&lt;li&gt;Implementation Approach&lt;/li&gt;
&lt;li&gt;
Implementation Details

&lt;ul&gt;
&lt;li&gt;Data Structure Changes&lt;/li&gt;
&lt;li&gt;Function Tracking Logic&lt;/li&gt;
&lt;li&gt;Analysis Algorithm&lt;/li&gt;
&lt;li&gt;Complete Implementation&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Testing and Results

&lt;ul&gt;
&lt;li&gt;Test Case Design&lt;/li&gt;
&lt;li&gt;x86_64 Results&lt;/li&gt;
&lt;li&gt;aarch64 Results&lt;/li&gt;
&lt;li&gt;Comparison Between Architectures&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Capabilities and Limitations&lt;/li&gt;

&lt;li&gt;How to Reproduce My Results&lt;/li&gt;

&lt;li&gt;Final Reflections&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome to the third and final stage of my GCC Clone-Pruning Analysis Pass project! In this post, I'll describe how I extended my Stage II implementation to handle multiple cloned functions in a single program and tested it across both x86_64 and aarch64 architectures.&lt;/p&gt;

&lt;p&gt;If you didn't read my previous post about Stage II, I'd recommend checking it out first, since this builds directly on that work. In Stage II, I built a basic GCC pass that could analyze a single cloned function and determine whether its variants were substantially similar enough to be pruned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage III Requirements
&lt;/h2&gt;

&lt;p&gt;For Stage III, we needed to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Extend our code to handle multiple cloned functions in a single program&lt;/li&gt;
&lt;li&gt;Create test cases with at least two cloned functions per program&lt;/li&gt;
&lt;li&gt;Verify functionality on both x86_64 and aarch64 architectures&lt;/li&gt;
&lt;li&gt;Test scenarios with mixed PRUNE and NOPRUNE recommendations&lt;/li&gt;
&lt;li&gt;Clean up any remaining issues from Stage II&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The biggest challenge was removing the assumption from Stage II that "there is only one cloned function in a program" and ensuring our implementation works correctly across architectures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation Approach
&lt;/h2&gt;

&lt;p&gt;After reviewing my Stage II code, I identified several limitations that needed to be addressed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The static variables used to track function information limited us to a single function with two variants&lt;/li&gt;
&lt;li&gt;The simple comparison based on block and statement counts could be improved&lt;/li&gt;
&lt;li&gt;The state management needed enhancement to handle multiple functions&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;My approach for Stage III focused on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Replacing the static variables with a more sophisticated data structure to track multiple functions&lt;/li&gt;
&lt;li&gt;Enhancing the comparison algorithm &lt;/li&gt;
&lt;li&gt;Making sure the implementation works consistently across architectures&lt;/li&gt;
&lt;li&gt;Creating comprehensive test cases for both architectures&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Implementation Details
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Data Structure Changes
&lt;/h3&gt;

&lt;p&gt;The core of my improvement was replacing the simple static variables from Stage II:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;previous_function_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;previous_block_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;previous_statement_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a map to store information about all encountered functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;FunctionVariant&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;    &lt;span class="c1"&gt;// Complete function name with variant&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;block_count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// Number of basic blocks&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;statement_count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// Number of GIMPLE statements&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;FunctionVariant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;function_variants&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This data structure allows us to track multiple variants of multiple functions simultaneously, and keep all their relevant information organized by base function name.&lt;/p&gt;

&lt;h3&gt;
  
  
  Function Tracking Logic
&lt;/h3&gt;

&lt;p&gt;First of all, I needed to improve my base function name extraction to handle different variant suffixes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;get_base_function_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;cgraph_node&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cgraph_node&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;decl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                                          &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="c1"&gt;// Handle resolver functions&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".resolver"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Handle regular variants&lt;/span&gt;
    &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I also added a function to find the default variant among all variants of a function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="n"&gt;FunctionVariant&lt;/span&gt; &lt;span class="nf"&gt;find_default_variant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;FunctionVariant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".default"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Analysis Algorithm
&lt;/h3&gt;

&lt;p&gt;Here's the core logic for processing a function in the execute method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;FILE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dump_file&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;dump_file&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;cgraph_node&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cgraph_node&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;decl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;full_fname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                           &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                           &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;print_frame_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;full_fname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_resolver_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;full_fname&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;print_frame_footer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ANALYSIS FINISHED (resolver function)"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;bb_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;gimple_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;basic_block&lt;/span&gt; &lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;FOR_EACH_BB_FN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;bb_count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gimple_stmt_iterator&lt;/span&gt; &lt;span class="n"&gt;gsi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gsi_start_bb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
             &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;gsi_end_p&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gsi&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
             &lt;span class="n"&gt;gsi_next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gsi&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;gimple_count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_base_function_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;FunctionVariant&lt;/span&gt; &lt;span class="n"&gt;current_variant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;full_fname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bb_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gimple_count&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="n"&gt;function_variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_variant&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function_variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;analyze_function_variants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function_variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;print_frame_footer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"First variant of this function - storing for comparison"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here's the function that analyzes variants to determine if they should be pruned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;analyze_function_variants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;FILE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                              &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;FunctionVariant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;FunctionVariant&lt;/span&gt; &lt;span class="n"&gt;default_variant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;find_default_variant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;FunctionVariant&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;variant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;default_variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;should_prune&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;block_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;default_variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;block_count&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
                            &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;statement_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;default_variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;statement_count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;should_prune&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"PRUNE: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"NOPRUNE: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CLONE FOUND: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CURRENT: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

        &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;border&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;border&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"*  End of Diagnostic for Clone Pair&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;border&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key improvements are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We track all variants of all functions&lt;/li&gt;
&lt;li&gt;We process functions as they're encountered&lt;/li&gt;
&lt;li&gt;When we have multiple variants of a function, we analyze them immediately&lt;/li&gt;
&lt;li&gt;We identify the default variant to use as a baseline for comparison&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Complete Implementation
&lt;/h3&gt;

&lt;p&gt;For the complete implementation, check my GitHub repository:&lt;br&gt;
&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-3/tree-amullagaliev.cc" rel="noopener noreferrer"&gt;tree-amullagaliev.cc&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Testing and Results
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Test Case Design
&lt;/h3&gt;

&lt;p&gt;To thoroughly test the implementation, I created a complex test file with six different functions, each with different optimization characteristics:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;simple_calculation&lt;/code&gt;: Basic scalar operations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vector_multiply&lt;/code&gt;: Vector operations &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;matrix_transpose&lt;/code&gt;: Matrix operations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;count_bits&lt;/code&gt;: Bit counting&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;count_char&lt;/code&gt;: String processing&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;classify_number&lt;/code&gt;: A branch-heavy function&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For x86_64, I used these target attributes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"arch=x86-64-v3"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;simple_calculation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"avx2"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;vector_multiply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"avx2"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;matrix_transpose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"popcnt"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;count_bits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;uint64_t&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"sse4.1"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;count_char&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"arch=x86-64-v3"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;classify_number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For aarch64, I used different attributes appropriate for that architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"simd"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;simple_calculation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// same function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;target_clones&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"sve"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;vector_multiply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// same function body&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// etc.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full test files can be found here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-3/x86-test/complex-clones-test.c" rel="noopener noreferrer"&gt;complex-clones-test.c for x86_64&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-3/aarch64-test/complex-clones-test.c" rel="noopener noreferrer"&gt;complex-clones-test.c for aarch64&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  x86_64 Results
&lt;/h3&gt;

&lt;p&gt;On x86_64, the results were:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;====== Summary of PRUNE/NOPRUNE decisions ======
NOPRUNE: vector_multiply
PRUNE: classify_number
PRUNE: count_bits
PRUNE: count_char
PRUNE: matrix_transpose
PRUNE: simple_calculation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most functions were identified for pruning, with only &lt;code&gt;vector_multiply&lt;/code&gt; showing structural differences with AVX2 optimization. I was surprised to see that &lt;code&gt;matrix_transpose&lt;/code&gt; was marked for pruning despite also using AVX2! This suggests that not all vector operations benefit equally from AVX2 instructions.&lt;/p&gt;

&lt;h3&gt;
  
  
  aarch64 Results
&lt;/h3&gt;

&lt;p&gt;On aarch64, the results were different:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;====== Summary of PRUNE/NOPRUNE decisions ======
NOPRUNE: matrix_transpose
NOPRUNE: vector_multiply
PRUNE: classify_number
PRUNE: count_bits
PRUNE: count_char
PRUNE: simple_calculation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, both &lt;code&gt;vector_multiply&lt;/code&gt; AND &lt;code&gt;matrix_transpose&lt;/code&gt; were marked as NOPRUNE. This was a fascinating finding!&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparison Between Architectures
&lt;/h3&gt;

&lt;p&gt;The most interesting observation was how the same function (&lt;code&gt;matrix_transpose&lt;/code&gt;) was treated differently on each architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On x86_64 with AVX2: PRUNE recommendation&lt;/li&gt;
&lt;li&gt;On aarch64 with SVE: NOPRUNE recommendation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This shows that the SVE instructions on aarch64 changed the function structure more significantly than AVX2 did on x86_64, resulting in different optimization outcomes. I found this really intriguing since both are vector instruction sets, but they impact code structure differently. This highlights the importance of architecture-specific optimization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capabilities and Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Capabilities
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Function Support&lt;/strong&gt;: Successfully handles any number of cloned functions in a program&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-Architecture Compatibility&lt;/strong&gt;: Works on both x86_64 and aarch64&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed Decision Support&lt;/strong&gt;: Can recommend PRUNE for some functions and NOPRUNE for others&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detailed Output&lt;/strong&gt;: Provides clear diagnostic information for each clone pair&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Limitations
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Simple Comparison Metric&lt;/strong&gt;: Still relies on basic block and statement counts, which may not capture all structural differences&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture-Specific Test Cases&lt;/strong&gt;: Requires different test files for each architecture due to different valid target attributes&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How to Reproduce My Results
&lt;/h2&gt;

&lt;p&gt;To replicate my work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First, create or modify the pass file:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd ~/git/gcc/gcc
vi tree-amullagaliev.cc  # Copy the implementation code from GitHub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Rebuild GCC:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd ~/gcc-build-001/
make -j$(nproc)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create test files:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir ~/stage3
cd ~/stage3
# Copy complex-clones-test.c and Makefile from GitHub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Test the implementation:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;make
make run-test
make show-results
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's the Makefile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nv"&gt;CC&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; ~/gcc-build-001/gcc/xgcc
&lt;span class="nv"&gt;CFLAGS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nt"&gt;-B&lt;/span&gt; ~/gcc-build-001/gcc/ &lt;span class="nt"&gt;-g&lt;/span&gt; &lt;span class="nt"&gt;-O3&lt;/span&gt; &lt;span class="nt"&gt;-ftree-vectorize&lt;/span&gt; &lt;span class="nt"&gt;-fdump-tree-amullagaliev&lt;/span&gt;

&lt;span class="nl"&gt;all&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;complex-test&lt;/span&gt;

&lt;span class="nl"&gt;complex-test&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;complex-clones-test.c&lt;/span&gt;
    &lt;span class="p"&gt;$(&lt;/span&gt;CC&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;CFLAGS&lt;span class="p"&gt;)&lt;/span&gt; complex-clones-test.c &lt;span class="nt"&gt;-o&lt;/span&gt; complex-test

&lt;span class="nl"&gt;run-test&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;complex-test&lt;/span&gt;
    ./complex-test
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Test completed. Check the dump files for analysis results."&lt;/span&gt;

&lt;span class="nl"&gt;show-results&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"====== Analysis Results ======"&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; 3 &lt;span class="nt"&gt;-B&lt;/span&gt; 1 &lt;span class="s2"&gt;"PRUNE&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;|NOPRUNE"&lt;/span&gt; complex-test-complex-clones-test.c.&lt;span class="k"&gt;*&lt;/span&gt;.amullagaliev &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"No results found"&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"====== Summary of PRUNE/NOPRUNE decisions ======"&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"PRUNE:"&lt;/span&gt; complex-test-complex-clones-test.c.&lt;span class="k"&gt;*&lt;/span&gt;.amullagaliev | &lt;span class="nb"&gt;sort&lt;/span&gt; | &lt;span class="nb"&gt;uniq&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"NOPRUNE:"&lt;/span&gt; complex-test-complex-clones-test.c.&lt;span class="k"&gt;*&lt;/span&gt;.amullagaliev | &lt;span class="nb"&gt;sort&lt;/span&gt; | &lt;span class="nb"&gt;uniq&lt;/span&gt;

&lt;span class="nl"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; complex-test &lt;span class="k"&gt;*&lt;/span&gt;.o &lt;span class="k"&gt;*&lt;/span&gt;.amullagaliev&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="nl"&gt;.PHONY&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;all run-test show-results clean&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full source code and test files are available in my GitHub repository for easy access and replication:&lt;br&gt;
&lt;a href="https://github.com/mulla028/SPO600-Project/tree/main/stage-3" rel="noopener noreferrer"&gt;SPO600 Project Stage III&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Reflections
&lt;/h2&gt;

&lt;p&gt;Wow, working on this project has been an incredible learning experience! I've gained a much deeper understanding of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;GCC internals and how passes are implemented&lt;/li&gt;
&lt;li&gt;Function Multi-Versioning and how it works across architectures&lt;/li&gt;
&lt;li&gt;Cross-architecture development challenges&lt;/li&gt;
&lt;li&gt;GIMPLE representation and analysis techniques&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The most challenging part of Stage III was understanding how to properly track and analyze multiple function variants. I initially tried to use a finalize method to process all variants at the end, but this approach had issues. The solution of processing variants as they're encountered worked much better.&lt;/p&gt;

&lt;p&gt;I was particularly surprised by the difference in optimization behavior between x86_64 and aarch64, especially for matrix operations. This highlighted how architecture-specific optimizations can lead to very different code structures, even for the same source code.&lt;/p&gt;

&lt;p&gt;The most frustrating part was dealing with architecture-specific target attributes. I spent a lot of time figuring out which attributes were valid on aarch64 vs. x86_64. It was a bit of trial and error until I found combinations that worked on both platforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I've successfully extended my GCC clone-pruning analysis pass to handle multiple functions and test it across architectures. The implementation now meets all the requirements for Stage III.&lt;/p&gt;

&lt;p&gt;This whole project, from Stage I through III, has been a fascinating journey into compiler development. I've gained skills that I never expected to acquire and deepened my understanding of how compilers work at a fundamental level.&lt;/p&gt;

&lt;p&gt;I want to express my sincere thanks to Professor Chris Tyler for his incredible guidance throughout this project. His clear explanations of GCC internals and compiler theory made the complex world of compilers much more approachable. Without his lectures and support, navigating the intricate structures of GCC would have been much more challenging. The skills I've gained in this course will be valuable throughout my career in software development.&lt;/p&gt;

&lt;p&gt;This project marks the end of my SPO600 journey, but the knowledge and experience I've gained will stay with me for years to come.&lt;/p&gt;

</description>
      <category>gcc</category>
    </item>
    <item>
      <title>SPO600 Lab 5: Adventures in Assembly Language</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Fri, 18 Apr 2025 23:52:40 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-lab-5-adventures-in-assembly-language-350k</link>
      <guid>https://dev.to/amullagaliev/spo600-lab-5-adventures-in-assembly-language-350k</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Lab Requirements&lt;/li&gt;
&lt;li&gt;Implementing the Loop in AArch64&lt;/li&gt;
&lt;li&gt;Implementing the Loop in x86_64&lt;/li&gt;
&lt;li&gt;Comparing Assembly Languages&lt;/li&gt;
&lt;li&gt;Debugging Headaches&lt;/li&gt;
&lt;li&gt;Code Breakdown&lt;/li&gt;
&lt;li&gt;Lessons Learned&lt;/li&gt;
&lt;li&gt;Full Source Code&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I've completed Lab 5 for the &lt;strong&gt;SPO600&lt;/strong&gt; course, and let me tell you - working with assembly language is like trying to communicate with aliens using only hand gestures.&lt;/p&gt;

&lt;p&gt;This lab focused on experimenting with assembler on both &lt;strong&gt;x86_64&lt;/strong&gt; and &lt;strong&gt;AArch64&lt;/strong&gt; platforms. I had to write programs that looped through numbers, converted them to characters, and printed them to the screen. Sounds simple, right? &lt;strong&gt;WRONG&lt;/strong&gt;. &lt;strong&gt;Nothing is simple in assembly!&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Lab Requirements
&lt;/h2&gt;

&lt;p&gt;The lab required me to implement the following in both &lt;strong&gt;AArch64&lt;/strong&gt; and &lt;strong&gt;x86_64&lt;/strong&gt; assembly:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A basic loop that prints &lt;code&gt;Loop&lt;/code&gt; 6 times&lt;/li&gt;
&lt;li&gt;Modify it to print &lt;code&gt;Loop: #&lt;/code&gt; where # is the loop index (0-5)&lt;/li&gt;
&lt;li&gt;Extend it to print 2-digit numbers (00-32)&lt;/li&gt;
&lt;li&gt;Suppress leading zeros&lt;/li&gt;
&lt;li&gt;Change to hexadecimal output (0-20)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Implementing the Loop in AArch64
&lt;/h2&gt;

&lt;p&gt;My very first roadblock with &lt;strong&gt;AArch64&lt;/strong&gt; was figuring out how to actually modify a buffer in memory. With higher-level languages, you'd just do something like &lt;code&gt;message[6] = digit + '0';&lt;/code&gt; but in assembly... nope! You need to load addresses, use registers, and do all kinds of register juggling.&lt;/p&gt;

&lt;p&gt;For example, to print &lt;code&gt;Loop: #&lt;/code&gt; with the index, I had to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mov     x20, x19    
add     x20, x20, 48

ldr     x1, =message

strb    w20, [x1, digit_pos]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hardest part was definitely the 2-digit conversion. I spent way too long figuring out how to divide numbers in AArch64. Turns out you need &lt;code&gt;udiv&lt;/code&gt; for division and &lt;code&gt;msub&lt;/code&gt; to calculate the remainder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mov     x20, x19         
mov     x21, 10          
udiv    x22, x20, x21    

msub    x23, x22, x21, x20  # x23 = x20 - (x22 * x21) = remainder
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Implementing the Loop in x86_64
&lt;/h2&gt;

&lt;p&gt;Working with &lt;strong&gt;x86_64&lt;/strong&gt; after &lt;strong&gt;AArch64&lt;/strong&gt; was like switching from "Japanese" to "German" - still foreign, but somehow differently confusing! &lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;x86_64&lt;/strong&gt; division was a total pain. You have to clear specific registers, put values in specific places, and the division gives both quotient AND remainder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mov     %r15,%rax  
mov     $0,%rdx       
mov     $10,%rcx      
div     %rcx          
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And don't even get me started on the syntax differences! In &lt;strong&gt;AArch64&lt;/strong&gt;, destination register comes first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mov x0, 1  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But in &lt;strong&gt;x86_64&lt;/strong&gt;, it's the other way around:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mov $1,%rax  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I kept mixing them up, and my programs wouldn't assemble.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparing Assembly Languages
&lt;/h2&gt;

&lt;p&gt;Now that I've worked with three assembly languages (&lt;code&gt;6502&lt;/code&gt;, &lt;code&gt;x86_64&lt;/code&gt;, and &lt;code&gt;AArch64&lt;/code&gt;), here's my totally subjective ranking:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AArch64&lt;/strong&gt;: Cleanest syntax and most consistent. The register naming makes sense (&lt;code&gt;x0&lt;/code&gt;, &lt;code&gt;x1&lt;/code&gt;, etc.), and the instruction names are mostly intuitive. The best part is having separate instructions for quotient and remainder.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;6502&lt;/strong&gt;: Simple and limited, which is actually nice for beginners.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;x86_64&lt;/strong&gt;: Most powerful but also most confusing. The register naming is historical (&lt;code&gt;%rax&lt;/code&gt;, &lt;code&gt;%rbx&lt;/code&gt;, &lt;code&gt;%r15&lt;/code&gt;) with no obvious pattern. Instructions are cryptic (&lt;code&gt;%al&lt;/code&gt; vs &lt;code&gt;%ax&lt;/code&gt; vs &lt;code&gt;%eax&lt;/code&gt; vs &lt;code&gt;%rax&lt;/code&gt;). Division is a nightmare requiring specific register setup.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Debugging Headaches
&lt;/h2&gt;

&lt;p&gt;Here's what my debugging process looked like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write code&lt;/li&gt;
&lt;li&gt;Compile&lt;/li&gt;
&lt;li&gt;Get cryptic error message&lt;/li&gt;
&lt;li&gt;Stare at code for 10 minutes&lt;/li&gt;
&lt;li&gt;Realize I used &lt;code&gt;//&lt;/code&gt; for comments instead of &lt;code&gt;#&lt;/code&gt; in GNU assembler&lt;/li&gt;
&lt;li&gt;Fix and repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The worst part was when the program assembled but didn't work right. With no debugger (or at least none that I knew how to use properly), I was basically adding write statements to see what was happening inside - like &lt;code&gt;printf&lt;/code&gt; debugging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code Breakdown
&lt;/h2&gt;

&lt;p&gt;Let's look at a small piece of the hexadecimal conversion in &lt;strong&gt;AArch64&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cmp     x22, 10          
b.ge    high_alpha       

add     x22, x22, 48     
b       high_done

high_alpha:
add     x22, x22, 55     

high_done:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code checks if a hex digit is 0-9 or A-F and converts it. For 0-9, we add 48 (&lt;strong&gt;ASCII&lt;/strong&gt; for '0'). For 10-15, we add 55 to get 'A'-'F'. &lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Assembly is PRECISE&lt;/strong&gt;: A single wrong register or memory address and everything breaks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Different architectures = different paradigms&lt;/strong&gt;: &lt;strong&gt;x86_64&lt;/strong&gt; and &lt;strong&gt;AArch64&lt;/strong&gt; handle things like division completely differently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Comments are ESSENTIAL&lt;/strong&gt;: Without comments, I'd have no idea what my own code was doing 5 minutes after writing it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Register allocation matters&lt;/strong&gt;: In higher level languages, variables just exist. In assembly, you need to carefully plan which registers to use for what.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Full Source Code
&lt;/h2&gt;

&lt;p&gt;Here are the links to the full source code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/aarch64/loop1.s" rel="noopener noreferrer"&gt;AArch64 loop1.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/aarch64/loop2.s" rel="noopener noreferrer"&gt;AArch64 loop2.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/aarch64/loop3.s" rel="noopener noreferrer"&gt;AArch64 loop3.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/aarch64/loop4.s" rel="noopener noreferrer"&gt;AArch64 loop4.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/aarch64/loop5.s" rel="noopener noreferrer"&gt;AArch64 loop5.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/x86/loop1.s" rel="noopener noreferrer"&gt;x86_64 loop1_x86.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/x86/loop2.s" rel="noopener noreferrer"&gt;x86_64 loop2_x86.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/x86/loop3.s" rel="noopener noreferrer"&gt;x86_64 loop3_x86.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/x86/loop4.s" rel="noopener noreferrer"&gt;x86_64 loop4_x86.s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-5/x86/loop5.s" rel="noopener noreferrer"&gt;x86_64 loop5_x86.s&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'll just paste the AArch64 &lt;code&gt;loop5.s&lt;/code&gt; code here as an example (I'm probably proudest of this one since it handles hex conversion :D):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.data
message:
    .ascii "Loop: ##\n"
message_len = . - message 
hex1_pos = 6              
hex2_pos = 7              
space = 32                

.text
.globl _start
min = 0                   
max = 33                  
_start:
    mov     x19, min      

loop:

    mov     x20, x19         
    mov     x21, 16          
    udiv    x22, x20, x21    

    msub    x23, x22, x21, x20  # x23 = x20 - (x22 * x21) = remainder

    cmp     x22, 10          
    b.ge    high_alpha       

    add     x22, x22, 48     
    b       high_done

high_alpha:
    add     x22, x22, 55     

high_done:
    # Convert low nibble to ASCII
    cmp     x23, 10          
    b.ge    low_alpha        

    add     x23, x23, 48     
    b       low_done

low_alpha:
    add     x23, x23, 55     

low_done:
    ldr     x1, =message

    cmp     x22, 48          
    b.ne    print_both       

    mov     x24, space       
    strb    w24, [x1, hex1_pos]  
    b       print_low        

print_both:
    strb    w22, [x1, hex1_pos]

print_low:
    strb    w23, [x1, hex2_pos]

    mov     x0, 1            # 1 is stdout
    mov     x2, message_len  # message length
    mov     x8, 64           # 64 is write
    svc     0                

    add     x19, x19, 1      
    cmp     x19, max         
    b.ne    loop             

    mov     x0, 0            # set exit status to 0
    mov     x8, 93           # exit is syscall #93
    svc     0                
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In conclusion, would I write assembly code in my free time? Probably not. But I have a much better understanding of what's happening &lt;em&gt;under the hood&lt;/em&gt; of my programs now.&lt;/p&gt;

</description>
      <category>assembly</category>
    </item>
    <item>
      <title>SPO600 - Lab 3: Building a Number Guessing Game in 6502 Assembly</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Fri, 18 Apr 2025 02:48:01 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-lab-3-building-a-number-guessing-game-in-6502-assembly-3ca2</link>
      <guid>https://dev.to/amullagaliev/spo600-lab-3-building-a-number-guessing-game-in-6502-assembly-3ca2</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Game Overview&lt;/li&gt;
&lt;li&gt;
Code Breakdown

&lt;ul&gt;
&lt;li&gt;Initialization and Random Number Generation&lt;/li&gt;
&lt;li&gt;Text Output: Game Prompts&lt;/li&gt;
&lt;li&gt;Keyboard Input Handling&lt;/li&gt;
&lt;li&gt;Graphics Feedback&lt;/li&gt;
&lt;li&gt;Screenshots&lt;/li&gt;
&lt;li&gt;Attempt Tracking and Display&lt;/li&gt;
&lt;li&gt;Restart Mechanism&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Reflection

&lt;ul&gt;
&lt;li&gt;Challenges&lt;/li&gt;
&lt;li&gt;Limitations&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Full Code&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;This blog post is about my journey through Lab 3 of the SPO600 course, where I developed a &lt;strong&gt;Number Guessing Game&lt;/strong&gt; using 6502 Assembly. The goal was to create a program that meets specific criteria: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Outputting to both text and graphics screens&lt;/li&gt;
&lt;li&gt;Accepting keyboard input&lt;/li&gt;
&lt;li&gt;Using arithmetic operations &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's dive into how I tackled this challenge!&lt;/p&gt;




&lt;h2&gt;
  
  
  Game Overview
&lt;/h2&gt;

&lt;p&gt;The game is simple but engaging:  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The program generates a random number between 1 and 99.
&lt;/li&gt;
&lt;li&gt;The player guesses the number via keyboard input.
&lt;/li&gt;
&lt;li&gt;After each guess, the game outputs ("Too High" or "Too Low") and changes the colour of the graphics screen (red for high, blue for low, green for a win).
&lt;/li&gt;
&lt;li&gt;The player's number of attempts is tracked and displayed as &lt;code&gt;A&lt;/code&gt;.
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Code Breakdown
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Initialization and Random Number Generation
&lt;/h3&gt;

&lt;p&gt;The game starts by initializing the text screen (&lt;code&gt;SCINIT&lt;/code&gt;) and testing the graphics screen with a flash effect (&lt;code&gt;TEST_GRAPHICS&lt;/code&gt;). A random number is generated using the pseudo-random byte at memory location &lt;code&gt;$FE&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LDA $FE      ; Load random byte
AND #$7F     ; Ensure positivity (bitwise AND)
CMP #100     ; Check if &amp;gt;=100
BCC SAVE_TARGET ; If &amp;lt;100, use it
LSR          ; Divide by 2 if out of range
SAVE_TARGET:
STA TARGET   ; Store as target number
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures the number stays within 1–99 using bitwise operations and comparisons.&lt;/p&gt;

&lt;h3&gt;
  
  
  Text Output: Game Prompts
&lt;/h3&gt;

&lt;p&gt;The text screen displays instructions and feedback using the &lt;strong&gt;CHROUT ROM routine&lt;/strong&gt;. For example, printing "GUESS" and handling newlines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LDA #$47     ; 'G'
JSR CHROUT
LDA #$55     ; 'U'
JSR CHROUT
; ... repeats for remaining letters
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Feedback like "HI" (too high) or "LO" (too low) is printed after each guess.&lt;/p&gt;

&lt;h3&gt;
  
  
  Keyboard Input Handling
&lt;/h3&gt;

&lt;p&gt;The INPUT_GUESS subroutine reads digits from the keyboard using &lt;code&gt;CHRIN&lt;/code&gt; and converts ASCII characters to numeric values. Two-digit inputs are handled by shifting the first digit into the tens place and adding the second:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;; Convert first digit to tens place
LDA GUESS
ASL          ; ×2
STA TEMP2
ASL          ; ×4
ASL          ; ×8
CLC
ADC TEMP2    ; ×10 (8+2)
STA GUESS

; Add second digit
PLA
SEC
SBC #$30     ; ASCII to numeric
ADC GUESS
STA GUESS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This uses shifts (&lt;strong&gt;ASL&lt;/strong&gt;) and arithmetic (&lt;strong&gt;ADC&lt;/strong&gt;) to efficiently calculate the final guess.&lt;/p&gt;

&lt;h3&gt;
  
  
  Graphics Feedback
&lt;/h3&gt;

&lt;p&gt;The graphics screen (memory starting at $0200) is filled with a color based on the guess using &lt;code&gt;FILL_SCREEN&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FILL_SCREEN:
  LDX #0
  LDY #0
FILL_LOOP:
  STA $0200,X ; Fill pages $0200–$05FF
  STA $0300,X
  ; ... continues for all pages
  INX
  BNE FILL_LOOP
  RTS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Colors are set using values like &lt;code&gt;$05&lt;/code&gt; (&lt;strong&gt;green&lt;/strong&gt;) or &lt;code&gt;$02&lt;/code&gt; (&lt;strong&gt;red&lt;/strong&gt;), updating the entire screen instantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Screenshots
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ncu1rger2bgc5cha3y6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ncu1rger2bgc5cha3y6.png" alt="Image description" width="798" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User guesses low number&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzxz346yrh8uncfa9fgp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzxz346yrh8uncfa9fgp.png" alt="Image description" width="800" height="539"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User guesses high number&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqdgfh1ph1txwu1lj5cje.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqdgfh1ph1txwu1lj5cje.png" alt="Image description" width="800" height="994"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User guesses correct number (&lt;strong&gt;WINS&lt;/strong&gt;) and asked if he wants to play another game &lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Attempt Tracking and Display
&lt;/h3&gt;

&lt;p&gt;The ATTEMPTS(&lt;strong&gt;A&lt;/strong&gt;) counter is incremented after each guess. For values ≥10, the number is split into tens and ones digits using a division loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DIVIDE_LOOP:
  CMP #10
  BCC DIGIT_READY ; Exit if &amp;lt;10
  SBC #10         ; Subtract 10
  INX             ; Count tens
  JMP DIVIDE_LOOP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This avoids complex division by repeatedly subtracting 10, demonstrating efficient arithmetic in assembly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Restart Mechanism
&lt;/h3&gt;

&lt;p&gt;After winning, the player can press 'Y' to restart. This resets the screen and jumps back to the start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RESTART:
  JSR SCINIT      ; Reinitialize screen
  JMP $0600       ; Restart program
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Reflection
&lt;/h2&gt;

&lt;p&gt;Writing this game in 6502 Assembly was both frustrating and rewarding. Here are the major challenges I faced and the limitations the program has:&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenges
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Random Number Range&lt;/strong&gt;: Ensuring the random number stayed within 1–99 required masking (AND #$7F) and conditional checks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two-Digit Input&lt;/strong&gt;: Converting ASCII input to a numeric value involved bit shifts and arithmetic, avoiding slow multiplication.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limitations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The random number generator isn't perfectly uniform due to reliance on $FE.&lt;/li&gt;
&lt;li&gt;Input requires pressing Enter after each digit, which might feel unintuitive.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Full Code
&lt;/h2&gt;

&lt;p&gt;Available &lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/lab-3/lab3.asm" rel="noopener noreferrer"&gt;here&lt;/a&gt; (paste into the 6502 Emulator to run).&lt;/p&gt;

</description>
    </item>
    <item>
      <title>OSD700 - RAG Integration: Stage 3</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Thu, 17 Apr 2025 02:04:14 +0000</pubDate>
      <link>https://dev.to/amullagaliev/osd-700-rag-integration-stage-3-4c3g</link>
      <guid>https://dev.to/amullagaliev/osd-700-rag-integration-stage-3-4c3g</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Tensorflow.js&lt;/li&gt;
&lt;li&gt;Settings UI&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;After we have successfully landed the stage 1, 2 to the &lt;a href="https://github.com/tarasglek/chatcraft.org" rel="noopener noreferrer"&gt;Chatcraft.org&lt;/a&gt;, it's time to work on the stage 3.&lt;br&gt;
Today, I am going to describe the embeddings' generation implementation process that I am currently working on. &lt;/p&gt;

&lt;p&gt;Firstly, we gotta stick to the proposed plan, you may find it here:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/868" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        RAG on DuckDB Implementation Based on Prototype
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#868&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F138519917%3Fv%3D4" alt="mulla028 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;mulla028&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/868" rel="noopener noreferrer"&gt;&lt;time&gt;Mar 29, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Description&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Recently, we have implemented a &lt;a href="https://github.com/mulla028/duckdb-rag-prototype" rel="noopener noreferrer"&gt;prototype&lt;/a&gt; of &lt;code&gt;RAG on DuckDB&lt;/code&gt;, and it proves that implementation is doable for the &lt;code&gt;ChatCraft&lt;/code&gt; it's time to start working on it!&lt;/p&gt;
&lt;p&gt;The implementation will take several steps, lets call them &lt;strong&gt;stages&lt;/strong&gt;. Since we already have the set up of DuckDB using &lt;code&gt;duckdb-wasm&lt;/code&gt;, the file loader, and format to text extractors, we are skipping some of the steps(&lt;strong&gt;stages&lt;/strong&gt;). Therefore here are the steps we need to take in order successfully implement it:&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Proposed Implementation Stages&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stage 1: Create Two New Tables in IndexedDB&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Embeddings Table, with foreign key to a file&lt;/li&gt;
&lt;li&gt;Chunks Table, with foreign key to a file&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 2: Implement Chunking Logic&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Proper Chunking with overlap (cf. &lt;a href="https://platform.openai.com/docs/assistants/tools/file-search#customizing-file-search-settings" rel="nofollow noopener noreferrer"&gt;https://platform.openai.com/docs/assistants/tools/file-search#customizing-file-search-settings&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Proper Chunking Storage in IndexedDB&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 3: Implement Embeddings Generation&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Allow using a cloud-based model or local (transformers.js or tensorflow.js)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 4: Vector Search&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Use DuckDB's extension Called &lt;strong&gt;VSS&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Load Embeddings, Chunks, etc. into DuckDB&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;HNSW&lt;/strong&gt; Indexing to Increase Speed of the Search ( &lt;code&gt;HNSW&lt;/code&gt; Indexing Provided by VSS extension)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 5: LLM Integration&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Modify Prompt Construction to Include Retrieved Context&lt;/li&gt;
&lt;li&gt;Implement Source Attribution in Responses&lt;/li&gt;
&lt;li&gt;Adjust Token Management to Account For Context&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 6: Query Processing&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Implement Embedding Generation for &lt;strong&gt;User Queries&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Use the Same Embedding Model as Documents for Consistency(&lt;code&gt;text-embedding-3-small&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a class="mentioned-user" href="https://dev.to/humphd"&gt;@humphd&lt;/a&gt;, @tarasglek please take a look at the proposed implementation stages, and approve them. Let me know if I am missing something :)&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/issues/868" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;Stage 3 has just one point: "&lt;em&gt;Allow using a cloud-based model or local (transformers.js or tensorflow.js)&lt;/em&gt;"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;First, I had to figure out what's &lt;code&gt;tensorflow.js&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tensorflow.js
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;TensorFlow is a software library for machine learning and artificial intelligence.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Alright, it sounds cool, but how should I use it? I made a research and found out that &lt;code&gt;tensorflow.js&lt;/code&gt; has a model called &lt;code&gt;Universal Sentence Encoder&lt;/code&gt; that has &lt;code&gt;embed&lt;/code&gt; method which generates the embeddings for the text passed as parameter.&lt;/p&gt;

&lt;p&gt;Here's the &lt;a href="https://github.com/tensorflow/tfjs-models/tree/master/universal-sentence-encoder" rel="noopener noreferrer"&gt;link&lt;/a&gt; of &lt;strong&gt;Universal Sentence Encoder's source code&lt;/strong&gt; and the &lt;a href="https://www.npmjs.com/package/@tensorflow-models/universal-sentence-encoder" rel="noopener noreferrer"&gt;npm usage documentation&lt;/a&gt;. These resources helped me to implement it. &lt;/p&gt;

&lt;p&gt;The advantage of &lt;code&gt;tensorflow.js&lt;/code&gt; over the &lt;code&gt;openai&lt;/code&gt; model that I have also implemented is that we are running it offline, and it doesn't require the API key and the internet connection, which makes it extremely reliable for the chatcraft users.&lt;/p&gt;

&lt;p&gt;I would love to share the &lt;code&gt;tensorflow.js&lt;/code&gt; implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;EmbeddingsProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./EmbeddingProvider&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/**
 * TensorFlow.js-based embedding provider
 * Uses Universal Sentence Encoder for local embedding generation
 */&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TensorflowEmbeddingsProvider&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;EmbeddingsProvider&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tensorflow-use&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TensorFlow Universal Sentence Encoder&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Local embedding model using TensorFlow.js (512 dimensions)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;dimensions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;maxBatchSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;defaultBatchSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;minBatchSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;CONFIG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxBatchSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;defaultBatchSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;minBatchSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;loadPromise&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="kd"&gt;get&lt;/span&gt; &lt;span class="nc"&gt;CONFIG&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Method not implemented.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="cm"&gt;/**
   * Load the model if it hasn't been loaded yet
   */&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;loadModelIfNeeded&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;loadPromise&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;loadPromise&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Loading Universal Sentence Encoder model...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

          &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@tensorflow/tfjs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;use&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@tensorflow-models/universal-sentence-encoder&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;use&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
          &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Universal Sentence Encoder loaded successfully&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Failed to load Universal Sentence Encoder:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;loadPromise&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;})();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;loadPromise&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="cm"&gt;/**
   * Generate an embedding vector for a single text
   */&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateBatchEmbeddings&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="cm"&gt;/**
   * Generate embedding vectors for multiple texts in batch
   */&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;generateBatchEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[][]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loadModelIfNeeded&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;arrays&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;arrays&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error generating TensorFlow embeddings:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`TensorFlow embedding error: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unknown error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispose&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Here's the open PR:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/873" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        [RAG] Stage - 3: Embeddings Generation
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#873&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F138519917%3Fv%3D4" alt="mulla028 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;mulla028&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/873" rel="noopener noreferrer"&gt;&lt;time&gt;Apr 09, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Description&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This is &lt;strong&gt;stage 3&lt;/strong&gt; of #868. We are adding the capability of generation and storage vector embeddings for document chunks. It introduces modular embedding provider archtecture, and supports:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI's &lt;strong&gt;text-embedding-3-small&lt;/strong&gt; API&lt;/li&gt;
&lt;li&gt;Local &lt;strong&gt;tensorflow.js&lt;/strong&gt; alternative&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://github.com/tarasglek/chatcraft.org/blob/mulla028/868-3/src/lib/ChatCraftFile.ts" rel="noopener noreferrer"&gt;ChatCraftFile&lt;/a&gt; has been extended with methods to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;generate embeddings&lt;/li&gt;
&lt;li&gt;store embeddings&lt;/li&gt;
&lt;li&gt;manage embeddings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Integrated embedding generation with the &lt;a href="https://github.com/tarasglek/chatcraft.org/blob/mulla028/868-3/src/hooks/use-file-import.tsx" rel="noopener noreferrer"&gt;use-file-import&lt;/a&gt; to &lt;strong&gt;automatically create embeddings after chunking&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;New settings added&lt;/strong&gt; to control:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the embedding provider&lt;/li&gt;
&lt;li&gt;batch size&lt;/li&gt;
&lt;li&gt;automatic generation preference&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;[!IMPORTANT]
UI for the embedding preferences required! Therefore, we must land this PR with the updated Settings UI...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Test It&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;[!TIP]
You may want to test it. In order to do so follow the steps below!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ol&gt;
&lt;li&gt;Open &lt;code&gt;CloudFlare&lt;/code&gt; deployment below&lt;/li&gt;
&lt;li&gt;Upload File &lt;strong&gt;&amp;gt;=300KB&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Go to &lt;strong&gt;DevTools&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Application&lt;/code&gt; → &lt;code&gt;Storage&lt;/code&gt; → &lt;code&gt;IndexedDB&lt;/code&gt; → &lt;code&gt;ChatCraftDatabase&lt;/code&gt; → &lt;code&gt;files&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Wuolah! You can see the generated embeddings! (I HOPE :D)&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Screenshot&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/user-attachments/assets/36fd071c-6a07-48f7-9a54-6918ab954045"&gt;&lt;img width="1470" alt="image" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fuser-attachments%2Fassets%2F36fd071c-6a07-48f7-9a54-6918ab954045"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h4 class="heading-element"&gt;Question&lt;/h4&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Maybe we could reduce the size of the minimum chunking size from 300KB → 100 KB. Therefore, the minimum character per chunk is from 1000 → 300.&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/pull/873" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;h2&gt;
  
  
  Settings UI
&lt;/h2&gt;

&lt;p&gt;Since this is an experimental feature that might be unstable or break, we need to make sure that regular users aren't distracted and by default it is turned off. Therefore, we need a switch that turns on the feature without adjusting the code every time. Here's the result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foklju9wjv3v2ak65qzvg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foklju9wjv3v2ak65qzvg.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This PR is still in progress, but embeddings generation works well. Since it is the last week of the term, everyone busy and don't have time to review my code, which is understandable. This is the end of the semester, but I will continue working on the RAG feature and running this blog, I really enjoy doing it! &lt;/p&gt;

</description>
    </item>
    <item>
      <title>OSD700 - RAG Integration: Stage 1 &amp; 2</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Thu, 17 Apr 2025 00:56:53 +0000</pubDate>
      <link>https://dev.to/amullagaliev/osd700-rag-integration-stage-1-2-25lo</link>
      <guid>https://dev.to/amullagaliev/osd700-rag-integration-stage-1-2-25lo</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Preface&lt;/li&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Stage 1&lt;/li&gt;
&lt;li&gt;PR Expansion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Preface
&lt;/h2&gt;

&lt;p&gt;In the previous post, I shared with you that I successfully implemented &lt;a href=""&gt;RAG prototype&lt;/a&gt; locally and described the steps I've taken. Moreover, we decided to try to make this feature work in the &lt;a href=""&gt;ChatCraft.org&lt;/a&gt; which means that the implementation is going to differ a little bit, since unlike my prototype we will be working in a browser-like environment. Therefore, it requires a clear plan that will help land the feature step-by-step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;After the prototype presentation, I filed an issue with the proposal plan. Within the next couple of hours, we had a discussion regarding the plan, and eventually professor has adjusted it and approved.&lt;/p&gt;

&lt;p&gt;That's how the final proposal issue looks like:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/868" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        RAG on DuckDB Implementation Based on Prototype
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#868&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F138519917%3Fv%3D4" alt="mulla028 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;mulla028&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/868" rel="noopener noreferrer"&gt;&lt;time&gt;Mar 29, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Description&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Recently, we have implemented a &lt;a href="https://github.com/mulla028/duckdb-rag-prototype" rel="noopener noreferrer"&gt;prototype&lt;/a&gt; of &lt;code&gt;RAG on DuckDB&lt;/code&gt;, and it proves that implementation is doable for the &lt;code&gt;ChatCraft&lt;/code&gt; it's time to start working on it!&lt;/p&gt;
&lt;p&gt;The implementation will take several steps, lets call them &lt;strong&gt;stages&lt;/strong&gt;. Since we already have the set up of DuckDB using &lt;code&gt;duckdb-wasm&lt;/code&gt;, the file loader, and format to text extractors, we are skipping some of the steps(&lt;strong&gt;stages&lt;/strong&gt;). Therefore here are the steps we need to take in order successfully implement it:&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Proposed Implementation Stages&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stage 1: Create Two New Tables in IndexedDB&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Embeddings Table, with foreign key to a file&lt;/li&gt;
&lt;li&gt;Chunks Table, with foreign key to a file&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 2: Implement Chunking Logic&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Proper Chunking with overlap (cf. &lt;a href="https://platform.openai.com/docs/assistants/tools/file-search#customizing-file-search-settings" rel="nofollow noopener noreferrer"&gt;https://platform.openai.com/docs/assistants/tools/file-search#customizing-file-search-settings&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Proper Chunking Storage in IndexedDB&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 3: Implement Embeddings Generation&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Allow using a cloud-based model or local (transformers.js or tensorflow.js)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 4: Vector Search&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Use DuckDB's extension Called &lt;strong&gt;VSS&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Load Embeddings, Chunks, etc. into DuckDB&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;HNSW&lt;/strong&gt; Indexing to Increase Speed of the Search ( &lt;code&gt;HNSW&lt;/code&gt; Indexing Provided by VSS extension)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 5: LLM Integration&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Modify Prompt Construction to Include Retrieved Context&lt;/li&gt;
&lt;li&gt;Implement Source Attribution in Responses&lt;/li&gt;
&lt;li&gt;Adjust Token Management to Account For Context&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 6: Query Processing&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Implement Embedding Generation for &lt;strong&gt;User Queries&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Use the Same Embedding Model as Documents for Consistency(&lt;code&gt;text-embedding-3-small&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a class="mentioned-user" href="https://dev.to/humphd"&gt;@humphd&lt;/a&gt;, @tarasglek please take a look at the proposed implementation stages, and approve them. Let me know if I am missing something :)&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/issues/868" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;As you can see, it takes 6 stages. Could be more, but we already have these features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic Text Extraction of Uploaded File to IndexedDB&lt;/li&gt;
&lt;li&gt;Chunking Logic (Thanks to one of the contributors)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Stage 1
&lt;/h2&gt;

&lt;p&gt;In the first stage, I had to add the chunking and embeddings tables to the file table in IndexedDB. During the implementation, we have decided to have the embeddings inside the chunks, therefore each chunk has its vector embeddings. It didn't require much time, just a couple of lines of code...&lt;/p&gt;

&lt;h2&gt;
  
  
  PR Expansion
&lt;/h2&gt;

&lt;p&gt;I realized that the PR is too small to be landed, and I have to expand it a little more and implement the chunking logic for each file. Which means that I am implementing Stages 1 and 2 in a single PR. &lt;/p&gt;

&lt;p&gt;After a couple of hours, I pushed a bunch of commits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Created &lt;code&gt;FileChunk[]&lt;/code&gt; type&lt;/li&gt;
&lt;li&gt;Implemented chunking logic&lt;/li&gt;
&lt;li&gt;Added condition that files with the size of &amp;gt;3MB are getting automatically chunked during the import&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rest was a cleanup. However, one of the contributors pointed at the function that he has already implemented for the chunking. It helped me a lot, since I had to remake my chunking logic and it had some problems...&lt;/p&gt;

&lt;p&gt;Eventually, the PR was approved and merged, you may take a look right here:&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/870" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        [RAG] Stages 1 &amp;amp; 2: New Columns and Chunking
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#870&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F138519917%3Fv%3D4" alt="mulla028 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;mulla028&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/870" rel="noopener noreferrer"&gt;&lt;time&gt;Apr 01, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;p&gt;Stage 1 for #868&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Description&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This is the &lt;strong&gt;stage 1&lt;/strong&gt; of &lt;strong&gt;RAG&lt;/strong&gt; implementation. Since we've decided to use &lt;strong&gt;vector search&lt;/strong&gt; ChatCraft requires two new tables, as it is stated in the &lt;code&gt;Proposed Implementation&lt;/code&gt;. However, &lt;a class="mentioned-user" href="https://dev.to/humphd"&gt;@humphd&lt;/a&gt; suggested to add two new columns to the &lt;code&gt;ChatCraftFileTable&lt;/code&gt; - &lt;strong&gt;chunks&lt;/strong&gt; and &lt;strong&gt;embeddings&lt;/strong&gt;. These are optional columns, &lt;strong&gt;chunks&lt;/strong&gt; that will contain chunked text ( Planned to be implemented during the stage 2 - next.) Therefore, &lt;strong&gt;embeddings&lt;/strong&gt; - will contain generated by model and based on chunks &lt;code&gt;vector embeddings&lt;/code&gt; ( The implementation is planned to be done at &lt;strong&gt;stage - 3&lt;/strong&gt; .)&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Small Concern&lt;/h3&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;I totally understand that these columns are optional, but do we need to add them to data schema as the new fields in the indexedDB like this? I don't see that we have any other optional column, so I decided not to include them to  PR.&lt;/p&gt;
&lt;div class="highlight highlight-source-ts js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;version&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-c1"&gt;13&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stores&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;files&lt;/span&gt;: &lt;span class="pl-s"&gt;"id, name, type, size, text, created, chunks, embeddings"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;UPD:&lt;/strong&gt; Decided to implement chunking here as well&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/pull/870" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Now it means that stages 1,2 are done, and we have to move forward to the stage 3 - embeddings generation, it will be interesting :)&lt;/p&gt;

</description>
    </item>
    <item>
      <title>OSD700 - RAG on DuckDB</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Wed, 16 Apr 2025 23:15:13 +0000</pubDate>
      <link>https://dev.to/amullagaliev/osd700-rag-on-duckdb-b2a</link>
      <guid>https://dev.to/amullagaliev/osd700-rag-on-duckdb-b2a</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Preface&lt;/li&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;How it works?&lt;/li&gt;
&lt;li&gt;Final Decision&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Preface
&lt;/h2&gt;

&lt;p&gt;In the last post dedicated to &lt;strong&gt;OSD700&lt;/strong&gt;, I wrote that I've chosen a vector of the development for the rest of the term. If you don't remember, read it. If you are busy, I decided to implement the local prototype of RAG and based on the results decide whether to integrate it into &lt;a href="https://github.com/tarasglek/chatcraft.org" rel="noopener noreferrer"&gt;ChatCraft.org&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;A month later I am coming back with a group of posts regarding the results, and as a spoiler, I can tell, they are promising!&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Going back to the previous post, I am going to attach the prototype issue here:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/803" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        Prototype RAG on DuckDB and File Attachments
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#803&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/humphd" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F427398%3Fv%3D4" alt="humphd avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/humphd" rel="noopener noreferrer"&gt;humphd&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/803" rel="noopener noreferrer"&gt;&lt;time&gt;Jan 27, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;p&gt;ChatCraft has been expanded to include File Attachments and DuckDB, which supports querying files.  The two features have been connected, so you can attach files, run SQL queries on them, get back results, download them, etc.&lt;/p&gt;
&lt;p&gt;Now that we have this foundation, I think we have most of what we need for building a &lt;a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" rel="nofollow noopener noreferrer"&gt;RAG&lt;/a&gt; solution, when file attachments are too large to put into the chat context.&lt;/p&gt;
&lt;p&gt;I think the process would work like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;user attaches some files with text we can extract (PDF, source code, Word Doc, etc)&lt;/li&gt;
&lt;li&gt;somehow (UI? automatically based on file size) we decide when use these file attachments for RAG vs. embedding directly in the chat messages&lt;/li&gt;
&lt;li&gt;we take the set of RAG-attachment-files and "index" them in DuckDB.  Maybe we use &lt;a href="https://duckdb.org/2021/01/25/full-text-search.html" rel="nofollow noopener noreferrer"&gt;full-text search&lt;/a&gt; or maybe we use vector search (see &lt;a href="https://motherduck.com/blog/search-using-duckdb-part-1/" rel="nofollow noopener noreferrer"&gt;part 1&lt;/a&gt;, &lt;a href="https://motherduck.com/blog/search-using-duckdb-part-2/" rel="nofollow noopener noreferrer"&gt;part 2&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;when the user asks a question, we use their prompt to create a query, get back results from the indexed docs, and include relevant text context along with the original prompt&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The initial version of this can be crude, without proper UI, optimal indexing, etc.  We need to play a bit to get this right.&lt;/p&gt;
&lt;p&gt;Likely, the best way to begin this work is to prototype it outside of ChatCraft using DuckDB and text files locally.&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/issues/803" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Hopefully, it helped refresh your memory. &lt;/p&gt;

&lt;p&gt;Research took about 2 weeks, and it really helped to understand what's happening and how, at least locally. &lt;/p&gt;

&lt;p&gt;After those two weeks, I had to try to implement the feature locally and present it in the class. After the first failed attempt, I didn't give up and made it work. &lt;/p&gt;

&lt;p&gt;Here's the repository where you may find my find prototype solution:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        mulla028
      &lt;/a&gt; / &lt;a href="https://github.com/mulla028/duckdb-rag-prototype" rel="noopener noreferrer"&gt;
        duckdb-rag-prototype
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      CLI RAG prototype on DuckDB implemented for ChatCraft
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;duckdb-rag-prototype&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;CLI RAG prototype on DuckDB implemented for ChatCraft using &lt;strong&gt;vector search&lt;/strong&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Getting Started&lt;/h2&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;Clone the Repository&lt;/li&gt;
&lt;li&gt;Install DuckDB on your local machine (Optional, used for testing &lt;code&gt;duckdb -ui&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Install the dependencies
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;npm i&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Create &lt;strong&gt;.env&lt;/strong&gt; file and add there your &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;How to use it?&lt;/h2&gt;
&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;Once you have cloned the repo you will get the populated data inside of the targeted folder called &lt;code&gt;documents&lt;/code&gt;. Obviously, you may add any text file to process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ol&gt;
&lt;li&gt;First of all you will need to process all the files using the command:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;npm run rag -- process&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;This command will process all the files, segmenting them into the &lt;strong&gt;chunks&lt;/strong&gt; of the &lt;strong&gt;sentences(default)&lt;/strong&gt; or &lt;strong&gt;paragraphss&lt;/strong&gt;. Eventually, it will generate vector embeddings of 1&lt;strong&gt;584 dimensions&lt;/strong&gt; using &lt;code&gt;text-embedding-3-small&lt;/code&gt; model.&lt;/p&gt;
&lt;p&gt;To use the &lt;strong&gt;paragraphs option&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;npm run rag -- process -c &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;paragraphs&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
npm run rag -- process --chunking &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;paragraphs&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/mulla028/duckdb-rag-prototype" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;If you are willing to try it, I have written &lt;code&gt;README.md&lt;/code&gt; that guides on how to use this prototype. &lt;/p&gt;

&lt;h2&gt;
  
  
  How it works?
&lt;/h2&gt;

&lt;p&gt;Essentially, the process that sounds really complex consists of multiple simple stages and one complex. Here they are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Receive text from the file. (We learnt it during the first semester)&lt;/li&gt;
&lt;li&gt;Chunk Text and store at DuckDB. (Write a logic that chunks text e.g., by paragraph, by sentence etc.)&lt;/li&gt;
&lt;li&gt;Generate the Vector Embeddings and store at DuckDB. (Using &lt;code&gt;text-embedding-3-small&lt;/code&gt; &lt;strong&gt;openai model&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Vector Search (Hard one) 

&lt;ul&gt;
&lt;li&gt;import &lt;strong&gt;DuckDB VSS extension&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;HNSW&lt;/strong&gt; Indexing to Increase Speed of the Search (HNSW Indexing Provided by VSS extension)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Generate Embeddings for User Query and Generate Answer&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Final Decision
&lt;/h2&gt;

&lt;p&gt;Countless hours of &lt;strong&gt;RAG&lt;/strong&gt; research and prototyping eventually paid off, so professor liked the way I implemented/understood this problem, and we made a decision to try the integration to &lt;a href="https://github.com/tarasglek/chatcraft.org" rel="noopener noreferrer"&gt;ChatCraft.org&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;However, firstly I needed to write the proposal implementation issue, which would clearly identify all the steps it requires in order to successfully implement the feature. &lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600 Project - Stage 2: Function Clone Detection and Analysis (Part 2)</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Mon, 07 Apr 2025 12:10:16 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-project-stage-2-function-clone-detection-and-analysis-part-2-16k9</link>
      <guid>https://dev.to/amullagaliev/spo600-project-stage-2-function-clone-detection-and-analysis-part-2-16k9</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Reproducing My Setup&lt;/li&gt;
&lt;li&gt;
Detailed Test Results

&lt;ul&gt;
&lt;li&gt;x86_64 Results&lt;/li&gt;
&lt;li&gt;aarch64 Challenges&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Capabilities and Limitations

&lt;ul&gt;
&lt;li&gt;What My Implementation Can Do&lt;/li&gt;
&lt;li&gt;Technical Limitations&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Knowledge Gaps and Personal Reflections&lt;/li&gt;

&lt;li&gt;Technical Improvements for Stage III&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Reproducing My Setup
&lt;/h2&gt;

&lt;p&gt;To replicate my work, follow these steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;First, create or modify pass file located at &lt;code&gt;gcc/gcc/tree-amullagaliev.cc&lt;/code&gt; with implementation shown in &lt;a href=""&gt;Part 1&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Navigate to your GCC build directory:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;cd&lt;/span&gt; ~/gcc-build-001/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Rebuild GCC with your modified pass:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;time &lt;/span&gt;make &lt;span class="nt"&gt;-j&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;nproc&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Test your pass using provided test cases:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/test/directory
   &lt;span class="nb"&gt;tar&lt;/span&gt; &lt;span class="nt"&gt;-xzf&lt;/span&gt; /public/spo600-test-clone.tgz
   &lt;span class="nb"&gt;cd &lt;/span&gt;spo600/examples/test-clone
   make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Detailed Test Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  x86_64 Results
&lt;/h3&gt;

&lt;p&gt;On the &lt;strong&gt;x86_64&lt;/strong&gt; platform, my pass successfully identified both prune and no-prune cases:&lt;/p&gt;

&lt;p&gt;For prune test case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)

************************************************************
*  ANALYZATION - Examining Function: scale_samples                *
************************************************************
************************************************************
*  ANALYSIS FINISHED!
************************************************************

;; Function scale_samples.popcnt (scale_samples.popcnt, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28)

************************************************************
*  ANALYZATION - Examining Function: scale_samples.popcnt         *
************************************************************
PRUNE: scale_samples
CLONE FOUND: scale_samples
CURRENT: scale_samples.popcnt
************************************************************
*  End of Diagnostic
************************************************************
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For no-prune test case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;;; Function scale_samples.arch_x86_64_v3 (scale_samples.arch_x86_64_v3, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28)

************************************************************
*  ANALYZATION - Examining Function: scale_samples.arch_x86_64_v3 *
************************************************************
NOPRUNE: scale_samples
CLONE FOUND: scale_samples
CURRENT: scale_samples.arch_x86_64_v3
************************************************************
*  End of Diagnostic
************************************************************
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  aarch64 Challenges
&lt;/h3&gt;

&lt;p&gt;On &lt;strong&gt;aarch64&lt;/strong&gt; platform, I encountered this error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default","rng") ))'\
    -march=armv8-a -g -O3 -fno-lto  -ftree-vectorize  -fdump-tree-all -fdump-ipa-all -fdump-rtl-all \
    clone-test-core.c vol_createsample.o -o clone-test-aarch64-prune
clone-test-core.c:28:6: error: pragma or attribute 'target("rng")' is not valid
   28 | void scale_samples(int16_t *in, int16_t *out, int cnt, int volume) {
      |      ^~~~~~~~~~~~~
make: *** [Makefile:35: clone-test-aarch64-prune] Error 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This error happens because &lt;code&gt;rng&lt;/code&gt; is not valid target attribute for aarch64 architecture. Unlike x86_64 which has attributes like "popcnt" and "arch=x86-64-v3", aarch64 has different set of supported CPU features.&lt;/p&gt;

&lt;p&gt;This shows important cross-platform thing: Function Multi-Versioning attributes are architecture-specific, and code that works on one architecture may need changes to work on another.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capabilities and Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What My Implementation Can Do
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Base Function Identification&lt;/strong&gt;: Correctly identifies base name of cloned functions by stripping variant suffixes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Resolver Function Detection&lt;/strong&gt;: Recognizes and skips resolver functions, which handle runtime selection between clones.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Basic Structure Comparison&lt;/strong&gt;: Compares number of basic blocks and GIMPLE statements to determine if functions potentially equivalent.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Output Requirements&lt;/strong&gt;: Produces required &lt;code&gt;PRUNE&lt;/code&gt; or &lt;code&gt;NOPRUNE&lt;/code&gt; messages in correct format.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;State Management&lt;/strong&gt;: Manages state between function calls to compare different clones using std::string for storage rather than C-style character arrays, making code more robust.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Technical Limitations
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Superficial Comparison&lt;/strong&gt;: Comparison based only on block and statement counts, not on actual code semantics. Two functions with same number of statements but different logic would incorrectly considered identical.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No SSA Variable Normalization&lt;/strong&gt;: Implementation doesn't normalize variable names before comparison, which more robust solution would do.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Architecture Dependency&lt;/strong&gt;: As shown by aarch64 error, current implementation doesn't fully handle cross-architecture differences in FMV attributes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Single Clone Assumption&lt;/strong&gt;: Code assumes only one cloned function with two variants, as per project specs, but this isn't scalable to real-world codebases.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Limited Structural Analysis&lt;/strong&gt;: My pass don't analyze control flow structure within functions, which would be necessary for truly robust clone detection algorithm.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Knowledge Gaps and Personal Reflections
&lt;/h2&gt;

&lt;p&gt;This project revealed several knowledge gaps I need address:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Architecture-Specific Compiler Features&lt;/strong&gt;: I need deeper understanding of how architecture-specific features implemented across different platforms. My aarch64 issues highlighted this gap.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GCC GIMPLE Internals&lt;/strong&gt;: While I understand basics of GIMPLE representation, I need more thorough understanding of how to analyze and compare GIMPLE statements.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cross-Architecture Testing&lt;/strong&gt;: I need better strategies for developing and testing features that must work across different architectures.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I found most challenging aspect to be understanding how to effectively compare function structures beyond simple metrics. Specifically, I struggled with how to normalize SSA variables and other identifiers to determine when two functions was semantically equivalent despite superficial differences.&lt;/p&gt;

&lt;p&gt;The most interesting part was seeing how GCC implements Function Multi-Versioning - resolver functions and naming conventions used for variants gave me insight into how runtime feature detection works. Professor Chris Tyler lectures were incredibly helpful in understanding these concepts and inner workings of GCC. I couldnt figure this out without his explanations!&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Improvements for Stage III
&lt;/h2&gt;

&lt;p&gt;For Stage III, I plan address these technical issues:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deep Structural Comparison&lt;/strong&gt;: Implement GIMPLE statement-by-statement comparison that normalizes SSA variables, labels, and basic block numbers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Architecture-Agnostic Approach&lt;/strong&gt;: Modify implementation to handle architecture-specific differences in more robust way.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hash-Based Signature&lt;/strong&gt;: Generate normalized hash or signature for each function structure to make comparisons more efficient and accurate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Better State Management&lt;/strong&gt;: Improve current state management to handle more complex scenarios with multiple clones.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Control Flow Analysis&lt;/strong&gt;: Add analysis of control flow structure, which would capture logical equivalence of functions beyond just statement counts.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Stage 2 has been challenging yet enlightening. Working directly with GCC has given me a deeper understanding of compiler optimization techniques, particularly Function Multi-Versioning.&lt;/p&gt;

&lt;p&gt;Cross-architecture challenges I encountered were unexpected but provided valuable learning experiences about how compiler features can differ between platforms.&lt;/p&gt;

&lt;p&gt;While my current implementation fulfills the basic requirements of the project, the limitations I identified provide clear direction for improvements in Stage III. I'm particularly interested in developing a more robust comparison algorithm that can accurately determine when two functions are semantically equivalent despite sketchy differences.&lt;/p&gt;

&lt;p&gt;Professor Chris Tyler guidance and lectures been instrumental in this journey, providing foundation of knowledge needed to tackle these complex compiler topics. Without his clear explanations, navigating GCC internal structures would be much more difficult.&lt;/p&gt;

&lt;p&gt;The journey continues in Stage III, where I'll improve these techniques and address the limitations identified here.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Note: All code for this project is available in my &lt;a href="https://github.com/mulla028/SPO600-Project" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. Feel free to clone it and follow steps above to reproduce my results.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600: Project Stage 2 - Function Clone Detection and Analysis (Part 1)</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Mon, 07 Apr 2025 10:59:51 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-project-stage-2-function-clone-detection-and-analysis-part-1-2ob7</link>
      <guid>https://dev.to/amullagaliev/spo600-project-stage-2-function-clone-detection-and-analysis-part-1-2ob7</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Project Requirements&lt;/li&gt;
&lt;li&gt;What is Function Multi-Versioning (FMV)?&lt;/li&gt;
&lt;li&gt;Understanding the Challenge&lt;/li&gt;
&lt;li&gt;My Implementation Approach&lt;/li&gt;
&lt;li&gt;Key Functions and Data Structures&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome back to my &lt;strong&gt;SPO600 blog&lt;/strong&gt;! If you didn't read my &lt;a href="https://dev.to/amullagaliev/spo600-project-stage-1-basic-gcc-pass-51i8"&gt;previous post&lt;/a&gt;, in which I described creating a basic &lt;code&gt;GCC pass&lt;/code&gt; that counts basic blocks and GIMPLE statements, you should definitely check it out first to understand the foundation of what we're building on. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In Stage 2, we tackle a more complex challenge: building a &lt;strong&gt;Clone-Pruning Analysis Pass for GCC&lt;/strong&gt;. This pass analyzes functions cloned during compilation and determines if they are substantially similar enough to be pruned. It's a deep dive into &lt;strong&gt;GCC optimization&lt;/strong&gt; processes and how we can &lt;strong&gt;extend compiler capabilities&lt;/strong&gt;!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Learning about GCC was really interesting, especially through &lt;a href="https://github.com/ctyler" rel="noopener noreferrer"&gt;Professor&lt;/a&gt; Chris Tyler's lectures. His clear explanations helped me a lot to navigate the complexity of compiler development. Without his videos, I probably still trying to understand how GCC works! &lt;/p&gt;

&lt;h2&gt;
  
  
  Project Requirements
&lt;/h2&gt;

&lt;p&gt;For Stage 2, we need to create a pass that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifies functions that have been cloned (these will have names like &lt;code&gt;function.variant&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Examines clones to determine if they are &lt;strong&gt;substantially the same&lt;/strong&gt; or different&lt;/li&gt;
&lt;li&gt;Outputs diagnostic message indicating whether functions should be pruned or not&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To simplify the project, we allowed to make these assumptions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is only one cloned function in the program&lt;/li&gt;
&lt;li&gt;There are only two versions (clones) of that function (ignoring resolver)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is Function Multi-Versioning (FMV)?
&lt;/h2&gt;

&lt;p&gt;Before diving into implementation, it's worth understanding what &lt;strong&gt;function cloning&lt;/strong&gt; or &lt;strong&gt;multi-versioning&lt;/strong&gt; is &lt;strong&gt;in GCC&lt;/strong&gt;. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Function Multi-Versioning&lt;/strong&gt; &lt;em&gt;is technique where a compiler creates multiple versions of the same function, each optimized for different processor capabilities. For example, one version might use AVX instructions for newer processors, while another uses more basic instructions for compatibility.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;NOTE!&lt;/strong&gt; When program runs, resolver function chooses appropriate version based on actual CPU capabilities of machine. This allows single binary to efficiently run on different processor generations without needing separate builds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Challenge
&lt;/h2&gt;

&lt;p&gt;The challenge here is to determine when two function variants are &lt;code&gt;substantially the same&lt;/code&gt;. According to project specs, functions are &lt;strong&gt;substantially the same&lt;/strong&gt; if they are identical except for identifiers like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Temporary variable names&lt;/li&gt;
&lt;li&gt;Single static assignment (SSA) variable names&lt;/li&gt;
&lt;li&gt;Labels&lt;/li&gt;
&lt;li&gt;Basic block numbers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If two cloned functions are substantially same, there's no reason to keep both versions in final binary—we can &lt;code&gt;prune&lt;/code&gt; the redundant one.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Implementation Approach
&lt;/h2&gt;

&lt;p&gt;After studying GCC codebase and our previous work from Stage 1, I decided implement relatively simple but effective approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Function Recognition&lt;/strong&gt;: Identify base function name by stripping away variant suffixes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Basic Comparison Metrics&lt;/strong&gt;: Compare structure of functions using:

&lt;ul&gt;
&lt;li&gt;Number of basic blocks&lt;/li&gt;
&lt;li&gt;Number of GIMPLE statements&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;While this isn't full structural comparison, it provides solid first-pass heuristic. Functions with different block counts or statement counts definitely different, while those with matching counts likely similar (though not guaranteed).&lt;/p&gt;

&lt;p&gt;Here's core logic from my implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Clone comparison logic using previous_function_name as flag.&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;previous_function_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// No clone stored; store current information.&lt;/span&gt;
    &lt;span class="n"&gt;previous_function_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;previous_block_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bb_count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;previous_statement_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gimple_count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="c1"&gt;// For standalone function, print footer.&lt;/span&gt;
    &lt;span class="n"&gt;print_frame_footer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ANALYSIS FINISHED!"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// A clone already been stored; compare stored info with current function.&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;previous_function_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;base_name&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
        &lt;span class="n"&gt;previous_block_total&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;bb_count&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
        &lt;span class="n"&gt;previous_statement_total&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;gimple_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"PRUNE: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;previous_function_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CLONE FOUND: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;previous_function_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CURRENT: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;full_fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"NOPRUNE: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;previous_function_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CLONE FOUND: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;previous_function_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CURRENT: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;full_fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;print_frame_footer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"End of Diagnostic"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Clear previous_function_name to allow storing next clone.&lt;/span&gt;
    &lt;span class="n"&gt;previous_function_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;previous_block_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;previous_statement_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My overall approach was:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Store information about first clone encountered&lt;/li&gt;
&lt;li&gt;When encountering second clone with same base name, compare it to stored information&lt;/li&gt;
&lt;li&gt;Output &lt;code&gt;PRUNE&lt;/code&gt; or &lt;code&gt;NOPRUNE&lt;/code&gt; decision based on comparison&lt;/li&gt;
&lt;li&gt;Reset stored state for potential future clone pairs&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Key Functions and Data Structures
&lt;/h2&gt;

&lt;p&gt;Main components of my implementation include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static Storage Variables&lt;/strong&gt;: To maintain state between function calls
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;previous_function_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;previous_block_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;previous_statement_total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base Name Extraction&lt;/strong&gt;: Strips variant suffixes to find base function name
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;  &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;get_base_function_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;cgraph_node&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cgraph_node&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;decl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;nullptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                                            &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
      &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".resolver"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resolver Detection&lt;/strong&gt;: Special handling for resolver functions
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;  &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;is_resolver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;full_fname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".resolver"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;npos&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_resolver&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;print_frame_footer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ANALYSIS FINISHED (resolver function)"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Block and Statement Counting&lt;/strong&gt;: Basic metrics for function comparison
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;  &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;bb_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;gimple_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;basic_block&lt;/span&gt; &lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;FOR_EACH_BB_FN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fun&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;bb_count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gimple_stmt_iterator&lt;/span&gt; &lt;span class="n"&gt;gsi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gsi_start_bb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
           &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;gsi_end_p&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gsi&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
           &lt;span class="n"&gt;gsi_next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gsi&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="n"&gt;gimple_count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(In Part 2, I'll cover testing process, results, limitations, and future improvements for project.)&lt;/p&gt;

</description>
    </item>
    <item>
      <title>OSD700: Stage 4</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Thu, 13 Mar 2025 14:25:25 +0000</pubDate>
      <link>https://dev.to/amullagaliev/osd700-stage-4-1e2g</link>
      <guid>https://dev.to/amullagaliev/osd700-stage-4-1e2g</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In last week's lecture, the professor helped me pick the work to do for the rest of this term. My main goal was to minimize front-end development and gain experience in back-end or middle-end. However, it doesn't mean that I won't work on UI/UX, it means that I will work in killer feature development.&lt;/p&gt;

&lt;p&gt;Therefore, we came up with an idea to develop a RAG on DuckDB. &lt;/p&gt;

&lt;h2&gt;
  
  
  What's RAG?
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;"Retrieval-Augmented Generation (RAG) is a hybrid AI framework that enhances language model outputs by combining the model's inherent knowledge with information retrieved from external sources. When a query is received, RAG first searches through connected databases, documents, or knowledge bases to find relevant information, then feeds this retrieved context alongside the original query into the language model. This approach addresses several limitations of standalone language models by providing access to up-to-date information beyond the model's training cutoff, reducing hallucinations by grounding responses in verified sources, enabling attribution to specific documents, and allowing for domain specialization without extensive model fine-tuning. RAG has become fundamental in enterprise AI applications, search engines, and customer support systems where factual accuracy and current information are essential." - &lt;em&gt;Claude AI&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The rag will help ChatCraft search through the text files and give an answer based on the user's prompt. There is a filed issue that describes everything:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/803" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fassets.dev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        Prototype RAG on DuckDB and File Attachments
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#803&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/humphd" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F427398%3Fv%3D4" alt="humphd avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/humphd" rel="noopener noreferrer"&gt;humphd&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/issues/803" rel="noopener noreferrer"&gt;&lt;time&gt;Jan 27, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;p&gt;ChatCraft has been expanded to include File Attachments and DuckDB, which supports querying files.  The two features have been connected, so you can attach files, run SQL queries on them, get back results, download them, etc.&lt;/p&gt;
&lt;p&gt;Now that we have this foundation, I think we have most of what we need for building a &lt;a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" rel="nofollow noopener noreferrer"&gt;RAG&lt;/a&gt; solution, when file attachments are too large to put into the chat context.&lt;/p&gt;
&lt;p&gt;I think the process would work like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;user attaches some files with text we can extract (PDF, source code, Word Doc, etc)&lt;/li&gt;
&lt;li&gt;somehow (UI? automatically based on file size) we decide when use these file attachments for RAG vs. embedding directly in the chat messages&lt;/li&gt;
&lt;li&gt;we take the set of RAG-attachment-files and "index" them in DuckDB.  Maybe we use &lt;a href="https://duckdb.org/2021/01/25/full-text-search.html" rel="nofollow noopener noreferrer"&gt;full-text search&lt;/a&gt; or maybe we use vector search (see &lt;a href="https://motherduck.com/blog/search-using-duckdb-part-1/" rel="nofollow noopener noreferrer"&gt;part 1&lt;/a&gt;, &lt;a href="https://motherduck.com/blog/search-using-duckdb-part-2/" rel="nofollow noopener noreferrer"&gt;part 2&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;when the user asks a question, we use their prompt to create a query, get back results from the indexed docs, and include relevant text context along with the original prompt&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The initial version of this can be crude, without proper UI, optimal indexing, etc.  We need to play a bit to get this right.&lt;/p&gt;
&lt;p&gt;Likely, the best way to begin this work is to prototype it outside of ChatCraft using DuckDB and text files locally.&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/issues/803" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  What Have I Done?
&lt;/h2&gt;

&lt;p&gt;I started working toward prototype implementation. It took me a while to research how everything works, and the first small steps were taken, but I consider my local prototype a super raw version. &lt;/p&gt;

&lt;p&gt;Using langchain, openai and duckdb, I am working on the local version of this new feature before I start web implementation and eventually, implementing it in ChatCraft! It will take some time, but I am really motivated to finish it, and present.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This is a small blog post since I spent a week on research and a small part of the implementation. However, next week, I will write a huge blog post on how to implement RAG on the Duckdb prototype locally. Will see y'all!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>programming</category>
      <category>webdev</category>
    </item>
    <item>
      <title>SPO600: Project Stage 1 - Basic GCC Pass</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Mon, 10 Mar 2025 04:03:04 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-project-stage-1-basic-gcc-pass-51i8</link>
      <guid>https://dev.to/amullagaliev/spo600-project-stage-1-basic-gcc-pass-51i8</guid>
      <description>&lt;h1&gt;
  
  
  Table of Contents
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;
Steps To Create a Pass

&lt;ul&gt;
&lt;li&gt;What is GCC Pass?&lt;/li&gt;
&lt;li&gt;Step 1 - Write a Pass&lt;/li&gt;
&lt;li&gt;Step 2 - Registering the Pass&lt;/li&gt;
&lt;li&gt;Step 3 - Add Object File&lt;/li&gt;
&lt;li&gt;Step 4 - Modify Header File&lt;/li&gt;
&lt;li&gt;Step 5 - Re-create the Makefile&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
Results

&lt;ul&gt;
&lt;li&gt;Dump File Outputs&lt;/li&gt;
&lt;li&gt;Code Limitations&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;This blog post is dedicated to &lt;code&gt;SPO600's Project - Stage 1&lt;/code&gt;, where I particularly work with &lt;strong&gt;GCC&lt;/strong&gt;. If you haven't read the previous post where I described the steps to build &lt;strong&gt;GCC compiler&lt;/strong&gt; on &lt;strong&gt;Aarch64&lt;/strong&gt; and &lt;strong&gt;x86_64&lt;/strong&gt; servers and eventually compared them, I &lt;strong&gt;highly recommend doing so!&lt;/strong&gt;. Therefore, you'd 100% understand what, why and how I do in the current post!&lt;/p&gt;

&lt;p&gt;Stage one helps students prepare their environment and GCC for the second stage, where the major "heavy lifting" will happen. During this stage, I am creating &lt;strong&gt;Basic GCC Pass&lt;/strong&gt; for the current development version of the GCC compiler which:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Iterates through the code being compiled.&lt;/li&gt;
&lt;li&gt;Prints the name of every function being compiled.&lt;/li&gt;
&lt;li&gt;Prints a count of the number of basic blocks in each function.&lt;/li&gt;
&lt;li&gt;Prints a count of the number of gimple statements in each function. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Modern GCC has poor documentation on how to create a pass, so I referred to video-lectures and documentation provided and created by our &lt;a href="https://github.com/ctyler" rel="noopener noreferrer"&gt;professor&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:creating_a_gcc_pass" rel="noopener noreferrer"&gt;Creating a GCC Pass&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:start#week_6_-_class_ii" rel="noopener noreferrer"&gt;Lecture #1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:start#week_7_-_class_i" rel="noopener noreferrer"&gt;Lecture #2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Steps To Create a Pass
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Honestly, these steps do not require extraordinary knowledge but patience and attention. Following these steps and provided resources, anyone may reproduce whatever has been done in this stage. However, it is logical since we were notified that it is a preparation of the environments.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What is GCC Pass?
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;"A GCC pass is a modular component within the GNU Compiler Collection that performs a specific transformation or analysis task during the compilation process. Each pass operates on an intermediate representation of the code (such as GIMPLE or RTL), executing in a predetermined order within the compilation pipeline to transform source code into machine code. Passes can analyze code, optimize it, clean up after other passes, or implement target-specific transformations, with the pass manager coordinating their execution. Compiler developers can create custom passes to extend GCC's functionality, as in your project where you're implementing a pass to count basic blocks and GIMPLE statements within functions." - &lt;em&gt;Claude AI (Sonnet 3.7)&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Back to square one!&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1 - Write a Pass
&lt;/h3&gt;

&lt;p&gt;Once I built a GCC compiler, I had to look over the &lt;code&gt;GCC passes&lt;/code&gt; to understand how it looked and pick one of them as a template. To do so, I went to the source of gcc, where I was able to find those passes.&lt;/p&gt;

&lt;p&gt;Since I cloned GCC from the git repository, my source code is located at &lt;code&gt;~/git/gcc&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Move to the &lt;code&gt;gcc&lt;/code&gt; &lt;strong&gt;sub-directory&lt;/strong&gt;: &lt;code&gt;cd gcc&lt;/code&gt;, you will get to &lt;code&gt;~/git/gcc/gcc&lt;/code&gt; where is located actual compiler implementation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Look for the files starting with &lt;code&gt;tree-*.cc&lt;/code&gt; or &lt;code&gt;tree-*.c&lt;/code&gt; for passes that work on the tree/GIMPLE representation:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ll tree&lt;span class="k"&gt;*&lt;/span&gt;.cc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will find this kind of list of the passes implemented by GCC developers:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58x0p59i1lfiaxfn1wsx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58x0p59i1lfiaxfn1wsx.png" alt="Image description" width="800" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pick one of these templates as the starting point. &lt;/p&gt;

&lt;p&gt;The professor's first example was used and found at &lt;code&gt;gcc/gcc/tree-nrv.cc&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Throughout this stage, I'd been reproducing professors' code to make sure that I was going along.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test Pass may be found in my git repository, simply copy it if you want to keep things simple, and go along with this tutorial:&lt;/strong&gt; &lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-1/tree-amullagaliev-0.1.cc" rel="noopener noreferrer"&gt;click here to see the source code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This test pass simply outputs all of the compiled functions in the dump file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My final source code implementation for the pass may be found here&lt;/strong&gt;: &lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-1/tree-amullagaliev.cc" rel="noopener noreferrer"&gt;final pass' source code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This implementation shows the names of each function alongside counting the basic blocks and gimple statements. Eventually, shows the total numbers after every function. &lt;em&gt;NOTE: Outputs will be presented as the final step&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2 - Registering the Pass
&lt;/h3&gt;

&lt;p&gt;It is very important to register the pass in &lt;code&gt;passes.def&lt;/code&gt; in order for GCC to recognize the custom pass. Otherwise, it won't work. &lt;/p&gt;

&lt;p&gt;This file is located at &lt;code&gt;~/git/gcc/gcc/passes.def&lt;/code&gt;. This file processes a lot of passes during the compilation therefore the order is important, I decided to do the same as the professor and put it under:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;NEXT_PASS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tree_nrv&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My modified file looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;NEXT_PASS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tree_nrv&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;NEXT_PASS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tree_amullagaliev&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3 - Add Object File
&lt;/h3&gt;

&lt;p&gt;I added the object file for my pass to the file &lt;code&gt;Makefile.in&lt;/code&gt; in the &lt;strong&gt;OBJS&lt;/strong&gt; section. My source file is &lt;code&gt;tree-amullagaliev.cc&lt;/code&gt; therefore, I added &lt;code&gt;tree-amullagaliev.o&lt;/code&gt; to the &lt;strong&gt;OBJS&lt;/strong&gt; list. &lt;/p&gt;

&lt;p&gt;It should look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cm"&gt;/* existing content before the modification*/&lt;/span&gt;
&lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;dfa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; \
&lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;amullagaliev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; \
&lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;diagnostic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; \
&lt;span class="cm"&gt;/* existing content continues */&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4 - Modify Header File
&lt;/h3&gt;

&lt;p&gt;To make my pass recognizable in earlier modified &lt;code&gt;passes.def&lt;/code&gt;, I had to declare it in the &lt;code&gt;tree-pass.h&lt;/code&gt; header file. Which can be found at &lt;code&gt;~/git/gcc/gcc/tree-pass.h&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Add this declaration in order to allow GCC to recognize the function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="n"&gt;gimple_opt_pass&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;make_tree_amullagaliev&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gcc&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;contenxt&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctxt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5 - Re-create the Makefile
&lt;/h3&gt;

&lt;p&gt;As one of the final steps, I had to re-create the &lt;code&gt;Makefile&lt;/code&gt; inside of my build tree, which is &lt;code&gt;~/gcc-build-001&lt;/code&gt;. I had to do it in order for changes in the &lt;code&gt;Makefile.in&lt;/code&gt; to be recognized - the build system wouldn't automatically detect the changes &lt;strong&gt;only&lt;/strong&gt; to the &lt;code&gt;Makefile.in&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;The easiest way is to delete &lt;code&gt;Makefile&lt;/code&gt; inside of the gcc sub-directory inside of the build tree, which can be found at &lt;code&gt;~/gcc-build-001/gcc/Makefile&lt;/code&gt;. &lt;em&gt;NOTE: This method allows me to prevent rebuilding everything!&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IMPORTANT!&lt;/strong&gt; Don't get me wrong, the deletion of &lt;code&gt;Makefile&lt;/code&gt; inside of &lt;code&gt;~/gcc-build-001/gcc&lt;/code&gt; is required only when we make changes inside of the single file: &lt;code&gt;~/git/gcc/gcc/Makefile.in&lt;/code&gt;, future modification of the pass doesn't require this step!&lt;/p&gt;

&lt;p&gt;Here's how it looks in the bash:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/gcc-build-001/gcc
&lt;span class="nb"&gt;rm &lt;/span&gt;Makefile
&lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;span class="nb"&gt;time &lt;/span&gt;make &lt;span class="nt"&gt;-j&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;nproc&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; |&amp;amp; &lt;span class="nb"&gt;tee &lt;/span&gt;buid-xxx.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;Once, I have rebuilt the &lt;code&gt;GCC&lt;/code&gt; with the brand new pass, I am ready to test it!&lt;/p&gt;

&lt;p&gt;First of all, I had to write a test code, which can be found &lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-1/test-code/hello.c" rel="noopener noreferrer"&gt;here&lt;/a&gt;, I am leaving for you, so you could reuse it, also looks like the professor's code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next thing that I had to do was to create a &lt;code&gt;Makefile&lt;/code&gt; inside the test directory, which was also &lt;a href="https://github.com/mulla028/SPO600-Project/blob/main/stage-1/test-code/Makefile" rel="noopener noreferrer"&gt;uploaded&lt;/a&gt; to the git repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nv"&gt;BINARIES&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;hello
&lt;span class="nv"&gt;CCFLAGS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-g&lt;/span&gt; &lt;span class="nt"&gt;-O0&lt;/span&gt; &lt;span class="nt"&gt;-fno-builtin&lt;/span&gt; &lt;span class="nt"&gt;-fdump-tree-amullagaliev&lt;/span&gt;

&lt;span class="nl"&gt;all&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;${BINARIES}&lt;/span&gt;

&lt;span class="nl"&gt;hello&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;hello.c&lt;/span&gt;
    gcc &lt;span class="p"&gt;${&lt;/span&gt;CCFLAGS&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; hello hello.c

&lt;span class="nl"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="p"&gt;${&lt;/span&gt;BINARIES&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;.o &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Notice!&lt;/strong&gt; How using the flags I marked that I want to see the dump file for the &lt;strong&gt;pass&lt;/strong&gt; I have just implemented: &lt;code&gt;-fdump-tree-amullagaliev&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Dump File Outputs
&lt;/h3&gt;

&lt;p&gt;Upon the completion of all the steps above, I had to compile the sample code using &lt;code&gt;make&lt;/code&gt; command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;make hello
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two files will appear &lt;code&gt;hello&lt;/code&gt; and &lt;code&gt;hello.c.265t.amullagaliev&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Here are the results of the dump file using two passes: test-pass and final pass:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test-Pass Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;;;&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funcdef_no&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decl_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3929&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cgraph_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol_order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="n"&gt;FUnction&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt;
&lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="n"&gt;FUnction&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt;
&lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="n"&gt;FUnction&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt;


&lt;span class="cp"&gt;#### End amullagaliev diagnostics, start regular dump of current gimple ####
&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3937&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;_3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p1_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p2_2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;L0&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;:&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;



&lt;span class="p"&gt;;;&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funcdef_no&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decl_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3931&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cgraph_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol_order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="n"&gt;FUnction&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt;
&lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="n"&gt;FUnction&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt;
&lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="n"&gt;FUnction&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt;


&lt;span class="cp"&gt;#### End amullagaliev diagnostics, start regular dump of current gimple ####
&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3939&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;_7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;a_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;b_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;c_5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b_2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;printf&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c_5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;_7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;L0&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;:&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;As you can see, only function names were produced.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Pass Output&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="p"&gt;;;&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funcdef_no&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decl_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2337&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cgraph_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol_order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;=====&lt;/span&gt; &lt;span class="n"&gt;Basic&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;=====&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="n"&gt;_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p1_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p2_2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;=====&lt;/span&gt; &lt;span class="n"&gt;Basic&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;=====&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;L0&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;:&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="cp"&gt;# VUSE &amp;lt;.MEM_4(D)&amp;gt;
&lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;------------------------------------&lt;/span&gt;
&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;Basic&lt;/span&gt; &lt;span class="n"&gt;Blocks&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;Gimple&lt;/span&gt; &lt;span class="n"&gt;Statements&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="o"&gt;------------------------------------&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2345&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;_3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p1_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p2_2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;L0&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;:&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;



&lt;span class="p"&gt;;;&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;funcdef_no&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decl_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2339&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cgraph_uid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol_order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;=====&lt;/span&gt; &lt;span class="n"&gt;Basic&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;=====&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="n"&gt;a_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="n"&gt;b_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="cp"&gt;# .MEM_4 = VDEF &amp;lt;.MEM_3(D)&amp;gt;
&lt;/span&gt;&lt;span class="n"&gt;c_5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b_2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="cp"&gt;# .MEM_6 = VDEF &amp;lt;.MEM_4&amp;gt;
&lt;/span&gt;&lt;span class="n"&gt;printf&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c_5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="n"&gt;_7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;=====&lt;/span&gt; &lt;span class="n"&gt;Basic&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;=====&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;L0&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;:&lt;/span&gt;
&lt;span class="o"&gt;-----&lt;/span&gt; &lt;span class="n"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="o"&gt;-----&lt;/span&gt;
&lt;span class="cp"&gt;# VUSE &amp;lt;.MEM_6&amp;gt;
&lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;------------------------------------&lt;/span&gt;
&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;Basic&lt;/span&gt; &lt;span class="n"&gt;Blocks&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;Gimple&lt;/span&gt; &lt;span class="n"&gt;Statements&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="o"&gt;------------------------------------&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;D&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2347&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;_7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;a_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;b_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;c_5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b_2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;printf&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c_5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;_7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;bb&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;L0&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;:&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;This one results in following all the requirements set by professor, they are all written at the introduction.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These two dump-files may be found on github as well: &lt;a href="https://github.com/mulla028/SPO600-Project/tree/main/stage-1/test-code/dump-files" rel="noopener noreferrer"&gt;click here&lt;/a&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  Code Limitations
&lt;/h3&gt;

&lt;p&gt;I could list you another whole blog of limitations of this code, however this pass serves as a counter, and function name printer :) &lt;/p&gt;

&lt;p&gt;Regarding the capabilities, everything listed in the introduction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I hope someone found this post helpful, and was able to reproduce everything written here!&lt;/p&gt;

&lt;p&gt;Honestly, I faced some minor challenges. First one was the attention, I had to keep track of many files always made sure that I added the pass recognition inside of the gcc source file. Secondly, time it took to rebuild &lt;code&gt;Makefile&lt;/code&gt;. For some reasons, it took 15 minutes on &lt;code&gt;x86&lt;/code&gt;, and just 2 minutes on &lt;code&gt;Aarch64&lt;/code&gt;. I think it happened due to the high load of the servers by other students, everyone was trying to build. I won't be surprised if someone was building &lt;code&gt;gcc&lt;/code&gt; from scratch :D &lt;/p&gt;

&lt;p&gt;In my opinion, this is one of the most interesting courses, I am doing something new and mind-blowing. I really appreciate professor's efforts and explanations, even while we don't have that much documentation, he still manages to deliver the content by writing his own tutorials, and explaining clearly everything in his videos. &lt;/p&gt;

&lt;p&gt;Will see you in next blogs, I have a lot of things to do: &lt;code&gt;Lab03&lt;/code&gt;, &lt;code&gt;Lab05&lt;/code&gt; and the rest of the project stages!&lt;/p&gt;

</description>
      <category>gcc</category>
    </item>
    <item>
      <title>SPO600: Lab 4 - Building GCC</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Fri, 07 Mar 2025 20:29:27 +0000</pubDate>
      <link>https://dev.to/amullagaliev/spo600-lab-4-building-gcc-1lm8</link>
      <guid>https://dev.to/amullagaliev/spo600-lab-4-building-gcc-1lm8</guid>
      <description>&lt;h1&gt;
  
  
  Table of Contents
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
Introduction

&lt;ul&gt;
&lt;li&gt;What is GCC?&lt;/li&gt;
&lt;li&gt;Where to Get Source Code?&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Aarch64 vs x86

&lt;ul&gt;
&lt;li&gt;CPU Specifications&lt;/li&gt;
&lt;li&gt;Cache Memory&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Build Preparation

&lt;ul&gt;
&lt;li&gt;Screen&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Time to Build!

&lt;ul&gt;
&lt;li&gt;x86&lt;/li&gt;
&lt;li&gt;AArch64&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Install the Build

&lt;ul&gt;
&lt;li&gt;Result&lt;/li&gt;
&lt;li&gt;Building C Programs&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Experiments

&lt;ul&gt;
&lt;li&gt;Changed Timestamp Experiment&lt;/li&gt;
&lt;li&gt;Result&lt;/li&gt;
&lt;li&gt;Rebuild Software Without Making Any Changes&lt;/li&gt;
&lt;li&gt;Result&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;For those who have never read my blog, I am writing about the Labs and Project for Seneca Polytechnic's Software Optimization and Portability course, or &lt;code&gt;SPO600&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Today, I am going to talk about lab number four, during this lab, I am going to install and build the GCC compiler using Makefile. &lt;/p&gt;

&lt;p&gt;First of all, I would like to share all of the resources I used in order to finish this lab:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:building_gcc" rel="noopener noreferrer"&gt;GCC Build Guide&lt;/a&gt; - Provided by professor&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://gcc.gnu.org/git.html" rel="noopener noreferrer"&gt;GCC source code&lt;/a&gt; - Found on this page&lt;/li&gt;
&lt;li&gt;
&lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:screen_tutorial" rel="noopener noreferrer"&gt;Screen&lt;/a&gt; - Provided by professor&lt;/li&gt;
&lt;li&gt;
&lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:make_and_makefiles" rel="noopener noreferrer"&gt;Makefile&lt;/a&gt; - Provided by professor&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What is GCC?
&lt;/h3&gt;

&lt;p&gt;Before I start, you have to understand what it is &lt;strong&gt;GCC&lt;/strong&gt;. &lt;strong&gt;The GNU Compiler Collection(GCC)&lt;/strong&gt; is a collection of compilers from the &lt;a href="https://en.wikipedia.org/wiki/GNU_Project" rel="noopener noreferrer"&gt;GNU Project&lt;/a&gt; that supports various programming languages, hardware architectures and operating systems. &lt;/p&gt;

&lt;h3&gt;
  
  
  Where to Get Source Code?
&lt;/h3&gt;

&lt;p&gt;The very first step of building the &lt;strong&gt;GCC&lt;/strong&gt; is to get source code. I used the code provided by GNU &lt;a href="https://gcc.gnu.org/git.html" rel="noopener noreferrer"&gt;here&lt;/a&gt;. Therefore, cloned it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone git://gcc.gnu.org/git/gcc.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since we are using two servers with two different architectures in this course: &lt;code&gt;aarch64&lt;/code&gt; and &lt;code&gt;x86&lt;/code&gt;, I have to execute all the steps explained in this blog for both of them. &lt;/p&gt;

&lt;p&gt;Going forward, let's understand the difference in those specifications that matter most in terms of this lab and make a prediction based on it. Mostly, we care about the speed of the build.&lt;/p&gt;

&lt;h2&gt;
  
  
  Aarch64 vs x86
&lt;/h2&gt;

&lt;h4&gt;
  
  
  CPU Specifications
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;x86_64&lt;/th&gt;
&lt;th&gt;aarch64&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CPU Cores&lt;/td&gt;
&lt;td&gt;10 cores, 20 threads (2 threads per core)&lt;/td&gt;
&lt;td&gt;16 cores, 16 threads (1 thread per core)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Clock Speed&lt;/td&gt;
&lt;td&gt;3.70GHz base, up to 4.70GHz&lt;/td&gt;
&lt;td&gt;Not specified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vendor&lt;/td&gt;
&lt;td&gt;Intel&lt;/td&gt;
&lt;td&gt;ARM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Cache Memory
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cache Level&lt;/th&gt;
&lt;th&gt;x86_64&lt;/th&gt;
&lt;th&gt;aarch64&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;L1 Data&lt;/td&gt;
&lt;td&gt;320 KiB total (32 KiB per core)&lt;/td&gt;
&lt;td&gt;512 KiB total (32 KiB per core)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L1 Instruction&lt;/td&gt;
&lt;td&gt;320 KiB total (32 KiB per core)&lt;/td&gt;
&lt;td&gt;768 KiB total (48 KiB per core)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L2&lt;/td&gt;
&lt;td&gt;10 MiB total (1 MiB per core)&lt;/td&gt;
&lt;td&gt;8 MiB total (1 MiB per 2 cores)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L3&lt;/td&gt;
&lt;td&gt;19.3 MiB (shared)&lt;/td&gt;
&lt;td&gt;8 MiB (shared)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Based on the architectural comparison, some factors will likely influence the build performance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Thread Advantage&lt;/strong&gt;: The x86_64 system has 20 threads vs. 16 threads on the ARM system. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache Considerations&lt;/strong&gt;: ARM system has larger L1 caches, however x86_64 system has a significantly larger L3 cache -  &lt;strong&gt;19.3MiB vs 8MiB&lt;/strong&gt;. GCC includes processing large amounts of code since x86_64 has a larger L3 cache, which may provide a meaningful advantage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;As of now, I know that x86_64 has extensive specialized instruction sets that may accelerate certain compilation tasks, while the ARM system uses a more streamlined approach.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My prediction is that the x86_64 system will complete the GCC build faster. &lt;/p&gt;

&lt;h2&gt;
  
  
  Build Preparation
&lt;/h2&gt;

&lt;p&gt;First of all, I created two new directories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; git gcc-build-001
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, I went to the &lt;code&gt;git&lt;/code&gt; directory and cloned &lt;code&gt;gcc&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;git
git clone git://gcc.gnu.org/git/gcc.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a next step, I had to configure the GCC source code for a custom build, since we already have GCC located at /usr/local, we want to install another version at ~/gcc-test-001:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~/git/gcc/configure &lt;span class="nt"&gt;--prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;/gcc-test-001
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Screen
&lt;/h3&gt;

&lt;p&gt;Everyone who wants to build GCC understands that it takes at least 20 minutes in the best case, and it may take up to several hours. Therefore, if something goes wrong and you are connected to the server where the build is happening, it will crash.&lt;/p&gt;

&lt;p&gt;To prevent users from it, bash uses a &lt;a href="http://spo600.cdot.systems/doku.php?id=spo600:screen_tutorial" rel="noopener noreferrer"&gt;screen&lt;/a&gt; tool. It detaches the session and allows the user to do whatever he wants in parallel to anything happening on that detached session. Surprisingly, it is a pretty simple tool to utilize. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a detached session:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;screen &lt;span class="nt"&gt;-RaD&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Run anything continues you want:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Leave the session while it finishes the task using this key combination:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Ctrl+A+D
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Reconnect to the session:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;screen &lt;span class="nt"&gt;-RaD&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Time to Build!
&lt;/h2&gt;

&lt;p&gt;We are ready to start building the GCC before we have to detach the session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;screen &lt;span class="nt"&gt;-RaD&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Going forward, our Makefile is already located at &lt;code&gt;~/gcc-build-001&lt;/code&gt;, so we have to be inside this directory in order to start the build:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;gcc-build-001
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once we are inside the appropriate directory, it is time to perform the build by typing &lt;code&gt;make.&lt;/code&gt; However, we want to compare the time it takes and save the &lt;code&gt;stdout&lt;/code&gt; and &lt;code&gt;stderr&lt;/code&gt; inside of the &lt;code&gt;build.log&lt;/code&gt; file. Therefore, we need a little more complex command than just &lt;code&gt;make&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;time &lt;/span&gt;make &lt;span class="nt"&gt;-j&lt;/span&gt; 24 |&amp;amp; &lt;span class="nb"&gt;tee &lt;/span&gt;build.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'll be back once it builds on both architectures with the results!&lt;/p&gt;

&lt;h3&gt;
  
  
  x86
&lt;/h3&gt;

&lt;p&gt;x86_64 won the competition and finished first with a time of &lt;code&gt;47m37s.647&lt;/code&gt; as I predicted! &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dxivclwqbjoe4wrp81e.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dxivclwqbjoe4wrp81e.jpg" alt="Image description" width="800" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  AArch64
&lt;/h3&gt;

&lt;p&gt;It took a year to finish. With a time of &lt;code&gt;124m55s,&lt;/code&gt; aarch64 architecture loses...as...expected...&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjdyqmxzz7bpx4246o97l.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjdyqmxzz7bpx4246o97l.jpg" alt="Image description" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Install the Build
&lt;/h2&gt;

&lt;p&gt;After you've performed a build, it is time to install it by simply typing &lt;code&gt;make install.&lt;/code&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IMPORTANT!&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;To prove that we have successfully installed gcc, I will provide a result that I had a different version of gcc installed earlier by admin or by system. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1lppemns8h6g3qweiq5b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1lppemns8h6g3qweiq5b.png" alt="Image description" width="800" height="115"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now I am ready to install it!&lt;/p&gt;

&lt;p&gt;As you remember, during the building preparation stage, we configured the GCC source code for the custom build by typing &lt;code&gt;~/git/gcc/configure --prefix=$HOME/gcc-test-001&lt;/code&gt;. Eventually, after we have performed the installment, we can see that &lt;code&gt;~/gcc-test-001&lt;/code&gt; appeared!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qujer1b9w0ww95n1ba1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qujer1b9w0ww95n1ba1.png" alt="Image description" width="678" height="62"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is now located the installed GCC! To use it, we have to make a small adjustment to the system :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;/gcc-test-001/bin:&lt;span class="nv"&gt;$PATH&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;... to include the bin directory within the installation directory as the first directory! &lt;/p&gt;

&lt;h3&gt;
  
  
  Result
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE!&lt;/strong&gt; As you can see, we have installed an experimental or development version of GCC, which differs from the initial one!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftbb4p3byfo3es2xw7s0e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftbb4p3byfo3es2xw7s0e.png" alt="Image description" width="800" height="119"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Building C Programs
&lt;/h4&gt;

&lt;p&gt;This &lt;strong&gt;Development&lt;/strong&gt; &lt;strong&gt;version&lt;/strong&gt; of GCC is capable of creating and compiling simple C programs:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3r2yg9z12zzmpbp6yxbb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3r2yg9z12zzmpbp6yxbb.png" alt="Image description" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I could have stopped here. However, we have to run a couple of more experiments!&lt;/p&gt;

&lt;h2&gt;
  
  
  Experiments
&lt;/h2&gt;

&lt;p&gt;This lab provides us only two experiments that answers two qustions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;How long does it take to rebuild GCC if we change the timestamp of one file?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How long does it take to rebuild GCC without making any changes, just by invoking the &lt;code&gt;make&lt;/code&gt; command? &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Changed Timestamp Experiment
&lt;/h3&gt;

&lt;p&gt;First of all, we have to figure out how to change a timestamp...&lt;/p&gt;

&lt;p&gt;I've done some research, and it turns out that it is as simple as using the &lt;code&gt;touch&lt;/code&gt; command... Honestly, I used this command a lot, but to create new files, I have never applied this command to existing ones. It proves that every new day, we learn new things, no matter how much we already know...&lt;/p&gt;

&lt;p&gt;Secondly, we have to find the file called &lt;code&gt;passes.cc&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Steps I had taken to complete this experiment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Find the file called &lt;code&gt;passes.cc&lt;/code&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find ~/git/gcc &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"passes.cc"&lt;/span&gt;

result:
/home/amullagaliev/git/gcc/gcc/passes.cc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Update timestamp:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch&lt;/span&gt; /home/amullagaliev/git/gcc/gcc/passes.cc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Get to &lt;code&gt;~/gcc-build-001&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/gcc-build-001
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Rebuild the software by re-issuing the &lt;code&gt;make&lt;/code&gt; command.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Result
&lt;/h4&gt;

&lt;p&gt;Fortunately, it took only &lt;code&gt;59s&lt;/code&gt; to rebuild. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi58zfm525cje84xbbeqr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi58zfm525cje84xbbeqr.jpg" alt="Image description" width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Rebuild Software Without Making Any Changes
&lt;/h3&gt;

&lt;p&gt;This is the easiest part of the lab; I just have to re-issue the &lt;code&gt;make&lt;/code&gt; command without any prior manipulations! Obviously, to measure time, I used: &lt;code&gt;time make -j 24 |&amp;amp; build.log&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Result
&lt;/h4&gt;

&lt;p&gt;It was the fastest build today that only took &lt;code&gt;15s&lt;/code&gt;. I am so happy to see this number :D &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqz9uidz390df9va1c5m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqz9uidz390df9va1c5m.png" alt="Image description" width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These experiments went as I expected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This lab took so much time due to the build time of GCC. However, I really enjoyed it. I am satisfied with the results, even though I had to start from scratch and rebuild twice. &lt;/p&gt;

&lt;p&gt;Unfortunately, I wasn't able to go through all the steps on &lt;code&gt;aarch64&lt;/code&gt;, the only thing that I performed there was the build since the server doesn't have enough space... It has only &lt;code&gt;12MB&lt;/code&gt; which isn't enough for the installment of GCC. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhn5154poxcofbs4bczya.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhn5154poxcofbs4bczya.png" alt="Image description" width="800" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j50vj96fq4ywajlqhbe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j50vj96fq4ywajlqhbe.png" alt="Image description" width="800" height="89"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Anyway, I may reach back with the experimental results later once we get some more space at the &lt;code&gt;Aarch64&lt;/code&gt; server. Thanks a lot to those who read this, I put a lot of effort into my blog posts!&lt;/p&gt;

</description>
      <category>gcc</category>
    </item>
    <item>
      <title>OSD700: Sprint 4 - Planning</title>
      <dc:creator>Amir Mullagaliev</dc:creator>
      <pubDate>Thu, 06 Mar 2025 13:09:42 +0000</pubDate>
      <link>https://dev.to/amullagaliev/osd700-sprint-4-planning-4ieb</link>
      <guid>https://dev.to/amullagaliev/osd700-sprint-4-planning-4ieb</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;This is the beginning of the second half of the term, which means that we have already made a significant amount of contributions. However, it is not the end. At this point, I am preparing a clean plan with a bunch of goals that I am willing to achieve by the end of the term.&lt;/p&gt;

&lt;p&gt;During the last lecture, the professor talked about the problems we are experiencing now. One of the problems is that we are working hard on the project, but we jump between the tasks without finishing them 100%. It was a pretty good point since I felt that changing the focus and goals of my work wasn't resulting in high-quality results and PRs.&lt;/p&gt;

&lt;p&gt;Therefore, I decided to fix existing bugs caused by my previous PRs. Going forward, my primary purpose for the upcoming lecture is to set meaningful goals that I would be proud of after achieving them during the last half of the term.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bug that Finally Fixed
&lt;/h2&gt;

&lt;p&gt;In the middle of the previous half of the term, I worked on the &lt;code&gt;Files Attachment UI.&lt;/code&gt; Here's my PR: &lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/804" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fassets.dev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        [UI] File Attachment Added 
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#804&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F138519917%3Fv%3D4" alt="mulla028 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;mulla028&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/804" rel="noopener noreferrer"&gt;&lt;time&gt;Jan 27, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;p&gt;Closes #794&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Description (UPDATED)&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This PR improves UX in terms of file management. Previously user was only able to see the file in the PromptForm, but now he is able to use &lt;code&gt;paperclip&lt;/code&gt; icon to see all files attached in modal window, and manage them:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Attach files... (Same as in OptionsButton)&lt;/li&gt;
&lt;li&gt;Delete (using &lt;code&gt;removeFile&lt;/code&gt; from &lt;code&gt;src/lib/fs.ts&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Download (using &lt;code&gt;downloadFile&lt;/code&gt; from &lt;code&gt;src/lib/fs.ts&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If user never attached the file(s) &lt;code&gt;paperclip&lt;/code&gt; will trigger file attachment process without opening modal window.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Preview (UPDATED)&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/user-attachments/assets/d19f6cf1-1890-43d0-85a7-269e076f1f89"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fuser-attachments%2Fassets%2Fd19f6cf1-1890-43d0-85a7-269e076f1f89" alt="ScreenRecording2025-01-27at7 36 33PM-ezgif com-video-to-gif-converter"&gt;&lt;/a&gt;&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/pull/804" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;During the implementation of the new UI feature, I faced a bug that was preventing the user from attaching the new files after he had deleted all of them. I was hitting my head on the wall since I wasn't able to find the solution. &lt;/p&gt;

&lt;p&gt;However, after the lecture that I was talking about in the introduction, &lt;a href="https://github.com/aldrin312" rel="noopener noreferrer"&gt;Aldrin&lt;/a&gt; opened two issues addressing this problem. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/tarasglek/chatcraft.org/issues/837" rel="noopener noreferrer"&gt;Issue 1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqcaj1z3jm0iksogguvs.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqcaj1z3jm0iksogguvs.gif" alt="Image description" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/tarasglek/chatcraft.org/issues/838" rel="noopener noreferrer"&gt;Issue 2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmglc8vvxjm9c2c8o8we1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmglc8vvxjm9c2c8o8we1.gif" alt="Image description" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution
&lt;/h3&gt;

&lt;p&gt;To fix it, I had to dive into the myself-written code to understand why is that happening. The cause of these problems was pretty simple; I used two &lt;code&gt;&amp;lt;Input&amp;gt;&lt;/code&gt; elements and one of them wasn't always rendering  due to the &lt;code&gt;!isAttached&lt;/code&gt; condition:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{!isAttached &amp;amp;&amp;amp; (
          &amp;lt;Input
            multiple
            type="file"
            ref={fileInputRef}
            hidden
            onChange={handleFileChange}
            accept={acceptableFileFormats}
          /&amp;gt;
        )}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    &amp;lt;Input
     multiple
     type="file"
     ref={fileInputRef}
     hidden
     onChange={handleFileChange}
     accept={acceptableFileFormats}
    /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;It allows us to always render the input element that doesn't have any duplication. Moreover, I had to refresh the &lt;code&gt;fileInputRef&lt;/code&gt; after every deletion so we could ensure that it is available for future use.&lt;/p&gt;

&lt;p&gt;Here's the &lt;strong&gt;PR&lt;/strong&gt; I opened:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/849" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fassets.dev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        [FIX] File Input bug fixed
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#849&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F138519917%3Fv%3D4" alt="mulla028 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/mulla028" rel="noopener noreferrer"&gt;mulla028&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/tarasglek/chatcraft.org/pull/849" rel="noopener noreferrer"&gt;&lt;time&gt;Mar 05, 2025&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;p&gt;Closes #837 &amp;amp; #838&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Description&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;During the implementation of #804 I couldn't fix the problem after the deletion of all files, I couldn't interact with paperclip button unless refreshed the page. This time I realized that problem was hidden inside of the &lt;code&gt;&amp;lt;Input&amp;gt;&lt;/code&gt; elements that were duplicated, and also we didn't need a condition to render it, we had to render it in any case. Good lesson for me tho :)&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tarasglek/chatcraft.org/pull/849" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Result:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3idy6svl9cd23fd60t3z.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3idy6svl9cd23fd60t3z.gif" alt="Image description" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you noticed, now, if the number of files becomes zero, the modal window is going to get closed, and the user will have to upload a new file to open the file attachments modal window again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;It is so hard to pick a focus of work, and I would like to figure it out during today's lecture. Next week, you will learn about the new focus area I've picked. Thank you for reading, will see you next week!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
