<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Smit Gabani</title>
    <description>The latest articles on DEV Community by Smit Gabani (@smitgabani).</description>
    <link>https://dev.to/smitgabani</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F925410%2F47594afa-f20e-4923-bc4f-798a9a994f86.png</url>
      <title>DEV Community: Smit Gabani</title>
      <link>https://dev.to/smitgabani</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/smitgabani"/>
    <language>en</language>
    <item>
      <title>SPO Project : Auto-vectorization with iFunc - Stage 1</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Tue, 13 Dec 2022 21:35:14 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo-project-auto-vectorization-with-ifunc-stage-1-3084</link>
      <guid>https://dev.to/smitgabani/spo-project-auto-vectorization-with-ifunc-stage-1-3084</guid>
      <description>&lt;h2&gt;
  
  
  "ifunc"
&lt;/h2&gt;

&lt;p&gt;ifunc was introduced for developers to create programs that can use advance SIM, SVE, SVE2 i.e. create multiple implementation of a function and select amongst them at runtime using a resolver function.&lt;/p&gt;

&lt;p&gt;The resolver function is used to determine between the different implemented functions based on the system's architecture. i.e. to determine the appropriate function during runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auto-vectorization:
&lt;/h2&gt;

&lt;p&gt;... where a computer program is converted from a scalar implementation, which processes a single pair of operands at a time, to a vector implementation, which processes one operation on multiple pairs of operands at once.&lt;/p&gt;

&lt;p&gt;Automatic vectorization have three major implementation of SIMD instructions for AArch64, Advanced SIMD, SVE and SVE2. Some modern gcc compiler have option to choose from one of these implementations at runtime using the function ifunc. &lt;/p&gt;

&lt;p&gt;**Learn about ifunc and autovectorization to implement the project.&lt;/p&gt;

&lt;p&gt;The purpose of the created tool is:&lt;/p&gt;

&lt;p&gt;Given a function create and implement an ifunc resolver.Then automatically compile the function using auto-vectorization for multiple SIMD systems.Or the runtime compiler will choose one from them and generate a single output file.&lt;/p&gt;

&lt;p&gt;Call the program in place of compiling. In between assembly and linking phase.&lt;/p&gt;

&lt;h2&gt;
  
  
  The language I will be using: Python
&lt;/h2&gt;

&lt;p&gt;I am comfortable coding and understanding pyton code and I know python has many library that I can use. The reason I an not using C or C++ is that I am not sure I can complete the project if I face any troble while working with C++. But at last I decided to work with C&lt;/p&gt;

&lt;p&gt;I plan that my code produces 3 builds successfully and automatically select a build that is appropriate with the hardware. Then execute the build. Should get the output. Different output can be used.&lt;/p&gt;

&lt;p&gt;The program will accept 2 files one will be the main file and other will be the argument. For testing case function.c file which will contain a single function which will get vectorized.&lt;/p&gt;

&lt;p&gt;The tool will only work on aarch64 system. So the testing will be done on the israel machine. Will focus on  three SIMD implementations: ASIMD, SVE and SVE2.&lt;/p&gt;

&lt;p&gt;Steps:&lt;br&gt;
Get argument using sys i.e. function.c file.&lt;/p&gt;

&lt;p&gt;Linker function&lt;br&gt;
Functions for different modules&lt;br&gt;
will figure out in the future.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges:
&lt;/h2&gt;

&lt;p&gt;Rename the functions and add those to ifunc c file and header file&lt;br&gt;
Calling the linking stage multiple times during building for different modules.&lt;/p&gt;

</description>
      <category>gratitude</category>
    </item>
    <item>
      <title>SPO Project : Auto-vectorization with iFunc - Stage 3.2- after due</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Tue, 13 Dec 2022 21:34:47 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo-project-auto-vectorization-with-ifunc-stage-2-1jbc</link>
      <guid>https://dev.to/smitgabani/spo-project-auto-vectorization-with-ifunc-stage-2-1jbc</guid>
      <description>&lt;h2&gt;
  
  
  Testing:
&lt;/h2&gt;

&lt;p&gt;Root dir:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RXzcbUqU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/n1ibi2i863grbounvr15.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RXzcbUqU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/n1ibi2i863grbounvr15.png" alt="Image description" width="396" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Running&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 tool.py --inputfile function.c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Missing tool.py running screenshot.&lt;/p&gt;

&lt;p&gt;Some last min error occurred due to multiple file inputs I will blog on diff blog if it is past 11:59.&lt;/p&gt;

&lt;h3&gt;
  
  
  ls:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IqPagIi---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/w3oyech1mumiz5d0yukc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IqPagIi---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/w3oyech1mumiz5d0yukc.png" alt="Image description" width="880" height="71"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  running the executable which has adjust chanels with
&lt;/h3&gt;

&lt;p&gt;arguments test/input/bree.jpg 1.0 3.0 5.0 and test/output/breelaa.jpg&lt;br&gt;
We can run this on israel machine which is an armv8&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./main tests/input/bree.jpg 1.0 3.0 5.0 tests/output/breelaa.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mFcAaVxq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rbxj9k650c8s4ldi58uq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mFcAaVxq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rbxj9k650c8s4ldi58uq.png" alt="Image description" width="880" height="94"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Emulating SVE2 system using qemu-aarch64:&lt;br&gt;
Runnig with same arguments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; qemu-aarch64 ./main tests/input/bree.jpg 1.0 3.0 5.0 tests/output/breel_qemu.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--w2g3NN3Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dupxdcbofm4400kjqi9i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--w2g3NN3Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dupxdcbofm4400kjqi9i.png" alt="Image description" width="880" height="98"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The output are the same but the built implementation are different.&lt;br&gt;
That does not mean that our function_sve2.o and function_asimd.o file are the same. I was working on comparing the .o files for diff arch using filecmp but no results. Still have written some code.&lt;/p&gt;
&lt;h2&gt;
  
  
  How to use the tool:
&lt;/h2&gt;

&lt;p&gt;Run the tool which will accept the funciton.c file as an argument and produce the main exe file along with the functions it may use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 tool.py --inputfile function.c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are 2 test files (images) provided by the professor so we could use any of them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./main tests/input/bree.jpg 1.0 3.0 5.0 tests/output/breelaa.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main executable will take the bree.jpg file and modify it based on the chosen function. So the first time we run the exe without qemu-aarch64 the asimd_adjust_chanels() function will be chosen using the iFunc resolver. &lt;br&gt;
The output file should be stored in test/output dir. ## error&lt;/p&gt;

&lt;p&gt;I forgot to share the screenshot of the modified tree after running the command.&lt;/p&gt;

&lt;p&gt;The output image will be different from the original image.&lt;/p&gt;

&lt;h2&gt;
  
  
  References:
&lt;/h2&gt;

&lt;p&gt;Help was provided by Naziur Khan.&lt;/p&gt;

&lt;p&gt;Passing multiple arguments:&lt;br&gt;
&lt;a href="https://stackoverflow.com/questions/15753701/how-can-i-pass-a-list-as-a-command-line-argument-with-argparse"&gt;https://stackoverflow.com/questions/15753701/how-can-i-pass-a-list-as-a-command-line-argument-with-argparse&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using subprocess module: used when running external os system command and manuplate the command&lt;br&gt;
subprocess.Popen() returns a exit code.&lt;br&gt;
&lt;a href="https://docs.python.org/3/library/subprocess.html"&gt;https://docs.python.org/3/library/subprocess.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Extract a function writen in c from a file:&lt;br&gt;
&lt;a href="https://stackoverflow.com/questions/55078713/extract-function-code-from-c-sourcecode-file-with-python"&gt;https://stackoverflow.com/questions/55078713/extract-function-code-from-c-sourcecode-file-with-python&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Filecmp:&lt;br&gt;
&lt;a href="https://www.geeksforgeeks.org/python-filecmp-cmpfiles-method/"&gt;https://www.geeksforgeeks.org/python-filecmp-cmpfiles-method/&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO 600 Project : Auto-vectorization with iFunc - Stage 3 - Final</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Sat, 10 Dec 2022 01:27:36 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo-600-project-auto-vectorization-with-ifunc-stage-3-1n98</link>
      <guid>https://dev.to/smitgabani/spo-600-project-auto-vectorization-with-ifunc-stage-3-1n98</guid>
      <description>&lt;p&gt;I may create a new blog on this instead of editing this if the time passes 11:59.&lt;/p&gt;

&lt;p&gt;My Github repo:&lt;br&gt;
&lt;a href="https://github.com/smitgabani/spo_project.git"&gt;https://github.com/smitgabani/spo_project.git&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Explain the code:
&lt;/h2&gt;

&lt;p&gt;The python library used are:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;subprocess - used when running external os system command and manuplate the command
shutil - for copy operation on a file
argparse - parsing argument 
sys - for exit codes
filecmp - for comparing output files (.o) 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The file starts executing from &lt;strong&gt;main&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First getting the arguments using argparser&lt;br&gt;
arguments accepted are:&lt;br&gt;
&lt;code&gt;--inputfile&lt;/code&gt;: single or multiple file seprated by ','&lt;br&gt;
&lt;code&gt;--makeheaderfile&lt;/code&gt;: alternative to makeheaders.c file&lt;br&gt;
&lt;code&gt;--arch&lt;/code&gt;: different arch for compiling seprated by ','&lt;/p&gt;

&lt;p&gt;Then getting all the .c files supplied in inputfile argument&lt;br&gt;
Then loop through each file doing the following:&lt;br&gt;
getting the function name.&lt;br&gt;
Creating function files and prototypes for diff arch.&lt;br&gt;
Process the argument files, Rewrite architecture files.&lt;br&gt;
Write function_arch.c files &lt;br&gt;
Compile the files&lt;br&gt;
Check if the files are same if they are ...&lt;br&gt;
Make ifunc header&lt;br&gt;
Make ifunc code files.&lt;br&gt;
Autovectorization.&lt;/p&gt;
&lt;h3&gt;
  
  
  I needed the bonus marks so I contected professor what functionality I could add and ....
&lt;/h3&gt;
&lt;h3&gt;
  
  
  Additional functions implemented:
&lt;/h3&gt;

&lt;p&gt;aarch64 and x86_64 systems - could not be done but .... &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Accept multiple function files
You could pass multiple files but each file should contain only one funciton.
seprate file name with (",") 
python3 tool.py --inputfile function.c,function2.c,function3.c&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To test this I have created a function2.c file with the same function in function.c file but with different name i.e. function2&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Notifying the user if autovectorization could not be applied (could have used exit code but have used sys.exit("what happened"))&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Error handling has been done for the following.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;If compilation of makeheaders could not be done.&lt;/li&gt;
&lt;li&gt;If compilation for different arch could not be done.
i.e. gcc -g -03 -c -march=armv8-a.. filename -o outputfilename
could not be completed&lt;/li&gt;
&lt;li&gt;If autovectorization could be completed and function_main is not generated.&lt;/li&gt;
&lt;li&gt;Header file for the functions could not be created (stdout to output file)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Professor said it would be plus if I could check the output of all arch compilation and compare them if they are the same. If they are the same ... &lt;br&gt;
done using filecmp&lt;/p&gt;
&lt;h2&gt;
  
  
  What the tool does:
&lt;/h2&gt;
&lt;h3&gt;
  
  
  In short:
&lt;/h3&gt;

&lt;p&gt;Generates the following files&lt;br&gt;
 1.makeheaders_exe created after compiling makeheaders.c used to create .h file for function&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generated using makeheaders_exe to find the function names and prototype. Works for more than one function in a single file.
Now I have used function.c file but if you pass --inputfile function.c,function2.c double the files will be generated.&lt;/li&gt;
&lt;li&gt;function_asimd.c  - function for Advanced SIMD build, function_sve.c - function for SVE build, function_sve2.c - function for SVE2 build&lt;/li&gt;
&lt;li&gt;function_asimd.o  - build for Advanced SIMD, function_sve.o - build for SVE, function_sve2.o - build for SVE2. produced after compiling respective .c files.&lt;/li&gt;
&lt;li&gt;ifunc.c (generated resolver function for ASIMD, SVE,SVE2 build for the target functions)&lt;/li&gt;
&lt;li&gt;ifunc.h (generated header of target functions for ASIMD, SVE,SVE2 variants)
argument_main_exe file will be produced for each argument supplied.&lt;/li&gt;
&lt;li&gt;function_main_exe (final compiled build file with auto vectorization capabilities)&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Procedure and explaining the code:
&lt;/h3&gt;

&lt;p&gt;When executed with ideal arguemnts the tool will parse the argument/s which would be a .c file which will have a single function. Then will use (compile) makeheaders.c file and then use the executable (makeheaders_exe) created to generated header file. i.e. function.c. Then from the header file created get the prototype(s) for the function in that file. I wanted to create a tool that could support multiple functions in a same file. Then create seprate files for all arch which will be the same but have different names. Some of it is explained above.&lt;/p&gt;
&lt;h2&gt;
  
  
  Testing:
&lt;/h2&gt;

&lt;p&gt;Root dir:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RXzcbUqU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/n1ibi2i863grbounvr15.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RXzcbUqU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/n1ibi2i863grbounvr15.png" alt="Image description" width="396" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Runing&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 tool.py --inputfile function.c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Some last min error occured due to multiple file inputs I will blog on diff blog if it is past 11:59.&lt;/p&gt;

&lt;p&gt;after compiling:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--G2dZhC8M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/86z098q5xnkgx5vqg7cf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--G2dZhC8M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/86z098q5xnkgx5vqg7cf.png" alt="Image description" width="852" height="76"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ls:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IqPagIi---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/w3oyech1mumiz5d0yukc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IqPagIi---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/w3oyech1mumiz5d0yukc.png" alt="Image description" width="880" height="71"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  running the executable which has adjust chanels with arguments test/input/bree.jpg 1.0 3.0 5.0 and test/output/breelaa.jpg
&lt;/h3&gt;

&lt;p&gt;We can run this on israel machine which is an armv8&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./main tests/input/bree.jpg 1.0 3.0 5.0 tests/output/breelaa.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mFcAaVxq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rbxj9k650c8s4ldi58uq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mFcAaVxq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rbxj9k650c8s4ldi58uq.png" alt="Image description" width="880" height="94"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Emulating SVE2 system using qemu-aarch64:&lt;br&gt;
Runnig with same arguments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; qemu-aarch64 ./main tests/input/bree.jpg 1.0 3.0 5.0 tests/output/breel_qemu.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--w2g3NN3Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dupxdcbofm4400kjqi9i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--w2g3NN3Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dupxdcbofm4400kjqi9i.png" alt="Image description" width="880" height="98"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The output are the same but the built implementation are different.&lt;br&gt;
That does not mean that our function_sve2.o and function_asimd.o file are the same. I was working on comparing the .o files for diff arch using filecmp but no results. Still have written some code.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned:
&lt;/h2&gt;

&lt;p&gt;About different arch, how to modify code according to each one of them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I enjoyed:
&lt;/h2&gt;

&lt;p&gt;I wrote some code that solved the limitation section and some additional functionality.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I disliked:
&lt;/h2&gt;

&lt;p&gt;Some part of ifunc I could not understand (why that code) and have a few unsolved doubts.&lt;/p&gt;

&lt;p&gt;Will the tool be used by me in the future:&lt;br&gt;
Maybee. I may write C code on diff arch. in the future if so I could use the &lt;/p&gt;

&lt;p&gt;The code I picked of used up:&lt;br&gt;
The code that utilizes or makes ifunc.c or ifunc.h&lt;/p&gt;

&lt;h1&gt;
  
  
  help and #uses in the code blocks
&lt;/h1&gt;

&lt;p&gt;Consent: Yes&lt;/p&gt;

&lt;h2&gt;
  
  
  References:
&lt;/h2&gt;

&lt;p&gt;Help was provided by Naziur Khan.&lt;/p&gt;

&lt;p&gt;Passing multiple arguments:&lt;br&gt;
&lt;a href="https://stackoverflow.com/questions/15753701/how-can-i-pass-a-list-as-a-command-line-argument-with-argparse"&gt;https://stackoverflow.com/questions/15753701/how-can-i-pass-a-list-as-a-command-line-argument-with-argparse&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using subprocess module: used when running external os system command and manuplate the command&lt;br&gt;
subprocess.Popen() returns a exit code.&lt;br&gt;
&lt;a href="https://docs.python.org/3/library/subprocess.html"&gt;https://docs.python.org/3/library/subprocess.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Extract a function writen in c from a file:&lt;br&gt;
&lt;a href="https://stackoverflow.com/questions/55078713/extract-function-code-from-c-sourcecode-file-with-python"&gt;https://stackoverflow.com/questions/55078713/extract-function-code-from-c-sourcecode-file-with-python&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Filecmp:&lt;br&gt;
&lt;a href="https://www.geeksforgeeks.org/python-filecmp-cmpfiles-method/"&gt;https://www.geeksforgeeks.org/python-filecmp-cmpfiles-method/&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO 600 Project : Auto-vectorization with iFunc - Stage 2</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Fri, 09 Dec 2022 02:01:45 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo-600-project-auto-vectorization-with-ifunc-stage-2-4gil</link>
      <guid>https://dev.to/smitgabani/spo-600-project-auto-vectorization-with-ifunc-stage-2-4gil</guid>
      <description>&lt;p&gt;How to enable SVE and SVE2 on aarch64 systems:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcc -g -O3 -c march=armv8-a+sve ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcc -g -O3 -c march=armv8-a+sve2 ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How to enable SMID on aarch64 system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcc -g -O3 -march=armv8-a ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I was having difficulty in deciding which language to choose. Initiall I went with pyton as I had thought it will be easy. Then as the complexity of the project increase I decide to go with C as I thought the it would be the choise as we get to use gcc the again I chose python after Chris recomended it.&lt;/p&gt;

&lt;p&gt;My code can be found at:&lt;br&gt;
&lt;a href="https://github.com/smitgabani/spo600_project"&gt;https://github.com/smitgabani/spo600_project&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Clone professors test code repo from;&lt;br&gt;
&lt;a href="https://github.com/ctyler/spo600-fall2022-project-test-code.git"&gt;https://github.com/ctyler/spo600-fall2022-project-test-code.git&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have used make header utility used by some of my friends.&lt;br&gt;
&lt;a href="https://github.com/bjconlan/makeheaders.git"&gt;https://github.com/bjconlan/makeheaders.git &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step 1&lt;br&gt;
Start implementing 3 functions for different modules:&lt;/p&gt;

&lt;p&gt;I started by going through some code that Chris provided and seperated the code. &lt;br&gt;
Parse the string in order to get the function prototype and name. This took a lot of time.&lt;/p&gt;

&lt;p&gt;Now I will use the header utility as mentioned above.&lt;br&gt;
Next produce a code file called header.c and place the extracted prototype on top of that file. &lt;/p&gt;

&lt;p&gt;Then get the name of the function in the test code and add a prefix to the name &lt;br&gt;
for eg.&lt;br&gt;
If the funtion's name is &lt;code&gt;foo&lt;/code&gt; make three files &lt;code&gt;foo_sve&lt;/code&gt; &lt;code&gt;foo_sve2&lt;/code&gt; &lt;code&gt;foo_adSMID&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Step 2&lt;br&gt;
Make a resolver function that will have the logic to choose what implementation will be run&lt;/p&gt;

&lt;p&gt;Step 3&lt;/p&gt;

&lt;p&gt;Testing on VS Code:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Esfhf7kT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yad3t9ur03c4j129obtp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Esfhf7kT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yad3t9ur03c4j129obtp.png" alt="Image description" width="880" height="274"&gt;&lt;/a&gt;&lt;br&gt;
Error are due to the architecture.&lt;/p&gt;

&lt;p&gt;To check the output we need to run the process on israel machine as the machine supports the architecutre.&lt;/p&gt;

&lt;p&gt;Testing on Israel SPO600 server:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kJUF5vWj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/o8o45lhave8ozosiimiu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kJUF5vWj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/o8o45lhave8ozosiimiu.png" alt="Image description" width="880" height="302"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Emulating different stytems using different architecture.&lt;/p&gt;

&lt;p&gt;So far the tool has errors.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO 600 Project : Auto-vectorization with iFunc - Stage 1</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Fri, 09 Dec 2022 01:04:24 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo-600-project-auto-vectorization-with-ifunc-stage-1-33oc</link>
      <guid>https://dev.to/smitgabani/spo-600-project-auto-vectorization-with-ifunc-stage-1-33oc</guid>
      <description>&lt;h2&gt;
  
  
  Objective:
&lt;/h2&gt;

&lt;p&gt;The GCC compiler introduced the ‘ifunc’ extension to allow developers to create programs that can use advanced SIMD, SVE2, etc.&lt;br&gt;
Our project is base on the use of ifunc and advanced SIMD, SVE2.&lt;/p&gt;

&lt;p&gt;The goal for this project is to produce a &lt;strong&gt;proof-of-concept tool&lt;/strong&gt; that will take code that meets specific conditions and automatically build it with &lt;code&gt;ifunc&lt;/code&gt; capability to select between &lt;strong&gt;multiple, autovectorized versions of a function&lt;/strong&gt;, to take advantage of the best &lt;strong&gt;SIMD implementation&lt;/strong&gt; &lt;strong&gt;available on the CPU&lt;/strong&gt; on which the code is running.&lt;/p&gt;

&lt;p&gt;For a single instruction to be completed for multiple data unit different architecture have different SIMD implementations like advance SIMD, SVE and SVE2. &lt;/p&gt;

&lt;p&gt;The tool designed by us will build 3 implementation for &lt;code&gt;function.c&lt;/code&gt; file which wil have only one function.&lt;/p&gt;
&lt;h3&gt;
  
  
  The language I will be using: C
&lt;/h3&gt;

&lt;p&gt;I am comfortable coding and understanding pyton code and I know python has many library that I can use. The reason I an not using C or C++ is that I am not sure I can complete the project if I face any troble while working with C++. But at last I decided to work with C&lt;/p&gt;
&lt;h3&gt;
  
  
  Overall Operation - Approach
&lt;/h3&gt;

&lt;p&gt;I am some what confused no how to approach the problem.&lt;/p&gt;

&lt;p&gt;Lets divide the working of the program into steps.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Get input&lt;/li&gt;
&lt;li&gt;Parameters&lt;/li&gt;
&lt;li&gt;Linker function&lt;/li&gt;
&lt;li&gt;Functions for different modules&lt;/li&gt;
&lt;li&gt;will figure out in the future.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The initial process will produce output similar to gcc compiler.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcc -c func.c -o func.o
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Challenges
&lt;/h3&gt;

&lt;p&gt;Calling the linking stage multiple times during building for different modules.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>react</category>
      <category>learning</category>
    </item>
    <item>
      <title>SPO600 Lab 5 - Algorithm Selection Part 2: In depth</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Fri, 04 Nov 2022 02:25:05 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo600-lab-5-algorithm-selection-part-2-in-depth-532f</link>
      <guid>https://dev.to/smitgabani/spo600-lab-5-algorithm-selection-part-2-in-depth-532f</guid>
      <description>&lt;p&gt;We want to measure the performance of each algorithm specifically and nothing else should be in the way in order to give a correct measurement of the time elapsed performing the algorithm.&lt;/p&gt;

&lt;p&gt;In order to do that we need to make some modification to all the files.&lt;/p&gt;

&lt;p&gt;I have the code here for vol0.c&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// vol0.c - naive scaling algorithm
// Chris Tyler 2017.11.29-2021.11.16 - Licensed under GPLv3.
// For the Seneca College SPO600 Course

#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdint.h&amp;gt;
#include &amp;lt;stdbool.h&amp;gt;
#include "vol.h"
#include &amp;lt;time.h&amp;gt;

int16_t scale_sample(int16_t sample, int volume) {

        return (int16_t) ((float) (volume/100.0) * (float) sample);
}

int main() {
        int             x;
        int             ttl=0;

// ---- Create in[] and out[] arrays
        int16_t*        in;
        int16_t*        out;
        in=(int16_t*) calloc(SAMPLES, sizeof(int16_t));
        out=(int16_t*) calloc(SAMPLES, sizeof(int16_t));

        clock_t start_t, end_t;
        start_t = clock();
// ---- Create dummy samples in in[]
        vol_createsample(in, SAMPLES);

// ---- This is the part we're interested in!
// ---- Scale the samples from in[], placing results in out[]
//
        for (x = 0; x &amp;lt; SAMPLES; x++) {
                out[x]=scale_sample(in[x], VOLUME);
        }
        end_t = clock();
        printf("Time elapsed: %f\n", ((double)start_t - end_t)/CLOCKS_PER_SEC);

// ---- This part sums the samples. (Why is this needed?)
        for (x = 0; x &amp;lt; SAMPLES; x++) {
                ttl=(ttl+out[x])%1000;
        }

// ---- Print the sum of the samples. (Why is this needed?)
        printf("Result: %d\n", ttl);

        return 0;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Declare start_t and end_t which are of type clock_t.&lt;/p&gt;

&lt;p&gt;We wrap the scaleing part in &lt;code&gt;start_t = clock();&lt;/code&gt; and &lt;code&gt;end_t = clock();&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And then print the time &lt;br&gt;
&lt;code&gt;printf("Time elapsed: %f\n", ((double)start_t - end_t)/CLOCKS_PER_SEC);&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;DO THIS FOR ALL THE vol*.c FILES.&lt;/p&gt;

&lt;p&gt;When you run the ./vol0 you will get such output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81g3thmm4l6r5ojn2rsm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81g3thmm4l6r5ojn2rsm.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Run the ./vol0 in a loop:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpcwl1651bwszhd6wrcpe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpcwl1651bwszhd6wrcpe.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Observations:
&lt;/h2&gt;

&lt;p&gt;AArch64 system:&lt;br&gt;
        vol0        vol1        vol2        vol4        vol5&lt;br&gt;
1       4.980222    4.313869    10.567804   3.481179    4.04965&lt;br&gt;
2       4.968114    4.323427    10.560218   3.456832    4.045124&lt;br&gt;
3       4.959307    4.30239     10.59247    3.460554    4.045296&lt;br&gt;
4       4.973315    4.325246    10.590718   3.469896    4.028074&lt;br&gt;
5       4.961954    4.31275     10.585899   3.493608    4.039965&lt;br&gt;
6       4.976137    4.320063    10.532851   3.438831    4.040304&lt;br&gt;
7       4.959057    4.349743    10.642468   3.482111    4.055758&lt;br&gt;
8       4.960718    4.317437    10.534451   3.469382    4.150955&lt;br&gt;
9       4.95485     4.336698    10.548297   3.451517    4.085376&lt;br&gt;
10      4.960642    4.329521    10.552774   3.455712    4.022335&lt;br&gt;
11      4.952321    4.332141    10.550225   3.384459    4.041784&lt;br&gt;
12      4.990002    4.334904    10.572559   3.423148    4.058293&lt;br&gt;
13      4.942643    4.326342    10.545258   3.420771    4.096189&lt;br&gt;
14      4.957297    4.317604    10.548684   3.391878    4.020974&lt;br&gt;
15      4.967439    4.317819    10.558838   3.433111    4.062904&lt;br&gt;
average 4.964267867 4.323996933 10.5655676  3.4475326   4.056198733&lt;br&gt;
median  4.960718    4.323427    10.558838   3.455712    4.045296&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600 Lab 5 - Algorithm Selection Part 1: Introduction</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Thu, 03 Nov 2022 08:23:36 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo600-algorithm-selection-lab-part-1-introduction-2oo</link>
      <guid>https://dev.to/smitgabani/spo600-algorithm-selection-lab-part-1-introduction-2oo</guid>
      <description>&lt;p&gt;Hi, I am Smit Gabani and in this blog I will introduce the Lab5.&lt;br&gt;
For this lab we comparing the relative performance of various algorithms on the same computer on AArch64 and x86_64 systems.&lt;/p&gt;

&lt;p&gt;Initial tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get example programs using
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp /public/spo600-volume-examples.tgz ~
tar -xvzf spo600-volume-examples.tgz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Change directory to where the Makefile stored. Using make to compile c-scripts.&lt;/li&gt;
&lt;li&gt;Define SAMPLES in the header file vol.h and the scale of volume was set to 50% in this header file.&lt;/li&gt;
&lt;li&gt;The memory used by these algorithm can be measured using free -m command in terminal.&lt;/li&gt;
&lt;li&gt;Simple measure of time using time command.&lt;/li&gt;
&lt;li&gt;Find a way to measure performance, using #include &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What are inside the files:
&lt;/h2&gt;

&lt;p&gt;vol.h&lt;br&gt;
SAMPLES and VOLUME&lt;br&gt;
In vol.h a large number of sample for the algorithms to process seems to be reasonable, because it will allow us to analyze the differences in terms of performance much easier.&lt;/p&gt;

&lt;p&gt;vol0.c&lt;br&gt;
In vol0.c, Audio samples are multiplied by the volume scaling factor, casting between signed 16-bit integers and floating-point values. This way takes up a lot of resources.&lt;/p&gt;

&lt;p&gt;vol1.c&lt;br&gt;
vol1.c utilizes a fixed-point calculation. This avoids the cost of repetitively casting between integer and floating point.&lt;/p&gt;

&lt;p&gt;vol2.c&lt;br&gt;
Unlike vol0.c and vol1.c, vol2.c pre-calculates all 65535 results, looking up answers for each input value afterward.&lt;/p&gt;

&lt;p&gt;vol3.c&lt;br&gt;
vol3.c returns an identical sample value, the purpose of this program seems to be a baseline to compare to the other scaling volume algorithm.&lt;/p&gt;

&lt;p&gt;vol4.c&lt;br&gt;
vol4.c uses the SIMD (Single Input, Multiple Data) instructions accessed through inline assembly. Which is only available on AArch64 architectures.&lt;/p&gt;

&lt;p&gt;vol5.c&lt;br&gt;
vol5.c like vol4.c also utilize SIMD instruction but with complier intrinsic built into the compiler. vol5.c is also specific to AArch64 due to usage of unique instructions of AArch64 architecture.&lt;/p&gt;

&lt;p&gt;vol_createsample.c&lt;br&gt;
vol_createsample.c contains the function vol_createsample(int16_t* sample, int32_t sample_count) that will be use to create dummy samples for the algorithms to run with.&lt;/p&gt;

&lt;h2&gt;
  
  
  My prediction:
&lt;/h2&gt;

&lt;p&gt;Fastest: vol4, vol5&lt;br&gt;
Average: vol2&lt;br&gt;
Slowest: vol0, vol1&lt;/p&gt;

&lt;p&gt;Running the programs with the &lt;code&gt;time&lt;/code&gt; command (SAMPLES 16 VOLUME 50):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvjs5z0k9ejyefju8ujy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvjs5z0k9ejyefju8ujy.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Running the programs with the &lt;code&gt;time&lt;/code&gt; command (SAMPLES 500 VOLUME 50):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8vz87cwm712golfe8kg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8vz87cwm712golfe8kg.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;** Not all algorithms produce same result **&lt;br&gt;
This is significant especially if the sample size increases.&lt;/p&gt;

&lt;p&gt;Memory utilization:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxbg5r37ftmyzsves42sk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxbg5r37ftmyzsves42sk.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Observation:&lt;br&gt;
vol4 and vol5 are using much more memory that vol0, vol1, vol2, vol3.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600 Lab 4 - Part 2: x86_64</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Thu, 03 Nov 2022 05:02:10 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo600-lab-4-part-1-x8664-204n</link>
      <guid>https://dev.to/smitgabani/spo600-lab-4-part-1-x8664-204n</guid>
      <description>&lt;ul&gt;
&lt;li&gt;Connecting to israel server.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh username@portugal.cdot.systems // replace username with your username.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Copy and unpack tarball, which for a windows user like me can be a new experience.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp /public/filename ~
tar -xvzf filename
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Modify the loop to run until 30 iterations (which took hours for me)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We need more steps to make the division work which makes the code longer and harder to make it work.&lt;/p&gt;

&lt;p&gt;My x86_64 code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.text
.globl  _start

min = 0                         
max = 30                       

_start:

        mov     $min,%r15           
        mov     $10, %r14
    mov     $10, %r13
        mov     %r15,%r13       
        mov     %r15,%r14      

        mov     $10,%r8
        movq    $len,%rdx                      


loop:

        mov     $0, %r11
            mov     %r9, %r12

            div     %r8    

        add     $'0',%12

        movb    %r14b,msg+6     


inner: 

        add     $'0', %11
        movb    %r13b,msg+7 



             movq    $msg,%rsi                      
            movq    $1,%rdi                        
        movq    $1,%rax                        
        syscall


        inc     %r15                
        cmp     $max,%r15           
        jne     loop                



        movq    $0,%rdi                        
        movq    $60,%rax                       
        syscall

.section .data

msg:    .ascii      "Loop: ##\n"
        len = . - msg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code did not work for me on the portugal server.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600 Lab 4 - Part 1: AArch64</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Thu, 03 Nov 2022 04:50:55 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo600-lab-4-part-1-aarch64-c7o</link>
      <guid>https://dev.to/smitgabani/spo600-lab-4-part-1-aarch64-c7o</guid>
      <description>&lt;p&gt;In this lab we expriment with the x86_64 and aarch64 platforms with the loop code demonstrated by Chris Tyler.&lt;/p&gt;

&lt;p&gt;The following tasks from lab 4 are completed and explained in this blog.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connecting to israel server.&lt;/li&gt;
&lt;li&gt;Copy and unpack tarball, which for a windows user like me can be a new experience.&lt;/li&gt;
&lt;li&gt;Explore the files inside the spo600 folder and use the makefiles and make command to build the code.&lt;/li&gt;
&lt;li&gt;Use objdump -d to inspect what was inside the binary file.&lt;/li&gt;
&lt;li&gt;Modify the Hello World! In AArch64 and x82_64 to make it loop as demonstrated in the lab class.&lt;/li&gt;
&lt;li&gt;Modify the loop to run until 30 iterations (which took hours for me)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Lets start with the lab.
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Connecting to israel server.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh username@israel.cdot.systems // replace username with your username.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Copy and unpack tarball, which for a windows user like me can be a new experience.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp /public/filename ~
tar -xvzf filename
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Explore the files inside the spo600 folder and use the makefiles and make command to build the code.
Use make &lt;code&gt;clear&lt;/code&gt;first and then use &lt;code&gt;make filename&lt;/code&gt; for a particular file or make for all files in the dir.&lt;/li&gt;
&lt;li&gt;Use objdump -d to inspect what was inside the binary file.
Move to the c dir and the use objdump -d to inspect the c code writen.&lt;/li&gt;
&lt;li&gt;Modify the Hello World! In AArch64 and x82_64 to make it loop as demonstrated in the lab class.&lt;/li&gt;
&lt;li&gt;Modify the loop to run until 30 iterations (which took hours for me)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The code of loop that would print the loop counter every iteration from 0 – 10 appending to a statically coded string “Loop”. This task alone took us about a half of the next class. And to even expand it, we had to write code to make the program print 2 digits instead of 1 from every iteration between 0 – 10. So the results would look like: Loop 00, Loop 01, Loop 02, Loop 03,…. Loop 30. And this also took about a half of the class.&lt;/p&gt;

&lt;p&gt;My AArch64 final code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.text
.globl _start
min = 0                          
max = 30 // should use 31

_start:

        mov     x15, 10 // for division
        mov     x19, min // value of the counter
        add     x18, x19, '0' //counter
        adr     x17, msg+6
        adr     x20, msg+7

loop:
        cmp x19, x15

        b.lt inside //start from smallest decimals

        udiv x12, x19, x15 // divide x19/x15 and store in x12 // udiv r0,r1,r2     // unsigned - divide r1 by r2, places quotient into r0 - remainder is not calculated (use msub)
        add x14, x12, '0'
        strb w14,[x17]

inside:

        msub x10,x15,x12,x19 //
        msub r0,r1,r2,r3  // load r0 with r3-(r1*r2) (useful for calculating remainders)
        add x10, x10, '0'
        strb w10,[x20]

        //Printing msg

        mov     x0, 1           
        adr     x1, msg         
        mov     x2, len         

        mov     x8, 64          
        svc     0               


        add x19, x19, 1 
        cmp x19, max
        b.ne loop


        //return

        mov     x0, 0           
        mov     x8, 93          
        svc     0               

.data
msg:    .ascii      "Loop: #\n"
len=    . - msg

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;here is my ./loop run on AArch64 on israel SPO Server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuux9q96p7u78txm6y17t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuux9q96p7u78txm6y17t.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While debugging, syntax errors are easy to correct but logical errors are much harder to correct in assembly. You have the debugger available to you and it will show you where your program catches a seg-fault and crashes but to understand why it’s causing the seg-fault may not be clear, making it extremely difficult to fix.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600 Lab 3 - Math and Strings - Guessing Numbers.</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Wed, 28 Sep 2022 04:54:22 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo600-lab-3-math-and-strings-guessing-numbers-4gpk</link>
      <guid>https://dev.to/smitgabani/spo600-lab-3-math-and-strings-guessing-numbers-4gpk</guid>
      <description>&lt;p&gt;In my previous blog I started developing a number guessing game. The game. The game has some milestones:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accept number from the player.&lt;/li&gt;
&lt;li&gt;Compare two numbers.&lt;/li&gt;
&lt;li&gt;Generate a random number and store it.&lt;/li&gt;
&lt;li&gt;Compare the number accepted by the user with the random number.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my previous blog we have completed the first 2 parts. Now lets start with the next part. &lt;/p&gt;

&lt;h2&gt;
  
  
  Generate a random number and store it.
&lt;/h2&gt;

&lt;p&gt;So the program generates and stores the random number (answer) before starting to get the input number ('start:' part) since it needs to keep getting the input until the user inputs the correct answer but the answer in one game (round) must not be changed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;; variables
define        answer    $16
;-------------------------------------------------------
lda $fe        ; generate random number
and #$99       ; mask out low two bits (=numbers 0-99)
sta answer    ; store the answer

start:        jsr PRINT

dcb $0d,$0d,"E","n","t","e","r",32,"a",32,"n","u","m","b","e","r"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After generating the random number we will compare the input by the player with the random number. &lt;/p&gt;

&lt;p&gt;Here is the final code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;; A number guessing game

; ROM routine entry points
define        SCINIT        $ff81 ; initialize/clear screen
define        CHRIN        $ffcf ; input character from keyboard
define        CHROUT        $ffd2 ; output character to screen
define        SCREEN        $ffed ; get screen size
define        PLOT        $fff0 ; get/set cursor coordinates

; zeropage variables
define        PRINT_PTR    $10
define        PRINT_PTR_H    $11
define        value_h        $15
define        answer    $16

; absolute variables
define        GETNUM_1    $0080
define        GETNUM_2    $0081

; constants

; --------------------------------------------------------

        jsr SCINIT
        jsr CHRIN

        jsr PRINT

dcb "A",32,"n","u","m","b","e","r",32
dcb "g","u","e","s","s","i","n","g",32,"g","a","m","e",00

        lda $fe        ; generate random number
        and #$99       ; mask out low two bits (=numbers 0-99)
        sta answer     ; store the answer

start:        jsr PRINT

dcb $0d,$0d,"E","n","t","e","r",32,"a",32,"n","u","m","b","e","r"
dcb "(","0","-","9","9",")",":"
dcb 32,32,32,32,32,32,32,32,00

        lda #$00
        sta value_h

        jsr GETNUM

        sed        ; set decimal
        cmp answer ; compare user input to the answer
        cld        ; clear decimal flag (switches into binary math mode)

        bcc toolow    ; if input value is less than the answer, go to toolow
        beq identical    ; if input value is same as answer, go to identical

        ; otherwise (if input value is greater than the answer), go to toohigh
toohigh:    pha        ; push the accumulator
        jsr PRINT

dcb "T","o","o",32,"h","i","g","h",32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 00

        jsr start

toolow:        pha        ; push the accumulator
        jsr PRINT

dcb "T","o","o",32,"l","o","w",32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 00
        jsr start

identical:    pha        ; push the accumulator
        jsr PRINT

dcb "C","o","r","r","e","c","t",".",32
dcb "T","h","e",32,"a","n","s","w","e","r",32,"w","a","s",":"
dcb 32,32,32,32,32
dcb 00

        lda value_h
        beq low_digits
        lda #$31
        jsr CHROUT
        jmp draw_100s

low_digits:    lda answer
        and #$f0
        beq ones_digit

draw_100s:    lda answer
        lsr
        lsr
        lsr
        lsr
        ora #$30
        jsr CHROUT

ones_digit:    lda answer
        and #$0f
        ora #$30
        jsr CHROUT

        brk        ; if correct, break the program

; --------------------------------------------------------
; Print a message
; 
; Prints the message in memory immediately after the 
; JSR PRINT. The message must be null-terminated and
; 255 characters maximum in length.

PRINT:        pla    ; pull the accumulator
        clc    ; clear carry flag (C) - required before using ADC
        adc #$01    ; add with carry
        sta PRINT_PTR
        pla
        sta PRINT_PTR_H

        tya    ; transfer Y to A
        pha    ; push the accumulator

        ldy #$00
print_next:    lda (PRINT_PTR),y
        beq print_done

        jsr CHROUT
        iny
        jmp print_next

print_done:    tya
        clc
        adc PRINT_PTR
        sta PRINT_PTR

        lda PRINT_PTR_H
        adc #$00
        sta PRINT_PTR_H

        pla
        tay

        lda PRINT_PTR_H
        pha
        lda PRINT_PTR
        pha

        rts

; ---------------------------------------------------
; GETNUM - get a 2-digit decimal number
;
; Returns A containing 2-digit BCD value

GETNUM:        txa        ; transfer X to accumulator
        pha        ; push the accumulator
        tya        ; transfer Y to A
        pha

        ldx #$00    ; count of digits received
        stx GETNUM_1    ; store the X register
        stx GETNUM_2


getnum_cursor:    lda #$a0    ; black square
        jsr CHROUT
        lda #$83    ; left cursor
        jsr CHROUT

getnum_key:    jsr CHRIN
        cmp #$00
        beq getnum_key

        cmp #$08    ; BACKSPACE
        beq getnum_bs

        cmp #$0d    ; ENTER
        beq getnum_enter

        cmp #$30    ; "0"
        bmi getnum_key

        cmp #$3a    ; "9" + 1
        bmi getnum_digit

        jmp getnum_key

getnum_enter:    cpx #$00
        beq getnum_key

        lda #$20
        jsr CHROUT
        lda #$0d
        jsr CHROUT

        lda GETNUM_1

        cpx #$01
        beq getnum_done

        asl
        asl
        asl
        asl
        ora GETNUM_2

getnum_done:    sta GETNUM_1
        pla
        tay
        pla
        tax
        lda GETNUM_1

        rts

getnum_digit:    cpx #$02
        bpl getnum_key
        pha
        jsr CHROUT
        pla
        and #$0f
        sta GETNUM_1,x
        inx
        jmp getnum_cursor

getnum_bs:    cpx #$00
        beq getnum_key
        lda #$20
        jsr CHROUT
        lda #$83
        jsr CHROUT
        jsr CHROUT
        lda #$20
        jsr CHROUT
        lda #$83
        jsr CHROUT
        dex
        lda #$00
        sta GETNUM_1,x
        jmp getnum_cursor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jF9musKQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p01cfmxe6k0isywrkujh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jF9musKQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p01cfmxe6k0isywrkujh.png" alt="Image description" width="699" height="658"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have used Chris Tyler's code from the repository of Wiki page.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SPO600 Lab 3 - Math and Strings - Guessing Numbers.</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Tue, 27 Sep 2022 11:01:08 +0000</pubDate>
      <link>https://dev.to/smitgabani/spo600-lab-3-math-and-strings-guessing-numbers-5po</link>
      <guid>https://dev.to/smitgabani/spo600-lab-3-math-and-strings-guessing-numbers-5po</guid>
      <description>&lt;p&gt;Assembly language is a type of low-level programming language that is intended to communicate with a computer's hardware more easily than high level languages. A high level language require compiler or interpreter to convert the language to binary code.&lt;/p&gt;

&lt;p&gt;I will be using &lt;a href="https://wiki.cdot.senecacollege.ca/wiki/6502_Emulator"&gt;6502 Emulator&lt;/a&gt; available at &lt;a href="http://6502.cdot.systems/"&gt;http://6502.cdot.systems/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In lab 3 the task was to write either a game or a program that calculates or converts a value using 6502 Emulator.&lt;/p&gt;

&lt;p&gt;I have decided to code a number guessing game where the player has to guess a random number. The goal is to guess the number with least amout of clues.&lt;/p&gt;

&lt;p&gt;Two major tasks are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accept number from the player.&lt;/li&gt;
&lt;li&gt;Compare two numbers.&lt;/li&gt;
&lt;li&gt;Generate a random number and store it.&lt;/li&gt;
&lt;li&gt;Compare the number accepted by the user with the random number.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;; Comparing two numbers

; ROM routine entry points
define        SCINIT        $ff81 ; initialize/clear screen
define        CHRIN        $ffcf ; input character from keyboard
define        CHROUT        $ffd2 ; output character to screen
define        SCREEN        $ffed ; get screen size
define        PLOT        $fff0 ; get/set cursor coordinates

; zeropage variables
define        PRINT_PTR    $10
define        PRINT_PTR_H    $11
define        value        $14
define        value_h        $15

; absolute variables
define        GETNUM_1    $0080
define        GETNUM_2    $0081

; constants

; --------------------------------------------------------

        jsr SCINIT
        jsr CHRIN

        jsr PRINT

dcb "C","o","m","p","a","r","i","n","g",32,"t","w","o",32,
dcb "n","u","m","b","e","r","s",00

start:        jsr PRINT

dcb $0d,$0d,"E","n","t","e","r",32,"a",32,"n","u","m","b","e","r"
dcb "(","0","-","9","9",")",":"
dcb 32,32,32,32,32,32,32,32,00

        lda #$00
        sta value_h

        jsr GETNUM
        sta value

        jsr PRINT

dcb "E","n","t","e","r",32,"a","n","o","t","h","e","r"
dcb 32,"n","u","m","b","e","r",32,"(","0","-","9","9",")",":",32,00

        jsr GETNUM

        sed        ; set decimal
        cmp value
        cld        ; clear decimal flag (switches into binary math mode)

        bcc toohigh    ; if second value is less than the first value, go to toohigh
            beq identical    ; if second value is same as first one, 
        inc value_h    ; increment memory
toolow:           pha        ; push the accumulator
            jsr PRINT

dcb "N","u","m","b","e","r",32,"o","n","e",32,
dcb "i","s",32,"s","m","a","l","l","e","r",32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 00
            jsr start

identical:        pha        ; push the accumulator
            jsr PRINT

dcb "C","o","r","r","e","c","t",32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 00
            jsr start

toohigh:        pha        ; push the accumulator
            jsr PRINT

dcb "N","u","m","b","e","r",32,"o","n","e",32,
dcb "i","s",32,"g","r","e","a","t","e","r",32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 32,32,32,32,32,32,32
dcb 00
            jsr start

; --------------------------------------------------------
; Print a message
; 
; Prints the message in memory immediately after the 
; JSR PRINT. The message must be null-terminated and
; 255 characters maximum in length.

PRINT:        pla    ; pull the accumulator
        clc    ; clear carry flag (C) - required before using ADC
        adc #$01    ; add with carry
        ;sec ; set carry flag (C) - required before using SBC
        ;sbc #$01
        sta PRINT_PTR
        pla
        sta PRINT_PTR_H

        tya    ; transfer Y to A
        pha    ; push the accumulator

        ldy #$00
print_next:    lda (PRINT_PTR),y
        beq print_done

        jsr CHROUT
        iny
        jmp print_next

print_done:    tya
        clc
        adc PRINT_PTR
        sta PRINT_PTR

        lda PRINT_PTR_H
        adc #$00
        sta PRINT_PTR_H

        pla
        tay

        lda PRINT_PTR_H
        pha
        lda PRINT_PTR
        pha

        rts

; ---------------------------------------------------
; GETNUM - get a 2-digit decimal number
;
; Returns A containing 2-digit BCD value

GETNUM:        txa        ; transfer X to accumulator
        pha        ; push the accumulator
        tya        ; transfer Y to A
        pha

        ldx #$00    ; count of digits received
        stx GETNUM_1    ; store the X register
        stx GETNUM_2


getnum_cursor:    lda #$a0    ; black square
        jsr CHROUT
        lda #$83    ; left cursor
        jsr CHROUT

getnum_key:    jsr CHRIN
        cmp #$00
        beq getnum_key

        cmp #$08    ; BACKSPACE
        beq getnum_bs

        cmp #$0d    ; ENTER
        beq getnum_enter

        cmp #$30    ; "0"
        bmi getnum_key

        cmp #$3a    ; "9" + 1
        bmi getnum_digit

        jmp getnum_key

getnum_enter:    cpx #$00
        beq getnum_key

        lda #$20
        jsr CHROUT
        lda #$0d
        jsr CHROUT

        lda GETNUM_1

        cpx #$01
        beq getnum_done

        asl
        asl
        asl
        asl
        ora GETNUM_2

getnum_done:    sta GETNUM_1
        pla
        tay
        pla
        tax
        lda GETNUM_1

        rts

getnum_digit:    cpx #$02
        bpl getnum_key
        pha
        jsr CHROUT
        pla
        and #$0f
        sta GETNUM_1,x
        inx
        jmp getnum_cursor

getnum_bs:    cpx #$00
        beq getnum_key
        lda #$20
        jsr CHROUT
        lda #$83
        jsr CHROUT
        jsr CHROUT
        lda #$20
        jsr CHROUT
        lda #$83
        jsr CHROUT
        dex
        lda #$00
        sta GETNUM_1,x
        jmp getnum_cursor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the result for the program.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--pIYlaTnw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nzbksxrpdqhske3i89nd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pIYlaTnw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nzbksxrpdqhske3i89nd.png" alt="Image description" width="731" height="635"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So in this blog 2 out of 4 tasks have been completed.&lt;br&gt;
I will write one more blog to complete the guessing game.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>SPO600 Lab 2: Working with 6502 Emulator and bitmapped display.</title>
      <dc:creator>Smit Gabani</dc:creator>
      <pubDate>Sat, 17 Sep 2022 23:01:23 +0000</pubDate>
      <link>https://dev.to/smitgabani/working-with-6502-emulator-and-bitmapped-display-37fl</link>
      <guid>https://dev.to/smitgabani/working-with-6502-emulator-and-bitmapped-display-37fl</guid>
      <description>&lt;p&gt;In this post, I'm going to do some exercises using &lt;a href="http://6502.cdot.systems/"&gt;6502 Emulator&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have already filled the bitmap display with yellow color. In this post, I'm going to change the code to get some different results.&lt;/p&gt;

&lt;p&gt;The source code will will work on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    lda #$00    ; set a pointer in memory location $40 to point to $0200
    sta $40     ; ... low byte ($00) goes in address $40
    lda #$02    
    sta $41     ; ... high byte ($02) goes into address $41

    lda #$07    ; colour number

    ldy #$00    ; set index to 0

loop:   sta ($40),y ; set pixel colour at the address (pointer)+Y

    iny     ; increment index
    bne loop    ; continue until done the page (256 pixels)

    inc $41     ; increment the page
    ldx $41     ; get the current page number
    cpx #$06    ; compare with 6
    bne loop    ; continue until done all pages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result the code generates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qL5PvbsX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/78r738nbz318oslq2oir.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qL5PvbsX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/78r738nbz318oslq2oir.png" alt="Image description" width="880" height="674"&gt;&lt;/a&gt;&lt;br&gt;
Assemble &amp;gt; Run &amp;gt; No errors &amp;gt; Yellow output.&lt;/p&gt;
&lt;h2&gt;
  
  
  Now lets modify the code to make the following changes.
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Change the code to fill the display with light blue instead of yellow. (Tip: you can find the colour codes on the 6502 Emulator page).&lt;/li&gt;
&lt;li&gt;Change the code to fill the display with a different colour on each page (each "page" will be one-quarter of the bitmapped display).&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Making the changes.
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Changing the color from yellow to light blue.
&lt;/h3&gt;

&lt;p&gt;To chage the color from yellow to light blue we can change the line&lt;br&gt;
&lt;code&gt;lda #$07&lt;/code&gt; to &lt;code&gt;lda #$e&lt;/code&gt;&lt;br&gt;
The color numbers are:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$0: Black
$1: White
$2: Red
$3: Cyan
$4: Purple
$5: Green
$6: Blue
$7: Yellow
$8: Orange
$9: Brown
$a: Light red
$b: Dark grey
$c: Grey
$d: Light green
$e: Light blue
$f: Light grey
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The desired result is obtained. &lt;/p&gt;

&lt;p&gt;I have checked the monitor checkbox to see how the light blue color is stored.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--U55bvF8l--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/52ztdieo5wvte50puove.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--U55bvF8l--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/52ztdieo5wvte50puove.png" alt="Image description" width="880" height="822"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Change the code to fill the display with a different colour on each page.
&lt;/h3&gt;

&lt;p&gt;Now, I'm going to change the code to fill the display with a different color on each page (each page: one-quarter of the bitmapped display).&lt;/p&gt;

&lt;p&gt;This line is responsible for generating random color. (not exactly random).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;inc $10 ; increment the color number
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the final code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        lda #$00    ; set a pointer at $40 to point to $0200
        sta $40
        lda #$02
        sta $41

        lda #$07    ; colour number
        sta $10 ; store colour number to memory location $10

        ldy #$00    ; set index to 0

loop:    sta ($40),y    ; set pixel at the address (pointer)+Y

        iny ; increment index
        bne loop    ; continue until done the page

        inc $41 ; increment the page

        inc $10 ; increment the color number
        lda $10 ; colour number

        ldx $41 ; get the current page number
        cpx #$06 ; compare with 6
        bne loop ; continue until done all pages

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The final result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--468Ek7bP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6lairn28cd9s1kvb1yq0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--468Ek7bP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6lairn28cd9s1kvb1yq0.png" alt="Image description" width="880" height="766"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This code will generate rendom color page using random color number ($fe)&lt;br&gt;
You can also set the first color as a random color (lda $fe) by using pseudo-random number generator (PRNG) at $fe&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  lda #$00    ; set a pointer at $40 to point to $0200
        sta $40
        lda #$02
        sta $41

        lda $fe ; random color
        sta $10 ; store colour number to memory location $10

        ldy #$00    ; set index to 0

loop:    sta ($40),y    ; set pixel at the address (pointer)+Y

        iny ; increment index
        bne loop    ; continue until done the page

        inc $41 ; increment the page

        inc $10 ; increment the color number
        lda $10 ; colour number

        ldx $41 ; get the current page number
        cpx #$06 ; compare with 6
        bne loop ; continue until done all pages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--zFAsM3db--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/adx9obszjpy5t7nxpb5y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zFAsM3db--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/adx9obszjpy5t7nxpb5y.png" alt="Image description" width="880" height="828"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qteqL5nC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dj8875acdq84cngyjo1l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qteqL5nC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dj8875acdq84cngyjo1l.png" alt="Image description" width="880" height="828"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The more I work with the emulator the more I understand it.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
