Writing low-level functions in assembly might seem daunting, but it’s an excellent way to deepen your understanding of how things work under the hood. In this blog, we’ll recreate two popular C standard library functions, strlen
and strcmp
, in assembly language and learn how to call them from a C program.
This guide is beginner-friendly, so don’t worry if you’re new to assembly programming. Let’s dive in! 🚀
Table of Contents
- Introduction to Assembly and C Integration
-
What Are
strlen
andstrcmp
? - Setting Up Your Environment
-
Writing
strlen
in Assembly -
Writing
strcmp
in Assembly - Integrating Assembly with C
- Compiling and Running the Program
- Expected Output
- Conclusion
- Connect on Twitter
1. Introduction to Assembly and C Integration
Assembly language operates at a very low level, close to machine code. When combined with a high-level language like C, you get the best of both worlds:
- High-level readability.
- Low-level control and optimization.
In this guide, we’ll write two functions in assembly—my_strlen
and my_strcmp
—and call them from C to demonstrate this integration.
2. What Are strlen
and strcmp
?
-
strlen
: This function calculates the length of a null-terminated string (a string that ends with\0
). -
strcmp
: This function compares two strings character by character. It returns:-
0
if the strings are identical. - A negative value if the first string is smaller.
- A positive value if the first string is greater.
-
We’ll replicate their behavior in assembly.
3. Setting Up Your Environment
Tools Required:
- NASM (Netwide Assembler): To assemble the assembly code.
- GCC (GNU Compiler Collection): To compile and link the C and assembly code.
Installing the Tools (Linux):
Run the following commands:
sudo apt update
sudo apt install nasm gcc
4. Writing strlen
in Assembly
Logic of strlen
:
- Start with a counter initialized to
0
. - Read each character of the string until the null terminator (
\0
) is found. - Return the counter as the length.
Assembly Code for my_strlen
:
section .text
global my_strlen
my_strlen:
xor rax, rax ; Set RAX (length) to 0
.next_char:
cmp byte [rdi + rax], 0 ; Compare current byte with 0
je .done ; If 0, jump to done
inc rax ; Increment RAX
jmp .next_char ; Repeat
.done:
ret ; Return length in RAX
-
Registers Used:
-
RDI
: Points to the input string. -
RAX
: Stores the string length (output).
-
5. Writing strcmp
in Assembly
Logic of strcmp
:
- Compare the characters of two strings.
- Stop when the characters differ or the null terminator (
\0
) is reached. - Return:
-
0
if strings are equal. - Difference of ASCII values if they differ.
-
Assembly Code for my_strcmp
:
section .text
global my_strcmp
my_strcmp:
xor rax, rax ; Set RAX (result) to 0
.next_char:
mov al, [rdi] ; Load byte from first string
cmp al, [rsi] ; Compare with second string
jne .diff ; If not equal, jump to diff
test al, al ; Check if we’ve hit \0
je .done ; If \0, strings are equal
inc rdi ; Advance pointers
inc rsi
jmp .next_char ; Repeat
.diff:
sub rax, [rsi] ; Return difference
.done:
ret
-
Registers Used:
-
RDI
andRSI
: Pointers to the input strings. -
RAX
: Stores the result (output).
-
6. Integrating Assembly with C
Let’s write a C program that calls these assembly functions.
C Code:
#include <stdio.h>
#include <stddef.h>
// Declare the assembly functions
extern size_t my_strlen(const char *str);
extern int my_strcmp(const char *s1, const char *s2);
int main() {
// Test my_strlen
const char *msg = "Hello, Assembly!";
size_t len = my_strlen(msg);
printf("Length of '%s': %zu\n", msg, len);
// Test my_strcmp
const char *str1 = "Hello";
const char *str2 = "Hello";
const char *str3 = "World";
int result1 = my_strcmp(str1, str2);
int result2 = my_strcmp(str1, str3);
printf("Comparing '%s' and '%s': %d\n", str1, str2, result1);
printf("Comparing '%s' and '%s': %d\n", str1, str3, result2);
return 0;
}
7. Compiling and Running the Program
- Save the assembly code to
functions.asm
. - Save the C code to
main.c
. - Compile and link them:
nasm -f elf64 functions.asm -o functions.o
gcc main.c functions.o -o main
./main
8. Expected Output
Length of 'Hello, Assembly!': 17
Comparing 'Hello' and 'Hello': 0
Comparing 'Hello' and 'World': -15
9. Conclusion
By writing strlen
and strcmp
in assembly, you gain a better understanding of:
- Low-level memory operations.
- How strings are processed at the machine level.
- Integrating assembly with C to leverage the strengths of both languages.
What other C standard library functions would you like to see recreated in assembly? Let me know in the comments below!
10. Connect on Twitter
Enjoyed this guide? Share your thoughts or ask questions on Twitter! Let’s connect and explore more low-level programming together. 🚀
Top comments (0)