DEV Community

Cover image for C - Unsafe!
VanPonasenkov
VanPonasenkov

Posted on

C - Unsafe!

Introduction

You've probably heard many times by now that is C is not a memory safe language, and that is absolutely correct. Buffer overflows all over the place, memory leaks, and SEGFAULT had even become a taboo. In this small article we're gonna show you some unsafe C code, and we're going to rewrite it in Rust.

The problem

Consider this code:

// lifetimetest.c

#include <stdio.h>
#include <string.h>

char*
longest(char* a, char* b)
{
    if(strlen(a) > strlen(b)){
        return a;
    }
    return b;
}

main(void)
{
    char a[] = "Hello";
    char b[] = "KosherFoods";
    char* res;

    res = longest(a, b);
    printf("%s", res);
}

Enter fullscreen mode Exit fullscreen mode

Here we define the function longest which takes two character pointers (Essentially strings) and it returns the longer string.

if you compile and run this program:
$ gcc lifetimetest.c -o lft.out && ./lft.out
you should get this output:
KosherFoods
Which is what we expected

now let's rewrite this program slightly

#include <stdio.h>
#include <string.h>

char*
longest(char* a, char* b)
{
    if(strlen(a) > strlen(b)){
        return a;
    }
    return b;
}

main(void)
{
    char a[] = "Hello";
    char* res;
    {
        char b[] = "KosherFoods";
        res = longest(a, b);
    }

    printf("%s", res);
}
Enter fullscreen mode Exit fullscreen mode

Here we moved b into a different scope
if you compile and run this program:
$ gcc lifetimetest.c -o lft.out && ./lft.out
you should get this output:
KosherFoods
Nothing peculiar so far.

Let's rewrite the program once again!

#include <stdio.h>
#include <string.h>

char*
longest(char* a, char* b)
{
    if(strlen(a) > strlen(b)){
        return a;
    }
    return b;
}

main(void)
{
    char a[] = "Hello";
    char* res;
    {
        char b[] = "KosherFoods";
        res = longest(a, b);
    }
    char ohnoo[] = "Plan 9 from User Space";
    printf("%s", res);
}
Enter fullscreen mode Exit fullscreen mode

Here we declared a new string variable after the second scope
if you compile and run this program:
$ gcc lifetimetest.c -o lft.out && ./lft.out
you should get this output:
Plan 9 from User Space
But.... How could this Happen?

Explanation

In C, strings are essentially character arrays, and the variables that "store" those strings are just pointers to the start of the array.

So longest doesn't return the copy of the string, but rather, the pointer to the start of the string in memory. With this in mind let's continue. When we declare the variable b in the inner scope it gets put in the memory of that scope, but once the scope closes it gets freed, this means that any other variable can be put in the memory where b once was. so res still points to let's say 0x000004 (Where b once was), but now 0x000004 is the start of the ohnoo string.

Rewrite in Rust

Let's discuss a couple of things before moving on to the implementation. In Rust there is a term "Lifetime", it's means, well.. the lifetime of a reference and it prevents dangling references. with that in mind let's continue.

First let's implement the longest function:

fn longest<'a>(a: &'a str, b: &'a str) -> &'a str {
    if a.len() > b.len() {
        a
    } else {
        b
    }
}
Enter fullscreen mode Exit fullscreen mode

Here 'a is the lifetime specifier, which means that both of these parameters must live at least as long as 'a, with this in mind let's continue.

Consider this code:

fn main() {
    let a = String::from("APCHIHBALONGERSTRING");
    let result: &str;
    {
        let b = String::from("Banana");
        result = longest(a.as_str(), b.as_str());
        println!("Longest: {}", result);
    }
}
Enter fullscreen mode Exit fullscreen mode

If we type:
$ cargo run
we should get the following output:

Finished dev [unoptimized + debuginfo] target(s) in 0.01s
Running `target/debug/lifetime_test`
Longest: APCHIHBALONGERSTRING
Enter fullscreen mode Exit fullscreen mode

as you can the rust compiler didn't complain.
Now let's try using the lower lifetime borrow in the higher lifetime:

fn main() {
    let a = String::from("APCHIHBALONGERSTRING");
    let result: &str;
    {
        let b = String::from("Banana");
        result = longest(a.as_str(), b.as_str());
    }
    println!("Longest: {}", result);
}
Enter fullscreen mode Exit fullscreen mode

If we type:
cargo run
We should get the following output:

Compiling lifetime_test v0.1.0 (/home/ernest/projects/lifetime_test)
error[E0597]: `b` does not live long enough
  --> src/main.rs:14:38
   |
14 |         result = longest(a.as_str(), b.as_str());
   |                                      ^^^^^^^^^^ borrowed value does not live long enough
15 |     }
   |     - `b` dropped here while still borrowed
16 |     println!("Longest: {}", result);
   |                             ------ borrow later used here

For more information about this error, try `rustc --explain E0597`.
error: could not compile `lifetime_test` due to previous error
Enter fullscreen mode Exit fullscreen mode

This happens because we borrow the reference to b, but later we drop it (By exiting the scope) thus making the use of result in the outer scope (Higher lifetime) invalid.
As a result, we can't get the same undefined behaviour as we did in C, unless you're explicitly using the unsafe keyword.

Conclusion

In this article we provided an example of why C is dangerous and tricky to use. We've provided an example of the same Rust code and showed how Rust deals with such problems.

Note

Whilst i like Rust and i think it is great for low level things like: Audio libraries, video game engines, web servers, etc., i still don't think it's a good idea to rewrite even parts of the operating systems in Rust

Further Reading

Top comments (4)

Collapse
 
pauljlucas profile image
Paul J. Lucas

Your C code should be using const for all char*, e.g., char const*.

That aside, the faults of C have been known for quite a long time.

i still don't think it's a good idea to rewrite even parts of the operating systems in Rust.

Because?

Collapse
 
ernestvonmoscow profile image
VanPonasenkov

Because?

1st, Rust's assembly output is a bit more complicated than that of C (Or C++, Even with things like classes and generics), not a problem when dealing with a game engine, or a web server, but pretty important when writing an operating system (Or when writing on an embedded system). with that said, i'm not saying Rust is slow, it's surprisingly fast, what i'm saying is Rust's assembly is harder to debug than that of C or C++

2nd, this part is a bit biased but hear me out, i don't think we NEED TO rewrite even parts of the current operating systems in Rust (For example drivers). The reason for that is very simple,- C provides the bare minimum for writing that sort of stuff, and i don't think we need anything more.

In spite of this, i still like Rust. I'm looking forward to new operating systems being written in Rust, but i oppose the idea of rewriting the current ones

Collapse
 
pauljlucas profile image
Paul J. Lucas

Rust's assembly output is a bit more complicated than that of C....

That could simply be due to an immature compiler. gcc way back in the 3.0 days was pretty poor too.

Rewriting anything just for the sake of rewriting it is generally a bad idea. There's simply no reason to replace debugged, stable code.

Thread Thread
 
ernestvonmoscow profile image
VanPonasenkov

agreed