Deepu K Sasidharan for Okta

Posted on Mar 21, 2022 • Originally published at developer.okta.com

Why Safe Programming Matters and Why a Language Like Rust Matters

#rust #c #security #cpp

As programmers, how many of you have a good understanding of programming safety or secure programming? It's not the same as application security or cyber security. I have to confess; I didn't know a lot about these in the early years of my career, especially since I didn't come from a computer science background. But looking back, I think programming security is something every programmer should be aware of and should be taught at a junior level.

What is safe programming, or to be more precise, what does being safe mean for a programming language? Or rather, what does unsafe mean? Let's set the context first.

If you would rather follow along by watching a video, check out the video of the talk I made on the same topic, at FOSDEM'22, below from the OktaDev YouTube channel.

Programming safety

Programming safety = Memory safety + Type safety + Thread safety

When we talk about "safety" in programming, we mean some combination of three distinct things: memory safety, type safety, and thread safety. There are four if you count null safety as distinct from memory safety, but we'll group those two together today.

Memory safety

In a memory-safe language, when you access a variable or an item in an array, you can be sure that you are indeed accessing what you meant to or are allowed to access. In other words, you will not be reading or writing into the memory of another variable or pointer by mistake, regardless of what you do in your program.

So, why is this a big deal? Don't all major programming languages ensure this?

Yes, to varying extents. But some languages are unsafe by default—for example, C and C++. In C or C++, you can access the memory of another variable by mistake, or you can free a pointer twice; that's called double-free error. Sometimes a program continues to use a pointer after it has been freed, and that's called a use-after-free (UAF) error or a dangling pointer error. Such behaviors are categorized as undefined; they are unpredictable and cause security vulnerabilities rather than just crashing the program. In these scenarios, a crashing program is a good thing as it won't cause a security vulnerability.

I call it my billion-dollar mistake. It was the invention of the null reference in 1965

– Tony Hoare

Then there is also null safety which is kind of related to memory safety. I come from a Java/JavaScript background, and we are used to the concept of null. Null is infamous for being the worst invention in programming. Garbage collected languages need a concept of nothing so that a pointer can be freed when unused. But the concept also leads to issues and pain, like the null pointer exceptions. Technically this relates to memory safety, but most memory-safe languages still let you use null as a value leading to null pointer errors.

Type safety

In a type-safe language, when you access a variable, you access it as the correct type of data according to how it is stored. This gives us the confidence to work on data without manually checking for the data type during runtime. Memory safety is required for a language to be type-safe.

Thread safety

In a thread-safe language, you can access or modify the same memory from multiple threads simultaneously without worrying about data races. This is generally achieved using message passing techniques, mutual exclusion locks (mutexes), and thread synchronization. Thread safety is required for optimal memory and type safety, so generally, memory and type-safe languages tend to be thread-safe as well.

Why does it matter?

Ok! Why does this matter, and why should we care? Let's take a look at some stats to get an idea first.

Memory safety issues

Memory safety issues are the cause of most security CVEs (Common Vulnerabilities and Exposures) we encounter. Undefined behavior can be abused by a hacker to take control of the program or to leak privileged information. If you try to access an out of bounds array element in a memory-safe language, you will just crash the program with panic or error, which is predictable behavior.

This is why memory-related bugs in C/C++ systems often result in CVEs and emergency patches. There are other memory-unsafe behaviors in C/C++, like accessing pointers from stack frames that have been popped, a memory that has been de-allocated, iterator invalidation, and so on. Memory safe languages, even ones that are not the safest, still protect against such security issues.

If we take a look at stats, we can see that:

About 70% of all CVEs at Microsoft are memory safety issues.
Two-thirds of Linux kernel vulnerabilities come from memory safety issues.
An Apple study found that 60-70% of vulnerabilities in iOS and macOS are memory safety vulnerabilities.
Google estimated that 90% of Android vulnerabilities are memory safety issues.
70% of all Chrome security bugs are memory safety issues.
An analysis of 0-days that were discovered being exploited in the wild found that more than 80% of the exploited - vulnerabilities were memory safety issues.
Some of the most popular security issues of all time are memory safety issues:
- Slammer worm, WannaCry, Trident exploit, HeartBleed, Stagefright, Ghost

That's a huge chunk of CVEs, and of course, it's no surprise that most of it is from C/C++ systems 🤷

Imagine a world without memory safety issues. Imagine the amount of developer time saved, amount of money saved, amount of resources saved. Sometimes I wonder why we still use C/C++. Why do we trust humans, against all available evidence, to handle memory manually? And this is without considering other non-CVE memory issues like memory leaks, memory efficiency, and so on.

Thread safety issues

Though not as notorious as memory safety, thread safety is also a cause of major headaches for developers and can result in security issues.

Thread safety issues can cause two types of vulnerabilities:

Information loss caused by one thread overwriting information from another
- Pointer corruption that allows privilege escalation or remote execution
Integrity loss due to information from multiple threads being interlaced
- The best-known attack of this type is called a TOCTOU (time of check to time of use) attack, which is a race condition between checking a condition (like a security credential) and using the results.

Both information loss and integrity loss can be exploited and lead to security issues. While thread safety-related exploits are harder and less common than memory safety ones, they are still possible.

Type safety issues

While not as critical as memory and thread safety, lack of type safety can also lead to security issues, and type safety is important for ensuring memory safety.

Low-level exploits are possible in languages that are not type-safe, as an attacker can manipulate the data structure and change the data type to gain access to privileged information. Although this type of exploit is pretty rare, it's not unheard of.

Why Rust?

Now that we understand how important programming safety is, let's see why Rust is one of the safest languages and how it avoids most of the security issues we normally encounter with languages like C/C++.

For those not familiar, Rust is a high-level multi-paradigm language. It's ideal for functional and imperative programming. It has very modern and, in my opinion, the best tooling for a programming language. Though it was originally designed as a systems programming language, its advantages and flexibility have made it suitable for all sorts of use cases as a general-purpose language.

"Rust throws around some buzzwords in its docs, but they are not just marketing buzz; they actually mean it with full sincerity, and they matter a lot."

Rust's safety guarantee

The safety guarantee is one of the most important aspects of Rust; Rust is memory-safe, null-safe, type-safe, and thread-safe by design.

If the compiler detects unsafe code, it will refuse to compile that code by default. You would have to go out of your way to break those guarantees using the unsafe keyword. So even in cases where you would have to write unsafe code, you are making it explicit and hence issues can easily be traced down to specific code blocks.

Memory safety in Rust

Rust ensures memory safety at compile time using its innovative ownership mechanism and the borrow checker built into the compiler. The compiler does not allow memory unsafe code unless it's explicitly marked as unsafe in an unsafe block or function. This static compile-time analysis eliminates many types of memory bugs, and with some additional runtime checks, Rust guarantees memory safety.
There is no concept of null at the language level. Instead, Rust provides the Option enum, which can be used to mark the presence or absence of a value. This makes the resulting code null safe and much easier to deal with, and you will never encounter null pointer exceptions in Rust.

The ownership and borrowing mechanisms make Rust one of the most memory-efficient languages while avoiding pitfalls with manual memory management and garbage collection. It has memory efficiency and speeds comparable to C/C++, and memory safety that's better than garbage-collected languages like Java and Go.

I've written detailed articles about memory management in different languages in my personal blog, so check them out if you are interested in learning more about memory management in Java, Rust, JavaScript, and Go.

Type safety in Rust

Rust is statically typed, and it guarantees type safety by strict compile-time type checks and by guaranteeing memory safety. This is not special, as most modern languages are statically typed. Rust also allows some level of dynamic typing with the dyn keyword and Any type when required. But the powerful type inference and the compiler ensure type safety even in those cases.

Thread safety in Rust

Rust guarantees thread safety using similar concepts for memory safety and provides standard library features like channels, mutex, and ARC (Atomically Reference Counted) pointers. In safe Rust, you can have either one mutable reference to a value or unlimited read-only references to it at any given time. The ownership mechanism makes it impossible to cause accidental data race from a shared state. This makes us confident to focus on code and let the compiler worry about shared data between threads.

Other Rust features

I wrote about my impressions of Rust in a detailed post on my blog where I explain Rust's excellent features that make it unique. Here is a short summary of those features:

Zero cost abstractions: Rust offers true zero-cost abstractions, which means that you can write code in any style with any number of abstractions without paying any performance penalty. Very few languages offer this, which is why Rust is so fast. Rust compiler will always generate the best byte code regardless of the style of code you write. This means you can write functional-style code and get the same performance as its imperative counterpart.
Immutable by default: Values in Rust are immutable, or read-only, by default. Mutability has to be declared explicitly. This, along with the ability to pass by value or reference, makes it super easy to write functional code without side effects.
Pattern matching: Rust has excellent support for advanced pattern matching. Pattern matching is used extensively for error handling and control flows in Rust.
Advanced generics, traits, and types: Rust has advanced generics and traits with type aliasing and type inference support. Though generics could easily become complex when combined with lifetimes, it's one of the most powerful features of Rust.
Macros: There is also support for metaprogramming using macros. Rust supports both declarative macros and procedural macros. Macros can be used like annotations, attributes, and functions.
Great tooling and one of the best compilers: Rust has one of the best compilers and the best tooling I have seen and experienced (compared to JS world, JVM languages, Go, Python, Ruby, CSharp, PHP, C/C++). It also has excellent documentation, which is shipped with the tooling for offline use. How awesome is that!
Excellent community and ecosystem: Rust has one of the most vibrant and friendly communities. The ecosystem is quite young but is one of the fastest-growing.

Usually, a programming language would offer a choice between safety, speed, and high-level abstractions. At the very best, you can pick two of those. For example, with Java/C#/Go, you get safety and high-level abstractions at the cost of runtime overhead, whereas C++ gives you speed and abstractions at the cost of safety. But Rust offers all three and a good developer experience as a bonus. I don't think many other mainstream languages can claim that.

"Rust, not Firefox, is Mozilla's greatest industry contribution."

– TechRepublic

This doesn't mean there are no downsides, and Rust is definitely not a silver bullet. There are issues like the steep learning curve and complexity of the language. But it's the closest thing to a silver bullet, in my opinion. That doesn't mean you should just start using Rust for everything. If a use case requires speed, concurrency, building system tools, or building CLIs, then Rust is an ideal choice. Personally, I would recommend Rust over C/C++ for any use case unless you are building a tool for a legacy platform that Rust does not support.

Learn more about Rust and security

If you want to learn more about Rust and security in general, check out these additional resources.

If you liked this tutorial, chances are you'll enjoy the others we publish. Please follow @oktadev on Twitter and subscribe to our YouTube channel to get notified when we publish new developer tutorials.

Top comments (6)

Alex Lohr • Mar 21 '22

When you start rust, it feels like you're battling the borrow checker and the type system and your only allies are all those really helpful APIs and macros that make it at least partially bearable, until you finally get the concept of borrowing and now the borrow checker becomes your pair programmer buddy who points out those parts where your code could become unsafe.

Deepu K Sasidharan • Mar 22 '22

So true

Eduard • Apr 17 '22

Exactly, that's the point where you realize all the code you had written in C or C++ was actually unsound or not easily proven to be sound.

Selvakumar Jawahar • Mar 26 '22 • Edited

This article is bit misleading. Check the list of CVE listed for Rust cve.mitre.org/cgi-bin/cvekey.cgi?k.... Many of them are for the same vulnerabilites which are mentioned in the Article. By no means I am saying Rust is not a good language. All I want to point out is memory safety and in general resource safety issues cannot be fully avoided, just by moving to Rust. This is a much deeper topic. Claiming that moving to Rust will solve these problems is incorrect.
One primary reason is hardware itself is fundamentally unsafe.

Eduard • Apr 17 '22

hardware itself is fundamentally unsafe.
unless you are dealing with a broken cpu or something like that, there is nothing unsafe about hardware

if you write rust, you are safe behind all of its checks. yes, unsafe rust is a thing, but it allows us to make small abstractions over unsafe code, that can be proven to be safe to use; once this is done, any user of the code can be sure their rust code is safe as well

Deepu K Sasidharan • Apr 3 '22 • Edited

Thanks for the comment. Rust provides a way to write memory unsafe code (which I have mentioned in the article) and with that anyone can end up with Rust code that causes CVEs but that is not the default, you have to explicitly write unsafe blocks for that. The chance of developers writing unsafe code in Rust is way less compared to C/C++ where the default is unsafe. And to be fair if you take a closer look, many of those CVEs are from crates that relies on underlying C code and some other are from non memory safety issues. And as I mentioned in the conclusion Rust is not a silver bullet and it would be hard to avoid writing unsafe code atleast when consuming underlying OS/hardware stuff but Rust does drastically reduce the possibility of memory safety issues by default and makes it easier to reason about unsafe code when you have to write them