Luca Barbato

Posted on Feb 24, 2020

Oxidizing code

#rust

Last week I presented to my local rust meetup a neat tool called c2rust.
Next month, if everything returns normal from the current local emergency we'll have an hacknight about it.

This is yet another intentionally terse blogpost, I will probably write more about it in other posts.

Oxidizing C

Converting code to rust is commonly called oxidizing, with C being the main target.

Since nearly the beginning of the current Rust, multiple efforts were spent to automatically convert a C codebase to Rust as a way to jumpstart writing something that is hopefully as fast as the original but less prone to memory faults and all the other mistakes Rust effectively prevents at compile time.

corrode was the first quite effective tool for that. It is written in Haskell and had a number of shortcomings. c2rust reimplemented the great ideas from it using Rust as language and progressed further. It is still not perfect but much easier to use. I won't detail how to use it here, but I'll do on another post.

Why?

I'm usually against rewriting code for the sake of rewriting it: if something works well enough, there is little reasons to spend time redoing it in another language.

Usually wrapping it is more cost-effective, bindgen makes using C-API libraries a breeze.

But then I started to feel the burden of maintaining them. The integration problems do pile up:

Document how to build the original library (and deal with its own build system warts and issues)
If you want to make the crate build its own private copy you are basically feeling all the pain of packaging (I wrote the autotools crate while facing this).
Supporting cross-compilation is far from easy, while cross-compiling rust to even strange platforms such wasm32-wasi is fairly straightforward.

Depending on the size of the library in the long run you might see that oxidizing it might be more effort-effective.

The Plan

Ideally I want to make the target library end up written in Rust, with a nice idiomatic API for the Rust side while also providing a C-API variant that is not distinguishable from the original C library.

The Process

Here a quick list of steps and tools that make much easier to complete them.

Make a thin wrapper of the C library

You may use bindgen to produce a -sys crate in minutes.

Since we are going to use this crate as scaffold we can keep it simple.
Here an full example build.rs

    let libs = metadeps::probe().unwrap();
    let headers = libs.get(MY_LIBRARY).unwrap().include_paths.clone();

    let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());

    let mut builder = bindgen::builder().header("data/mylib.h"));

    for header in headers.iter() {
        builder = builder.clang_arg("-I").clang_arg(header.to_str().unwrap());
    }

    let s = builder
        .generate()
        .unwrap()
        .to_string()
        .replace("/**", "/*")
        .replace("/*!", "/*")

    let mut file = File::create(out_path.join("mylib.rs")).unwrap();

    let _ = file.write(s.as_bytes());

See the bindgen documentation for a full walk-through.

Machine-convert the C code to quasi-Rust

c2rust can automatically convert most of the C language to a fairly ugly amount of code that rustc can grok.

Its manual guides you to the proper way of incrementally change the code using its refactor tool.

Write comparative tests

The built-in test harness may be enough, but quickcheck or even cargo-fuzz might help if you want to be extra sure you aren't missing corner cases while reworking the code.

#[test]
fn does_it_work() {
    let a = sys::my_call();
    let b = nat::my_call();

    assert_eq!(a, b);
}

NOTE: When dealing with floating point values you will have to check that the difference between the values is smaller than a epsilon small value.

cargo-kcov or the still-nightly -Z profile + grcov and ccguard are useful to see how good your test suite is.

Make the code pretty

You may leverage the refactor tool provided by c2rust or do without:

Remove all the #[features] lines, you usually can do w/out it.
Replace the libc function calls.
Replace the manual allocations by implementing a normal constructors
Replace the "relooped" code with saner control flow, possibly leveraging the iterators.

Make the code (as) fast

criterion or any of the tools I mentioned here will help you in your optimization jurney.

Prepare an idiomatic API

Ideally the original C-API can live in a capi.rs file while the idiomatic one could stay in api.rs.

You might call the C-API from the Rust-API initially and end up having the C-API wrap the Rust-API once you are done.

Make sure to write non-comparative tests by the time you complete this step.

Make the Rust library usable from C

I wrote cargo-c to make the process simple, here if you want to know more.

$ cargo install cargo-c
$ cargo cinstall --prefix=/usr --destdir=/tmp/staging
$ sudo cp -a /tmp/staging/* /

Should be all you need to mention in the README.

Coming next

I still have to write another post regarding optimizing rav1e and the next rav1e release. I skipped blogging about 0.3 since I did talk a lot about it during the last FOSDEM.

Thanks

To Edoardo Morandi that volunteered to help with with speexdsp-rs effort and all the rust-torino members.

DEV Community