Introduction
I've been dreaming about completing the fast.ai course for about 4 years now. I've done some of the projects in the book, and it's fun, however, I've never liked the magical nature of the fastai
library.
So I've decided to write this series as a means of understanding the underlying libraries and concepts. And since I love the language, and I'm a glutton for punishment, I've decided to do all of this in Rust. I've found the dfdx
crate to be really approachable, and it has the same kind of strong typing that I love about the Rust language as a whole.
In particular, from the documentation:
dfdx is a cuda accelerated tensor and neural network library, writtten entirely in rust!
Additionally, it can track compile time shapes across tensor operations, ensuring that all your neural networks are checked at compile time.
I'm very excited about the compile time checking of the neural networks, but on the other hand, that also makes it more difficult to create neural networks on the fly. You have to make sure everything type checks. But once it does, you can be sure your dimensions line up properly.
By no means am I an expert on anything I'm talking about, this is as much of a learning experience for me as I assume it is for you, dear reader. That being said, if you want to read the ramblings of an engineer attempting to learn how to build Neural Networks in a language that is, by most metrics, terrible for the field, then please, by all means, read on.
Part 1 of this series will only talk about setting up the environment and showing off the Tensor math capabilities of the dfdx
crate. We'll discuss actually setting up a neural network in part 2.
Setting up
As is customary with these types of articles, I must start with the obligatory explanation of how to install the tools I'm going to be using.
We'll just be using the standard Rust installation method. I'm running Linux so I'll copy the installation command below, Mac is the same. For Windows, you can click on the link in this paragraph and download the rustup-init.exe
file appropriate for your CPU architecture.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
This will install the rust compiler and the cargo
toolchain that we will use to fetch dependencies, and build our code into native binaries.
Next, let's navigate to a comfortable directory where we can start coding. I use $HOME/git
for all of my code.
cd $HOME/git/
mkdir -p articles/fastai-rust/
cd articles/fastai-rust/
Then we can use cargo
to create our rust project. I'm just going to go by chapters for right now.
cargo new --bin chapter1
cd chapter1/
Now if you run cargo run
in this folder you'll see the ceremonial "Hello, world!".
➜ cargo run
Compiling chapter1 v0.1.0 (/home/klah/git/articles/fastai-rust/chapter1)
Finished dev [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/chapter1`
Hello, world!
Let's do something slightly interesting in the next section.
Doing something interesting
Now "Hello, world!" is all well and good, but that's just a starting place. Let's figure out how to get something mildly interesting going. Right now, I'm thinking that doing some simple tensor math is a good starting place. We can get to the actual meat of the course after we've seen the type of thing that dfdx
can do.
First let's add some useful crates, and our intended dfdx
crate. I'm adding the env_logger
and the log
crates for ease of logging.
cargo add env_logger log dfdx
And if we edit our main.rs
file a bit:
fn main() {
env_logger::Builder::new()
.filter_level(log::LevelFilter::Info)
.init();
}
This is the magic incantation that I use a lot in order to get sane default logging. This is nice, because the default format includes a timestamp and the module that the log line was logged from. This can help with rudimentary timing at least to the second level granularity. And for a lot of what we're going to be doing, second level granularity will be nice, because it will take a long time to run some of the steps. And we won't have to inject specific timing code.
use dfdx::prelude::*;
fn main () {
// ...
let dev = AutoDevice::default();
}
This line creates a "device" that actually does the construction of tensors, and models. This can be either a CPU or an NVidia GPU running on Cuda. We need a device to create tensors, because if we are actually creating them on a GPU, we can't just create a vector in memory, we have to create it in the GPU memory. A device helps you create tensors and initialize them with data.
// ... Rest of code
let a: Tensor<Rank1<3>, f32, _> = dev.tensor([1.0, 2.0, 3.0]);
let b: Tensor<Rank1<3>, f32, _> = dev.tensor([4.0, 5.0, 6.0]);
let c = a.clone() + b.clone();
log::info!("\n{:?} + {:?} = {:?}", a.array(), b.array(), c.array());
These lines create two one dimensional tensors and add them together, and log out the results. Tensors add together by just adding each individual element. So we can see from the output: [1.0, 2.0, 3.0] + [4.0, 5.0, 6.0] = [5.0, 7.0, 9.0]
Calling the .array()
method turns the tensor into something that implements Debug
in a sensible manor. In particular, it turns a one dimensional tensor into an array of the same size. We'll see how it similarily converts a multidimensional tensor into a multidimensional array in the next step.
// ... Rest of code
let e: Tensor<Rank2<2, 3>, f32, _> = dev.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let f = dev.tensor([7.0, 8.0, 9.0]).broadcast();
let g = e.clone().powf(0.5) + (f.clone() * 2.0);
// Log the result
log::info!("\n √({:?}) \n+ (2.0 * {:?}) \n= {:?}", e.array(), f.array(), g.array());
In these lines, I decided to do something different. The first line just creates a two dimensional tensor with a 2x3 shape. The line after also creates a two dimensional vector, but it does it by broadcasting a one dimensional tensor into the second dimension with the same shape as e
.
This is done with fancy Rust type system trickery. Because we add them together and store them in g
, Rust knows that they must have the same type, and therefore it knows to change the shape to a 2x3 tensor.
In the third line, I wanted to show some more advanced math operations. So I am taking the element-wise square root of e
, and doubling f
, and adding them together.
The final log line just makes the log easier to read by putting the tensors on their own line.
Putting it all together
So here I've collected all the lines together into a single program. You can also see all the code
use dfdx::prelude::*;
fn main() {
// Initialize logging
env_logger::Builder::new()
.filter_level(log::LevelFilter::Info)
.init();
// Initialize dfdx device, either CPU or CUDA depending on availability
let dev = AutoDevice::default();
// Create two tensors and add them together
let a: Tensor<Rank1<3>, f32, _> = dev.tensor([1.0, 2.0, 3.0]);
let b: Tensor<Rank1<3>, f32, _> = dev.tensor([4.0, 5.0, 6.0]);
let c = a.clone() + b.clone();
// Log the result
log::info!("\n{:?} + {:?} = {:?}", a.array(), b.array(), c.array());
// It even supports higher dimensional tensors
let e: Tensor<Rank2<2, 3>, f32, _> = dev.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let f = dev.tensor([7.0, 8.0, 9.0]).broadcast();
let g = e.clone().powf(0.5) + (f.clone() * 2.0);
// Log the result
log::info!("\n √({:?}) \n+ (2.0 * {:?}) \n= {:?}", e.array(), f.array(), g.array());
}
And if we run that we can see some simple tensor math getting performed.
➜ cargo run
Compiling chapter1 v0.1.0 (/home/klah/git/articles/fastai-rust/chapter1)
Finished dev [unoptimized + debuginfo] target(s) in 0.84s
Running `target/debug/chapter1`
[2023-11-18T03:55:42Z INFO chapter1]
[1.0, 2.0, 3.0] + [4.0, 5.0, 6.0] = [5.0, 7.0, 9.0]
[2023-11-18T01:37:13Z INFO chapter1]
√([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
+ (2.0 * [[7.0, 8.0, 9.0], [7.0, 8.0, 9.0]])
= [[15.0, 17.414213, 19.73205], [16.0, 18.236069, 20.44949]]
Next time
In the next installment, I plan to start working on the dog/cat categorizer that is our first model from chapter 1.
This is not nearly as easy to do in Rust as it is in Python, so there will be a number of steps to get us into a position where we can actually create the model and run it.
Top comments (0)