Maksim Ryndin

Posted on Jul 30 • Updated on Aug 27

Cairo for Rust devs I (verifiable computation)

#rust #cairo #starks

This is a series of articles about Cairo usually mentioned in the context of web3 and Starknet smart contracts but here we would like to expose it as a language for creating provable and verifiable programs which allows you to outsource the computation in a trustless manner¹.

It is a work-in-progress. Updates are expected. Experienced Caironautes are welcome to point out where I am wrong.

Why Cairo when Rust is already so awesome
Cairo
- Mind Map
- Setup
- Input and output
- Provers
- Proof
Conclusion
References

Why Cairo when Rust is already so awesome

Let's say, you have a task of finding arbitrage opportunities between different currency markets². You can model this problem via finding the negative cycle in the graph where vertices are currencies and edges are pair markets. The graph is directed and weighted where weights are logarithms of exchange rates. Bellman-Ford shortest path algorithm is usually applied in this case (we will focus on verifiable computation here and on the implementation in the next article so don't worry about the algorithm now).

The time complexity of the algorithm is O(|V||E|) (where V is a set of vertices and E is a set of edges) so you're going to outsource this computation for large graphs to a cloud provider but due to the money-related nature of the problem you want to make it reliably - you do want to verify the result without re-running the algorithm (i.e. be sure that the provider actually ran the binary with the algorithm). In other words, you want to ensure a computational integrity via a succinct verification of the output³.

Hm... How would we make it?

The conventional thinking would suggest us to use hashing of binaries (actually hashes are the right direction) but wouldn't guarantee the unchanged runtime (memory can be changed), inputs can be changed. So traditional approaches focus on how to protect the binary and input from being tampered in any way. But there is always some room for cheating.

To our relief brainy cryptographers have come up with various proof schemes. In our example, the cloud provider is a prover who runs the expensive computation and delivers the output and the proof⁴ to the verifier (us). One of the latest state-of-the-art proof schemes is STARK which in layman's (my) terms basically records the computation execution trace which is then randomly sampled by verifier and checked against predefined constraints⁵.

It is possible to integrate this execution tracing into an existing Rust program using some of general STARK libraries (e.g. Winterfell) but it would require knowing the low-level things very well (see the example for very simple computation) and be sure that all computations are traced (and all constraints are enforced).
Would it be nice to abstract away the low-level complexities and work with a familiar Rust syntax? Welcome to Cairo!

Cairo

Cairo is a language for creating programs which are allow to verify computations without re-running the program. Cairo is heavily inspired by Rust⁶ (move semantics and ownership, traits, core types, tooling) but due to the nature of verifiable computation the underlying machinery (memory model and a non-deterministic execution) is completely different. Our knowledge of Rust should help to quickly get used to the syntax and control flow but we should be careful when thinking over the computation.

Mind map

Rust	Cairo	Purpose
rustc	cairo	compiler
cargo	scarb	multitool (dependencies, build, test)
Cargo.toml	Scarb.toml	project configuration
rustup	asdf	toolchain management
crates.io	scarbs.xyz	package registry
Rust Playground	cairovm.codes	online compilation

Setup

I expect that you have Rust already installed. Let's also install scarb (asdf is a recommended way) and make a project⁷:

$ scarb new bellman_ford_cairo

If we open the generated repository (there is the official VS code extension⁸ to highlight the syntax), we will see the familiar structure.

Input and output

Let think on the input of data. In the context of a pricing data and different currency pairs we can expect a sparse graph. So a lean representation would be to use a vector of tuples Vec<(String, String, f32)> where strings denote currency names (like USD or EUR) and f32 is an exchange rate. All sounds good except... except some minor things:

There are no utf8 String's in Cairo - alternatives are short strings (up to 31 ascii bytes) and long strings (byte arrays of ascii bytes).
There is no builtin dynamically growing modifiable array like Vec in Rust (but with Cairo 2.7.0 it is expected). There is an appendable-only (and poppable from the front) Array which can serve as a FIFO-queue and Felt252Dict<T> which can serve as a basis to implement library-defined Vec.
There are no floats in Cairo. So either we can revert to library-defined fixed-point numbers or multiply input floats in advance by some integer constant and drop the fractional part.

And we also may stuck how to provide the input as there is no such thing as std wrappers to libc to read stdin or a file. It is like Rust in #[no_std]. Let's investigate which options we have:

$ scarb cairo-run --help

There is a section Arguments which says that a program argument is "a JSON array of numbers, decimal bigints or recursive arrays of those" and the main function should have a corresponding signature (there is an example [1, 2, [3, 4, 5]]` to `fn main(t: (u64, u64), v: Array<u64>)). So let's remove an example code from lib.cairo and try the following main function:

fn main(data: Array<(u16, u16, felt252)>) {
    println!("number of edges: {}", data.len());
}

Here we encode currency names with u16 (they represent graph edges as a pair of vertices) and felt252 represents an exchange rate (a weight of the edge). Here we should talk a little bit more about the CPU architecture which is expected by Cairo program. The machine word is 252 bits wide and is represented by felt252 type (Field Elemen*t*, because of the underlying math). So every type is Serialization of Cairo types eventually represented via felt252. So basically the maximum number we can get is 252 bits and if we are going to encode floats by multiplying them with some integer constant, felt252 seems to be a right choice.

Let's compile and run the program (think cargo run):

$ scarb cairo-run "[[1, 2, 300]]"

Output:

number of edges: 1
Run completed successfully, returning []

It works! Ok, what about an output? returning [] hints that we can try the same way as an input. We expect that the algorithm returns a path corresponding to the arbitrage opportunity. Let's try:

fn main(data: Array<(u16, u16, felt252)>) -> Array<u16> {
    println!("number of edges: {}", data.len());
    array![1, 2, 3, 4] // path 1 -> 2 -> 3 -> 4
}

The output:

number of edges: 1
Run completed successfully, returning [2837, 2841]

What??? By the way, you may get another numbers as I did in other runs as the execution is non-deterministic. Let's first inspect array! macro. We see that it is a code-generation plugin to the compiler which basically instantiate an array ArrayTrait::new() and appends every element to it. Why ArrayTrait and not just Array? Digging further leads us to Array implementation.

We cannot find a familiar trait ArrayTrait<T> but there is again a macro:

#[generate_trait]
pub impl ArrayImpl<T> of ArrayTrait<T> {

Searching over the codebase of Cairo - and here it is. What does it generate? In Cairo defining methods for a type requires a trait as methods are defined only for a trait. So #[generate_trait] reduces a boilerplate generating a corresponding trait definition out of the implementation for a type. You can always define a corresponding trait manually (e.g. check the library-implemented Queue). Thus the associated (i.e. doesn't require an instance) method new is called on ArrayTrait.
Let's come back to the output. We do specify the output as Array<u16>. And the output of 2 numbers instead of 4 suggests that we deal with some compact serialization scheme.
Let's test this hypothesis with a bigger int and larger array:

fn main(data: Array<(u16, u16, felt252)>) -> Array<u128> {
    println!("number of edges: {}", data.len());
    // u128::MAX but there is not MAX const 
    // for unsigned integers
    let elem = 340282366920938463463374607431768211455; 
    array![elem, elem, elem, elem, elem, elem]
}

It again returns two numbers:

number of edges: 1
Run completed successfully, returning [2845, 2851]

Damn! Notice that 2851 - 2845 = 6 while earlier it was 2841 - 2837 = 4 so it looks like slice bounds as it corresponds to the number of integers we hope to output. Let's find Run completed successfully, returning in the compiler source code. It seems that scarb runs this binary. And the output is Vec<Felt252>.

/// The ran function return value.
#[derive(Debug, Eq, PartialEq, Clone)]
pub enum RunResultValue {
    /// Run ended successfully, returning the memory of the non-implicit returns.
    Success(Vec<Felt252>),
    /// Run panicked, returning the carried error data.
    Panic(Vec<Felt252>),
}

"returning the memory of the non-implicit returns"... Here the hint - it's a memory. Cairo's memory model is a contiguous immutable array. Perhaps, these two felt252 numbers denote the start and end indices in the memory? scarb cairo-run --help suggests an argument --print-full-memory. Let's try that for the original program:

fn main(data: Array<(u16, u16, felt252)>) -> Array<u16> {
    println!("number of edges: {}", data.len());
    array![1, 2, 3, 4]
}

and run

$ scarb cairo-run "[[1, 2, 300]]" --print-full-memory

The output

number of edges: 1
Run completed successfully, returning [2837, 2841]
Full memory: [_, 290341444919459839, 1, 
... 
6744073709483335, 4130, 594, 4166, 500, 1, 2, 300, 49, 1997209042069643135709344952807065910992472029923670688473712229447419591075, 
0, 2463311330861459210825190905860592116187541770, 19, 
1, 2, 3, 4, ]

Hm... Let's try to use 2837 as an index to this "contiguous" array of memory. Just replace "_" with 0 (it is just to feed it into python) and let's feed the memory array into a python interpreter:

>>> memory = [0, 290341444919459839, 1, 
... 
6744073709483335, 4130, 594, 4166, 500, 1, 2, 300, 49, 1997209042069643135709344952807065910992472029923670688473712229447419591075, 
0, 2463311330861459210825190905860592116187541770, 19, 
1, 2, 3, 4]
>>> memory[2837]
1
>>> memory[2838]
2
>>> memory[2839]
3
>>> memory[2840]
4
>>> memory[2841]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Nice. At least now we understand what these two numbers in the output mean - they denote output start and end indices in a memory array. Ok, but let's remember our initial goal - we need to outsource the computation and then be able to verify it. Perhaps, for that thing there is a more convenient way to get the output?⁹

Provers

There are several provers for STARKs and some convenient wrappers for Cairo programs. I'm going to use Stone as it is the one used by Starknet. Let's try to prove and verify our program. The readme provides an example.

The prover requires a memory dump, a trace file and an AIR¹⁰ input after the execution but cairo-run cli doesn't expose these arguments.

We haven't yet checked how our compiled "binary" looks like. Let's inspect target directory. It contains bellman_ford_cairo.sierra.json. It is our program as a Sierra intermediate representation, think as of LLVM IR. I wouldn't go into any details, just recommend the articles by Mathieu Saugier

We cannot run a json file. It should be compiled to a target:

cairo assembly (casm) interpreted by a cairo virtual machine
wasm executed by wasm vm
native host assembly - see cairo native

If we check what cairo-run uses, it is a Rust vm. Let's clone the vm repository and run

$ cargo install --bin cairo1-run --path cairo1-run

then we get a cairo vm interpreter and let's run this interpreter in our repository:

$ cairo1-run src/lib.cairo \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode

We need to specify:

the layout as we need an output
the public input (our graph) as the provider should prove to verifier that he/she knows an optimal route to the specific graph and not that he/she knows an optimal route to some undisclosed graph. In other words, our input is public.

If we run the command above... we get an error:

thread 'main' panicked at cairo1-run/src/main.rs:181:18:
called `Result::unwrap()` on an `Err` value: Failed to find development corelib.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

So 1) there is some form of linking 2) if there is a development corelib, then there is a release one. There is an instruction in README for scarb projects - we can provide a Sierra program to the interpreter and add

[cairo]
enable-gas = false

to Scarb.toml¹¹.

So let's try

$ scarb --release build
$ cairo1-run target/release/bellman_ford_cairo.sierra.json \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode

And we get... Error: Runner(MissingMain). >_<. Let's try the dev mode again:

$ scarb build
$ cairo1-run target/dev/bellman_ford_cairo.sierra.json \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode

The error Error: IlegalInputValue. Ah, we should provide an input:

$ cairo1-run target/dev/bellman_ford_cairo.sierra.json \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode \
    --args "[[1, 2, 300]]"

Again in the readme "only allowed Array<felt252> as return and input value in the proof mode". Ok, let's change the code:

fn main(data: Array<felt252>) -> Array<felt252> {
    println!("number of edges: {}", data.len());
    array![1, 2, 3, 4]
}

$ scarb build
$ cairo1-run target/dev/bellman_ford_cairo.sierra.json \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode \
    --args "[1 2 300]"

And it works!!! Hooray!! There two other useful arguments: --print_output and --args_file that allow a readable output and a large input graph.

$ cat graph.txt
[1 2 300]
$ cairo1-run target/dev/bellman_ford_cairo.sierra.json \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode \
    --args_file graph.txt \
    --print_output
[DEBUG]                                 (raw: 1997209042069643135709344952807065910992472029923670688473712229447419591075) 
[DEBUG]                                 (raw: 0)
[DEBUG] number of edges: 3
                (raw: 2463311330861459210825190905860592116187542282)
[DEBUG]                                 (raw: 19                             )

Program Output : [1 2 3 4]

Yeeeeee! Before finishing this section let's fix the release mode. As the dev profile works, it seems that there is a difference in profiles. It seems that sierra-replace-ids affects this.

In Scarb.toml we have:

[cairo]
enable-gas = false
sierra-replace-ids = true

and

$ scarb --release build
$ cairo1-run target/release/bellman_ford_cairo.sierra.json \
    --layout=small \
    --air_public_input=public_input.json \
    --air_private_input=private_input.json \
    --trace_file=trace.bin \
    --memory_file=memory.bin \
    --proof_mode \
    --args_file graph.txt \
    --print_output
[DEBUG]                                 (raw: 1997209042069643135709344952807065910992472029923670688473712229447419591075) 
[DEBUG]                                 (raw: 0)
[DEBUG] number of edges: 3
                (raw: 2463311330861459210825190905860592116187542282)
[DEBUG]                                 (raw: 19                             )

Program Output : [1 2 3 4]

The output is the same. Note [DEBUG] output is related to cairo1-run. In the repository we should now have files:

memory.bin
private_input.json
public_input.json
trace.bin

It seems that we have all the artifacts to generate a proof.

Proof

I've prepared a ready-to-use (and arm64 compatible!) stone prover with Docker (thus you outsourced me the prover's compilation and I outsourced Docker image building to Github in a trusted way :)).

Then in our repository run:

$ docker run --rm -it -v $(pwd):/tmp maksimryndin/starknet-stone:latest prover \
    -logtostderr \
    --out_file=/tmp/proof.json \
    --private_input_file=/tmp/private_input.json \
    --public_input_file=/tmp/public_input.json \
    --prover_config_file=/tmp/cpu_air_prover_config.json \
    --parameter_file=/tmp/cpu_air_params.json

The files cpu_air_prover_config.json and cpu_air_params.json I took from Stone e2e test.

The output:

  what():  src/starkware/stark/stark.cc:187: Fri parameters do not match stark degree bound. Expected FRI degree from FriParameters: 8192. STARK: 131072
Stack trace (most recent call last):
<...>

Do you have any idea what it means? Me - no :)). Let's try to explore the source code of the prover. The number 131072 denotes the actual degree - trace_length attribute of Air class which seems to be instantiated from the statement. The statement is a combination of public and private inputs, air parameters file. 8192 denotes the expected which is calculated from cpu_air_params.json with the function GetFriExpectedDegreeBound. The sample cpu_air_params.json is:

{
    "field": "PrimeField0",
    "stark": {
        "fri": {
            "fri_step_list": [
                0,
                4,
                3
            ],
            "last_layer_degree_bound": 64,
            "n_queries": 18,
            "proof_of_work_bits": 24
        },
        "log_n_cosets": 4
    },
    "use_extension_field": false
}

So GetFriExpectedDegreeBound calculates it like 8192 = last_layer_degree_bound * 2.pow(0) * 2.pow(4) * 2.pow(3) or in logarithm base 2 terms: log(trace_length) = log(last_layer_degree_bound) + ∑fri_step_list (this is almost the note). The error raises on testing an equality between an actual and an expected degrees. To tune the air configuration in a meaningful way we have to learn some theory but for now let's just add an additional FRI step of 4 to hold the equation:

{
    "field": "PrimeField0",
    "stark": {
        "fri": {
            "fri_step_list": [
                0,
                4,
                3,
                4
            ],
            "last_layer_degree_bound": 64,
            "n_queries": 18,
            "proof_of_work_bits": 24
        },
        "log_n_cosets": 4
    },
    "use_extension_field": false
}

If we run the prover again (make sure to update memory and trace paths in private_input.json), it outputs:

I0730 10:39:04.829474     1 profiling.cc:58] Prover started
I0730 10:39:05.030071     1 memory_cell.inl:121] Filled 12256 vacant slots in memory: 171 holes and 12085 spares.
I0730 10:39:19.560891     1 stark.cc:425] Trace cells count:
Log number of rows: 17, first trace n_cols: 23, interaction trace n_cols: 2, Total trace cells: 3276800
I0730 10:40:03.643604     1 prover_main_helper_impl.cc:178] Byte count: 88424
Hash count: 1344
Commitment count: 6
Field element count: 1419
Data count: 1
I0730 10:40:03.658634     1 profiling.cc:85] Prover finished in 58.8287 sec

Note that total number of rows (2.pow(17)) is 131072. And we can verify the computation:

$ docker run --rm -it -v $(pwd):/tmp maksimryndin/starknet-stone:latest verifier \
    -logtostderr \
    --in_file=/tmp/proof.json
I0730 11:14:42.393836     1 cpu_air_verifier_main.cc:39] Proof verified successfully.

Conclusion

In this article we learnt about verifiable computations with Cairo, compiled a basic program, explored input/output for Cairo programs, and proved/verified the basic program. As a result of our trial-and-error path, the convenient wrapper tools were born which we will use further. All the source code is in the repository.

Now it's time to digest it and learn some cool things from the references 8 - 12 (the order is a recommended way).

In the next article(s) we will implement Bellman-Ford algorithm, try to break/attack the proof scheme, investigate the time complexity of the verification, an overhead over the Rust version.

References

Cairo book (accessed in July 2024)
Introducing Hints in Cairo programming language (accessed in July 2024)
Cairo0 docs (accessed in July 2024)
Stwo Prover: The next-gen of STARK scaling is here (accessed in July 2024)
Under the hood of Cairo 1.0: Exploring Sierra (accessed in July 2024)
Zero Knowledge Systems You Can Trust (accessed in July 2024)
Unleashing the power of the Stone Prover (accessed in July 2024)
STARK Math: The Journey Begins (accessed in July 2024)
Arithmetization I (accessed in July 2024)
Arithmetization II (accessed in July 2024)
Low Degree Testing (accessed in July 2024)
Anatomy of a STARK (accessed in July 2024)

The phrase "reliable outsourcing of computation" credits to Lurk language. Volunteers are invited to create an article "Lurk for List/Clojure devs". It would be interesting to compare both languages for verifiable computation. ↩
in these articles we aren't tailored to web3 at all but they are within my ongoing effort to seize arbitrage opportunities in crypto. For those interested there is the Telegram group uniting MEV (re)searchers. ↩
Another prominent use case in the context of web3: we can move some computation from onchain to offchain in a trustless manner (Starknet is all around this). ↩
A verifier is actually protected against a computationally bound prover so it is not a strict proof, rather an argument of knowledge but I will stick with the word "proof" to avoid over-complication. ↩
Besides a computational integrity, there can be another (optional) requirement of zero-knowledge (ZK) when the prover doesn't reveal some information about the input. The rabbit hole is deep that there is no single article - follow the suggested reading. By the way, many STARK-provers still (July 2024) do not implement perfect ZK, e.g. Winterfell, miniSTARK, Stark Platinum Prover. Also refer to the article. ↩
Actually, there is Cairo0 which was the first version of a language for STARK-verifiable computation. It is basically an assembly language for STARK CPU. You may still encounter it in some projects. In this article and in general, mentioning Cairo means talking about Cairo1, a Rust-like language. ↩
It seems that a separation between binary and library crates in Cairo is via inclusion/exclusion of main function in lib.cairo. ↩
Make sure to choose Cairo 1.0 from Starkware Industries. ↩
There are also hints which were first introduced in Cairo0 and haven't yet implemented in Cairo1 in the official compiler. And they can possibly allow for more interactions with the outer world. ↩
See Stark math articles on Arithmetization. ↩
Gas metering should protect a prover from infinite loops as relations between a prover and a verifier are trustless. But for the moment I have no idea why "cairo1-run skips gas checks when running". ↩

DEV Community

Cairo for Rust devs I (verifiable computation)

Why Cairo when Rust is already so awesome

Cairo

Mind map

Setup

Input and output

Provers

Proof

Conclusion

References

Top comments (0)

Read next

Ratatui Audio with Rodio: Sound FX for Rust Text-based UI

Battleship Game in RUST

Sometimes all we need is a good safe word! Even ChatGPT!

Tree data structures in Rust with tree-ds (#1: Getting Started)