## DEV Community

Matteo Bertini

Posted on • Updated on

# Procedural macro in Rust 101

Cover Image by Photo by Ricardo Gomez Angel on Unsplash

# Procedural macro in Rust 101

Today, with Rust 1.51, no macro is required! (see below)

How to pick a function and make it a macro with added superpowers

One upon a time I had the need to take a list of items and return all the combinations of all couples.

This task in Python is easily accomplished using the built-in itertools.combinations:

`itertools.combinations(iterable, r)`
Return `r` length subsequences of elements from the input iterable.
Combinations are emitted in lexicographic sort order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

That offers a baseline implementation in Python (the real one is in C and based on iterators).

``````def combinations(iterable, r):
# combinations('ABCD', 2) --> AB AC AD BC BD CD
# combinations(range(4), 3) --> 012 013 023 123
pool = tuple(iterable)
n = len(pool)
if r > n:
return
indices = list(range(r))
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
``````

What we have here is a functions that takes a list and an integer `r` and returns a tuple of length `r`... hmm, not (yet) possible in plain Rust, but let's focus on my problem, I currently only need couples!

Here is a basic porting in Rust (generic enough to be modified for `r > 2`):

``````fn combinations2<T>(pool: &Vec<T>) -> Vec<[&T; 2]> {
let r = 2; // <----- is the length of this ^
let mut res = vec![];
let n = pool.len();
if r > n {
return res;
}
let mut indices: Vec<_> = (0..r).collect();
// we cannot `yield` (yet) in Rust, so we need an accumulator
res.push([&pool[indices[0]], &pool[indices[1]]]);
loop {
// since this is not a generator like in Python, we cannot `return`, so the code
// is slightly changed, `find` replaces the first `for` in Python.
match (0..r).rev().find(|&i| indices[i] != (i + n - r)) {
Some(i) => {
indices[i] += 1;
for j in (i + 1)..r {
indices[j] = indices[j - 1] + 1;
}
res.push([&pool[indices[0]], &pool[indices[1]]]);
}
None => break,
}
}
res
}
``````

Months later, I tried to achieve something using basic macros by example1 in Rust, but with no luck, it's not easy to take a single integer and expand2 it into a list of something, the code burden was going too high, and I'd have to chose upfront some maximum value for the `r` number 3.

So I gave up till the more powerful procedural macro reached the almost stable point!

But here be dragons! `syn`, `proc_macro`, `proc_macro2`, `quote!`, WTF?

## Parsing and quoting

Luckily the `syn` crate repository contains some very useful examples, in particular: examples/lazy-static is An example of parsing a custom syntax within a `functionlike!(...)` procedural macro.4

But lets start from the `syn` crate, that quoting from the README is:

Syn is a parsing library for parsing a stream of Rust tokens into a syntax tree of Rust source code.

So, given some Rust-like code, Syn will give us a syntax tree, ready to be used through all the basic types of the syntax tree, like `Expr`, `Ident`, `Token`, `Type`, and many others.
As noted in the README, Syn offers `proc_macro2` a Proc macro shim , that allows Syn to support back to Rust 1.15.

The little quirk of `proc_macro2` is that we are not allowed to use everything from a single place, because:

In general all of your code should be written against `proc-macro2` rather than `proc-macro`. The one exception is in the signatures of procedural macro entry points, which are required by the language to use `proc_macro::TokenStream.`

Once we have parsed our input, we need to generate back some Rust code, and here kicks in the `quote` crate, that:

... provides the `quote!` macro for turning Rust syntax tree data structures into tokens of source code.

Our workflow will be:

1. define some syntax for our procedural macro5, for example `combinations!(comb3, 3)` that will declare a function `comb3(values: &[T]) -> Vec<[T; 3]>`,
2. define a `struct` to store the parsed parts from the macro call: `comb3` and `3`,
3. use those values to parametrize the specialized code of `combinations2`,
4. return the abstract new function and build back the Rust code for the compiler,
5. call the new macro-generated function!

## Putting the pieces together

The first piece of the puzzle is a `struct` that will contain the user values in the macro definition, in this case we have only a name and a number:

``````struct  Combinations {
name: syn::Ident,
n: syn::LitInt,
}
``````

And now we need some way to parse something into this `struct`, and here kicks in the `Parse` trait a Parsing interface implemented by all types that can be parsed in a default way from a token stream:

``````impl Parse for Combinations {
fn parse(input: ParseStream) -> Result<Self> {
let name = input.parse()?;
input.parse::<syn::Token![,]>()?;
let n = input.parse()?;
Ok(Combinations { name, n })
}
}
``````

Thanks to the Rust type inference super powers, calling `input.parse()?` let us parse the values into the right type or return and error if the parsing fails.
The strange line `input.parse::<syn::Token![,]>()?;` is a special syntax to parse simple tokens.
Now that we have parsed our input, we can write our super minimal macro, to test that everything worked as expected:

``````#[proc_macro]
pub fn minimal(input: TokenStream) -> TokenStream {
let Combinations { name, n } = parse_macro_input!(input as Combinations);
quote!{
fn #name() -> i32 {
#n
}
}
}
``````

Ops, compilation error :/

``````   Compiling comb v0.1.0 (/home/naufraghi/Documents/src/circle-rs.git/comb)
error[E0308]: mismatched types
--> comb/src/lib.rs:27:5
|
25 |   pub fn minimal(input: TokenStream) -> TokenStream {
|                                         ----------- expected `proc_macro::TokenStream` because of return type
26 |       let Combinations { name, n } = parse_macro_input!(input as Combinations);
27 | /     quote!{
28 | |         fn #name() -> i32 {
29 | |             #n
30 | |         }
31 | |     }
| |_____^ expected struct `proc_macro::TokenStream`, found struct `proc_macro2::TokenStream`
|
= note: expected type `proc_macro::TokenStream`
found type `proc_macro2::TokenStream`
``````

Remember, `syn` offers a drop-in replacement for everything, except `TokenStream`, that is supposed to be the original from the compiler interfaces. Luckily the conversion from one `into()` the other is simple.

``````#[proc_macro]
pub fn minimal(input: TokenStream) -> TokenStream {
let Combinations { name, n } = parse_macro_input!(input as Combinations);
(quote!{
fn #name() -> i32 {
#n
}
}).into()
}
``````

Without the extra parentheses you will have another strange compilation error, sadly is not even possible to expand the macro to see exactly what's happening, ...

``````   Compiling comb v0.1.0 (/home/naufraghi/Documents/src/circle-rs.git/comb)
error: expected expression, found `.`
--> comb/src/lib.rs:31:6
|
31 |     }.into()
|      ^ expected expression
error[E0308]: mismatched types
--> comb/src/lib.rs:27:5
|
27 | /     quote!{
28 | |         fn #name() -> i32 {
29 | |             #n
30 | |         }
31 | |     }.into()
| |_____^ expected (), found struct `proc_macro2::TokenStream`
|
= note: expected type `()`
found type `proc_macro2::TokenStream`
``````

Did I said expand? Yes, we can look the macro expanded code with the `cargo expand` command, and this is the result:

``````\$ cargo expand --bin comb
Checking circle v0.1.0 (/home/naufraghi/Documents/src/circle-rs.git)
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
#![feature(prelude_import)]
#![no_std]
#[prelude_import]
use ::std::prelude::v1::*;
#[macro_use]
extern crate std as std;
fn mini3() -> i32 { 3 }    // <----- here is our expansion!
// ...
``````

Summing all up, we have this pipeline:

``````TokenStream -> parse_macro_input!(...) -> quote!{...} -> TokenStream
``````

And inside `quote!{}` the variables are accessed with the `#name` syntax.

### Extra powers

Here we are, we have written a lot more code to reach a point we already know how to reach with the good old macros by example, but we had a problem, how to expand a number into a list, because this is the monster of this level:

``````res.push([&pool[indices[0]], &pool[indices[1], ...]])
``````

We need some way to accumulate a little piece of syntax, `&pool[indices[#i]]`, and at the end join the parts into a comma separated list. Digging a bit into `syn`, what we need is `syn::punctuated`:

A punctuated sequence of syntax tree nodes separated by punctuation.
Lots of things in Rust are punctuated sequences.

• The fields of a struct are `Punctuated<Field, Token![,]>`. ...

What we can do is:

``````#[proc_macro]
pub fn punctuated(input: TokenStream) -> TokenStream {
let Combinations { name, n } = parse_macro_input!(input as Combinations);
let indices: syn::punctuated::Punctuated<_, syn::Token![,]> = (0..n.value() as i32)
.map(|i| quote! { #i })  // <-- add the quoted variable, `i` as a token
.collect();
(quote! {
fn #name() -> Vec<i32> {
vec![#indices]  // put the inner part of the list into something
}
})
.into()
}
``````

We can finally call the macro `punctuated!(punct4, 4);` that expands into:

``````fn punct4() -> Vec<i32> { <[_]>::into_vec(box [0i32, 1i32, 2i32, 3i32]) }
``````

We now have all the pieces to make our combinations function generic on the size r of the returned array:

``````#[proc_macro]
/// `combinations!(comb, N)` macro will define a function `comb` that takes a `&[T]` and returns
/// a vector of arrays of length _N_, `Vec<[&T; N]>`, covering all the combinations of _N_ elements
/// from the source vector.
pub fn combinations(input: TokenStream) -> TokenStream {
let Combinations { name, n } = parse_macro_input!(input as Combinations);
let pool_indices: syn::punctuated::Punctuated<_, syn::Token![,]> = (0..n.value() as usize)
.map(|i| quote! { &pool[indices[#i]] })
.collect();
let comb = quote! {
fn #name<T>(pool: &[T]) -> Vec<[&T; #n]> {
let r = #n;
let mut res = vec![];
let n = pool.len();
if r > n {
return res;
}
let mut indices: Vec<_> = (0..r).collect();
res.push([#pool_indices]);
loop {
match (0..r).rev().find(|&i| indices[i] != (i + n - r)) {
Some(i) => {
indices[i] += 1;
for j in (i + 1)..r {
indices[j] = indices[j - 1] + 1;
}
res.push([#pool_indices]);
},
None => break,
}
}
res
}
};
comb.into()
}
``````

Here we are, we can now define with a simple macro call a function with the wanted size, in the future we may also create and call a macro in a single shot, but not yet 5.

A final note, the macro can be used only from another crate, the suggested layout is highlighted in the lazy-static example.

Thanks to @dodomorandi for the review of the first draft and for the suggestions about a more idiomatic `combinations2`, and for his `Iterator` based implementation, and thanks to @lu-zero for the review and for the suggestion to avoid `syn` and search for a simpler method, task I left as an exercise for the reader.

#### Iterator based implementation

``````#[proc_macro]
pub fn iter_combinations(input: TokenStream) -> TokenStream {
let Combinations { name, n } = parse_macro_input!(input as Combinations);
let pool_indices: syn::punctuated::Punctuated<_, syn::Token![,]> = (0..n.value() as usize)
.map(|i| quote! { &self.pool[self.indices[#i]] })
.collect();

let initial_indices: syn::punctuated::Punctuated<_, syn::Token![,]> =
(0..n.value() as usize).collect();

let iter_name = proc_macro2::Ident::new(
&format!("CombIndicesIter{}", n.value()),
proc_macro2::Span::call_site(),
);

let comb = quote! {
pub struct #iter_name<'a, T> {
pool: &'a [T],
indices: [usize; #n],
started: bool,
}

impl<'a, T> Iterator for #iter_name<'a, T> {
type Item = [&'a T; #n];

fn next(&mut self) -> Option<Self::Item> {
if !self.started {
self.started = true;
Some([#pool_indices])
} else {
let n = self.pool.len();
(0..#n).rev().find(|&i| self.indices[i] != i + n - #n)
.map(|i| {
self.indices[i] += 1;
for j in (i + 1)..#n {
self.indices[j] = self.indices[j - 1] + 1;
}
[#pool_indices]
})
}
}
}

fn #name<T>(pool: &[T]) -> impl Iterator<Item = [&T; #n]> {
#iter_name {
pool,
indices: [#initial_indices],
started: false,
}
}
};
comb.into()
}
``````

# "Modern" Rust `combinations()`

Today, with Rust 1.51, there is no need to use any* macro:

``````use std::convert::TryInto;

fn combinations<T, const R: usize>(pool: &Vec<T>) -> Vec<[&T; R]> {
let mut res = vec![];
let n = pool.len();
if R > n {
return res;
}
// may have used `array-init` https://docs.rs/array-init/2.0.0/array_init/index.html
macro_rules! into_array {
(\$it:expr) => {
\$it.collect::<Vec<_>>().as_slice().try_into().expect("should be of size R")
};
}
let mut indices: [usize; R] = into_array!(0..R);
let reversed: [usize; R] = into_array!((0..R).rev());
res.push(into_array!(indices.iter().map(|i| &pool[*i])));
loop {
let idx = (|| {
for i in reversed.iter() {
if indices[*i] != (*i + n - R) {
return Some(*i);
}
}
None
})();
if let Some(i) = idx {
indices[i] += 1;
for j in (i + 1)..R {
indices[j] = indices[j - 1] + 1;
}
res.push(into_array!(indices.iter().map(|i| &pool[*i])));
} else {
break;
}
}
res
}

fn main() {
let numbers = vec![1i32,2,3,4];
let combs: Vec<[_; 2]> = combinations(&numbers);
// `let combs = combinations::<_, 2>(&numbers)` works, but IMHO the constraint
// on the returned type is more explanatory and futureproof
println!("{:?}", combs);
}
``````

1. Super in-depth intro by DanielKeep@github: A Practical Intro to Macros in Rust 1.0

2. I'd have to expand `2 -> 0, 1`, and then map it to `&pool[indices[0]], &pool[indices[1]]`

3. Again by DanielKeep, The Little Book of Rust Macros: Counting

4. One may try to avoid `syn` and use a combination of macro by example and procedural macros, but this is left as an exercise for the reader.

5. Sadly the usage of a macro in expression position, like `let res = combinations!(&data, 3)`, is feature gated behind `#![feature(proc_macro_hygiene)]` and limited to the nightly edition, so we use another approach.