loading...
Cover image for Rust builder pattern with types

Rust builder pattern with types

mindflavor profile image Francesco Cogno ・7 min read

Intro

Rust has a very rich type system. It also has move semantics. Using these two features together you can build your APIs using the builder pattern. Let's see an actual example on why you would want to use the builder pattern and how to implement is (well, I'm showing you a way to implement it, not necessarily the best one).

Example

Suppose we have a function that takes a lot of parameters: some of them are mandatory, other are optional. For example:

fn cook_pasta(
    pasta_type: String,
    pasta_name: Option<String>,
    pasta_length: u64,
    altitude: u64,
    water_type: Option<String>,
) {
    // your code here
}

This method has two optional parameters, and three mandatory ones. Using it is ugly:

cook_pasta("Penne".to_owned(), None, 100, 300, Some("Salty".to_owned()));

The problem here is we have to explicitly tell the method to ignore an optional field (pasta_name) and also explicitly specify the other optional field (water_type). Being an Option forces us to wrap the optional field with Some(). Also positional fields are hard to interpret. For example we have two u64 fields: without looking at the method signature is really hard to tell what's what. Is 100 referring to the pasta_length or to the altitude?
This hampers readability as we have to jump back and forth in our code just to figure out what is happening.

With the builder pattern we want to achieve something like this:

cook_pasta()
    .with_pasta_type("Penne".to_owned())
    .with_pasta_length(100)
    .with_altitude(300)
    .with_water_type("Salty".to_owned())
    .execute();

This syntax is really better: we know with a glance what is pasta_length (100) and what is altitude (300). Also we do not need to wrap the optional fields in Some() nor we need to use None for optional ones we do not want to pass to the function.
Readability is way better at the expense of a extra execute() method call. But how to achieve it?

Builder struct

The trick here is to have a builder object that we continually move around, adding fields in the process. The builder object will own all the method fields. Something like this:

#[derive(Debug, Clone, Default)]
struct CookPastaBuilder {
    pub pasta_type: String,
    pub pasta_name: Option<String>,
    pub pasta_length: u64,
    pub altitude: u64,
    pub water_type: Option<String>,
}

Now out cook_pasta() function (we'll call it cook_pasta2() to differentiate if from the previous version) just creates a default instance of that structure.

fn cook_pasta2() -> CookPastaBuilder {
    CookPastaBuilder::default()
}

Our cook_pasta code will then be run using the CookPastaBuilder defined above:

impl CookPastaBuilder {
    fn execute(&self) {
        // your code here
    }
}

As it is now you can use it like this:

let mut cpb = cook_pasta2();
cpb.water_type = Some("Penne".to_owned());
cpb.pasta_length = 100;
cpb.altitude = 300;
cpb.water_type = Some("Salty".to_owned());

cpb.execute();

Not quite what we want but better than the original method. This solution has two problems: firstly you have no way to enforce the mandatory fields. If you forget to set a mandatory field you will notice it at runtime (bad, bad, bad!). Secondly, the syntax is cumbersome. We'll address the second issue first.

Moving around

Instead of giving access to inner fields we can expose them with moving methods. For example:

impl CookPastaBuilder {
    fn with_pasta_type(self, pasta_type: String) -> CookPastaBuilder {
        CookPastaBuilder {
            pasta_type,
            pasta_name: self.pasta_name,
            pasta_length: self.pasta_length,
            altitude: self.altitude,
            water_type: self.water_type,
        }
    }

Notice three things:

  1. We are consuming the self object. This means we can no longer call methods on it (self in the method definition).
  2. We construct a new CookPastaBuilder copying all the fields from the previous one except the field we want to assign (in the example pasta_type).
  3. We pass the newly generated CookPastaBuilder to the caller (it becomes the new owner). This way we can chain these calls as they were on the same object. In reality we are changing CookPastaBuilders at each call.

Also we can get rid of the Some() wraps putting it inside the setter function:

    fn with_water_type(self, water_type: String) -> CookPastaBuilder {
        CookPastaBuilder {
            pasta_type: self.pasta_type,
            pasta_name: self.pasta_name,
            pasta_length: self.pasta_length,
            altitude: self.altitude,
            water_type: Some(water_type),
        }

Now we can do this:

cook_pasta2()
        .with_pasta_type("Penne".to_owned())
        .with_pasta_length(100)
        .with_altitude(300)
        .with_water_type("Salty".to_owned())
        .execute();

Nice!
But we still have to address the mandatory field check. For example this compiles fine:

cook_pasta2().execute();

But it's not fine at all (for instance, cooking pasta without actually having the pasta in the first place will not go well).

Types all around

Suppose now we have two builders instead of one. CookPastaBuilderWithoutPastaType and CookPastaBuilderWithPastaType (yes, I know, they suck but please bear with me).

If we define the execute method on the WithPasta variant only we can make sure no one will be able to call it on the WithoutPasta variant. At compile time. Which is good.

So our logical flow will be:

  1. Call of cook_pasta2() will generate a CookPastaBuilderWithoutPastaType.
  2. Calling with_pasta_type(..) will consume the CookPastaBuilderWithoutPastaType and return a CookPastaBuilderWithPastaType.
  3. Now calling execute() will work because CookPastaBuilderWitPastaType implements the method.

If we were to call execute() without calling with_pasta_type(..) first we would get a compiler error.

But, hey, we have just handled one mandatory field. We have three of them! Sure you can come up with something like CookPastaBuilderWithoutPastaTypeWithoutAltitudeWithoutPastaLength and all the permutations (in this case 12) like CookPastaBuilderWithPastaTypeWithoutAltitudeWithPastaLength and CookPastaBuilderWithoutPastaTypeWithoutAltitudeWithPastaLength... but there is a better way of doing so in less keystrokes.

Generics?

We can use generics to achieve a better compromise. We have three mandatory fields:

  1. pasta_type
  2. pasta_length
  3. latitude

We can simulate the presence or absence of a value with a boolean type. Something like:

#[derive(Debug, Default)]
pub struct Yes;
#[derive(Debug, Default)]
pub struct No;

Our new class becomes:

struct CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET>

These generic types can either be No or Yes indicating the fact the specific field has been assigned. For example setting the pasta_type field:

    fn with_pasta_type(
        self,
        pasta_type: String,
    ) -> CookPastaBuilder<Yes, PASTA_LENGTH_SET, ALTITUDE_SET> {
        CookPastaBuilder {
            pasta_type,
            pasta_name: self.pasta_name,
            pasta_length: self.pasta_length,
            altitude: self.altitude,
            water_type: self.water_type,
        }
    }

This function will return a new CookPastaBuilder instance with the PASTA_NAME_SET set on Yes so we know the field has been set. Notice that we are returning PASTA_LENGHT_SET and ALTITUDE_SET in this case because we do not want to change the underlying type for those fields (they can either be set or not and we won't change that here).

Doing so for every mandatory field means we end up with a type like this:

CookPastaBuilder<Yes, Yes, Yes>

This gives us the compile-time guarantee that our caller has set all the mandatory fields. All we need to do is to constrain the execution of the execute() function to this specific type:

impl CookPastaBuilder<Yes, Yes, Yes> {
    fn execute(&self) {
        // your code here
    }
}

Result

From the client perspective our API is still the same, this invocation works:

cook_pasta2()
    .with_pasta_type("Penne".to_owned())
    .with_pasta_length(100)
    .with_altitude(300)
    .with_water_type("Salty".to_owned())
    .execute();

But this does not because we forgot to set the mandatory altitude parameter:

cook_pasta2()
    .with_pasta_type("Penne".to_owned())
    .with_pasta_length(100)
    .with_water_type("Salty".to_owned())
    .execute();

The error will be like this one:

The no method named execute found for type CookPastaBuilder<Yes, Yes, No> in the current scope is not very helpful but at least you catch the error before runtime. At least the generic type names will help you determine what is missing:

That is if you add the necessary WHEREs to the struct declaration:

#[derive(Debug, Clone, Default)]
pub struct CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET>
where
    PASTA_TYPE_SET: ToAssign,
    PASTA_LENGTH_SET: ToAssign,
    ALTITUDE_SET: ToAssign,
{
    pasta_type_set: PhantomData<PASTA_TYPE_SET>,
    pasta_length_set: PhantomData<PASTA_LENGTH_SET>,
    altitude_set: PhantomData<ALTITUDE_SET>,

    pasta_type: String,
    pasta_name: Option<String>,
    pasta_length: u64,
    altitude: u64,
    water_type: Option<String>,
}

Last bit: the Phantoms

This is what we wanted: a builder pattern with elegant syntax, compile time checks. But we are using a fake generic, that is a generic not actively used in our code. Rust complains about this so in order to make him happy we add three new fields, one for each generic, of the type PhantomData. Do not worry, it won't appear at runtime so there is no cost in adding it (besides the nuisance of having them at all).

The final code if this one. Let me know what you think about it!

use std::fmt::Debug;
use std::marker::PhantomData;

#[derive(Debug, Default)]
pub struct Yes;
#[derive(Debug, Default)]
pub struct No;

pub trait ToAssign: Debug {}
pub trait Assigned: ToAssign {}
pub trait NotAssigned: ToAssign {}

impl ToAssign for Yes {}
impl ToAssign for No {}

impl Assigned for Yes {}
impl NotAssigned for No {}

pub fn cook_pasta(
    pasta_type: String,
    pasta_name: Option<String>,
    pasta_length: u64,
    altitude: u64,
    water_type: Option<String>,
) {
    // your code here
    println!(
        "cooking pasta! -> {:?}, {:?}, {:?}, {:?}, {:?}",
        pasta_type, pasta_name, pasta_length, altitude, water_type
    );
}

#[derive(Debug, Clone, Default)]
pub struct CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET>
where
    PASTA_TYPE_SET: ToAssign,
    PASTA_LENGTH_SET: ToAssign,
    ALTITUDE_SET: ToAssign,
{
    pasta_type_set: PhantomData<PASTA_TYPE_SET>,
    pasta_length_set: PhantomData<PASTA_LENGTH_SET>,
    altitude_set: PhantomData<ALTITUDE_SET>,

    pasta_type: String,
    pasta_name: Option<String>,
    pasta_length: u64,
    altitude: u64,
    water_type: Option<String>,
}

impl<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET>
    CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET>
where
    PASTA_TYPE_SET: ToAssign,
    PASTA_LENGTH_SET: ToAssign,
    ALTITUDE_SET: ToAssign,
{
    pub fn with_pasta_type(
        self,
        pasta_type: String,
    ) -> CookPastaBuilder<Yes, PASTA_LENGTH_SET, ALTITUDE_SET> {
        CookPastaBuilder {
            pasta_type_set: PhantomData {},
            pasta_length_set: PhantomData {},
            altitude_set: PhantomData {},
            pasta_type,
            pasta_name: self.pasta_name,
            pasta_length: self.pasta_length,
            altitude: self.altitude,
            water_type: self.water_type,
        }
    }

    pub fn with_pasta_name(
        self,
        pasta_name: String,
    ) -> CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET> {
        CookPastaBuilder {
            pasta_type_set: PhantomData {},
            pasta_length_set: PhantomData {},
            altitude_set: PhantomData {},
            pasta_type: self.pasta_type,
            pasta_name: Some(pasta_name),
            pasta_length: self.pasta_length,
            altitude: self.altitude,
            water_type: self.water_type,
        }
    }

    pub fn with_pasta_length(
        self,
        pasta_length: u64,
    ) -> CookPastaBuilder<PASTA_TYPE_SET, Yes, ALTITUDE_SET> {
        CookPastaBuilder {
            pasta_type_set: PhantomData {},
            pasta_length_set: PhantomData {},
            altitude_set: PhantomData {},
            pasta_type: self.pasta_type,
            pasta_name: self.pasta_name,
            pasta_length,
            altitude: self.altitude,
            water_type: self.water_type,
        }
    }

    pub fn with_altitude(
        self,
        altitude: u64,
    ) -> CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, Yes> {
        CookPastaBuilder {
            pasta_type_set: PhantomData {},
            pasta_length_set: PhantomData {},
            altitude_set: PhantomData {},
            pasta_type: self.pasta_type,
            pasta_name: self.pasta_name,
            pasta_length: self.pasta_length,
            altitude,
            water_type: self.water_type,
        }
    }

    pub fn with_water_type(
        self,
        water_type: String,
    ) -> CookPastaBuilder<PASTA_TYPE_SET, PASTA_LENGTH_SET, ALTITUDE_SET> {
        CookPastaBuilder {
            pasta_type_set: PhantomData {},
            pasta_length_set: PhantomData {},
            altitude_set: PhantomData {},
            pasta_type: self.pasta_type,
            pasta_name: self.pasta_name,
            pasta_length: self.pasta_length,
            altitude: self.altitude,
            water_type: Some(water_type),
        }
    }
}

impl CookPastaBuilder<Yes, Yes, Yes> {
    pub fn execute(&self) {
        // your code here
        println!("cooking pasta! -> {:?}", self);
    }
}

pub fn cook_pasta2() -> CookPastaBuilder<No, No, No> {
    CookPastaBuilder::default()
}

fn main() {
    cook_pasta("Penne".to_owned(), None, 100, 300, Some("Salty".to_owned()));

    cook_pasta2()
        .with_pasta_type("Penne".to_owned())
        .with_pasta_length(100)
        .with_water_type("Salty".to_owned())
        .with_altitude(300)
        .execute();
}

Happy Coding

Francesco Cogno

Discussion

pic
Editor guide
Collapse
seanconnell profile image
Sean Connell

I enjoyed this a lot, here's a java adaptation relying on downcasts and type erasure to accomplish mostly (slightly less ergo) the same effect.

import java.util.Objects;
import java.util.Optional;

/**
 * Demonstrates compile time failure for unsafe construction of objects via builders allowing both required and optional parameters.
 *
 * Shamelessly stolen from https://dev.to/mindflavor/rust-builder-pattern-with-types-3chf and adapted to java.
 */
class Scratch {

    /**
     * A demonstration of all our efforts in a nicely usable compile time failure for the api if required args are omitted.
     * Try adding and removing them, and notice how the optional args don't effect the overall compilation but do effect
     * the runtime print messages.
     */
    public static void main(String[] args) {
        final ReadyToLaunchRocket rocket = SafeBuilderProvider.build(
                SafeBuilderProvider.builder()
                // these three (fuel, ignition, target) are required. If you omit any of these lines, you'll get a compiler error about it. It looks like:

                // Error:(153, 31) incompatible types: Scratch.SafeBuilder<Scratch.RequiredArgProvided,Scratch.RequiredArgOmitted,Scratch.RequiredArgProvided>
                // cannot be converted to Scratch.SafeBuilder<Scratch.RequiredArgProvided,Scratch.RequiredArgProvided,Scratch.RequiredArgProvided>

                // which is relatively readable, I think. (Obviously pardon the Scratch.prefix here, that would just be the classname normally)
                .loadFuel(new Fuel())
                .primeIgnition(new Ignition())
                .programTarget(new Target())
                // these two are optional. They can be before/during/after the required args, it doesn't matter
                .breakChampagneBottle(new ChampagneBottle())
                .addSweetDecal(new SweetDecal())
        );

        // use those safely constructed values!
        rocket.fire();
    }

    //The core type check which is more or less a phantom type boolean for "value is set on builder"
    interface ValueIsSet {}
    interface RequiredArgProvided extends ValueIsSet {}
    interface RequiredArgOmitted extends ValueIsSet {}

    //various classes required to construct a rocket
    static class Ignition {}
    static class Fuel {}
    static class Target {}
    //and two optional ones, to demonstrate how to add optional args that don't constrain the overall result
    static class SweetDecal {}
    static class ChampagneBottle {}

    /**
     * Our sample class that requires non-null values from the builder, always
     */
    static class ReadyToLaunchRocket {
        //pretend there's some use of these values. It doesn't matter for the demonstration which is all compiler tricks.
        private final Ignition i;
        private final Fuel f;
        private final Target t;

        public ReadyToLaunchRocket(Ignition i, Fuel f, Target t) {
            Objects.requireNonNull(i);
            Objects.requireNonNull(f);
            Objects.requireNonNull(t);
            this.i = i;
            this.f = f;
            this.t = t;
        }

        public void fire() {
            System.out.println("Rocket firing with ignition " + i + " using fuel " + f + " and heading towards " + t);
        }
    }

    /**
     * Our type safe builder that disallows expressing unsafe constructions by tracking required parameters as phantom types
     */
    interface SafeBuilder<IGNITION_PRIMED extends ValueIsSet, FUEL_LOADED extends ValueIsSet, TARGET_PROGRAMMED extends ValueIsSet> {
        SafeBuilder<RequiredArgProvided, FUEL_LOADED, TARGET_PROGRAMMED> primeIgnition(Ignition i);
        SafeBuilder<IGNITION_PRIMED, RequiredArgProvided, TARGET_PROGRAMMED> loadFuel(Fuel f);
        SafeBuilder<IGNITION_PRIMED, FUEL_LOADED, RequiredArgProvided> programTarget(Target t);
        SafeBuilder<IGNITION_PRIMED, FUEL_LOADED, TARGET_PROGRAMMED> addSweetDecal(SweetDecal d);
        SafeBuilder<IGNITION_PRIMED, FUEL_LOADED, TARGET_PROGRAMMED> breakChampagneBottle(ChampagneBottle b);
    }

    /**
     * This can be considered the "unsafe" layer. The only trick we're using here
     * downcasting to allow us to re-use this object for the next step in the validation chain.
     *
     * Because all of the phantom types/generics are erased at runtime, this will never throw.
     *
     * The only thing the author of classes like this needs to be sure of is that they set the
     * correct "RequiredArgProvided" and "No" values on the return, or you'll be unsafe. Since this is
     * relatively easy to get right/see in CR this is a reasonble solution.
     *
     * This still can't save us from the evil of nullable types, so if somebody tries to
     * defeat us with .primeIgnition(null) we'll only catch it at runtime the usual way.
     * Hopefully the compiler error will help people not do that.
     */
    static class SafeBuilderImpl implements SafeBuilder<ValueIsSet, ValueIsSet, ValueIsSet> {
        Ignition i;
        Fuel f;
        Target t;
        Optional<SweetDecal> d = Optional.empty();
        Optional<ChampagneBottle> b = Optional.empty();

        @Override
        public SafeBuilder<RequiredArgProvided, ValueIsSet, ValueIsSet> primeIgnition(Ignition i) {
            Objects.requireNonNull(i);
            this.i = i;
            return (SafeBuilder<RequiredArgProvided, ValueIsSet, ValueIsSet>) (SafeBuilder) this;
        }

        @Override
        public SafeBuilder<ValueIsSet, RequiredArgProvided, ValueIsSet> loadFuel(Fuel f) {
            Objects.requireNonNull(f);
            this.f = f;
            return (SafeBuilder<ValueIsSet, RequiredArgProvided, ValueIsSet>) (SafeBuilder) this;
        }

        @Override
        public SafeBuilder<ValueIsSet, ValueIsSet, RequiredArgProvided> programTarget(Target t) {
            Objects.requireNonNull(t);
            this.t = t;
            return (SafeBuilder<ValueIsSet, ValueIsSet, RequiredArgProvided>) (SafeBuilder) this;
        }

        @Override
        public SafeBuilder<ValueIsSet, ValueIsSet, ValueIsSet> addSweetDecal(SweetDecal d) {
            this.d = Optional.of(d);
            return this;
        }

        @Override
        public SafeBuilder<ValueIsSet, ValueIsSet, ValueIsSet> breakChampagneBottle(ChampagneBottle b) {
            this.b = Optional.of(b);
            return this;
        }
    }

    /**
     * This is the beginning and the end of the type flow. It starts us out with a builder
     * that has all "No" phantom types and receives the fully constructed "RequiredArgProvided" types.
     */
    static abstract class SafeBuilderProvider {
        static SafeBuilder<RequiredArgOmitted, RequiredArgOmitted, RequiredArgOmitted> builder() {
            return (SafeBuilder<RequiredArgOmitted, RequiredArgOmitted, RequiredArgOmitted>) (SafeBuilder) new SafeBuilderImpl();
        }

        //Unlike rust, which has type specific static dispatch, we can't make the full builder
        //experience work. We need to pass back to another reciever method that can constrain the types
        //of the builder to "fully constructed"
        static ReadyToLaunchRocket build(SafeBuilder<RequiredArgProvided, RequiredArgProvided, RequiredArgProvided> builder) {
            SafeBuilderImpl privateBuilder = (SafeBuilderImpl)(SafeBuilder)builder;
            if (privateBuilder.b.isPresent()) {
                System.out.println("Let's celebrate this occasion by christening the new ship with some champagne! " + privateBuilder.b.get());
            }
            if (privateBuilder.d.isPresent()) {
                System.out.println("Cubert says that sweet decals make this ship go faster, let's put this one on! " + privateBuilder.d.get());
            }
            return new ReadyToLaunchRocket(privateBuilder.i, privateBuilder.f, privateBuilder.t);
        }
    }
}
Collapse
mindflavor profile image
Francesco Cogno Author

Really nice! Thank you! The static ReadyToLaunchRocket build trick is really clever!

Collapse
tzachshabtay profile image
tzachshabtay

I'm not convinced this is a better way. Like you said, the compilation error is an issue. Also, while the call itself might be more readable, if you turn to read the trait to understand what it does, you're in for a world of pain- so much noise. And third, it won't scale: if I want to cook pasta a million times, there's a lot of unneeded copies and allocations. I think the real solution should be in the language itself, with named and default arguments: internals.rust-lang.org/t/pre-rfc-...

Collapse
mindflavor profile image
Francesco Cogno Author

I agree, the copies are ugly. I expect them to be optimized away by the compiler tough (I haven't checked, it should be an interesting thing to do).

Named and default arguments just postpone the problem. If you have many optional fields your code will likely include a lot of if Some(val)... to handle them.
These ifs will be a runtime penalty - unless the compiler can again optimize away. Using types you can achieve static dispatch and incur in no runtime penalty (at expense at slower compilation and bigger code size).