Zelenya

Posted on Mar 30

Types and Compilers are our friends

#haskell #functional #rust

The (natural) language we use is influenced by how we feel and what we think, but it also works the other way: just by the way we approach things, we can make our experience better or worse.

So, when we say "compiler is in the way" or "I have to fight the compiler", we are setting ourselves up for failure. And I don't mean some gigantic failures ‒ I mean small shitty every-day frustrations. There is no need to live like this.

And it's normal to be frustrated, not trying to put pink glasses on everyone, just trying to share different perspectives on some things.

Good and Bad

Disclaimer: I don't want to go too deep into details or muddy the definitions. Let's establish an abstraction level: we'll focus on moving more work out of runtime to build/compilation time.

We don't have to search too long to get to the most common arguments on this topic:

"Good"

catching bugs
enforcing correctness
improving performance (optimizations)

"Bad"

introducing "to-live" latency (slow)
introducing cryptic error messages
introducing unnecessary work (bad for prototyping)

It's somewhat common to treat those as trade-offs. Some people say that those "bads"/"frustrations" are there because the compilers are doing solid work for us. Others say it's not worth it. And we keep going in circles for years.

There is some truth to that. However, I don't think of those as "frustrations" nor as balanced trade-offs. And here's why.

Bad? Introducing latency

Things are relative. Slowness is relative. When we want to run a program and check our changes, there is a difference between waiting for javascript, purescript, and scala to run. Okay.

However, when we account for the time it takes bugs (or programming errors) to surface, we see a more noticeable difference.

For example, imagine we added a new status to a reporting job: "dismissed", ran the program (or even deployed it), opened the dashboard, updated the state, checked that it's there, and were done. Pretty sure this could be accomplished with javascript xTimes faster than with scala. Ok, but. If this status is used somewhere else, we didn't notice, javascript didn't care, and as a result, our users faced the bug. Not ok.

We traded off waiting seconds for the compiler to verify that all enum usages are exhaustive for a runtime error hours later. Was it worth it? Still valid to ask and question that.

And yes, compilation times can be really slow, but there is often room for improvement. And I don't mean just waiting for the compiler team to do the improvements ‒ I mean rolling up our sleeves. I know it's nice to have things out of the box, but if you want faster builds, you might need to do something yourself: reorganize modules for incremental-friendly compilation, figure out the optimal build configurations, tweak the CI caches, and so on.

Bad? Introducing error messages

Similar situation with error messages.

There is elm with friendly compilation error messages, haskell with infamous messages, and then there are null pointer exceptions at runtime. I don't know about you, but I'd take an unknown to me haskell compilation error message over a vague null pointer runtime exception any time.

And let's face it: If I see a compilation error, it means I made an error. It's not the compiler's fault (at least most of the time). And it's okay to make mistakes during development, it's a great time to make mistakes – we can learn from those! Look at this beautiful error: E0507. We can read it like a book.

Not all errors are the same. Not sure we can learn much from the compiler telling us that we forgot a semicolon. If you are so smart, Java, why don't you put a semicolon there yourself?!

And of course, individual error messages can and should be improved.

Good. Catching bugs

Remember how we talked about adding a new status to a reporting job?

enum Status {
    InProgress,
    Done,
    Failed,
    Dismissed,
}

Show pattern matching error

match Status::InProgress {
  Status::Done => println!("Done"),
  Status::InProgress => println!("Not done"),
}

// error[E0004]: non-exhaustive patterns: `Status::Failed` and `Status::Dismissed` not covered

Those are incredibly useful when we need to express constraints on the values. But we need support from a compiler (to properly create, pass, and consume those).

Also, I'm not saying the compiler is good; no compiler is bad. I want to paint a picture around expectations and intentions. For instance, the go compiler has a weak offering for the status. We can fake enums / sum types, but it's still quite error-prone; for example, nobody warns us if a case in a switch is missing.

type Status int

const (
    InProgress Status = iota
    Done
    Failed
    Dismissed
)

s := InProgress

switch s {
    case Done:
        fmt.Println("Done")
    case InProgress:
        fmt.Println("Not done")
}

// No complaints

Let's look at another example.

Rust has a single-threaded-only primitive Rc (in other words, it's not thread-safe). If we try to use it concurrently, the compiler won't let us:

let value = std::rc::Rc::new(42);

// ERROR: `Rc<i32>` cannot be sent between threads safely
let handle = std::thread::spawn(move || {
    println!("value = {value}");
});

The compiler refuses to let the value cross a thread boundary before the program ever runs (through the Send trait). Nothing is stopping us from making those sorts of mistakes in java. For instance, we can forget to use volatile on a variable (and the changes won't be noticed by other threads), or we can use a non-atomic operation on concurrent collections – we always need to be aware and use the right combination of keywords and primitives.

import java.util.concurrent.ConcurrentHashMap;

public class Main {
    public static void main(String[] args) throws InterruptedException {
        ConcurrentHashMap<String, Integer> map = new ConcurrentHashMap<>();
        map.put("count", 0);

        Thread t1 = new Thread(() -> {
            for (int i = 0; i < 1000; i++) {
                map.put("count", map.get("count") + 1); // not atomic!
            }
        });

        Thread t2 = new Thread(() -> {
            for (int i = 0; i < 1000; i++) {
                map.put("count", map.get("count") + 1); // not atomic!
            }
        });

        t1.start();
        t2.start();
        t1.join();
        t2.join();

        System.out.println("Expected: 2000, Got: " + map.get("count"));
        // The result is going to be different on each run, not 2000
    }
}

Note that you can use a thread-safe Arc there:

let value = std::sync::Arc::new(42);

let handle = thread::spawn(move || {
    println!("value = {value}");
});

handle.join();

One can argue for a skill issue or whatever, but we won't go there.

The rust compiler or type system is quite expressive, allowing us to be quite expressive. We can eliminate classes of bugs, as well as classes of thoughts. We don't have to think about using the wrong collection because it's not even possible. With more thinking capacity, we can prioritize mastering things that actually matter – in writing or review, we can focus on the core intent of the code.

And it's not about thinking less; sometimes we have to think more. In general, it's about thinking about different things (and at different times).

I am a big fan of the "parse don't validate" approach. For example, using an explicit non-empty list instead of normal arrays or lists with comments, runtime errors, or whatever. A non-empty collection isn't always better or a possible alternative to a collection. So, we have to think and decide upfront if enforcing a non-empty requirement is actually the right thing to do (depends on what and for whom we are optimizing). But if we do decide, someone can't mess it up.

-- Please pass a non-empty array, please!
foo :: Array Int -> _ Unit
foo items = case head items of
  Nothing -> throwException (error "Unexpected empty array")
  Just x  -> ...

bar :: NonEmptyArray Int -> _ Unit

Now, imagine:

We have a giant list (or text) that we need to carry around and render in multiple places in our system.
In some places, we want to show it raw, in others – sorted.
Showing the wrong thing in the wrong place is bad, and sorting is expensive.

We can express this with types.

sort :: RawText -> SortedText

foo :: RawText -> ActionA

bar :: SortedText -> ActionB 

baz :: RawText -> SortedText -> ActionC

This way, we can never mess it up (never mix things up, never forget to sort, and never sort twice). In languages like haskell and rust, those safety nets are zero-cost abstractions. Compilers take care of those – they don't appear at runtime.

Good. Improving performance

When it comes to optimizations, it's rarely as simple as Y times slower compilation = Z times faster runtime. Regardless, compilers are often aware of things that we're not thinking about or are too lazy to think about. So, why not let them assist us?

Take, for instance, react compiler. Probably not the first association that comes to mind when we talk about compilers, but it's a good illustration. This build-time tool automatically optimizes the application by handling memoization (adding useCallback, useMemo, useRef, and other tweaks and optimizations in the right places), so we don't need to worry about them.

"...freeing you from this mental burden so you can focus on building features."

Once again, back to freeing our thinking. And, just to be clear, I'm not advocating for thoughtless development – we need to be realistic about management of our thinking resources. Of course, we lose some control. What if we add a minute to the build for a 100ms rendering improvement for the users? Is this a fair tradeoff?

Bad for prototyping?

Out of all of these, the meme about prototypes gets on my nerves the most.

First, short term vs long term effects of prototypes. Why does nobody talk about the long term? Why can't we admit that most prototypes stay in production?

Prototypes are products.
The impact of failure is high.

I know why, because it requires introspection and admitting mistakes, nobody wants to do that...

After seeing dozens or hundreds of prototypes, I'm pretty confident that languages with fearless refactoring shine here. Haskell, PureScript, etc. are really good when it comes to delivering with rapidly changing requirements while staying reasonably correct and maintainable.

Rust is not famous for its refactoring – the meme is that if you choose the wrong abstraction, you have to rewire everything. However, even then, I'd still take refactoring a project in rust over javascript 9 times out of 10.

Ok, let's take a step back and simplify to the short term. Let's imagine an extreme example: 1 out of 10 experiments actually stays. Or even, 1 out of 10 one-off scripts doesn't get thrown away the next day. In this case, ok, I agree, we don't need anything maintainable. The impact of failure is low.

However, it's still a myth that something like python is the best for this. It still 100% depends on the individual or team experience. I think throw-away scripts are the best proof. I've seen people do those in bash, python, scala, rust, typescript... I can keep going.

If a person is good at something, they will be good at it when they need something fast.

Sure, we can kickstart a blog on rails xTimes faster than in haskell (from scratch). But we are not making hello worlds. What is this metric? If I'm in a company that has 20 production projects in haskell, and 0 in ruby, I bet it would be much faster to start the 21st one in haskell.

Hold on, what about tests and linters?

The initial abstraction is a bit leaky. We included the "react compiler" build tool, but I want to push back on some other build tools. A bit. It's somewhat common to hear:

"We don't need proper type systems, we write tests and catch things with linters."

Both are the wrong tools for that job. Simple as that. Those also add latency, vague errors, and unnecessary work. And, on top of that, they are more error-prone, add flakiness, increase the size of the codebase, and are easy to neglect (or forget).

I use linters and tests a lot, and they are important. But as supplemental tools. It's like real supplements. Compare eating junk food along with taking a bunch of multivitamins vs. eating nutritious food and a couple of vitamins you personally need.