This post is part of a series called functional fundamentals. See the introduction and index. That's also where you can request specific topics
Note that I am not an authoritative figure on these subjects, if you see something you think might be incorrect or incomplete, please point it out with a comment, let's prevent mistakes from spreading :-)
The Curry-Howard correspondence states that formulating a mathematical proof and writing a (typed, functional) program are basically the same thing. It isn't just an analogy, it's an isomorphism, meaning they act exactly the same.
Why does that matter? For programming, it means that if we can compile a program of type a
, we have a proof that executing the program will result in a type a
. There cannot be any errors or other result that isn't an a
.
That is, provided you don't have any infinite recursion in your program, in which case the program will never terminate and you won't get your a
.
Now, most of this we already got from typed lambda calculus, which is the foundation of functional programming. However, the Curry-Howard correspondence is a very useful frame of thought when dealing with slightly more advanced types.
To really understand its value, you will have to dive deeper into the subject. In this post specifically, I will explore the isomorphism between intuitionistic, propositional logic and simple types.
A crash course in logic
Do not worry if you don't get all this in the first go, it is a lot to process. University courses tend to spend one or two weeks on propositional logic. Take your time and read it over a few times. Replace symbols with their meaning if it helps you.
If you're familiar with classical logic: intuitionistic logic is mostly the same but does not assume ⊢ p ∨ ¬p
and ¬¬p ⊢ p
.
If you're not familiar with classical logic: don't worry, you were not supposed to understand any of that, the rest of this section is for you.
There are many 'types' of logic around. Each has its own set of rules. We will deal with intuitionistic logic, also called constructive logic. Like many other logic systems, it deals with truth; a proposition can be true or not true.
Essentially, it defines the basis for formulas about truth, much like addition, subtraction, division and multiplication define formulas about numbers.
Much like for numbers in arithmetics, truth values in logic have operators. Each takes truth-values and produces truth-values. Let's take a look at these operators. I'm going to denote true as ⊤
and false as ⊥
.
Conjunction
A conjunction, denoted ∧
, is like the AND operator in programming; it becomes true when both operands are true.
E.g. p ∧ q
has the following truth table:
p |
q |
p ∧ q |
---|---|---|
⊥ | ⊥ | ⊥ |
⊥ | ⊤ | ⊥ |
⊤ | ⊥ | ⊥ |
⊤ | ⊤ | ⊤ |
( As for how to interpret truth-tables: the first line in this truth table means that when p
is false (⊥
) and q
is false (⊥
), p ∧ q
is also false (⊥
). We can substitute other formulas for p
and q
)
Disjunction
A disjunction, denoted ∨
, is like the OR operator in programming; it becomes true when either operand is true.
E.g p ∨ q
has the following truth table:
p |
q |
p ∨ q |
---|---|---|
⊥ | ⊥ | ⊥ |
⊥ | ⊤ | ⊤ |
⊤ | ⊥ | ⊤ |
⊤ | ⊤ | ⊤ |
Implication
An implication is denoted →
. It means "if the first operand is true, then the second operand must also be true". Note that this "if" is NOT like conditionals in programming, which are more akin to "if and only if" and deal with control flow rather than truth. Implication, on the other hand, does not give a damn about what happens when the condition is false.
E.g. p → q
can be read as "p implies q" and has the following truth table:
p |
q |
p → q |
---|---|---|
⊥ | ⊥ | ⊤ |
⊥ | ⊤ | ⊤ |
⊤ | ⊥ | ⊥ |
⊤ | ⊤ | ⊤ |
As you can see, "p implies q" is only false if p is true, but q is not.
An example: if I tell you "I will give you 10 bucks if you fetch me a beer", you have the right to call me a liar if you fetch me a beer but I do not give you 10 bucks. On the other hand, if you don't fetch me a beer but I still give you 10 bucks, you can't really say that I lied because I did not specify what would happen if you wouldn't fetch me a beer. (interesting fact: humans get a lot better at logic when you formulate it as a cheater-detection problem)
Proofs
So far we've learned the basic components of formulas. However, unlike numbers, there aren't many cases in which we need to compute the truth value of some formula. What's more interesting instead, is formulating proofs.
We say A ⊢ B
(read: A proves B) to say that B can be proven from A (in whatever proof system we're using, intuitionistic logic in our case).
We can also say A ⊨ B
(read: A entails B) to indicate that B must be true if A is true.
A proof system can be incorrect (a logician would say "not sound") or incomplete, so the two are different.
⊢ and ⊨ are operators in the meta-languages, which is a fancy way of saying that they tell something about formulas but are not a part of them. Pretty much in the same way that a for loop cannot be part of an expression (5 + for (...) {...} / 7
).
We separate multiple formulas with commas: A ⊢ B, C
means we can prove both B
and C
from A
and A, B ⊢ C
means we can prove C
if we have both A
and B
.
Now, let's look at a few axioms (an axiom is something that is assumed to be correct):
p ⊢ p
: anything is a proof of itself. Well, duhhhh.
p ∧ q ⊢ p
: also no surprise, if a conjunction is true, either of its operands must also be true
p, p → q ⊢ q
: again, pretty clear, if we know that p is true, and p implies q (p → q
), q must also be true.
p → r, q → r ⊢ (p ∨ q) → r
If we know that p implies r, and q implies r, so long as either p or q is true, r will be true as well.
We also have associativity and commutativity of conjunction and disjunction but you're likely familiar with those concepts, there's also some more axioms which I've left out because they're less relevant for functional programming.
Of course, we can drop whatever proposition we have proven if we no longer care for them, they're still true, we just don't write them down.
We can now prove e.g. a → (b → c), a, b ⊢ c
:
a → (b → c), a, b |
|
⊢ b → c |
we prove b → c using p, p → q ⊢ q : substitute a for p and b → c for q
|
⊢ c |
now we have proven b → c , we can use p, p → q ⊢ q , substituting b for p and c for q (remember b is still true from before) |
Back to functional programming
From the last example, you may have started to get an inkling as to what on earth this has to do with functional programming. The operators you've seen for logic correspond to type operators in functional programming.
In Elm syntax (≅ is the operator for isomorphism):
A tuple or record is a conjunction
type alias Foo = (A, B)
≅ Foo = A ∧ B
An enum type is a disjunction
type Foo = Left A | Right B
≅ Foo = A ∨ B
A function an implication
type alias Foo = A -> B
≅ Foo = A → B
So when you are writing a function with type String -> Int
, you are writing a proof that you can get an Int
(Int = ⊤
) if you have a String
(String = ⊤
), or M ⊢ String → Int
.
The premise (stuff before the ⊢
) M
consists of all the functions and variables that are native, imported or that you've written previously.
The compiler will check that your proof is correct (you have no compiler errors).
Take a moment to let that sink in.
Now, imagine that one of the formulas you used in the premise is incorrect, your proof becomes invalid! That's essentially the problem with side-effects. If we have some functions p -> q
, that has some side-effect, i.e. it depends on some non-constant 'state' r
that we don't know about, what it really corresponds to is a proof of p ∧ r → q
(side note: r
could be a property of p
, but we haven't discussed that kind of logic yet so don't overthink it). Or, if the function throws an error in certain cases, you could say the function proves p → ( q ∨ Error )
, just not p → q
. The point is: the type signature of an imperative function does not constitute a proof. It's primarily an efficiency feature, not a correctness one.
Other stuff
This isn't all there is to the Curry-Howard correspondence, there are still more advanced type systems, corresponding to more advanced or just different logic. It is a topic of active research today, e.g. in trying to prove that a certain parallel implementation is correct. The main innovation of Rust is linear types, which are derived from linear logic.
I think, however, that this is more than enough information for a first post about the subject.
Further reading:
- implies vs entails vs provable
- I've briefly mentioned infinite recursion and in how that can screw up your proof. This is unavoidable given the halting problem. One possible solution for this problem is to drop Turing completeness and resort to total functional programming.
Top comments (7)
Great article. I have a doubt. You mentioned that the type checker for imperative programs can only give a guess as to whether the function will indeed return the correct type due to side effects in code. Is that also valid for languages with exceptions like Java and C#? Java for e.g also has product types and can have sum-types if you are willing to jump through some hoops (using inheritance). Java's Exceptions make the effective return into a sum type of (return or thrown exception). Won't similar inference laws apply except with the type changed to reflect possibility of exceptions?
For example:
f = p -> (q or e)
where e is thrown exception simply becomesf = (p or e) -> (q or e)
and another function
g = (q or e) -> (r or e)
thus arguments for each function have a sum type oforiginal arguments or error
"Composing" (altered to check for error and short-circuit) these will become:
g o f = (p or e) -> (q or e) -> (r or e)
In Java code we cannot ignore a checked exception, we have to handle it by catching or throw it and exit.
Is there something I am missing here?
I'd like to first clarify a few things for other readers, you seem to be quite ahead of what I expected for my intended audience :-)
Sum-types are essentially disjunctions / enum types. If you have two types A and B, which can have respectively a and b number of values, their sum-type (disjunction / enum type / union) C can have a + b values, hence the name.
Similarly, product types (conjunction / tuples) can have a * b values because any combination of an A and a B is valid.
I am not intimately familiar with Java and C#, so they might have a "declarative mode" I just don't know about, in which case please do correct me. In my memories of Java, it was rather simple to cause an unhandled exception, e.g. ArrayOutOfBounds. But I will assume there is a 'safe mode' where you must handle all exceptions.
In this case, you are right in saying that exceptions are part of the type system and shouldn't be considered a side effect. If exceptions would have been the only side effect, that would make the language declarative! (You can look at SAC for an imperative looking but declarative language)
However, we still have hidden state in the form of static variables
and non-const, non-unique, references. That's essentially the
p ∧ r → q
example.Now you're thinking with Monads :-)
In Java, there are checked exceptions and unchecked exceptions. Checked ones can be caught in the code, unchecked ones like the ArrayOutOfBounds result in a Runtime exception. Runtime exceptions do not have to be caught i.e code will compile fine if you ignore them. OTOH, checked exceptions have to be caught.
You are right in that I had not considered global variables. I was thinking that they can also be added to the type signature
f = ((p and s) or e) -> ((q and s) or e)
wheres
is all state variables. But I can see that there is too much pollution occurring with these additions.Regarding functional languages like Haskell, how do they handle fatal errors like machine getting out of memory or being sent a
kill -9
signal? Is that also built into the type system?Finally, is there any book you can recommend for getting better at Functional programming? I tried Learn You a Haskell but after 5 chapters I kind of never got back into it.
I'll need to learn FP to understand what that means :)
Sure! That's what you would do in declarative languages.
The functional part of a program is just declarations, so it doesn't really 'run'. Because of this it can't run out of memory or be interrupted either. That all happens in the runtime,
IO
Monad, or similar imperative structure, which can indeed crash in the ways you mentioned.If the entire system were purely functional it'd be a different story, but I'm not aware of such systems existing in reality.
You seem to have the right mindset to apply FP in practice. Learning the formal definition of a Monad can wait a bit. I have a post planned for it in the future.
Unfortunately no, I personally don't learn well from books at all. Most of my uni books were left unopened.
Learning Elm can be a good starting point, see the guide and the beginners' slack channel
Haskell and Idris certainly have more features, but those very same features can get in the way of understanding the basics. Their type systems for instance use more than just propositional logic (another subject for the future).
I laughed hard!
Do you need a compiler to do this, or can it be done with a separate program where no compiler exists for the language?
There are proof checkers. I'm not familiar with them though. I believe Coq does something similar.
Could you do this for an interpreted language? Sure, but strongly typed, functionally pure languages are generally compiled. (if I'm not mistaken, you'd be losing out on a virtually free speedup by not compiling).
Agda and Coq are both languages that also forbid infinite loops and thus have type systems that allow you to prove logical statements (via the Curry-Howard correspondence) about your programs.
It is cool to know your program has certain properties for all inputs by proving it, but also very time consuming :)