When we think of esoteric programming languages, most of us quickly gravitate towards one in particular: the infamous Brainf*ck - a super minimal programming language that achieves Turing Completeness in just six simple instructions (not including the two for I/O).
The incessantly lovely Brainf*ck was created in 1993 by Urban Müller in an attempt to construct a usable - albeit barely - programming language with a compiler under 1024 bytes. Why 1024 bytes? Well, he had been inspired by Wouter van Oortmerssen's FALSE, a stack-based language with a compiler of exactly 1024 bytes. So I guess you could say it was a competition of sorts.
In this article, purely for the joy of it, we will see that Brainf*ck is actually an informal dialect of a programming language invented in Italy more than 50 years ago: Corrado Böhm's P″ (pronounced "P Prime Prime" or "P Double-Prime).
I do expect that the reader has basic familiarity with Brainf*ck - although, if you've never heard of it, I definitely recommend you take a look. Oh, and don't worry; despite the scary name, it's really very easy to learn (note, I didn't say "easy to use"!).
Let's talk a bit about Corrado Böhm's place in the history of Computer Science before we get into the nuts and bolts of his decidedly strange creation.
Corrado Böhm and the Structured Program Theorem
You can skip this section if you're only interested in the details and not the motivation.
When we talk about "programming" today, we are almost universally referring to what is known as "structured programming". That is, writing programs that employ the use of block structures, functions and loops. Pretty simple, huh? In fact, this all seems so obvious nowadays that we don't feel the need to qualify so specifically what we mean - so we use more general terms like "imperative" and "declarative" to describe our favorite paradigms of programming. The whole "structured" part is kind of a given.
But it wasn't always this way...
Many of the earliest programming languages, including heavy-hitters like BASIC, FORTRAN and COBOL1, were unstructured: Statements were sequentially ordered, generally one per-line. And those lines were given labels or numbered in such a way that a program could perform an unconditional jump from one part of the program to the other.
And when I say "jump", I really mean jump! Execution was not returned back to the calling context as is the case with a function call (unless, of course, it was physically asked to with another jump). One could, at least theoretically, jump straight into the middle of a
SUBROUTINE or an
IF2. Following the execution of a program meant tracing all of the jumps. Such a messy form of execution eventually gave rise to a phrase we still hear a lot of today (but never about our own code, of course): "Spaghetti code"3!
In case you haven't worked it out already; yes, I am talking about the notorious
The early '60s were dominated by the debate over structured vs. non-structured programming. This was further ignited when, in 1966, a landmark paper titled "Flow diagrams, Turing Machines and Languages with only Two Formation Rules" was published by Corrado Böhm and Giuseppe Jacopini. This paper4 provided the theoretical foundation for structured programming (and therefore the abolishment of
GO TO) by proving that one can compute any computable function with only three simple control structures:
- Execute one subprogram, and then another subprogram (sequence)
- Execute one of two subprograms according to the value of a boolean expression (selection/branching)
- Execute a subprogram until a boolean expression is true (iteration)
To demonstrate this in relation to Turing machines, they, of course, used a programming language: P″
P″, the mother of the Turing tarpits
P″ is a very simple. In fact, it's so simple that the whole thing can get a bit confusing. Kinda' like those movies that are so bad they loop back around to being good again. The language consists of only four characters:
) that act upon a Turing machine with infinite tape (main memory) and a finite alphabet (the set of things that can be written in a memory cell - such as the digits
Böhm defined the syntax rules5 as follows:
λ, R ∈ P″
q₁, q₂ ∈ P″implies
q₁q₂ ∈ P″
q ∈ P″implies
(q) ∈ P″
- Only the expressions that can be derived from rules 1, 2 and 3 belong to P″.
Right. Let's try that without the set notation:
- The characters
Rare syntactically valid programs in P″.
- Programs can be composed. So, if
q₂are programs in P″, then so is their composition:
- Any valid program can be iterated over. So, if
qis a program in P″, then so is
(q). This will make sense when we get to the semantics.
- That's it!
Böhm then goes on to explain the language semantics, which I've summarized and simplified here:
- The machine has a finite alphabet of length
N > 1.
0is considered a special, blank symbol. So each memory cell can contain any value from
N. Because we are dabbling in the theoretical world of Turing machines, we can say that the exact value of
Nis precisely what it needs to be for the computation at hand.
- Execution starts at the leftmost symbol and proceeds to the right until there is nothing left to compute.
Ris the operation of shifting the tapehead (a.k.a data pointer) forward to the right one cell (if there is one).
λis the operation of replacing the symbol,
c, at the tapehead with
(c+1) mod (N+1)and then shifting the tapehead to the left one cell. Note the modulus operation - so if
N = 5then trying to increment
5will result in
0(the blank symbol) because
6 mod 6 = 0.
qis any valid program, is interpreted as a while loop that iterates while the current cell does not contain the blank symbol (
0). Endless loops are possible, of course.
So, we can think of P″ as being a near-literal implementation of a Turing machine in a similar sense that we can see the original Lisp as being deeply related to Church' Lambda Calculus. The former is just far less useful to the practicing programmer.
In a paper published in 1964 for the International Computation Center in Rome6, Böhm proved that P″ was Turing-complete, which makes it the first structured programming language that did not contain a
GO TO instruction but instead relied upon iteration. Djikstra would go on to cite Böhm and Jacopini and in his now-famous paper, Go To Statement Considered Harmful, which would help to solidify their place in computer science folklore.
Here is a simple program to add two numbers together for a Turing machine where
N = 3 and main memory looks something like this
( λRλRλR # Decrement current cell (remember that N+1 wraps back around to 0, so decrementing is equivalent to N increments). R # Move tapehead right. λR # Increment current cell. λRλRλRλ # Move tapehead left by decrementing the current cell and then executing one final λ. ) # Iterate if the current cell is not 0. This means we iterate twice in this example.
At the end of execution main memory will look like this:
Ok, cool, I guess. Maybe you can see some minor similarities to Brainf*ck, but we are definitely missing some things. And even in esoteric programming language terms, it's all very clunky!
But, luckily for us, Böhm provided several small abstractions that allow us to write much simpler programs. And in doing so we will see that Brainf*ck and P″ are equivalent.
Let's take a look.
When two become one
There are a few instructions that Brainf*ck gives us that we don't get out-of-the-box with P″. Specifically, we still need to fill in the following gaps:
- Increment the current cell (without moving the tapehead!).
- Decrement the current cell.
- Move the tapehead leftward.
Let's define those as follows:
I = λR
- Increment the current cell and move left, then move right.
D = I^N
- This is defined as
I's in a row. So for a Turing machine where
N = 5it's literally
IIIII. The best way to think of
Dis as a macro that expands to
I's rather than a runtime construct like a function.
L = Dλ
- Decrement the current cell and then increment it (to restore the original value) and move left.
To recap, we have now now defined the following instructions:
Now it's starting to look familiar! And so, finally, let's write the previous program again with our new abstractions:
(D R I L)
And how does the same program look in Brainf*ck?
[- > + <]
Interesting! Although we've decided upon slightly different syntax, these two programs are now semantically identical.
I might just take a moment here to mention that there is something very fulfilling about defining a programming language in which
DRILL is a valid program (although, I must admit, Böhm used different names). I think I'll call my dialect "Braindriller".
Let's try another program7 that moves a value from cell
0 two places to the right (cell
R R ( D ) L L ( D R R I L L )
> > [ - ] < < [ - > > + < < ]
Well, I don't know about you, but I'm sold. I implore you to keep trying other Brainf*ck programs but, considering the two models of computation are now the same, it's my hypothesis that they will all be instruction-for-instruction identical.
One final note - you'll notice that we've skipped on the I/O instructions
. that exist in most dialects of Brainf*ck. This is because the theoretical nature of P″ means that practical things like input and output would serve no purpose - we are only interested in the effect of the instructions upon the machine state (memory cells). But if you were to build a P″ interpreter - and I encourage you to do so - then you could implement I/O exactly as Brainf*ck does.
If you've made it this far - well done! And I'm sure the mental energy you've just exerted will all pay off in the long run. Think about it: Next time you're talking to someone who brings up Brainf*ck, you can tell them all about the wonders of P″ and the genius of Corrado Böhm.
As always, corrections and feedback are always more than welcome. :)
- BASIC, FORTRAN and COBOL all added support for structured programming in later versions.
- I guess this would make coroutines somewhat simple to implement.
- It's not known exactly who coined this term, but according to Wikipedia, an early usage comes from the Guy L. Steele's 1977 paper: Macaroni is better than spaghetti
- This paper is known to be cited more than it's read. OK, I admit, I tried to get through it all. I really did. But before you judge me - give it a shot yourself.
- Had Backus-Naur Form been popular in '64, it would have looked something like this:
<program> ::= R|λ|<program><program>|(<program>).
- I have searched high-and-low for a copy of this paper. The ICC was decommissioned in the late-60s and most of their publications are very difficult to find now. I've tracked a physical copy down to the British Library, but am yet to get my hands on it.
- Courtesy of esolangs.
- Cover image, originally from corradobohm.it, shows the attendees of the first Lambda Calculus Conference (1979). Corrado has the white hair in the second row.