Alex Esoposting

Posted on Jul 17, 2021

Brainfuck Assembly Language Tutorial

#esoteric #brainfuck #tutorial

So you have your personal Brainfuck Processing Unit (or at least its simulation) and you want to do something with it. BAL is based on brainfuck, a language famous for its mind breaking programming experiences, but if you learn to think in a certain way neither of them is that hard. Don't raise your expectations too high though, I'm not building a game or a web app in this tutorial.

Basics

Like brainfuck but better!

BAL uses eight one-character commands: +-><[],.. Each of them can take an argument represented by a number directly after the command. Just like in brainfuck the commands operate on a linear memory using a single pointer. Anything that's not a command or a number is a comment and is ignored by compilers.

+/-: add/subtract the argument from the selected cell.
>/<: move the pointer up or down the memory according to the argument.
[/]: jump forwards/backwards some number of commands if the selected cell is zero/nonzero.
,/.: input/output the selected cell. Argument use depends on the BPU build, I will ignore it completely here.

+-><[] commands are called "add commands" and their argument must be at least one. It's also assumed to be one if you don't specify it, but that's a bad practice and leads to less readable code. ,. commands are "I/O commands" ant their default is zero. There will also be a maximum argument depending on the BPU architecture. This tutorial assumes an 8-bit BPU, so max arguments are 32 for add commands and 31 for I/O commands.

Simple loops

The most basic thing in BAL

Having seen such a limited set of tools you might start to question whether it's possible to program in BAL at all! How can one write a complex program if every command only ever involves a single cell? The answer is loops. While it's true that any single command only uses one cell at most if you wrap them in a loop you can create commands that use multiple cells and involve conditions.

Set cell to zero

Loop base

Let's look at the simplest useful loop: setting a cell to zero. There is no command in BAL that would allow to set a cell to a constant value. Instead you need to use subtraction in a loop that repeats until the cell is zero. You should also only subtract one to make sure you don't jump over zero. Here's the code with comments:

In memory explanations cells are divided with pipes (|)
and the current cell is marked by *asterisks*.
Starting memory: *some number*
[3    Jump over the loop if the cell is already zero
  -1  Subtract one
]1    Jump back until the cell is zero
Final memory: *zero*

You might recognise this construction as an inverse case of a for loop. In another language this could be written as for (i = cell; i > 0; i--) pass;. It's very important to think about actions in BAL in terms of "repeat something N times" and most loops will end with -1 ]X.

Moving values between cells

Loop body

Again, there's no single command to move a value from cell to cell. Thinking with fors, what is the action to repeat N times to move N somewhere? It's not hard to realise it's just "add 1", which gives us the following code:

Starting memory: *N* | zero 
[6    Jump over the loop if the cell is already zero
  >1  Move over to the cell that the value is moved to
  +1  Increment it
  <1  Move back to the original cell
  -1  Decrement it
]4    Repeat until original cell is zero
Final memory: *zero* | N

This code actually doesn't move the value between cells, it increments zero by the value of a cell. If the target cell was something other than zero it would still end up being incremented by the same amount, giving us addition for free.

Another thing achievable for free with this code is multiplication by a constant. If instead of adding one N times you add a constant N times you get N multiplied by the constant.

Multiplication

Nested loops

Let's again start by thinking with fors: what's the action to repeat N times to get M*N? It's adding M, and we already know how to do that, so let's see the code.

Starting memory: *M* | N | target (zero)
[11
  >1  Move over to N
  Add N to the target
  [6 >1 +1 <1 -1 ]4
  <1  Get back to M
  -1 Close up the loop
]9
Final memory: *zero* | zero | N

This looked promising, but the N was moved from source to the target, not copied, so it wasn't there to be added on the next loops. Instead of just moving N to target you need to use another helper cell initialised to zero. Then you move N to target and helper, and after that move helper back to N which leaves it ready for the next loop.

Starting memory: *M* | N | target (zero) | helper (also zero)
[20
  >1  Move over to N
  Move N to the target and helper
  [8
    >1 +1  Increment target
    >1 +1  Increment helper
    <2 -1  Decrement N
  ]6
  >2 Move to helper
  Move helper back to N
  [6 <2 +1 >2 -1 ]4
  <3  Get back to M
  -1 Close up the loop
]18
Final memory: *zero* | N | N*M

Conditional jumps

AKA Ifs

In the previous section most of the work was done by the ] command which jumps back and facilitates looping: executing the same code multiple times. Now is the time to take a closer look at the [ command which jumps forward and facilitates ifs: skipping parts of the code based on a condition.

Skipping loops

Been there, done that

I've already shown examples of the most commonly used forward jumps in BAL: skipping over loops that shouldn't be entered. This is exactly how they are used in pure brainfuck as well. Usually they aren't necessary for the program to work correctly in BAL but they can make it several times faster, depending on the loop.

Let's consider the simplest loop: -1 ]1. It starts by subtracting one, and then repeats it until the cell is zero. The problem arises when the cell is zero before this loop starts. Since BAL is supposed to be executed on hardware solutions, subtracting one from zero will result in overflow and wrap back to the maximum value the cell can hold (255 for 8-byte solutions), and then repeat the loop the maximum possible number of times until it's back at zero. That's why it's usually worth it to spare this one additional memory cell for a [ command before most loops.

Standard ifs

Normal constructions that you know and love

The only data type brainfuck can handle is integers, and the only comparison operation available is "equals". That doesn't sound like much, but it's enough to make basic "if" blocks that branch depending on a number in a cell. In normal languages it might look like this:

if n == 0 then
  do something
else if n == 1 then
  do something else
else if n == 2 then
  do something even more different
else
  do the final thing
end

Execution would skip over sections until finding the one that does fulfill the condition, execute it and skip the rest to continue execution after the end of the if block. BAL only allows jumps when the equality condition is fulfilled, but not on "is not equal to" condition. This follows from the fact that [ offers jumping if the number is equal to zero. This means the code requires some reorganisation:

if n == 0 goto N0
if n == 1 goto N1
if n == 2 goto N2
  do the final thing
  goto END
N0:
  do something
  goto END
N1:
  do something else
  goto END
N2:
  do something even more different
END:

This still includes comparisons to numbers other than zero, but checking if N is equal to X can be converted to checking if N-X is equal to zero which can be easily done in BAL. Let's see what the code looks like using BAL:

[9     Jump if N equals zero
-1 [9  Jump if N minus one equals zero
-1 [9  Jump if N minus two equals zero
do the final thing
-1 ]1  Set N to zero for the jump
[6
do something
[4  N is already zero here so it can be used for jump
do something else
[2
do something even more different

For this example I considered every "do" a single command so jump arguments are quite small, but they will need to be adjusted to whatever happens in each branch and to the amount of branches. Jumping to the end can be done using a cell different than N, but it's important that every branch ends on the same cell unless you deliberately want the program to move.

Moving loops

Complicated stuff

In the first section I talked about BAL loops as for loops, but actually they are more like while loops. They can repeat something N times if you go back to the same cell and decrement it at the end of the loop, but but it's sometimes useful to change the cell in another way or even not come back to the same cell altogether.

Text reversing

Ultimate test of skill

The task is to read a string of bytes representing ASCII symbols terminated by a newline, and then return that string in reverse order.

To reverse a piece of text you need to take each letter as input, store them all and return in reverse order. Seems like a trivial task, but storing a variable number of letters and reading them means moving through a variable number of cells which requires moving loops.

Let's break down the task into smaller parts:

Setup
Reading loop
- Read a byte
- Check if it's newline
- If not then move to the next cell
- If yes then move to the writing loop
Writing loop
- Move to the next saved byte
- Check if it's the end
- If not write the byte
- If yes finish the program

Because newline (ASCII 10) terminates the string it will never appear in it and can be safely used as a bumper telling the writing loop to stop:

Starting memory: *zero* | anything
+10 >1  Set cell to the bumper value and move over
Intermediate memory: bumper | *anything*

Then the reading loop starts. It checks for the ASCII code for newline.

Intermediate memory: bumper | *anything* | more of anything
,       Read a byte
-10 [6  Don't enter the loop if newline
  +10   Restore original value
  >1 ,  Move over and read a new byte
-10 ]4  Stop looping if newline
Memory after one cycle: bumper | byte | *anything minus ten* | more of anything
Intermediate memory after the loop terminates: bumper | many bytes | *zero*

The loop moves one cell right each time, saving respective bytes to consecutive cells. The writing loop reverses this process.

Intermediate memory: bumper | many bytes | *zero*
<1      Move to the last byte
-10 [6  Don't enter the loop if already at the end
  +10   Restore original value
  . <1  Write and move to the next byte
-10 ]4 Stop looping if at the end
Memory after one cycle: bumper | many bytes | *byte* | zero
Final memory: *zero* | many bytes

Moving loops are a powerful tool with many use cases. It's important to always know that the loop will either hit a bumper or terminate in some other way, otherwise it could overwrite other data.

Final remarks

You're almost ready

The text reversing program could almost be a valid program for the BPU, but it needs some minor corrections. The first one is that after finishing the program the computer would not stop but go on to execute whatever else is stored in memory, and in this case it is the text given by the user. When interpreted as code it is just garbage and would lead to unexpected behaviour.

To prevent this from happening each program should include a main loop that loops always, no matter what. In our case it just requires to add some code at the end:

+10 .  Set the bumper back to newline and print it
]19    Jump back to the start

If the jump is too big to fit in a single argument you can insert intermediate jumps right after jumps in the code. For example if you have this piece of code: ]3 ]30 then when it's executed normally either the first jump will take place or no jump will take place. On the other hand if you jump directly into the ]30 it will continue jumping back allowing for jumps bigger than whatever your maximum argument is.

The other minor issue with the program is that both the program pointer and the data pointer start at the first memory cell which is occupied by the first command of the program. The data pointer has to be moved to a code-free area first, and only then the program can be executed. This is accomplished by adding a >22 at the start of the setup, which gives the complete and working program:

>22
+10 >1  Set cell to the bumper value and move over
,       Read a byte

-10 [6  Don't enter the loop if newline
  +10   Restore original value
  >1 ,  Move over and read a new byte
-10 ]4  Stop looping if newline

<1      Move to the last byte
-10 [6  Don't enter the loop if already at the end
  +10   Restore original value
  . <1  Write and move to the next byte
-10 ]4 Stop looping if at the end

+10 .  Set the bumper back to newline and print it
]19    Jump back to the start

Conclusion

Go and write code!

Now you know how to use [ and ] commands to create advanced loops and conditionals. With such powerful tools you can write anything! Just remember to carefully plan your memory usage and the BPU is your oyster.

You can find the BPU simulation as well as a compiler and some examples in the PBU github repo. Have fun with the language!

DEV Community