Danie Palm

Posted on Dec 25, 2021 • Edited on Dec 28, 2021

Church encoding in the concatenative language Joy

#concatenative #joy #church

I've previously written about Church encoding in the Joy programming language. The article focused on booleans, logic and predicates, but didn't touch on numbers. Church numerals have been haunting me ever since, and after stumbling on two other concatenative languages, Dawn and Stck, things have finally fallen in place for me (avoiding an obvious pun here) and I've been able to simplify the Church boolean scheme as well. Instead of
updating my previous article, which would be a rewrite anyway, I'm doing a new article on Church encoding in Joy.

This article is technically part of the series Elixir of Life, but is intended to be self-contained.

Joy

Feel free to skip this intro to Joy if you've read any of my other Joy articles.

Joy is a functional, stack-based concatenative programming language developed by Manfred von Thun. It has many similarities to lambda calculus, combinatory logic, and Lisp. Like combinatory logic it differs from lambda calculus in that it doesn't feature lambda abstraction, the way in which formal parameters are specified in lambda calculus, and therefore does not have the concept of free variables. Joy differs from combinatory logic in that it is not based on function application, but rather on function composition. All joy functions, often called combinators, take a stack as its sole input and returns an updated stack as result. Joy uses postfix notation, also often called reverse Polish notation.

Joy is a fully-fledged programming language packed with all kinds of convenience functions that facilitate everyday program flow, such as conditional execution, repeated execution, recursion, and IO. It also comes with a rich set of types. But in this article, we begin to show how almost all of that can be thrown away and rebuilt from only a handful of functions and quotations. A quotations is essentially a program literal that takes the form of a list of functions that remain unevaluated unless explicitly executed. The following set of combinators, which can actually be reduced even more, is enough to render Joy Turing-complete:

dup  - duplicates the top of the stack
pop  - removes the top of the stack
swap - swaps the two tops elements of the stack
i    - executes a quotation (removes wrapping []'s)
unit - quotes the top of the stack (wraps it in []'s)
cons - prepends the element below the top of the stack to a quotation on top
cat  - concatenates the two top elements (quotations) of the stack

Here is a simple Joy program: [] [dup] [pop] swap i i, it evaluates to []. Terms are executed from left to right. We first place [], [dup], and [pop] on the stack. Then swap reads [pop] and [dup] from the stack, switches them around and puts them back on the stack. Next i unquotes [dup] which was on top of the stack, and thus causes it to be executed, duplicating [pop] which was right below it on the stack. Finally, the next i unquotes [pop] which then removes its duplicate that was below it on the stack. This leaves only [] on the stack.

Church encoding

Church encoding is a scheme by which any calculable operator and its data (operands) can be expressed purely in terms of functions. It has traditionally been explored mostly in the context of lambda calculus, but works well in concatenative languages, and - I imagine - Lisp, and combinatory logic. In fact, it works in
any Turing-complete language.

In simple terms, you only need a single primitive: the function. All other primitives (numbers, booleans, pairs, lists) can be encoded in terms of functions.

For example, in Joy (remember it uses postfix notation)it is possible to express the predicate 1 + 2 = 3 as follows:

1 2 + 3 =

where each of 1, 2, +, 3, and = is a composite function that is defined in terms of a handful of primitive functions such as dup, pop, swap, cat, unit, and cons. When executed, it leaves the value true on the stack (since the sum of 1 and 2 does in fact equal 3), where true is also a composite function defined in terms of a handful of primitives.

According to our scheme, the definition of true happens to be:

true == [pop]

But there are numerous different schemes, each with its own pros and cons (not avoiding an obvious pun here). There is no canonical representation of any literal, and its semantics as an operator (say +) or operand (say 1) arises from the scheme as a whole and how it maps onto logic and arithmetic.

In this article we employ a scheme in which literals are represented as quotations - function compositions that remain unevaluated until explicitly evaluated.

Church booleans

Booleans can be represented as functions that expect two values on the stack, returning the one argument if true and the other if false. In other words, we can write reduction rules for true and false as follows:

A B true i == A

and

A B false i == B

These reduction rules lead to the following definitions:

A B true i == A
A B true i == A B pop
A B true i == A B [pop] i
    true   ==     [pop]

A B false i == B
A B false i == B A pop
A B false i == A B swap pop
A B false i == A B [swap pop] i
    false   ==     [swap pop]

Don't stress if you don't quite yet follow the derivation process, we'll get back to it in more detail.

Note that we could easily reverse the definitions of true and false. As long as we construct the boolean operators correctly, the definitions of true and false are largely arbitrary.

While it is convenient to think of A and B as arguments of the functions true and false, they are in fact two elements on top of a stack that is the only argument of the functions.

Let's see true in action:

A B true i
A B [pop] i   # definition of true
A B pop       # i unquotes the element on top of the stack
A             # pop discards the top of the stack

Admittedly, there is nothing "boolean" here yet. The boolean nature of true and false will only appear once we introduce some operators. Note the need for i (apply) after the boolean function. It is required because true and false are quotations and therefore "inert" unless applied.

The first operator that we'll consider is and.

The `and` operator

The semantics we expect from and is as follows:

false false and == false
false true and == false
true false and == false
true true and == true

And we achieve that with the following definition of and:

and == dup i

It is easy to verify that the above definition works as expected, but it
is not immediately clear how we arrived at the solution. Let's start with
some reduction rules and work our way down to a definition. In case B is
true, the reduction rule is:

A B and == A

And in case B is false, the rule is:

A B and == B

In order to derive a definition for and we want to rewrite the reduction rules in the form:

A B and == A B C

where C is the definition of and. We do this by systematically introducing function compositions to the right hand side that are equivalent to the identity function id. This is similar to how introducing + 0 or + x - x to to a mathematical expression has no effect on its value.

We start with the case where B is true.

A B and == A                 # if B is true, only A counts
        == A B pop           # since B pop == id
        == A B [pop] i       # i unquotes [pop]
        == A B true i        # definition of true
        == A B B i           # since B is true

And in case B is false:

A B and == B                 # if B is false, we can ignore A
        == B A pop           # since A pop is a no-op
        == A B swap pop      # A B swap == B A
        == A B [swap pop] i  # [swap pop] i == swap pop
        == A B false i       # definition of false
        == A B B i           # since B is false

And so we arrive at a common reduction rule that is independent of the value of B:

A B and == A B B i
        == A B dup i

And therefore:

and == dup i

The `or` operator

We expect the following semantics from an or operator:

false false or == false
false true or == true
true false or == true
true true or == true

Using the same deductive process as before, we arrive at the following
definition:

or == [dup] swap unit cat i swap i
   == [dup] dip swap i      # defining dip == swap unit cat i

At first glance this is quite a bit more involved than the and operator,
but we'll get to a more elegant definition below.

The `not` operator

The not operator should simply negate the boolean that is on top of the
stack:

false not == true
true not == false

Let us once again work our way down from reduction rules to a definition. If A is true, then:

A not == false
      == false true pop     # true pop == id
      == false true [pop] i # quote then unquote
      == false true true i  # definition of true
      == false true A i     # since A is true

And similarly, when A is false:

A not == true
      == true false pop
      == false true swap pop
      == false true [swap pop] i
      == false true false i
      == false true A i  # since A is false

Now, regardless of the value of A:

A not == false true A i
A not == A [false true] dip i
  not == [false true] dip i

Armed with not, the definition of or simplifies to:

or == dup not i

We can also define nand == and not and nor == or not. Incidentally, nand
and nor are the only two functionally complete operators, meaning that reduction rules for any other boolean operator can be written purely in terms of either nand or nor. Let's briefly look at how nand can be implemented using only nor:

A B nand == A A nor B B nor nor A A nor B B nor nor nor

And, for symmetry, how nor can be implemented using only nand:

A B nor == A A nand B B nand nand A A nand B B nand nand nand

The reader is invited to check the veracity of these reduction rules and to derive definitions for nand and nor from them.

Conditionals

So far we have seen how choosing true == [pop] and false == [swap pop] has
allowed us to devise definitions for all the usual boolean operators. Our choice, which is not the only possible choice, is based on the idea of selectively discarding functions. I.e., pop the top of the stack, or pop the value below the top of the stack.

It is not surprizing then that we can also define functions that behave like
conditionals or if-then-else blocks. We have essentially already come across a primitive if-then-else block in the definition of not.

We can take this one step further and define branch with the following reduction rule:

cond [then] [else] branch == then # if cond is true
cond [then] [else] branch == else # if cond is false

cond [then] [else] branch == [then] i # if cond is true
cond [then] [else] branch == [else] i # if cond is false

cond [then] [else] branch == [then] i # if cond is true
cond [then] [else] branch == [else] i # if cond is false

cond [then] [else] branch == [then] [else] pop i # if cond is true
cond [then] [else] branch == [then] [else] swap pop i # if cond is false

cond [then] [else] branch == [then] [else] [pop] i i # if cond is true
cond [then] [else] branch == [then] [else] [swap pop] i i # if cond is false

cond [then] [else] branch == [then] [else] true i i # if cond is true
cond [then] [else] branch == [then] [else] false i i # if cond is false

cond [then] [else] branch == [then] [else] cond i i # if cond is true
cond [then] [else] branch == [then] [else] cond i i # if cond is false

# from this point onward the reduction rule is valid for any value of cond
cond [then] [else] branch == [then] [else] cond i i
cond [then] [else] branch == cond [then] [else] dig2 i i
cond [then] [else] branch == cond [then] [else] unit cons dip i i
                   branch == unit cons dip i i

We now have a simple conditional. If the condition is true we execute a then-block, otherwise we execute an else-block. It would be more useful if we also had a conditional block, so to speak. I.e. a quotation that, when executed, yields a boolean. This is where comparators like =, <, and > would be useful.

In order to arrive at definitions for such comparators, we need to look at Church numbers first.

Church numerals

Where Church booleans were centred around optional execution, Church numbers are centred around multiple execution.

A quick note on terminology. I will use "numeral" and "number" interchangeably, but they are not the same thing. "Numeral" is syntax, "number" is semantics. Both the Arabic numeral "3" and the Roman numeral "III" can have the meaning of the number three in the right context.

The semantics we are looking for is this:

[A] 0 i ==
[A] 1 i == A
[A] 2 i == A A
[A] 3 i == A A A
.
.
.
[A] N i == A A A ... A

A definition for 0 that satisfies the above reduction rule is 0 == [pop]. Similarly, 1 == [i] will do as a definition for 1. For 2, things are slightly more complicated:

[A] 2 i == A A
        == [A A] i
        == [A] [A] cat i
        == [A] dup cat i
        == [A] [dup cat i] i
      2 == [dup cat i]

We are hoping for some kind of pattern to emerge, so that we don't need a new definition for every
numeral. Let's take a look at 3:

[A] 3 i == A A A
        == [A A A] cat i
        == [A A] [A] cat i
        == [A] [A] [A] cat cat i
        == [A] [A] dup cat cat i
        == [A] dup dup cat cat i
        == [A] [dup] 2 i [cat] 2 i i
        == [A] [[dup] 2 i [cat] 2 i i] i
      3 == [[dup] 2 i [cat] 2 i i]

That gives us a definition of 3 in terms of its predecessor 2. We could expand the definition
all the way and replace 2 with [dup cat i], but we won't gain much in doing so. Let's rather find out whether the pattern continues. I.e. can we define 4 in terms of its predecessor 3? It turns out that we can - the reader is invited to find a derivation - and it also turns out that the pattern holds for all n where n > 0.

We now have a way to apply a quotation n times, but that in itself doesn't really mean that 2 has the semantics of the number two. Not without operators.

As a stepping stone to addition, the first operator that we'll consider is succ, the successor function.

The successor function `succ`

We expect the following semantics from a successor function:

0 succ == 1
1 succ == 2
2 succ == 3
.
.
.

In order to derive a definition for succ, we will exploit the fact that numbers are defined in
terms of their predecessors. That is, if N is our starting point, then by carefully positioning and manipulating N to arrive at [[dup] N i [cat] N i i], we will have found a definition for succ:

N succ == [[dup] N i [cat] N i i]
.
.
.
  succ == unit [i] cat [[dup]] swap dup [[cat]] swap [i] cat cat cat cat

Addition

Addition is the repetition of succession. Adding m to n is equivalent to calculating the m*th successor of *n. This leads us to the following reduction rule:

N M + == N succ succ ... succ         # succ repeated M times
N M + == N [succ] M i
N M + == N M [succ] swap i
    + == [succ] swap i

For example:

2 3 +
2 3 [succ] swap i
2 [succ] 3 i
2 succ succ succ
3 succ succ
4 succ
5

Multiplication

Multiplication can be defined in terms of addition. The product of n and m is equivalent to adding n m times to 0. That is, 2 times 3 is 0 + 2 + 2 + 2. We can generalize multiplication with the following reduction rule:

N M × == 0 N + N + ... N +      # "N +" repeated M times
N M × == 0 [N +] M i
N M × == [N +] M 0 bury2 i
N M × == N [+] cons M 0 bury2 i
N M × == N M [[+] cons] dip 0 bury2 i
    × == [[+] cons] dip 0 bury2 i

For example:

2 3 ×
2 3 [[+] cons] dip 0 bury2 i
2 [+] cons 3 0 bury2 i
[2 +] 3 0 bury2 i
0 [2 +] 3 i
0 2 + 2 + 2 +
2 2 + 2 +
4 2 +
6

Exponentiation

The derivation of the pow function is very similar to that of ×. However, instead of repetitively adding a value to 0, we now rather multiply 1 repetitively by some value. That is 2^3 is equivalent to 1 times 2 times 2 times 2. Or, in general:

N M pow == 1 N × N × ... N ×       # N × repeated M times
N M pow == 1 [N ×] M i
.
.
.
    pow == [[×] cons] dip 1 bury2 i

For example:

2 3 pow
2 3 [[×] cons] dip 1 bury2 i
2 [×] cons 3 1 bury2 i
[2 ×] 3 1 bury2 i
1 [2 ×] 3 i
1 2 × 2 × 2 ×
2 2 × 2 ×
4 2 ×
8

Predecessor function

So far, we've only dealt with functions that increase the value of a number. To get to subtraction, division, and roots, we need to start with
a predecessor function pred. Here is what we expect from a predecessor function:

0 pred == 0       # I'll explain this later
1 pred == 0
2 pred == 1
3 pred == 2
.
.
.

Defining pred using Church encoding is quite complicated. Here is how we will go about it. We want to start at 0 and repetitively add 1 to it, but instead of repeating this n times, we'll repeat it n - 1 times. In a sense, that just defers the problem, because now we have to figure out when to stop. The secret lies in applying some
function that will act like succ n times, except for the very first application. This we can achieve by maintaining a boolean along with our numerical value that keeps track of whether we are at the first function application or not. This leads us to the following reduction rule:

N pred == 0 false [[succ true] [true] branch] N i

From which we can derive a definition:

pred == 0 false dig2 [[succ true] [true] branch] swap i pop

To illustrate how it works, let's consider 2 pred:

2 pred
2 0 false dig2 [[succ true] [true] branch] swap i pop
0 false 2 [[succ true] [true] branch] swap i pop
0 false [[succ true] [true] branch] 2 i pop
0 false [succ true] [true] branch [succ true] [true] branch pop
0 true [succ true] [true] branch pop
0 succ true pop
1 true pop
1

Note that the predecessor of 0 is 0. The encoding we are dealing with here only applies to the natural numbers, so
no negatives. Extending the encoding to integers and reals and even complex numbers is possible, but not within
the scope of this article.

Subtraction

Equipped with pred, we can move on to subtraction (-):

N M - == N [pred] M i

And thus:

- == [pred] swap i

Church comparators and predicates

So far we have looked at Church booleans and Church numbers. We even had to combine them to arrive at a definition for pred. We now
turn to predicates. The first predicate we look at is is-zero. In order to arrive at a definition for is-zero, we will exploit the
fact that the numeral 0 can be used to apply some quoted function zero times, whereas any other numeral n, will apply the quoted
function n times. If we choose the quotation to be [pop false] and start with a value of true, then if we apply the function zero times,
we end up with true, which is what we want. If on the other hand we apply the function once or more, we will pop the initial true value
and replace it with false, only to pop false and replace it with false again, repeat n times. This leads us to the following reduction rules:

0 is-zero == true
1 is-zero == true pop false
2 is-zero == true pop false pop false
3 is-zero == true pop false pop false pop false
.
.
.
N is-zero == true [pop false] N i

And thus:

is-zero == [true [pop false]] dip i

Also, since n - m = 0 when m is larger than or equal to n, we can also make the following definitions:

<= == - is-zero

and

>= == swap - is-zero

Finally, if n <= m and n >= m, then we know that n = m:

m n = == m n <= m n >= and
m n = == [m n] i <= [m n] i >= and
m n = == [m n] [m n] [i <=] dip i >= and
m n = == [m n] [m n] [i <=] swap unit cat i i >= and
m n = == [m n] dup [i <=] swap unit cat i i >= and
m n = == m [n] cons dup [i <=] swap unit cat i i >= and
m n = == m n unit cons dup [i <=] swap unit cat i i >= and
    = ==     unit cons dup [i <=] dip i >= and

Advanced numerical operators

We have now built up a sizeable arsenal of composite operators and operands. It is time to address division and root extraction.

Division

The basic principle that we'll follow is that the quotient is the number of times that you can subtract the divisor from the dividend before you reach 0.
In other words 10 2 ÷ 5 = means that 10 [2 -] 5 i 0 = (subtracting 2 five times from 10 yields 0). In other words, in order to determine N M ÷ we want to start with a quotient of 1 (Q) and increment its value iteratively until the following no longer holds:

N [M -] Q i 0 >=

One way to achieve this is to use the y combinator, which makes anonymous recursion possible. In the base case, when our condition no longer holds, we terminate and only leave the quotient on the stack. In the recursive case, we increment the quotient and recheck the condition. The resulting reduction rule for division is:

N M ÷ == 0 true [
  bury2
  [
    succ dup [N [M -]] dip i 0 >= dig2 i
  ]
  [swap pop]
  branch
] y

The objective is to extract N and M to the front of the right hand side. The result is a definition for ÷:

÷ == [
  [[bury2] dip] dip
  [
    [[succ dup ] dip] dip [[-] cons] cons cons dip i 0 >= dig2 i
  ] cons cons
  [swap pop]
  branch
] cons cons [0 true] dip y

Root extraction

Determining the *n*th root is very similar to division. But instead of repetitive subtraction, we now employ repetitive division. This time we increment the root R until the following no longer holds:

N [M ÷] R i 1 >=

The reduction rule for root extraction looks like this:

N M √ == 0 true [
                    bury2
                    [
                      succ dup [N [M ÷]] dip i 1 >= dig2 i
                    ]
                    [swap pop]
                    branch
                  ] y

From which we can derive the definition of √:

√ == [
  [[bury2] dip] dip
  [
    [[succ dup ] dip] dip [[÷] cons] cons cons dip i 1 >= dig2 i
  ] cons cons
  [swap pop]
  branch
] cons cons [0 true] dip y

Putting it all together

We started out with the ambition to devise an encoding using only basic functions by which the following would evaluate to true:

1 2 + 3 =

Let's put it to the test (taking a few shortcuts here and there):

1 2 + 3 =
3 3 =
3 3 unit cons dup [i <=] dip i >= and      # definition of =
3 [3] cons dup [i <=] dip i >= and
[3 3] dup [i <=] dip i >= and
[3 3] [3 3] [i <=] dip i >= and
[3 3] i <= [3 3] i >= and
3 3 <= 3 3 >= and
3 3 - is-zero 3 3 swap - is-zero and
3 3 - is-zero 3 3 swap - is-zero and
0 is-zero 3 3 swap - is-zero and
0 is-zero 3 3 - is-zero and
0 is-zero 0 is-zero and
0 [true [pop false]] dip i 0 [true [pop false]] dip i and
true [pop false] 0 i true [pop false] 0 i and
true true and
true true [dup] dip swap i                # definition of and
true dup true swap i
true true true swap i
true true true i
true true [pop] i                         # definition of true
true true pop
true

Applications

Besides the ability to do simple arithmetic and logic, we can also put Church-encoding numerals to other use.

Digging and burying

One such application lies in generalizing stack shuffling functions like dig and bury.

The dig family of functions are used to dig values up from deep down on the stack. dig, or dig1 is equivalent to swap, whereas dig2 brings the second value below the top to the top. And so on. Here are definitions for some dig functions:

dig1 == [] cons dip
dig2 == [] cons cons dip
dig3 == [] cons cons cons dip

Clearly, the deeper you want to dig, the more cons functions are required. We can generalize dig as follows:

dign == [] [cons] N i dip

The bury family does the opposite. It buries the top of the stack. Once again, bury1 is equivalent to swap. Here are definitions for some bury functions:

bury1 == [[] cons] dip swap i
bury2 == [[] cons cons] dip swap i
bury3 == [[] cons cons cons] dip swap i

And likewise, we can generalize this to:

buryn === [[] [cons] N i] dip swap i

Summary of definitions

Miscellaneous helpers:

dip   == swap unit cat i
y     == [dup cons] swap cat dup cons i
dign  == [] [cons] N i dip
buryn == [[] [cons] N i] dip swap i

Boolean operators:

true   == [pop]
false  == [swap pop]
and    == dup i
or     == [dup] dip swap i
not    == [false true] dip i
nand   == and not
nor    == or not

Numerical operators:

N       == [[dup] M i [cat] M i i]
succ    == unit [i] cat [[dup]] swap dup [[cat]] swap [i] cat cat cat cat
+       == [succ] swap i
×       == [[+] cons] dip 0 bury2 i
pow     == [[×] cons] dip 1 bury2 i
pred    == 0 false dig2 [[succ true] [true] branch] swap i pop
-       == [pred] swap i
is-zero == [true [pop false]] dip i
÷       == [[[bury2] dip] dip [[[succ dup ] dip] dip [[-] cons] cons cons dip i 0 >= dig2 i] cons cons [swap pop] branch] cons cons [0 true] dip y
√       == [[[bury2] dip] dip [[[succ dup ] dip] dip [[÷] cons] cons cons dip i 1 >= dig2 i] cons cons [swap pop] branch] cons cons [0 true] dip y

where N is any natural number and M is the predecessor of N.

Conditionals and comparators:

branch == unit cons dip i i
<=     == - is-zero
>=     == swap - is-zero
=      == unit cons dup [i <=] dip i >= and

DEV Community

Church encoding in the concatenative language Joy

Joy

Church encoding

Church booleans

The `and` operator

The `or` operator

The `not` operator

Conditionals

Church numerals

The successor function `succ`

Addition

Multiplication

Exponentiation

Predecessor function

Subtraction

Church comparators and predicates

Advanced numerical operators

Division

Root extraction

Putting it all together

Applications

Digging and burying

Summary of definitions

Sources and further reading:

Top comments (0)

Joy

Church encoding

Church booleans

The and operator

The or operator

The not operator

Conditionals

Church numerals

The successor function succ

Addition

Multiplication

Exponentiation

Predecessor function

Subtraction

Church comparators and predicates

Advanced numerical operators

Division

Root extraction

Putting it all together

Applications

Digging and burying

Summary of definitions

Sources and further reading:

The `and` operator

The `or` operator

The `not` operator

The successor function `succ`