Menestret Martin

Posted on Feb 21, 2019 • Originally published at geekocephale.com

Anatomy of an algebra

#scala #functional

[Check my articles on my blog here]

I will try to group here, in an anatomy atlas, basic notions of functional programming that I find myself explaining often lately into a series of articles.

The idea here is to have a place to point people needing explanations and to increase my own understanding of these subjects by trying to explain them the best I can.
I'll try to focus more on making the reader feel an intuition, a feeling about the concepts rather than on the perfect, strict correctness of my explanations.

Part 1: Anatomy of functional programming
Part 2: Anatomy of an algebra
Part 3: Anatomy of a type class
Part 4: Anatomy of semi groups and monoids
Part 5: Anatomy of functors and category theory
Part 6: Anatomy of the tagless final encoding - Yet to come !

What is an algebra ?

Functional programmers tend to talk a lot about algebras, so a good starting point would be to understand what an algebra is.
A simple definition would be (from Wikipedia):

In its most general form, an algebra is the study of mathematical symbols and the rules for manipulating these symbols

Then to rephrase it in simpler terms, it is a system formed by:

Symbols, items or "things"
Operations on thoses "things"
Properties and rules about those operations

A simple algebra example would be:

Symbols: the integer numbers
Operations: addition and multiplication
Properties and rules:
- Closure: the result of the two operations is itself in integer number
- Associativity: a + (b + c) = (a + b) + c and a × (b × c) = (a × b) × c
- Commutativity: a + b = b + a and a × b = b × a
- Identity elements: a + 0 = a and a * 1 = a
- And so on (check the link if you want to see the rest)

How does it relates to FP ?

Domain modeling

When modeling a business domain, in functional programming, we separate strongly data from behaviors (see the Data/behavior relationship section).

To do that:

We create types on one side
Pure functions manipulating those types
And these functions have to respect some strict business logic

Doesn't that ring any bell ?

We manipulate algebras !

Symbols: our business types
Operations: our business functions acting on those types
Properties and rules: our business logic implemented in these functions, and our property based tests
- We won't cover those here, but to give you an idea, these are not unit tests, but tests based on function properties that they must always verify (such as: "A combination of several sales cannot result in a negative price") while tested against a wide range of more or less random values generated by a test framework

A concrete example

To make it clearer, let's take a dummy example.

Say we have to produce sales reports, we would probably model our domain like:

type Amount = Int
type Price = Int

final class ID(val value: Int) extends AnyVal
final case class Item(id: ID, price: Price)
final case class Sales(items: List[(ID, Amount)], totalPrice: Price)

trait SalesOperations {
    def combineAll(sales: List[Sales]): Sales
    def generateReport(sales: Sales): String
}

In that case:

Symbols: Amount, Price, ID, Item, Sales, etc. (our business types)
Operations: combineAll, generateReport (operations on those types)
Properties and rules: the business logic and the tests we haven't implemented here

That's pretty much it !

It happens frequently to have shared behaviors that are common to several types.
We want to be able to design functions that can be run on those different types and have the expected behavior depending on their input types. Those functions are said then said to be polymorphic.

That's a problem we solve with type classes.

What is an ADT ?

ADT stands for algebraic data type. Well, we talked a bit about algebras but we didn't talk about types yet.

What is a type ?

A type or a data type, is just a classification of data, a set of values or inhabitants, that we use to tell the compiler how we intend to use data so it can check that we respect the rules (that's the type checking).

Boolean is a type that classifies these two values: true, false
String is a type that reprents an infinity of inhabitants, all the possible characters combinations
Option[A] is a type that represents all the values of type A (wrapped in a Some) + the value None
- Option has a number of inhabitants equals to the number of inhabitants of the type A + 1 (the None value)
- Option[Boolean] has 3 inhabitants: Some(true), Some(false), None
Unit is a type that has only 1 inhabitant, the value: ()
Noting is a type that has no member (there is no way to create a value whose type is Nothing)

What is an algebraic data type then ?

Here is Wikipedia's definition of an algebraic data type:

In computer programming, especially functional programming and type theory, an algebraic data type is a kind of composite type, i.e., a type formed by combining other types.

We call these new types ADTs because we create them by combining existing types... well, guess what ? We use an algebra again !

This is algebra is the following:

Symbols: existing types: primitive types, other existing ADTs, ... (Sales, Boolean, String, Int, ...)
Operations: sum and product (we'll see what they mean)
Properties and rules: some properties and laws about these sum and product operations (we'll also see what they mean)

Sum

sum is the action of combining by "summing" their respective values together.

You can see it as a way to define that: values of a sum type can only be either a value of this OR that type (the OR is the key intuition here).

It is done, in Scala, usually by using:

sealed traits and their instances as case classes or case objects

Or less often:

The Either[A, B] type

That way, when you declare:

sealed trait CoinSide
case object Heads extends CoinSide
case object Tails extends CoinSide

You are creating an ADT CoinSide which is a sum type (created with a sum operation) which has two and only two possible values (or inhabitants), Heads or Tails.

The same would be achieved with:

type CoinSide = Either[Heads, Tails]

Whose possible values would then be, Left(Heads) or Right(Tails).

These are just two different encodings for the same thing.

Sum properties

The sum operation also have properties.

I won't go into full details here but (these examples would equally be true with sealed traits + their implementations instead of Either type):

The number of values of a sum type is the sum of the values of its composing types members (just as you would assume for addition on integers)
- Boolean has two inhabitants: true and false
- Either[Boolean, Boolean] which is a sum type of two types Boolean has four inhabitants:
  - Left(true)
  - Left(false)
  - Right(true)
  - Right(false)
- Either[Boolean, Either[Boolean, Boolean]]] is a sum type between a Boolean and another sum type of a Boolean and a Boolean
  - Well guess what ? It has 2 + (2 + 2) values !
It has an identity element which is the Nothing type which has no values at all
- Either[Boolean, Nothing] is a sum type of a Boolean with the sum identity. Because there is no way to create a value of type Nothing, it does not exist, so there is no way to construct a Right, it has only two values:
  - Left(true)
  - Left(false)
It is associative, Either[Boolean, Either[Boolean, Boolean]]] is the same as Either[Either[Boolean, Boolean]],Boolean], you can enumerate the values, you'll see (well isomorphic, we have the same "expressive power" with both representations, but let's say they are the same) !
And so on (I'll give you more material at the end if you want to keep diving)

Product

product is the action of combining two or more types together by "multiplying" their respective values together.

You can see it as a way to define that values of a product type are the combination of values of this AND that type (the AND is the key intuition here).

It is done, in Scala usually by using:

case classes

Or less often

tuples

That way, when you declare:

case class TwoCoinsAndABoolean(fst: CoinSide, snd: CoinSide, b: Boolean)
// or
type TwoCoinsAndABoolean = (CoinSide, CoinSide, Boolean)

These are just two different encodings for the same thing.

You are creating an ADT TwoCoinsAndABoolean which is a product types (created with a product operation) which has the number of values of its members multiplied.

In our case 8 values (2 * 2 * 2):

TwoCoinsAndABoolean(Heads, Heads, true) or (Heads, Heads, true)
TwoCoinsAndABoolean(Heads, Tails, true) or (Heads, Tails, true)
TwoCoinsAndABoolean(Tails, Heads, true) or (Tails, Heads, true)
TwoCoinsAndABoolean(Tails, Tails, true) or (Tails, Tails, true)
TwoCoinsAndABoolean(Heads, Heads, false) or (Heads, Heads, false)
TwoCoinsAndABoolean(Heads, Tails, false) or (Heads, Tails, false)
TwoCoinsAndABoolean(Tails, Heads, false) or (Tails, Heads, false)
TwoCoinsAndABoolean(Tails, Tails, false) or (Tails, Tails, false)

You can observe that product types values are the cartesian product of their composing types values !

Product properties

The product operation also have properties.

I won't go into full details here but (these example would equally be true with case classes instead of tuples):

The number of values of a product types is the product of the number of the values of its combining members (as you would assume for multiplication on integers)
- Boolean has two inhabitants: true and false
- (Boolean, Boolean) which is a product type of two types Boolean has four values:
  - (true, true)
  - (true, false)
  - (false, true)
  - (false, false)
- (Boolean, (Boolean, Boolean) is a product type between a Boolean and another product type of a Boolean and a Boolean
  - Well guess what ? It has 2 * (2 * 2) values !
It has an identity which is the famous Unit type which has only one inhabitant, the value ()
- (Boolean, Unit) is a product type of a Boolean with the product identity. It has two values:
  - (true, ())
  - (false, ())
It is associative, (Boolean, (Boolean, Boolean) is the same as ((Boolean, Boolean),Boolean), you can enumerate the values, you'll see (well isomorphic, we have the same "expressive power" with both representations, but let's say they are the same) !
And so on (I'll give you more material at the end if you want to keep diving)

Mixed types

Of course, as we saw in some of the examples, ADTs can be combinations of other ADTs, such as sum of products, products of sums, product of products and so on.

More material

If you want to keep diving deeper, some interesting stuff can be found on my FP resources list and in particular:

Functional and Reactive Domain Modeling (An awesome book about functional domain modeling using algebras)
Kinds of types in Scala
Why do Functional Programmers always talk about Algebra(s)?

Conclusion

All that long digression was only meant to show you that, creating new composite data types the way we do in pure FP is done by using an algebra.

That's why these composite types are called Algebraic Data Types :).

To sum up, we learnt:

What was an algebra
How algebras were used to model domains in a neat way and how it was adequate to the pure functional programming approach
What was a type
How creating new data types was done by using an algebra composed by types and operations on them and thus why the types created that were called algebraic data types

I'll try to keep that blog post updated.
If there are any additions, imprecision or mistakes that I should correct or if you need more explanations, feel free to contact me on Twitter or by mail !

Top comments (4)

Comment deleted

Menestret Martin • Mar 13 '19 • Edited

Hi,

Well the 4 values you listed, wrapped in a Right, are the Right part of the Either[Boolean, Either[Boolean, Boolean]]]:

-Right(Left(true))
-Right(Left(false))
-Right(Right(true))
-Right(Right(false))

Then you have the Left part of the Either[Boolean, Either[Boolean, Boolean]]] which are the following values:

-Left(true)
-Left(false)

Which makes 6 values, is that ok for you ?

Comment deleted

Menestret Martin • Mar 14 '19

No, it was a good question, maybe I should have listed all the values to make my point. Indeed if we had been working with (Either[Boolean], Either[Either[Boolean]]) we would have had 8 values !

Thanks for you question.

DEV Community