Definition of Semigroup
Semigroup is magma algebraic structure with additional requirement - associativity property of operation. So in exactly the same way how we describe magma as pair (S, *)
where S
is a set (in our context set equal type), and *
is binary, closed operation over S
defined as (S,S) -> S
, we describe semigroup, with one additional rule:
// associativity rule of operation *
forall a,b,c in S, a * (b * c) = (a * b) * c = a * b * c
In other words - grouping of operations has no meaning for the end result, there is no difference if we compose firstly a
with b
or b
and c
, the result should not change.
Basic algebra operations for numbers like +
, *
are associative. It is no difference if you do (1 + 2) + 3
or 1 + (2 + 3)
, the result will be the same. Semigroup is exactly about the notion of addition behavior, but semigroup is a concept abstracted from numbers, we can define it in other areas, in programming areas will have a form of types.
Note There are also many non-associative operations like exponentiation, subtraction or division
What it has to programming
In programming semigroups are visible very often, not only in number operations, but in other types also. I believe you were doing many times string concatenation or array concatenation, these operations are nominal examples of semigroups. We can say - every time you join two things together, for sure you have magma, and almost for sure you have semigroup too.
Humble examples
Pair (string, +)
forms a semigroup in TypeScript
[TS]
const a = "Hello";
const c = " ";
const b = "World";
const concat1 = (a + b) + c;
const concat2 = a + (b + c);
// concat1 and concat2 are the same - "Hello World"
Pair (Array<any>, Array.prototype.concat)
forms a semigroup in TypeScript.
[TS]
type S = Array<number>; // type contains all lists with number type inside
const a: S = [1, 2]; // a is member of S
const b: S = [3, 4]; // b is member of S
const c = a.concat(b); // c is member of S
c // [1,2,3,4]
That proofs that concat
is an operation together with type Array<any>
forms a magma, but does concat
form a semigroup also?
const res1 = a.concat(b.concat(c)); // read us a `conc` ( b `conc` c)
const res2 = a.concat(b).concat(c);// read us (a `conc` b) `conc` c
// all two results are [1,2,3,4,5,6]
Above is more like a test then the proof, but you need to believe me that concat
with set List<any>
forms a semigroup.
Note Take a second look at this
Array
example, we are using here typical for OOP implicitthis
passing. Isn't it a problem for pure mathematical concepts? No, not at all. We can represent mathematical abstractions also in OOP.
Type level semigroup
Algebra is visible not only at value level, but also at type level. Algebraic data types are great examples of such algebraic structures. As we have seen before in first article of the series - we have three kinds of algebraic data types - sums (+)
, products (*)
, exponentials (^)
. Exponential types are not associative, as exponential operation is not, that said - exponentiation does not form a semigroup.
[TS]
type Status = "Idle" | "Pending" | "Error" | "Success" // sum
type User = { name: string, lastname: string } // product
Sum type semigroup (or)
Let's take the first sum type and analyze how it is related to +
operation and if it is a semigroup. Take a look that we can consider every element of the the sum type as element with value equal 1, if we consider then |
as +
we will have an expression 1 + 1 + 1 + 1
and its 4
, exactly how many members we have of the type Status
. That is to say |
operation is a notion of addition algebra.
Numbers set, with +
operation forms a semigroup, does sum type also forms a semigroup. Lets check that.
[TS]
type Status1 = ("Idle" | "Pending") | "Error" | "Success"
type Status2 = "Idle" | ("Pending" | "Error") | "Success"
type Status3 = "Idle" | "Pending" | ("Error" | "Success")
All above are evaluated to the same type, so we can say pair (any, |)
forms a semigroup.
Note Not all usages of
|
form a sum type in TS, but all form a semigroup. For exampletype NotSum = 'a' | 'a' | 'b'
is evaluated as'a' | 'b'
, so the sum is 2, and should be 3.
Product type semigroup (and)
Product type is a notion of *
in the type level, this is because the amount of possible elements of the type is multiplicated for every joined member. For example tuple (string, string)
has string
* string
amount of elements. Lets take very small type like Boolean
with two members True
, False
and create a pair type - (boolean, boolean)
.
[TS]
type PairBool = [boolean, boolean]; // tuple syntax in TypeScript
// all possible values/members:
[true,true]
[true,false]
[false, false]
[false, true]
We have four members, original type has two members, it means we have multiplicated the size of the type - (bool, bool) ≅ bool * bool ≅ 2 * 2
. That is why we can say product type is a notion of multiplication at the type level.
Is it semigroup?
[TS]
type User1 = (Id & Name) & Lastname;
type User2 = Id & (Name & Lastname);
type User3 = {
id: string,
name: string,
lastname: string
}
All types above are equal to Id & Name & Lastname
, so yes pair (object, &)
and tuple syntax (any, any, ...)
and record syntax forms a semigroup.
Note OCaml syntax for product types is
*
symbol, for examplestring * int
is a type of tuple(string, int)
Custom semigroup
Semigroups are interesting because are notion of joining elements, notion of addition, we can use this notion in our custom types. Lets create custom domain type - Money
and add to it some algebraic semigroup notion. But wait, lets firstly create some naive implementation of adding money operations.
[ELM]
-- Naive implementation, not a semigroup, only magma
type Currency = PLN | USD | GBP
type Money = Money { amount: Float, currency: Currency}
addMoney : Money -> Money -> Money
addMoney (Money a) (Money b) =
if a.currency == b.currency then
Money {a | amount = a.amount + b.amount}
else
Money a
-- example values of type Money
m1 = Money {amount = 1.0, currency = PLN}
m2 = Money {amount = 2.0, currency = PLN}
m3 = Money {amount = 3.0, currency = GBP}
-- using
addMoney (addMoney m1 m2) m3
We have created a type - Money
and closed operation over this type addMoney
. Function addMoney
adds amounts only if currency match, if not it does not add them, in this situation it returns the first argument.
Let's for a moment think about behavior of addMoney
, if currency is the same it will add them, if not it will result by the left argument, and totally skip the right. Very implicit behavior, very not predictable one. Our naive implementation is not associative, how we group operations does matter. Such implementation is not predictable and risky to use, its very error prone as if you add firstly a
with b
and c
after the result will be different from adding firstly b
and c
and a
after. The fact that we don't have a semigroup has a meaning, such abstraction is just not the best one.
Lets try to make it better. Below the second try for the same operation.
-- [ELM]
-- implementation is a semigroup
type Currency
= PLN
| USD
| GBP
type Money
= Money { amount : Float, currency : Currency }
-- dictionary with rates to USD
rateToUSD : Dict Currency Float
rateToUSD = Dict.fromList [(PLN, 0.5), (GBP, 1.2), (USD, 1)]
addMoney : Money -> Money -> Money
addMoney (Money a) (Money b) =
if a.currency == b.currency then
Money { a | amount = a.amount + b.amount }
else
let
aRate = Dict.get a.currency rateToUSD
bRate = Dict.get b.currency rateToUSD
amountA = a.amount * withDefault 1 aRate
amountB = b.amount * withDefault 1 bRate
finalAmount = (amountA + amountB) / (withDefault 1 aRate)
in
Money {a | amount = finalAmount }
Yes it is more code. The difference is that instead of skipping the situation where the currency does not match, we are recalculating the amount by the rate to USD currency, we add amounts as USD and recalculate it again to the original currency of the left argument. The new behavior means that we always will calculate, never skip it. Thanks to such behavior we got associativity of the addMoney
operation, and abstraction is well more predictable.
Pair (Money, addMoney)
forms a semigroup.
Haskell the Heaven of algebra
Haskell by its language features like ad-hoc polymorphism and parametric polymorphism is the best language for creating custom algebraic structures. Not surprising Haskell already has abstraction for semigroup, what is left for the developer is to implement instance of it. Continuing our example with Money
I will recreate the same algebra in Haskell.
[Haskell]
import Data.Map
data Currency = USD | PLN | GBP deriving (Show, Eq, Ord)
usdRates = fromList [(PLN, 0.5), (GBP, 1.2), (USD, 1.0)]
data Money = Money {amount :: Float, currency :: Currency} deriving Show
instance Semigroup Money where
Money am1 c1 <> Money am2 c2 =
if c1 == c2 then
Money (am1+am2) c1
else
let
amount1 = am1 * (findWithDefault 1 c1 usdRates)
amount2 = am2 * (findWithDefault 1 c2 usdRates)
finalUSDAmount = amount1 + amount2
in Money (finalUSDAmount / (findWithDefault 1 c1 usdRates)) c1
-- values of type Money:
m1 = Money { amount = 1.0, currency = PLN }
m2 = Money { amount = 2.0, currency = GBP }
m3 = Money { amount = 3.0, currency = PLN }
In Haskell Semigroup typeclass has <>
infix operator. Now we can simply use it.
[Haskell]
m1 <> m2 <> m3
(m1 <> m2) <> m3
m1 <> (m2 <> m3)
in comparison to Elm code:
[Elm]
addMoney (addMoney m1 m2) m3
addMoney m1 (addMoney m2 m3)
Difference is significant. As I said before, Haskell is a heaven for definition of own algebras. They look and feel as addition and multiplication on numbers, and because we made our operation associative, it also behaves like them.
Exercise for the reader take a look at the code you lately wrote, do you see semigroups there?
What next in the series
Great. We know what is Semigroup in next article we will go one step further into Monoid.
If you are interested in notifications about next articles please follow me on dev.to and twitter.
Top comments (2)
Really good series! Thanks ❤️
Thank you. I will try to write more in the series, but other activities blocked me a little. Also there is not such many people who are reading it sadly.