DEV Community

Cover image for My Struggle Learning Set Theory as a Programmer
Seth T.T
Seth T.T

Posted on

My Struggle Learning Set Theory as a Programmer

Cover image from Wikimedia Commons

I'm Seth, an undergrad student with an established background in computer science.

This article introduces the bare fundamentals of set theory, then compares mathematical sets with some common data structures and type systems found in computer science.


Sets

Paraphrasing Wikipedia,

A set is a collection of different things,

and

A set contains elements which can be information of any kind: numbers, shapes, people, ...

Okay, my bedroom is a collection of laundry and furniture, so Room = {laptop charger, old sock, soft bed}.

Next,

Sets are uniquely characterized by their elements. Two sets that have precisely the same elements are equal.

This implies that any set containing exactly {laptop charger, old sock, soft bed} is literally my room.

This took me a while to understand! Last thing,

Given two sets, X is a subset of Y if every element of X is also an element of Y.

So, for example, the set X = {1, 3} is a subset of Y = {1, 2, 3} because Y contains every element of X.

Now, let's see how pure mathematical sets compare with real-world computer data structures.

Data Structures

Here's a few data structures that built my preconceptions about mathematical sets:

  • Array<T>
  • Set<T>

Being familiar with these data structures helped me tremendously while learning set theory, but later came to hinder me in some interesting ways.

Computers are bound to certain time- and space- constraints that affect how data structures behave. In conceptual math, there is no "computer" moving information around; the information just.. exists, hanging in a timeless void.

Arrays

An array may contain duplicate information, unlike a mathematical set. Otherwise, my intuition about arrays helped me a lot when learning set theory.

// :(
const bad: Array<number> = [1, 1, 1, 1];
Enter fullscreen mode Exit fullscreen mode

Sets

Set (data structure) introduces a guarantee of uniqueness. But, a key distinction is that two mathematical sets containing identical elements are the same object.

// These are identical sets,
// but not the same object.
const A: Set<number> = new Set([1, 2, 3]);
const B: Set<number> = new Set([1, 2, 3]);
Enter fullscreen mode Exit fullscreen mode

Consider that both A and B have unique memory addresses, effectively duplicating the contents of the set. Mathematical sets don't allow such duplication.

Armed with a proper intuition about how sets might behave on a computer, let's look at type systems.

Type Systems

Consider the following code:

const A: Array<number> = [1, 2, 3];
Enter fullscreen mode Exit fullscreen mode

Here, number is a mathematical set containing all possible numeric values, and A is a mathematical set containing {1, 2, 3}. Therefore, A is a subset of number.

This nicely demonstrates why programmers like using strongly-typed programming languages.

A strongly typed programming language's compiler will strongly enforce that all of your variables are subsets of their type.

So, you couldn't have code like this:

const A: Array<string> = [1, 2, 3];
Enter fullscreen mode Exit fullscreen mode

because neither 1, 2 nor 3 exist in the set of all possible strings.

But wait a minute! Both number and string are sets, but they don't consume memory in the way that [1, 2, 3] does! This property of "tangibility" is a key aspect of modern computer science, but mathematics has no such delineation between compile-time and run-time.

While learning set theory, adapting this intuition was by far my biggest struggle.

The Edge of Space

Imagine the x-y coordinate plane (formally, a Cartesian product of the real numbers, ℝxℝ).

x y plane

The coordinate plane does not exist on some background-space. It isn't contained by anything. Rather, the coordinate plane just.. exists.

Computers, on the other hand, have a memory-space that contains everything. Absurdly, every variable in every running program is a subset of that memory-space!

Contrast this with types, like number, string, and Dog. These types are much more like the x-y coordinate plane: they don't really exist anywhere, rather, they define a space on which everything else is allowed to exist.

Such was my struggle when learning set theory from a CS-background. I came from a place where data exists somewhere, consumes space, and takes time to create. Pure math is timeless, beyond space, and honestly it fucking scares me. I'll stick with code, please!

Addendum: Code Example

I was building a little example while writing this article, and I decided to include it for fun.

Given two sets, A and B:

// A is a set
const A: Set<number> = new Set([1, 3]);
// B is a set
const B: Set<number> = new Set([1, 2, 3]);
Enter fullscreen mode Exit fullscreen mode

we can define the subset function like so:

/**
 * @typeParam X - A type, like number or string
 * @typeParam Y - A type, like number or string
 * @param a - Set containing values of type X
 * @param b - Set containing values of type Y
 * @return true if a is a subset of b
 **/
function subset<X, Y>(a: Set<X>, b: Set<Y>): boolean {
    return a.filter(x => b.includes(x)).length === a.length;
}

subset(A, B); // true
Enter fullscreen mode Exit fullscreen mode

Here, we see that X and Y are types. That makes them intangible sets (loose definition), because they consume no memory and you can't print them. Rather, X and Y define the possibility space of values which a and b are allowed to hold.

Further, we see that a and b are variables. That makes them tangible sets (by contrast), because they consume memory and can be interacted with. The type system guarantees that a is a subset of X, and b is a subset of Y, but it cannot guarantee that a is a subset of b. This computation, sadly, must be done at run-time.


That's all I've got.

Find me on github

Top comments (0)