dalerank

Posted on Mar 25

Playfull programming (0x1)

#generalizations #values #objects

Generalizations in C++

Imagine for a moment that C++ is not a set of odd keywords and linker errors, but simply another way to talk about the world around us: people, numbers, colors, events, and cats. We are used to thinking of programming as something purely technical, where it matters to memorize syntax, place semicolons, and "guess" what the compiler wants right now.

But if we ask ourselves "what does a program actually operate on?", it suddenly turns out that behind all those int, struct, and template there are rather simple and clear ideas: things, their properties, groups of similar things, and rules by which some things turn into others.

And when trying to explain what objects, types, and other fundamental concepts of computer science are, one inevitably has to go beyond purely technical language and speak of more general categories of ideas that humanity has been working with for thousands of years—and it is here that the words "entity", "kind", and "genus" become useful.

Abstract and Concrete Entities
When philosophers and logicians speak of abstract entities, they mean individual things that do not exist in space and time the way a table, a person, or a computer do, but as something unchanging: for example, the number 13 or the color blue as such were not born at some moment and do not "die" after some time; they are not objects of the physical world but ideas we work with in our heads and in mathematics.

A concrete entity, by contrast, is always tied to history, to a moment of appearance and a moment of disappearance: Socrates was once born and once died; any country as a political entity was created on a certain date, and although the country continues to exist, it is clear that at some point it will no longer exist or may change radically.

Attributes and Identity
At the same time, we can speak not only of entities themselves but also of their attributes—the correspondence between a concrete entity and an abstract entity that describes some property: Socrates' eye color can be described as a concrete instance of an abstract color; the number of German states as a concrete value of the abstract natural number.

If at some moment we take a "snapshot" of a concrete entity and fix the full set of its attributes here and now, we see that over time individual attributes may change while identity remains, and it is identity—this very primitive but deep sense of "it is still the same one"—that lets us say that a person, a country, or a program object continues to be itself even as its properties change.

Kinds and Genera
Next arises the need to group entities in some way, and here the notions of kind and genus enter the stage; each of these categories can itself be abstract or concrete. When we speak of an abstract kind, we describe the common properties of a whole family of abstract entities.

Thus the kind "natural number" covers all individual numbers 0, 1, 2, 3, and so on (zero may or may not be included in the naturals and may be treated as the empty set); the kind "color" covers all possible shades from dark blue to bright orange, and we can reason about them as something general without tying ourselves to a concrete instance.

A concrete kind, by contrast, describes a set of attributes for a family of concrete entities: when we say "human", we mean a concrete kind that includes all humans with a certain set of characteristics such as biology, consciousness, etc.; when we say "continent", we describe a concrete kind that includes Europe, America, or Asia, each with boundaries, population, area, and other attributes. They differ as entities, but within the kind "human" or "continent" we can speak of recurring structural properties and model them in code as objects of one data type.

Functions as Rules
An important role in this picture is played by the notion of function, and here the mathematical definition carries over into programming almost verbatim: a function is a rule that maps some set of abstract entities, called arguments and belonging to certain kinds, to another abstract entity, called the result, belonging possibly to another kind.

A classic example from mathematics is the successor function, which maps each natural number to the next one—the function "n → n + 1" in terms of the natural-number kind; another, more figurative example is a color-blending function that maps two arguments of kind "color" to a third color as the result of mixing them—and if you have ever written code that takes two Color values and returns a new Color, you were in effect implementing such a function.

In programming we live inside this definition: any function call in C++ is the application of a rule to arguments that are concrete "casts" of abstract entities, and the result is a new object that can also be viewed as a concrete embodiment of an abstract entity of a given kind.

// Kind: natural number
using Natural = unsigned int;

// Successor function: n → n + 1
// Rule that maps each Natural to the next Natural
Natural successor(Natural n) {
    return n + 1;
}

Genera as Generalizations of Kinds
If we move up one level of abstraction and speak of genera, they let us talk not only about concrete values or even kinds, but about large classes of notions.

Abstractly, a genus is a way to describe a set of different abstract kinds that are similar in some respect: for example, the genus "number" includes kinds such as "natural number", "integer", "real number", each with its own rules of existence, but we can reason about them as number in general without specifying the kind; the genus "binary operator" includes both arithmetic operations (addition, multiplication) and logical operations (and, or) and bitwise operations, but all of them follow the general scheme "take two arguments and produce a result".

A concrete genus, in turn, describes a set of different concrete kinds similar in some respects: "mammal" describes different concrete kinds such as human, cat, and whale; "biped" describes another classification that includes humans, birds, and perhaps some fictional creatures—and one and the same entity (e.g. Socrates) can be viewed simultaneously as an instance of the kind "human" and as a member of the genera "mammal" and "biped". It is important to understand that any entity belongs to exactly one kind, which defines the rules of its construction or existence, but can belong to many genera, each of which describes only one aspect of its properties—and in programming it is the same: an object of one concrete type can implement many interfaces or concepts, each emphasizing only part of its behavior.

// Genus: template—a rule over kinds
// This is the "genus" level: Pair<T, U> is not a concrete kind
// but a rule that maps two kinds to a new kind
template<typename T, typename U>
struct Pair {
    T first;
    U second;
};


// Function at the genus level: for any kind T
// returns a pair of two elements of that kind
template<typename T>
Pair<T, T> make_pair_of(T a, T b) {
    return {a, b};
}

Types in C++ as Kinds
If we build a bridge from this philosophical picture of the world to our world of compilers and C++, we can see that a type in the language plays the role of both a concrete kind and an abstract kind, depending on the level we look at.

From the program's point of view, a class:

struct Person {
   std::string name;
   int age;
};

defines the kind "person in our model", specifying the set of attributes we chose as essential: name and age; and an object:

Person p{"Socrat", 70};

is a concrete entity—one person in our program with its attribute values at the current moment in time.

From the point of view of type theory in the compiler, "type" is closer to an abstract kind: the compiler works not with a concrete p but with the set of all possible values of type Person; it knows their size, layout, copy and destruction rules, and checks that you apply functions (in our sense—rules) correctly to entities of the corresponding kinds—i.e. that you do not pass a std::string to a function expecting double, and vice versa.

Evolution of Type Representation in Compilers

Historically, different compilers implemented this model differently at the level of internal program representation, but the conceptual picture was always similar: in the compiler's internal data structures there are entities corresponding to abstract kinds and genera—nodes of the type tree, tables of information about classes, functions, and templates; there are entities corresponding to concrete entities—variables, objects, temporaries that appear and disappear during program execution; and there are functions that, in the form of intermediate representation (IR), become a set of rules for transforming one set of values into another.

In early compilers like cfront, many of these things were encoded in the textual C code produced by the translator, and the notions of kind and genus existed only in the author's head and in the design. As GCC, Clang, and MSVC evolved, increasingly sophisticated type systems and checking emerged that formalize these categories and allow, for example, at the LLVM IR level, to speak of a type as a well-defined "kind" of values for which equivalence, compatibility, and validity of operations can be checked.

More information:
boosty: https://boosty.to/dalerank
github: https://github.com/dalerank/playful_programming_cpp

DEV Community

Playfull programming (0x1)

Top comments (0)