loading...
Cover image for Catch the flag!

Catch the flag!

belinde profile image Franco Traversaro ・7 min read

Have you ever seen configurations written as a concatenation of constants? Something like error_reporting = E_ERROR|E_CORE_ERROR: what kind of sorcery is this? It turns out this is a real common (and smart) way to pack a lot of boolean flags into a single value.

So let's make something that use binary flags. I'll use PHP, but the concepts are applicable to any language with support of boolean bitwise operators. They're quite common, you'll admit. So, let's assume that we have a function that set the table when we invite our friends home for dinner. Sometimes we do a very special dinner with appetizer, first course, second course and dessert (I'm from Italy, so it's not uncommon), sometimes just a first course and dessert, sometimes any other permutation. So we do something like this:

function setTheTable(
    bool $appetizer = false,
    bool $firstCourse = false,
    bool $secondCourse = false,
    bool $dessert = false
): void
{
    if ($appetizer) {
        // do something
    }
    if ($firstCourse) {
        // do something
    }
    // here the others checks
}

Well, that's really ugly. If we need just dessert we must call an embarassing setTheTable(false, false, false, true). If we need to add support for a welcome drink AND coffee AND liqueur (Italy, remember?) we must add 3 others variables, all of them enqueued to the previous ones because we want backward compatibility... Really, a terrible solution. It could be better if we use a configuration structure, in PHP arrays are the best solution:

function setTheTable(array $conf = []): void
{
    if ($conf['appetizer'] ?? false) {
        // do something
    }
    if ($conf['firstCourse'] ?? false) {
        // do something
    }
    // here the others checks
}

That's better, but still kinda sucks. Now we depend on a bunch of custom strings: that's ok if the function is the only one that use them, but if some other parts of the project depends on them you can see the problems: string mispelled can bring bugs quite difficult to spot, and the IDE cannot help you. That's the work for constants, or enums, or other similar language constructs, but if we want to use an object instead of an array we could have problems using them. What a mess! We want just something really simple:

function setTheTable(int $conf = 0): void
{
    if (/* some magic here for appetizer */) {
        // do something
    }
    if (/* some magic here for first course */) {
        // do something
    }
    // here the others checks
}

But how can we achieve this?

The most important information you have to know is that a decimal number can be written in binary form. Here's a quick reference with the numbers we'll use in this article; I'll put also a decomposition, later you'll find why:

Decimal Binary Decomposition
1 0001 1
2 0010 2
3 0011 2 + 1
4 0100 4
5 0101 4 + 1
6 0110 4 + 2
7 0111 4 + 2 + 1
8 1000 8

Stick your attention to those number with no decomposition: 1, 2, 4 and 8. They are all powers of 2, and in their binary form we can find just one 1 and all zeroes. Keep in mind this detail: powers of 2 are important.

And now for something completely different whe must analyze what bitwise operators do on binary numbers, because they'll be the key of our flag system. We normally use boolean operators, like if ($a AND $b), that consider the operands as a whole boolean value: so we know that true AND false is false, true OR false is true, true XOR true is false, NOT true is false.

Bitwise operators do the same thing, but at a lower level, bit by bit. The notation is also different. C language has taught us to write & for bitwise AND, | for OR, ^ for XOR and ~ for NOT. So we can try a simple 0001 | 0010: we'll write in column, in order to make clear the idea:

0001 → 1
0010 → 2
0011 → 1|2 = 3

Do you see the logic? The operator has checked the first character of the two operands and applied a normal boolean logic: 0 is false, 1 is true, so false OR false gets false, a.k.a. 0; the second character is the same; the thirdth is false OR true, so true, a.k.a. 1; the fourth is true OR false, so another 1. Pretty simple, isn't it?

Maybe now you'll say "Ok, bitwise OR is an addition, because 1|2=3". Well, definitely no. Try 1|1:

0001 → 1
0001 → 1
0001 → 1|1 = 1

So, bitwise operators sticks to just one character at a time: the "addition" has not a "carry". Let's have another example, with bitwise AND:

0111 → 7
0101 → 5
0101 → 7&5 = 5

That's interesting: all the 1 in 0101 are paired with a 1 in 0111, so the result is another 0101.

We know (or you'll know in a few seconds) that 5 is deconstructed in 4 + 1, that is 0100 + 0001, because in binary numbers each character "gain" a value based on his position, in powers of 2, from right to left; we can write 0101 as 0*8 + 1*4 + 0*2 + 1*1, and if you strip off the 0 multiplication and simplify the *1, you'll get just 4+1. Pretty simple, isn't it? So we can say that "destructuring" is like an inversed bitwise OR, because 1|4 is equal to 0001 | 0100, that is 0101, our 5.

Now, take a deep breath. Try to call 1, 2, 4, and 8 with other names, like appetizer, firstCourse, secondCourse and dessert. So 5, 0101, is:

  • 0 dessert (0*8)
  • 1 secondCourse (1*4)
  • 0 firstCourse (0*2)
  • 1 appetizer (1*1)

They seems like flags, am I right? So we can try this approach (but remember, the values must be all non-zero differents powers of two):

// Constants declaration
define( 'APPETIZER',     1 );
define( 'FIRST_COURSE',  2 );
define( 'SECOND_COURSE', 4 );
define( 'DESSERT',       8 );

// Our beloved function
function setTheTable(int $conf = 0): void
{
    // here the checks
}

// Our business logic
setTheTable(FIRST_COURSE|DESSERT);
setTheTable(SECOND_COURSE);
setTheTable(FIRST_COURSE|DESSERT|APPETIZER|SECOND_COURSE);

That's good! Now we have constants that our IDE can recognize, the order of parameters is ininfluent, we can add whatever other value without breaking compatibility. We have just one limitation: we're using integers, so we have 32 or 64 bits available (the limits depend on language, architecture, seasonal weather, astral conjunctions and so on; RTFM is the only way to know the real limit). But if you're thinking about a function with more than 32 boolean switches you have more than a problem.

But... how can we check for flags? For example, the recurring 5, that we write as APPETIZER|SECOND_COURSE: we need to check if the APPETIZER is enabled or not. In other words, check if in the deconstruction of the configuration parameter we have the value for APPETIZER. That's a bitwise AND operation:

0101 → 5 → APPETIZER|SECOND_COURSE
0001 → 1 → APPETIZER
0001 → 5&1 = 1 → APPETIZER

And if we check for DESSERT? Let's see:

0101 → 5 → APPETIZER|SECOND_COURSE
1000 → 8 → DESSERT
0000 → 5&8 = 0 → we don't have a constant for that, so it's simply a false value

Cool! Now we can fill those "some magic here" comments:

function setTheTable(int $conf = 0): void
{
    if ($conf & APPETIZER) {
        // do something
    }
    if ($conf & FIRST_COURSE) {
        // do something
    }
    // here the others checks
}

Now that's pretty! Simple and concise. And now we can do some cool trick, like check any flag from a selection: we could check if we have appetizer or dessert or both:

if ($conf & (APPETIZER|DESSERT)) {
    // something
}

How does it works? First we calculate APPETIZER|DESSERT: 0001|1000 = 1001, and then we check if $conf has some 1 in common; let's see some examples with different configurations:

Literal $conf $conf Result Literal result
APPETIZER 0001 0001 APPETIZER
APPETIZER⎮FIRST_COURSE 0011 0001 APPETIZER
APPETIZER⎮DESSERT 1001 1001 APPETIZER⎮DESSERT
FIRST_COURSE 0010 0000
FIRST_COURSE⎮SECOND_COURSE 0110 0000

So the result carry the information about which flag was effectively on. So we can also check that ALL the required flags are active:

if ((APPETIZER|DESSERT) === ($conf & (APPETIZER|DESSERT))) {
    // something
}

That's a little verbose, but if you do this check just once it's ok. But what if you do a lot of dinner with just first course and dessert, and a lot of complete dinner with all the components? Well, you can define derived constant!

// Basic constants declaration
define( 'APPETIZER',     1 );
define( 'FIRST_COURSE',  2 );
define( 'SECOND_COURSE', 4 );
define( 'DESSERT',       8 );

// Derived constant declaration
define( 'SMALL_DINNER',  FIRST_COURSE|DESSERT );
define( 'FULL_DINNER',  APPETIZER|FIRST_COURSE|SECOND_COURSE|DESSERT );

// We can use derived constant like any other:

if ($conf & SMALL_DINNER) {
    // we have a first course, or a dessert, or both
}

if (SMALL_DINNER === $conf & SMALL_DINNER) {
    // we surely have a first course AND a dessert
}

That's also convenient because now you can play with negations! Let's see:

setTheTable( FULL_DINNER & ~DESSERT );

How does it works? First let's see what's the value of FULL_DINNER: it's defined as APPETIZER|FIRST_COURSE|SECOND_COURSE|DESSERT, so it's 0001|0010|0100|1000 = 1111. After we do ~DESSERT, and it's ~1000 = 0111. The last step is the AND of those two values: 1111&0111 = 0111. We have effectively removed the dessert from the configuration, what a shame. Now, for just 4 flags it's not so useful, but try to think with a lot of flags, like in PHP, the configuration of error reporting: E_ALL & ~E_NOTICE is waaaaay better than listing all the other available levels. And if another level will be added in the remote future, the developers will take care of changing the real value of E_ALL, so your configuration will remain valid. This brings an important suggestion: never, never, never use directly the numerical values, use only the literal constants; numbers can change their meanings, on the other hand constants are, well, constants!

Just a last trick: if you prefer, you can write your constants directly in binary integer notation (if your language of choice supports it). When you'll get used with binary flags it will be useless, but at the beginning can make concepts clearer:

define( 'APPETIZER',     0b0001 ); // 1
define( 'FIRST_COURSE',  0b0010 ); // 2
define( 'SECOND_COURSE', 0b0100 ); // 4
define( 'DESSERT',       0b1000 ); // 8

--

This was just an introduction to binary flags; there are some interesting trick also while editing those values, but I'll think that for this article ça suffit. If these concepts will be usefull please let me know, this was my first article on dev.to and I'm craving for feedback!

Posted on by:

Discussion

markdown guide