DEV Community

Cover image for Beautiful Perl feature: low-precedence boolean operators 'and', 'or'
Laurent Dami
Laurent Dami

Posted on

Beautiful Perl feature: low-precedence boolean operators 'and', 'or'

Beautiful Perl series

This post is part of the beautiful Perl features series. See the introduction post for general explanations about the series.

Today's feature is quite unique in programming languages: the fact that Perl has two different syntaxes for expressing the same boolean operations. Read on to understand why this is clever and beautiful.

Boolean algebra and short-circuit evaluation

The basic operations of Boolean algebra are fundamental in programming; therefore they are present in all programming languages. In the C language, later followed by many other languages, icluding Perl, logical conjunction (the "AND" operator) is written &&, logical disjunction (the "OR" operator) is written ||, and logical negation (the "NOT" operator) is written !. Conjunction has higher precedence than disjunction, so a && b || c && d is parsed as (a && b) || (c && d) and not a && (b || c) && d. The unary negation operator has even higher precedence.

Mathematically speaking, both AND and OR operators are commutative : the left and right operands can be switched without affecting the result. But in programming languages the order of operands is not totally indifferent, because operands have an evaluation cost; therefore the system usually spares the cost of evaluating the right operand whenever it has enough information from the left operand to compute the result of the boolean operation: this is called short-circuit evaluation. For example in the conjunction $x < 9999 && is_prime($x), if $x is a large number, the left condition is false, so the whole expression is necessarily false and there is no need to do an expensive computation to decide if the number is prime. Similarly, when the left operand of a disjunction is true, then the whole expression is true and the right operand need not be evaluated.

Short-circuit evaluation is not only a question of performance: it also has important consequences when operand expressions have side-effects, like changing the value of a global variable or interacting with files, sockets or other elements of the operating system. Because of short-circuiting, the value of the left operand determines if the side-effects of the right operand will happen or not. In languages of C lineage, many common idioms take advantage of this feature for avoiding runtime exceptions: conditions checked in the left operand must be met before operations in the right operand are attempted. For example the left condition might be to check if a file is present before attempting to open it, or check if a number is different from zero before using it as denominator in a division. Here is a C example borrowed from Wikipedia:

bool isFirstCharValidAlpha(const char* p) {
    return p != NULL && isalpha(p[0]); // 1) no unneeded isalpha() execution
                                       //    with p == NULL,
                                       // 2) no SEGFAULT risk
}
Enter fullscreen mode Exit fullscreen mode

Perl expressions are also statements

In languages other than Perl, examples of boolean expressions where operands have side-effects are not frequent, apart from the family of idioms just mentioned. Expressions with side-effects may even be considered inadvisable. Perl is different, because expressions are also statements, as explained in perlsyn:

The only kind of simple statement is an expression evaluated for its side-effects.

Therefore a boolean combination of expressions is also a statement. Likewise, instructions in real life may also be combinations of more atomic instructions:

Add four spoons of sugar and dissolve it into the preparation, or replace it by honey.

The above sentence indeed can be seen as a boolean expression with short-circuit: either you add sugar and then you must dissolve it and there is no need for honey, or you don't use sugar, so there is no need to dissolve anything, and you just have to incorporate the honey.

This way of expressing composite instructions is used everyday in natural language; therefore, expressing programming instructions in a similar way helps to make them easily understandable by humans, often better than sequences of if-then-else constructs, at least when the composite instruction only contains two or three individual clauses. Most programming languages would require the individual clauses to be enclosed in parentheses, which impedes readability.

Boolean expressions in Perl: two sets of operators, same semantics, different precedence

Perl's brilliant solution for improving the readability of composite instructions is to offer another set of boolean operators, written as the English words and, or, not, with exactly the same semantics as the usual &&, || and !, but with lower precedence. Actually, these word versions of boolean operators have the lowest precedence of all in Perl's table of operators, which means that statements between such keywords will be nicely parsed as an individual clauses. Let us look at a simple example:

say "tic", "tac" and say "toe";
Enter fullscreen mode Exit fullscreen mode

The intended result is two lines of output

tictac
toe
Enter fullscreen mode Exit fullscreen mode

and it indeed works that way. The first say is a list operator of very high precedence. Commas after such an operator build a list that will be fed as arguments to say ... but that list stops at the keyword and because of its very low precedence, so it is equivalent to the following parenthesized version:

say("tic", "tac") and say("toe");
Enter fullscreen mode Exit fullscreen mode

Now since we are using parentheses, we might as well write

say("tic", "tac") && say("toe");
Enter fullscreen mode Exit fullscreen mode

with exactly the same result. By contrast, the && version without parentheses

say "tic", "tac" && say "toe"
Enter fullscreen mode Exit fullscreen mode

produces a different output:

toe
tic1
Enter fullscreen mode Exit fullscreen mode

because everything after the first say was treated as an argument list to that say, so an equivalent parenthesized version would be:

say("tic", ("tac" && say("toe")));
Enter fullscreen mode Exit fullscreen mode

The expression "tac" && say "toe" needs to be evaluated first, which is why the output has toe on the first line. The result of that expression is 1 because since the left operand is true, the boolean && returns the value of the right operand say("toe"), which is 1. Then the outer say can do its job, and since it received two arguments "tic" and 1, it prints tic1 on the second line.

This example with say was good for explanations, but is unlikely to be written like this in real programs. More realistic situations involve other list operators, like this very common idiom with open:

open my $fh, '<', $filename
  or die "failed to open '$filename': $!";
Enter fullscreen mode Exit fullscreen mode

Here is another example with mkdir; this also shows that of course several successive boolean connectors can be combined:

-d $dir                                # does this directory exist ?
  or mkdir $dir                        # if not, try to create it
  or die "failed to mkdir '$dir': $!"; # if the creation failed, report the error
Enter fullscreen mode Exit fullscreen mode

Assignment operations

Examples so far were mainly with the or operator. Usage of and is more often combined with assignment operations: a variable takes the value of an expression, and if that value is not empty, then a follow-up operation is performed.

$n_items = $aref && @$aref
  and say "this array contains $n_items items"
  or  say "this array is empty";
Enter fullscreen mode Exit fullscreen mode

This example assumes a variable $aref to be a reference to an array. The first line checks if the reference is indeed defined, and if so, if it refers to a non-empty array. Depending on the result, the appropriate follow-up message is printed.

For this example to work, the $n_items variable must be already declared earlier in the code: unfortunately it wouldn't work here to write my $n_items = ... because the my declaration would only take effect at the end of the whole statement; therefore at the point where the interpreter parses the string "this array contains $n_items items", variable $n_items would not be declared yet and would throw an exception.

This excerpt shows a single expression containing both && and and. Perl's table of operator precedence has been carefully crafted so that the assignment operator = (and its derivatives +=, *=, etc.) has lower precedence than boolean operators in their symbolic syntax &&, || or !, but higher precedence than those operators in their english syntax and, or, not. Therefore the value assigned to $n_items is $aref && @$aref, including the && operator, but it does not include what comes after the and.

The result of a boolean operator is the last operand

The example of the previous section assigns a boolean expression to a variable $n_items whose name indicates that it is not meant to receive a constant true or false but an integer number! This may be surprising for readers accustomed to strongly typed languages like Java or C++, where the result of a boolean expression can only take one of two boolean values.

By contrast, boolean expressions in Perl, like in several other dynamically typed languages (JavaScript, Python, Ruby, etc.), return the the last evaluated value. That value therefore can have a dual purpose: not only can it be tested for truthness (where any value different from undef, 0, "0" or "" means "true"), but it can also be used for its regular scalar value, as a number, string or reference. Many common idioms in Perl and other similar languages take advantage of this feature for writing compact expressions. A more verbose but more explicit version of the assignment above would be:

$n_items = defined $aref ? @$aref > 0 ? @$aref
                                      : 0
                         :              undef;
Enter fullscreen mode Exit fullscreen mode

Real example

To conclude with a real example, here is an excerpt from my module Data::Domain with quite heavy use of idiomatic and / or constructs:

sub _inspect {
  my ($self, $data) = @_;

  looks_like_number($data)
    or return $self->msg(INVALID => $data);

  if (defined $self->{-min}) {
    $data >= $self->{-min}
      or return $self->msg(TOO_SMALL => $self->{-min});
  }
  if (defined $self->{-max}) {
    $data <= $self->{-max}
      or return $self->msg(TOO_BIG => $self->{-max});
  }
  if (defined $self->{-not_in}) {
    grep {$data == $_} @{$self->{-not_in}}
      and return $self->msg(EXCLUSION_SET => $data);
  }

  return;
}
Enter fullscreen mode Exit fullscreen mode

Like always in Perl, there are several ways to do it. Here the stylistic choice was to use if statements for checking the main conditions, but and / or expressions for secondary checks. Another way to write the condition on -min could be:

  defined $self->{-min}
    and $data < $self->{-min}
    and return $self->msg(TOO_SMALL => $self->{-min});
Enter fullscreen mode Exit fullscreen mode

or perhaps

  return $self->msg(TOO_SMALL => $self->{-min})
    if defined $self->{-min} and $data < $self->{-min};
Enter fullscreen mode Exit fullscreen mode

Low precedence and/or in other programming languages

Apart from Perl, dual sets of boolean operators are only found in Ruby and PHP, two programming languages that were heavily influenced by Perl. Apparently their designers considered that this was a good idea! By contrast, a language like Python where the motto is there is only one way to do it did not adopt such a design.

Apart from stylistic preferences, the reason why low-precedence boolean operators in Perl are so effective is that they combine especially well with the other features of the language mentioned above, like the fact that expressions are also instructions, and the fact that boolean expressions return the last evaluated value.

Conclusion

Perl's invention of a double set of operators for a same purpose was a bold step, clearly on a different path than usual endeavors in programming language design that often seek to minimize the set of operators, with features as orthogonal as possible.

Yet Perl's choice integrates particularly well with other technical features of the language and with its general philosophy. Algorithmic steps can be written in a way that closely reflects natural language and natural thinking, without heavy syntactic if-then-else constructs and without verbose parentheses.

About the cover picture

Since the seminal Camel Book, Perl has been associated with the image of a camel (actually a dromedary). Here we have a camel carrying a performing organist ... nice illustration for a beautiful Perl feature. This woodcut print is part of a 16th-century Triumphal Procession series commissioned by the Holy Roman Emperor Maximilian I. The image is borrowed from the Deutsch Fotothek.

Top comments (0)