Beautiful Perl feature: trailing commas

#perl #programming #beautifulperl

Beautiful Perl series

This post is part of the beautiful Perl features series.
See the introduction post for general explanations about the series.

The last two posts about lexical scoping and dynamic scoping addressed deep subjects and therefore were very long; for this time, we'll discuss trailing commas, a much lighter subject ... that nevertheless deserves detailed comments!

Trailing commas, basic principle

Programming languages that support trailing commas are able to parse a list like

(1, 2, 3, 5, 8,)

without generating an error. In other words, it is legal to put a comma after the last item in the list; that comma is a no-op, so the list above is equivalent to (1, 2, 3, 5, 8). Of course the trailing comma is optional, not mandatory.

When it appears on a single line, like in that example, the trailing comma seems ridiculous; but the interest is when the list is written on several lines:

my @fruits = (
              "apples",
              "oranges",
              "bananas",
             );

Here the trailing comma facilitates later changes to the code, should these become necessary. In this example if we want to comment out bananas, or to switch the order of the fruits, we can operate on single lines without having to care about which fruit comes last.

This feature is a transposition of a principle already familiar in all languages with blocks, namely the fact that the last statement in a block can indifferently have a semicolon or not:

if ($is_simple) {do_simple_stuff()} # no semicolon
else            {initialize();
                 process();
                 cleanup();         # with semicolon
                }

Other arguments are commonly invoked in favor of trailing commas, like the facts that diffs in version control systems are cleaner, or that it is easier to generate code programmatically. Going into such arguments would take too much space here, but discussions on the matter can easily be found on the Internet.

On the other hand, let us mention that some people object to trailing commas, arguing that enumeration sentences in natural language never end with a comma, but rather with a full stop or another punctuation sign that marks the end of the list. Evidently this is true, but a sentence in natural language, once emitted, is not rewritten, while in programming languages rewriting is quite frequent - for this usage trailing commas are interesting in programming.

History and comparison with other languages

Long ago the venerable ANSI C language already accepted trailing commas in array initializers; the same thing for enums was added in C99. Both features propagated into languages of C heritage, like C++, Java, PHP, etc., and of course Perl. But Perl went further: since the beginning (version 1.0 in 1987), Perl supported trailing commas in all kinds of lists, including parameters to subroutines, assignments to lists of variables, or more recently in subroutine signatures:

my ($first,
    $second,
    @others,
   ) = @ARGV;

draw_diagram(Perl   => 1987,
             Python => 1991,
             Java   => 1995,
             );

sub transfer_content($source, 
                     $destination,
                     %options,
                    ) { ... }

I strongly suspect that this early design decision in Perl had an influence on the later conception of other programming languages, although I couldn't find any evidence to prove it - influences are rarely documented in design documents! Here is a historical picture:

Python had trailing commas since the beginning (1991), but with some peculiarities that will be discussed below;
Java accepted trailing commas since the beginning (1995), but only in array initializers and enums, like in C; a request to extend this to other lists was formulated in 2006 but was ignored;
JavaScript accepted trailing commas in array literals since the beginning (1995); later support for allowing them in object literals was added in ES5 (2009) and also for function definitions and calls in JS 2017. The global picture is well documented in the MDN documentation, and lots of examples are shown in a blog by logrocket. However JavaScript has a strange edge case with array literals, which will be discussed below. Furthermore, beware that JSON, not a programming language but a data interchange format closely related to JavaScript, does not support trailing commas;
C++ inherited from C a restricted use of trailing commas; a recent proposal (2025) to extend the support to more general use cases is still pending;
Kotlin added general support for trailing commas in version 1.4.0 (2020);
PHP extended its support for trailing commas in version 8.0 (2020).

So nowadays there seems to be a clear tendency towards adoption of trailing commas in many major languages, and some languages are still working on extending their support in this area.

Edge cases in other languages

Trailing commas in Perl are true no-ops, in every context. Furthermore, intermediate commas in a list, or several commas at the end, are also allowed, without any semantic consequence; this is not very useful, but has the nice property of requiring absolutely no reasoning from the programmer, for example when large chunks of code are reshuffled in the course of a refactoring operation. By contrast, Python and JavaScript have edge cases, as shown in the rest of this chapter.

Python : the tuple exception

In Python the array expression [1, 2, 3,] is legal, but expressions [1, 2, , 3,] or [1, 2, 3,,] are not: in other words, intermediate commas or multiple trailing commas are not allowed. This is just a minor syntactic restriction, not very annoying.

A more severe peculiarity is the "tuple exception", namely the fact that these two expressions are semantically different:

(123)    # single value
(123,)   # tuple with one member

Actually, the syntax with the trailing comma is the only way to write a singleton tuple. Another situation, related to the first, comes from the fact that tuples in some contexts are written without parenthesis; so these two lines are again semantically different:

x = 123  # assign a scalar value to x
x = 123, # assign a singleton tuple to x

whereas with tuples of more than one element, the trailing comma is a true no-op:

x = 1, 2, 3  # assign a triple to x
x = 1, 2, 3, # same thing

Furthermore, in singleton lists (as opposed to singleton tuples), trailing commas are also a no-op:

y = [1]      # assign a singleton list to y
y = [1,]     # same thing

So when trailing commas appear in Python code, some thought is required to get the proper meaning.

JavaScript: the sparse array exception

Now let us consider trailing commas in JavaScript. Within object literals, they are true no-ops:

{x:1, y:2, }   // equivalent to {x:1, y:2}

but like in Python, it is not possible to write intermediate commas or several trailing commas

{x:1, , y:2, } // Uncaught SyntaxError: Unexpected token ','
{x:1, y:2,, }  // idem

The situation with array literals is quite different: there the syntax admits intermediate commas and several trailing commas, but semantically they occupy slots in the array:

[1, , 2, 3,,,] // [ 1, <1 empty item>, 2, 3, <2 empty items> ]

This example is an array of length 6, but with only 3 occupied slots - something that is called a sparse array in JavaScript parlance. Empty slots in the array are not equivalent to undefined; their meaning depends on the operation applied to the array, as explained in this MDN document.

Like in Python, when trailing commas appear in JavaScript code, some thought is required to get the proper meaning.

Wrapping up

Trailing commas are a relatively minor topic, but it is interesting to observe that Perl since its initial design made wise decisions with this feature. Trailing commas are treated consistently in all Perl programming constructs, without any surprises for the programmer. In comparison to C, Perl widened the contexts where trailing commas were admitted, and this was probably an inspiration to many other programming languages. Isn't that beautiful ?

About the cover picture

The cover picture shows the initial bars of fugue BWV885 from Johann Sebastian Bach, in a manuscript which is not from the hand of the composer but was transmitted to us by his son-in-law Johann Christoph Altnickol. The theme of that fugue is very recognizable as it repeats the same note seven times before moving to something else - so it reminded me of repeated commas! But this is just a wink; in reality, the illustration is absolutely not relevant with respect to the main theme of this article, because the repetition of notes in the theme is by no means a no-op - on the contrary, it is a strong reinforcement. On a violin a similar effect could be obtained by performing a crescendo on a long note; but on a harpsichord or an organ a note cannot be changed after the initial attack - so repetition is another device to convey the idea of reinforcement.