DEV Community

Paul J. Lucas
Paul J. Lucas

Posted on • Updated on

Obscure C99 Array Features

#c

Introduction

C99 introduced a number of new features for arrays. Even though C99 is over 20 years old, you seldom see these new features used in the wild. Because of that, you’re less likely to be familiar with them and so less likely to use them in your own code (but maybe you shouldn’t!). So here’s a tour of those features.

Flexible Array Members

Introduced in C99, the last member of a struct with more than one named member may be a flexible array member, that is an array of an unspecified size:

struct s {
    size_t n;
    double d[];            // Flexible Array Member
};
Enter fullscreen mode Exit fullscreen mode

Typically, such a struct serves as a “header” for a larger region of memory, perhaps containing a binary file read from disk.

Note that it’s up to your code to somehow remember how big the array is. (This can, of course, be stored in a member that precedes the array in the struct.)

When sizeof is applied to such a struct, it’s as if the array isn’t there — except there may be some additional padding.

While you can have such a struct on the stack, it’s not useful since no size is set aside for the array (hence, accessing the array is undefined behavior). To be useful, such a struct has to be allocated on the heap: it’s then when the size is specified:

struct s *ps = malloc( sizeof(struct s) + sizeof(double[n]) );
Enter fullscreen mode Exit fullscreen mode

Lastly, note that assignments among such structs do not copy the array (because the compiler has no idea how big it is):

struct s *ps1, *ps2;
// ...
*ps1 = *ps2;               // copies only 'n' member
Enter fullscreen mode Exit fullscreen mode

Incidentally, C++ never adopted flexible array members from C99.

Variable Length Arrays

Prior to C99, all arrays had to be declared to be of a fixed length (known at compile-time). Introduced in C99, variable length arrays (VLAs, not to be confused with flexible array members) can be declared to be of a variable length (not known until run-time). For example:

void f( size_t n ) {
    int a[n];
Enter fullscreen mode Exit fullscreen mode

Not only that, but the sizeof operator, historically a compile-time operator, is now sometimes a run-time operator — when its argument is a VLA:

size_t sz = sizeof(a) / sizeof(a[0]); // sz = 10
Enter fullscreen mode Exit fullscreen mode

(The first sizeof is evaluated at run-time because its argument is a VLA; the second sizeof is still evaluated at compile-time.)

Variable Length Arrays Caveat

One serious caveat to VLAs is that, if the length is too big, it will silently overflow the stack. Additionally, unlike malloc() returning NULL upon failure, there's no way to detect when a VLA overflows the stack. Hence, if the size can be “too big,” your code has to guard against it:

size_t const A_LEN_MAX = 1024;
// ...

if ( n > A_LEN_MAX )
    // Do something else?
int a[n];
Enter fullscreen mode Exit fullscreen mode

However, if you know that A_LEN_MAX is the maximum safe size, then you might as well just declare a to be of that size and not use a VLA.

Incidentally, you could do small size optimization:

void f( size_t n ) {
    int a[A_LEN_MAX];
    int *const p = n <= A_LEN_MAX ? a : malloc( n * sizeof(*p) );
    // use only 'p' to access array
    if ( p != a )
        free( p );
}
Enter fullscreen mode Exit fullscreen mode

That is, if n isn’t too big, use the (fixed sized) array on the stack; otherwise, use a dynamically sized array in the heap. This has the advantage of saving on the calls to malloc() and free() for “small” n yet still works for “large” n. But notice that a VLA is not being used.

Hence, the moral is: use VLAs only when you don’t know the size at compile time but can guarantee that it won’t be “too big.” However, this is pretty much never true.

In hindsight, VLAs, though they seem convenient at times, are problematic, so much so that C11 made VLAs an optional feature. Incidentally, C++ never adopted VLAs from C99.

Array Syntax for Parameters

As you should be aware, array syntax can be used to declare function parameters, but, as you should also be aware, it’s just syntactic sugar since the compiler rewrites such parameters as pointers:

void f( int a[] );         // int *a
Enter fullscreen mode Exit fullscreen mode

Note that I’m intentionally writing “array syntax for parameters” and not “array parameters” because “array parameters,” despite appearances, simply don’t exist in C.

The only potential benefit of using array syntax for parameters is that it conveys to the human reader that a is presumed to be a pointer to at least one int rather than a exactly one int. However, it’s only a presumption and not a guarantee since you can call such a function with a null pointer:

f( NULL );                 // f’s 'a' will be NULL
Enter fullscreen mode Exit fullscreen mode

Note that adding a size doesn’t help:

void f( int a[10] );       // int *a
Enter fullscreen mode Exit fullscreen mode

While again this might convey to a human reader that a is presumed to be an array of 10 ints, the compiler ignores the size.

Array syntax for parameters in C is a remnant of how pointers are declared in New B (the precursor to C). See The Development of the C Language, Dennis M. Ritchie, April, 1993.

Non-Null Array Syntax Pointers for Parameters

One of the features added in C99 was the ability to declare an “array” function parameter that must not be null and be of a minimum size:

void f( int a[static 10] );
Enter fullscreen mode Exit fullscreen mode

If you try to pass either NULL or an array that has fewer than 10 ints, the compiler will warn you.

This marks yet another overloading of the static keyword in C since this static has nothing to do with either linkage or duration. Incidentally, C++ never adopted this syntax.
Also incidentally, C99 did not introduce a parallel way to specify that a pointer parameter must not be null.

Array Syntax for Parameters Caveat

Using array syntax for function parameters can also be dangerous:

void f( int a[10] ) {      // int *a
    for ( size_t i = 0; i < sizeof(a)/sizeof(*a); ++i ) {
        // ...
Enter fullscreen mode Exit fullscreen mode

The intention is to iterate over all the elements of the array, but, despite the sizeof expression being correct for an array, a is, again, not an array, but a pointer; so you’ll get the size of a pointer divided by the size of an int. Fortunately, gcc will warn about this.

Qualified Array Syntax for Parameters

To drive home that parameters declared with array syntax really are pointers, you can change them:

int ra[10];                // real array

void f( int pa[] ) {       // int *pa
    ++ra;                  // error (as expected)
    ++pa;                  // OK (surprisingly)
Enter fullscreen mode Exit fullscreen mode

C99 also added the ability to qualify the rewritten pointer:

void f( int pa[const] ) {  // int *const pa
    ++pa;                  // error now
Enter fullscreen mode Exit fullscreen mode

In addition to const, you can also qualify the pointer with volatile and restrict (CVR).

Note that neither of these:

void f( int const pa[] );  // pointer to const int
void f( const int pa[] );  // same as above
Enter fullscreen mode Exit fullscreen mode

is the same thing: the const outside the [] refers to the int and not pa.

Why doesn’t the compiler convert parameters with array syntax to const pointers? Because const wasn’t a part of C when Ritchie invented it.
Incidentally, C++ never adopted this syntax.

Variable Length Array Syntax for Parameters

C99 also added the ability to use VLAs for function parameters:

void f( size_t n, int a[n] ) { // int *a
Enter fullscreen mode Exit fullscreen mode

That is, the size of the “array” is given by an integral parameter that precedes it. Note, however, that a is still a pointer. Despite having the size information at run-time, sizeof(a) will still return the size of the pointer. Hence, this “feature” serves only to convey to the human reader that n is the presumed size of the “array.”

However, this “feature” is actually useful for multidimensional arrays. But before we get to that, a quick refresher on multidimensional array syntax for function parameters.

Multidimensional Array Syntax for Parameters

As you should be aware, array syntax can also be used to declare function parameters for multidimensional “arrays”:

void f( int a[10][20] );   // int (*a)[20]
Enter fullscreen mode Exit fullscreen mode

The rule that the compiler converts array syntax for a function parameter into a pointer happens only for the first (left-most) dimension; the remaining dimension(s) keep their “array-ness.” Hence, a is a pointer to a real array of 20 ints.

Note that the parentheses are necessary: without them, it would be an array of 20 pointers to int. FYI: to help decipher cryptic C declarations, you can use cdecl.

Pointers to array don’t often occur in C programs since the name of an array “decays” into a pointer to its first element. In most cases, this is good enough even though the size information is lost. However, a pointer to an array retains the array’s size as part of the type, so assignments between pointers to arrays of different size are warned about:

int (*p3)[3];              // pointer to array 3 of int
int (*p5)[5];              // pointer to array 5 of int
p5 = p3;                   // warning: incompatible pointers
Enter fullscreen mode Exit fullscreen mode

In particular, given:

int a[10];
int *pi = a;               // pointer to int (via decay)
int (*pa)[10] = &a;        // pointer to array 10 of int
Enter fullscreen mode Exit fullscreen mode

both pi and pa point to the same location in memory (here, &a[0]), but a “pointer to array” is an entirely different thing from a pointer that results from array decay. For pi, the compiler “forgets” the size of the array to which it points; for pa, it “remembers” the size.

Part of the reason pointers to array aren’t used much is because it’s clunky to access array elements since you have to dereference the pointer first:

int e1 = (*p3)[1];         // must dereference p3 first
Enter fullscreen mode Exit fullscreen mode

However, you can dereference the pointer once into another pointer then use that pointer:

int *p = *p3;
int e1 = p[1];             // same as: (*p3)[1]
Enter fullscreen mode Exit fullscreen mode

Multidimensional VLA Parameters

Multidimensional array syntax for parameters can be used for VLAs:

void f( size_t m, size_t n, int a[m][n] ) {
Enter fullscreen mode Exit fullscreen mode

Since the compiler rewrites only the first array dimension as a pointer, the above is really:

void f( size_t m, size_t n, int (*a)[n] ) {
Enter fullscreen mode Exit fullscreen mode

that is a is a pointer to a VLA of n ints. In this case, the VLA is actually a useful feature since the n allows the compiler to know the length of each row of the array. Additionally, sizeof (the first one below) once again becomes a run-time operator:

size_t sz = sizeof(*a) / sizeof(**a); // sz = n
Enter fullscreen mode Exit fullscreen mode

Unlike VLAs in general, VLAs used for function parameters are safe since the actual arrays passed to the function can be (and often are) normal arrays:

void f( size_t m, size_t n, int a[m][n] ) {
    // ...
}

void g( void ) {
    int a[10][20];
    f( 10, 20, a );
}
Enter fullscreen mode Exit fullscreen mode

There’s no new VLA being created at run-time here, so it can’t overflow the stack.

Multidimensional VLA Parameter Declarations

When declaring (as opposed to defining) functions, C allows you to omit the parameter names; however, if you do that, then there’s no name to specify the size of a VLA; but C99 added a new syntax for this case:

void f( size_t, size_t, int[][] );    // error
void f( size_t, size_t, int[*][*] );  // OK
Enter fullscreen mode Exit fullscreen mode

That is, you use * to denote a VLA of an unnamed size. Note that since the first array dimension is always converted to a pointer, the * is needed only starting with the second dimension:

void f( size_t, size_t, int[][*] );   // same as previous
Enter fullscreen mode Exit fullscreen mode

Hence, you never need * when using single dimension array syntax.

Conclusion

C99 added the new array features of:

  1. Flexible Array Members.
  2. VLAs (which are unsafe, so you probably shouldn’t use them).
  3. VLAs for function parameters (which are safe, but really only useful for multidimensional arrays).
  4. static that requires non-null, minimum-sized arrays be passed for parameters.
  5. The ability to const, volatile, or restrict qualify the rewritten pointers that decayed from arrays for function arguments.

Top comments (3)

Collapse
 
ac000 profile image
Andrew Clayton

Thumbs up for saying VLAs should be avoided. You can compile with -Wvla to warn about such usage.

Another array related item that was officially standardised in C99 is the Flexible Array Member.

Collapse
 
pauljlucas profile image
Paul J. Lucas

You can compile with -Wvla to warn about such usage.

The problem with that, at least with gcc, is that it also warns about VLA function parameters which are safe.

Another array related item that was officially standardised in C99 is the Flexible Array Member.

Oops! I forgot about them. I'm pretty sure I've never written any C program that used them. Anyway, I've added a section describing them.

Collapse
 
ac000 profile image
Andrew Clayton

You can compile with -Wvla to warn about such usage.

The problem with that, at least with gcc, is that it also warns about VLA function parameters which are safe.

Clang warns also. Whenever I've used multi-dimensional arrays (not that often) I've made them arrays of pointers.