Paul J. Lucas

Posted on Jul 14

Attributes in C23 and C++

#c #cpp

Introduction

An attribute in either a C or C++ program is a little bit of extra helpful information attached to one of a declaration, statement, or function, that neither compilers nor humans can either know or intuit from just looking at the code, but can be used to help compilers do a better job of either diagnostics or optimization.

C++11 introduced a new syntax for attributes that was later adopted into C23. The full syntax is a bit baroque, but the basic syntax is simply:

[[ attribute-list ]]

that is a sequence of one or more attributes separated by commas enclosed between double square brackets where an attribute is simply an identifier. For example, the standard library function exit() is now declared as:

[[noreturn]] void exit( int status );

which tells both compilers and humans that the function never returns — something the compiler can’t intuit from the name alone. (Of course, humans can intuit that in this case, but not in all cases; and the idea is primarily to help the compiler in this case.)

Prior to C23 or C++11, the only way to attach attributes was using compiler-specific syntax such as __attribute__ for gcc and clang, or __declspec for MSVC.

Historically, C11 actually introduced attributes, but as keywords — well, an attribute (singular): _Noreturn. Hence, exit() used to be declared as:

_Noreturn void exit( int status );

Why? The C Committee is fairly conservative. My guess is, at the time, they didn’t want to introduce a whole new syntax. However, keywords for attributes are problematic in that:

If a particular attribute isn’t supported by a particular compiler, the compiler has no choice but to assume it’s a syntax error.
It offers no mechanism for compiler vendors to add their own attributes without causing code using them to break on other compilers (due to problem 1).

The [[...]] syntax is simply better because the compiler knows it’s an attribute declaration. If a particular attribute isn’t supported, the compiler will simply ignore it (or at most warn if you enable warning about unsupported attributes), hence your code will still compile.

In C23, the C Committee saw the light an adopted the [[...]] syntax deprecating the keyword syntax. Obviously, the C and C++ Committees are separate and adopt things at different rates. Despite that, they do try to maximize compatibility between the two languages (eventually). Currently, there’s a common subset of attributes that are supported by both C23 and C++, but also attributes that are supported by only one or the other.

Attribute Placement

In general, attributes can be inserted almost anywhere in a program: before or after declarations or statements. Where they’re inserted influences what they’re an attribute of. For example, when at attribute is placed before a declaration, such as in:

[[maybe_unused]] int debug_open_count, debug_close_count;

it applies to all of the things being declared; in contrast, when it’s applied after a declaration, such as in:

int x, debug_count [[maybe_unused]], y;

it applies only to the variable to its immediate left.

Note that inserting an attribute after a declaration has started (above, by int) but before a particular thing being declared is an error:

int x, [[maybe_unused]] debug_count, y;  // error

Structure, union, and enumeration declarations can also have attributes, for example:

struct [[maybe_unused]] debug_info {
  // ...
};

but the attribute must go where shown; putting an attribute either before or after is an error.

Why weren’t such placements allowed? I don’t know. My only guess is that they make parsing harder.

Standard Attributes

Some attributes can also take arguments between parentheses as we’ll see.

`[[assume]]`

Languages: Not in C; C++23.
Syntax: [[assume(expression)]]
Attribute of: Null statements only.
For: Speed optimization.

The assume attribute tells the compiler to assume that the given expression is true. Consequently, it may be able to generate more efficient code. For example:

[[assume( n != 0 && (n & (n - 1)) == 0 )]];

tells the compiler to assume than n is a power of two. Given that, the compiler will likely be able to optimize better, e.g., substitute cheaper shifts for more expensive multiplications or divisions. Note that the expression is not evaluated.

What happens if the assumption turns out to be false at run-time? It results in undefined behavior. Because this is so, you should:

Use assume sparingly, typically only when squeezing out a bit more performance where it really matters, e.g., in tight loops or functions that are called frequently. (You should have profiled your code both with and without assume to see if it actually makes a significant difference.)
Be really, really sure that the assumption is always true.

One way to make using assume safer is always to pair it with assert, for example:

assert( n > 0 );
[[assume( n > 0 )]];

In a debug build, the assert will check your assumption at run-time; the assume will provide for better optimization. If your program passes your test suite, then you can disable assertions for a production build yet still benefit from better optimized code.

`[[deprecated]]`

Languages: C23; C++14.
Syntax: [[deprecated]], [[deprecated("reason")]]
Attribute of: Anything.
For: Diagnostics.

This attribute should be fairly obvious in that it marks the thing it’s attached to as deprecated such that if the thing is used in the program, the compiler will generate a warning — meaning you shouldn’t use it because it’s presumably going to be removed eventually.

If reason (an arbitrary string literal) is given, the compiler will include that in the warning message. Typically, it’s the reason the thing was deprecated or what you should use instead. For example:

[[deprecated("Use get_user_id instead")]]
int get_user( char const *name );

`[[fallthrough]]`

Languages: C23; C++17.
Syntax: [[fallthrough]]
Attribute of: Null statements only (at end of case or default).
For: Diagnostics.

As you know, cases fall through into the next case unless you break (or continue, return, or longjmp). You can request that the compiler warn you if you forget. However, if you really want to fall through, you can use fallthrough to suppress the warning, for example:

switch ( argc ) {
  case 2:
    if ( !freopen( fout_path, "w", stdout ) )
      file_error( fout_path );
    [[fallthrough]];
  case 1:
    if ( !freopen( fin_path, "r", stdin ) )
      file_error( fin_path );
}

`[[indeterminate]]`

Languages: Not in C; C++26.
Syntax: [[indeterminate]]
Attribute of: Block variable or function parameter.
For: Speed optimization.

Before indeterminate can be explained, you may need to be reminded that, in both C and C++, an uninitialized variable’s value is indeterminate; reading such a variable results in undefined behavior, for example:

int main() {
  int x;                // indeterminate value
  printf( "%d\n", x );  // undefined behavior
}

C++26 changed this such that an uninitialized variable’s value is merely erroneous — which means the value you get is “garbage,” but reading such a variable does not result in undefined behavior. This is a good thing because such a common bug will be easier to debug.

A variable marked indeterminate restores the pre-C++26 behavior of the variable’s value start off as indeterminate, for example:

[[indeterminate]] int x;

Why would you want to do that? In rare cases, treating a variable as indeterminate will yield slightly better performance. Since the attribute restores undefined behavior, you’d better know what you’re doing.

`[[likely]]` & `[[unlikely]]`

Languages: Not in C; C++20.
Syntax: [[likely]], [[unlikely]]
Attribute of: if, else, case.
For: Speed optimization.

An if, else, or a case can be marked either likely or unlikely if it’s either of those. This allows the compiler to generate better branch-prediction code, for example:

if ( n > 0 ) [[likely]]
  // ...

or:

switch ( err ) {
  [[likely]] case ERR_NONE:
    return;
  // ...
}

For either an if or else, the attribute goes after; for a case, it goes before. (Why the inconsistency? I don’t know.)

Note that “likely” and “unlikely” mean something like 99% of the time, not 51%. Hence, you should mark something likely or unlikely only if it’s very likely or unlikely.

Use of either also helps humans to understand that a condition is either.

`[[maybe_unused]]`

Languages: C23; C++17.
Syntax: [[maybe_unused]]
Attribute of: Anything.
For: Diagnostics.

A variable or function marked maybe_unused tells the compiler not to warn that the variable or function is unused. This can be useful when checking the result of a function with an assert:

[[maybe_unused]] int rv = f();
assert( rv == 0 );

When compiled with NDEBUG defined, the assert and use of rv will disappear. If that was the only use of rv, then the compiler would warn you. Marking rv as maybe_unused would suppress the warning.

`[[nodiscard]]`

Languages: C23; C++17.
Syntax: [[nodiscard]], [[nodiscard("reason")]] (C++20)
Attribute of: Functions only.
For: Diagnostics.

A function marked nodiscard such as:

[[nodiscard]] int initialize();

when called where you discard its return value such as:

int main( int argc, char const *argv[] ) {
  // ...
  initialize();
  // ...
}

will cause the compiler warn you:

warning: ignoring return value of function 'nodiscard'

Alternatively, you can declare a structure, union, or enumeration with nodiscard such as:

enum [[nodiscard]] error_code {
  ERROR_NONE,
  ERROR_USAGE,
  // ...
};

Then whenever any function returns a value of error_code, it’s nodiscard implicitly.

I strongly encourage you to annotate every non-void function with nodiscard and omit it only for functions where the return value can safely be discarded.

In hindsight, every function should have been implicitly nodiscard and there should have been an ok_discard attribute instead to indicate that it’s OK to discard a return value. Making such a change now to either C or C++ would causes warnings for practically every program in existence.

`[[noreturn]]`

Languages: C23; C++11.
Syntax: [[noreturn]]
Attribute of: Functions only.
For: Diagnostics.

As mentioned previously, noreturn is used for functions that never return. You should always use it for such functions.

What happens if a noreturn function returns anyway? You guessed it: undefined behavior.

`[[no_unique_address]]`

Languages: Not in C; C++20.
Syntax: [[no_unique_address]]
Attribute of: Non-static, non-bit-field members of structures or classes.
For: Space optimization.

Before no_unique_address can be explained, you may need to be reminded that, in C++, although a class T can be declared with no non-static data members, sizeof(T) must be > 0, for example:

struct my_alloc {
  void* operator new( std::size_t size );
};
static_assert( sizeof(my_alloc) > 0 );

However, when an object of such a class is a non-static data member of another class, it can be marked no_unique_address to allow the compiler to give it no size:

template<typename T, typename Alloc>
class my_container {
  // ...
private:
  [[no_unique_address]] Alloc _alloc;  // no space used
  // ...
};

FYI, standard C doesn’t allow empty structures, so no special rule is needed to make its size > 0. That means no_unique_address couldn’t ever be adopted into C unless that rule changes first, which is unlikely.

`[[reproducible]]`

Languages: C23; not in C++.
Syntax: [[reproducible]]
Attribute of: Functions only.
For: Speed optimization.

A reproducible function is one that, given the same values for arguments (if any), will always return the same result. For example, given:

[[reproducible]] double sqrt( double x );

and a call passing a literal or a constant allows the compiler to elide calls:

double a[2][2] = {
  { sqrt(2),       0 },
  {       0, sqrt(2) }
};

Because the compiler now knows sqrt(2) always returns the same result, it can call the function only once and use the result twice. (Formally, this is idempotent.)

A reproducible function must also not cause side-effects aside from its internal state (if any) or parameters. (Formally, this is effectless.)

`[[unsequenced]]`

Languages: C23; not in C++.
Syntax: [[unsequenced]]
Attribute of: Functions only.
For: Speed optimization.

An unsequenced function is a superset of a reproducible one. Not only is the function one that, given the same values for arguments (if any), will always return the same result, it will do so regardless of when it’s called or in what order relative to everything else. Hence, sqrt should be declared as:

[[unsequenced]] double sqrt( double x );

The compiler could decide to call sqrt(2) once at the start of the program and use its result thereafter.

An unsequenced function must also not maintain any internal state between calls that affects the result, i.e., no non-const, volatile, static, or thread_local variables. (Formally, this is stateless.)

Compiler-Specific Attributes

The attribute syntax also allows for attributes to have a prefix or namespace (even in C), for example:

[[gnu::nonnull(1)]]

This allows for compiler-specific attributes since the standard will never include all of them.

Conclusion

Use of attributes improves both diagnostics and optimization of your programs. Where appropriate, should use them whenever possible.

DEV Community

Attributes in C23 and C++

Introduction

Attribute Placement

Standard Attributes

`[[assume]]`

`[[deprecated]]`

`[[fallthrough]]`

`[[indeterminate]]`

`[[likely]]` & `[[unlikely]]`

`[[maybe_unused]]`

`[[nodiscard]]`

`[[noreturn]]`

`[[no_unique_address]]`

`[[reproducible]]`

`[[unsequenced]]`

Compiler-Specific Attributes

Conclusion

Top comments (0)

Introduction

Attribute Placement

Standard Attributes

[[assume]]

[[deprecated]]

[[fallthrough]]

[[indeterminate]]

[[likely]] & [[unlikely]]

[[maybe_unused]]

[[nodiscard]]

[[noreturn]]

[[no_unique_address]]

[[reproducible]]

[[unsequenced]]

Compiler-Specific Attributes

Conclusion

`[[assume]]`

`[[deprecated]]`

`[[fallthrough]]`

`[[indeterminate]]`

`[[likely]]` & `[[unlikely]]`

`[[maybe_unused]]`

`[[nodiscard]]`

`[[noreturn]]`

`[[no_unique_address]]`

`[[reproducible]]`

`[[unsequenced]]`