Paul J. Lucas

Posted on Oct 19, 2023 • Edited on Feb 16

setjmp(), longjmp(), and Exception Handling in C

Introduction

As you know, C doesn’t have exceptions like C++ does. However, C does have two functions, setjmp() and longjmp(), that can be used together to return through more than one level of the call stack, typically as an error-handling mechanism when there’s no other way for a deeply nested function to stop and return an error.

Additionally, it seems to be the case that some C programmers, whenever they become familiar with C++ exceptions, try to use setjmp() and longjmp() to implement exception-like handing in C. While it is somewhat possible, setjmp() and longjmp() have several major restrictions that unfortunately bleed into even using them for exception-like handling in C. Despite this, we’ll derive and implement just such a mechanism.

The Basics

Let’s jump (no pun intended) right in with an example:

#include <setjmp.h>

static jmp_buf env;

void f( void ) {
  // ...
  if ( setjmp( env ) == 0 ) {
    // When setjmp() is called, it returns 0 (the first time)
    // so this code is executed.
    // This code roughly corresponds to "try" code.
    g();
  } else {
    // The second time setjmp() returns non-0 (if ever), this
    // code is executed instead.
    // This code roughly corresponds to "catch" code.
  }
}

void g( void ) {
  // ...
  if ( disaster_strikes )
    longjmp( env, 1 );     // Roughly corresponds to "throw".
  // ...
}

What happens is:

The function f() is called.
f() calls setjmp() that saves a copy of the current “execution context” into the variable env and returns 0 that means “proceed normally.”
f() calls g().
In g(), some disaster strikes and calls longjmp() using the previously saved execution context in env and passes the value 1.
The program execution “jumps” back to the exact point where the setjmp() was called.
Now setjmp() “returns” a second time with the value passed to longjmp(), in this case 1.
Since 1 != 0, the code in the else block is executed instead.

That’s it. While that doesn’t seem complicated, the devil, as ever, is in the details.

The Details

`setjmp()` Call Restrictions

Calls to setjmp() have a number of restrictions. The first is that it must be called only in one of the following ways:

setjmp(env)
{ for | if | switch | while } ( [!]setjmp(env) [ relop expr ] )

that is:

A plain call to setjmp() (optionally cast to void); or:
One of for, if, switch, or while for which the setjmp() must constitute the entire controlling expression; followed by:
An optional !; followed by:
The call to setjmp(); followed by:
An optional relational operator and an integer constant expression.

Specifically, you may not save the return value from setjmp():

int rv;
// ...
rv = setjmp( env );  // Not allowed.

Calling setjmp() in any other way results in undefined behavior.

`volatile` Variables

Any variables local to the function in which setjmp() is called that are modified between the calls to setjmp() and longjmp() must be declared volatile. For example:

void f( void ) {
  int volatile count = 0;
  if ( setjmp( env ) == 0 ) {
    ++count;
  }
  // ...
}

Why? One of the things setjmp() does is to save the values of all the CPU registers. If a local variable is put into a register by the compiler, then if longjmp() is called and setjmp() returns for the second time, it restores the values of the registers. For the above example, if count were not declared volatile, then:

The initial value of count (in a register) is 0.
setjmp() saves this value.
The code then does ++count setting its value to 1.
If longjmp() is called, setjmp() returns for the second time restoring the values of all registers — including the register used by count whose value is 0.

The use of volatile prevents the compiler from storing the value of a variable in a register (among other things), hence it’s stored in the local stack frame instead and so is unaffected by register value restoration.

Variable Length Arrays

Variable length arrays (VLAs) must not be used at all when using setjmp() and longjmp(). If there are VLAs anywhere in the call stack between the longjmp() and the setjmp(), those likely will cause a memory leak.

Resource Leaks

Any resources (memory, FILEs, etc.) acquired between the time setjmp() and longjmp() were called are not released since C has no concept of destructors as the call stack unwinds. So if you malloc() memory or open files, you must figure out a way to release these resources.

`longjmp()` Details

Calling longjmp() returns to the function that called setjmp(). That function must still be on the call stack. For example, you can not “wrap” calls to setjmp():

int wrap_setjmp( jmp_buf env ) {  // Don’t wrap setjmp().
    if ( setjmp( env ) == 0 )
      return 0;
    return 1;
}

void f( void ) {
    if ( wrap_setjmp( env ) == 0 ) {
      // ...
}

The reason you can’t is because longjmp() jumps back to the setjmp() — but in this case, that was inside wrap_setjmp() that has already returned. This results in undefined behavior.

The second argument to longjmp() can be any non-zero value you want. It becomes the second return value of setjmp(). For example, you can use it for error codes:

#define EX_FILE_IO_ERROR    0x0101
#define EX_FILE_NOT_FOUND   0x0102

void f( void ) {
  switch ( setjmp( env ) ) {
    case 0:
      read_file( ".config" );
      break;
    case EX_FILE_IO_ERROR:
      // ...
      break;
    case EX_FILE_NOT_FOUND:
      // ...
      break;
  } // switch
}

Incidentally, you can not meaningfully pass 0 as the second argument: if you do, longjmp() will silently change it to 1.

Exception Handling in C

Given all the aforementioned restrictions, is it possible to implement exception-like handling in C using setjmp() and longjmp()? Yes, but with several rules that must be followed.

Requirements

A self-imposed requirement is that a proper exception-like mechanism in C should look as close as possible to C++ exceptions. Many exception-like implementations for C out there require using ugly macros in stilted ways. What we want is to be able to write “natural looking” code like:

void f( void ) {
  try {
    g();
  }
  catch ( EX_FILE_NOT_FOUND ) {
    // ...
  }
}

void g( void ) {
  // ...
  FILE *f = fopen( "config.h", "r" );
  if ( f == NULL )
    throw( EX_FILE_NOT_FOUND );
  // ...
}

We also want to be able to nest try blocks, either directly in the same function, or indirectly in called functions. This means we’ll need multiple jmp_buf variables and a linked list linking a try block to its parent, if any. We can declare a data structure to hold this information:

typedef struct cx_impl_try_block cx_impl_try_block_t;

struct cx_impl_try_block {
  jmp_buf               env;
  cx_impl_try_block_t  *parent;
};

static cx_impl_try_block_t *cx_impl_try_block_head;

Implementing `try`

Clearly, try will have to be a macro; but that expands into what? Code is needed that allows:

Storage of a cx_impl_try_block local to the scope of the try.
Code to be specified between { ... } for the try block.

The only thing in C that gives us both is a combination of for and if:

#define try                                             \
  for ( cx_impl_try_block_t cx_tb = cx_impl_try_init(); \
        ???; ??? )                                      \
    if ( setjmp( cx_tb.env ) == 0 )

where cx_impl_try_init() is:

cx_impl_try_block_t cx_impl_try_init( void ) {
  static cx_impl_try_block_t const tb;
  return tb;
}

But what do we put for the for loop condition expression and increment statements? The loop needs to execute only once, so the condition has to return true the first time and false the second. We can add a “state” to cx_impl_try_block:

enum cx_impl_state {
  CX_IMPL_INIT,       // Initial state.
  CX_IMPL_TRY,        // No exception thrown.
};
typedef enum cx_impl_state cx_impl_state_t;

struct cx_impl_try_block {
  jmp_buf               env;
  cx_impl_try_block_t  *parent;
  cx_impl_state_t       state;
};

For more about enumerations, see Enumerations in C.

We can then implement a function for the for loop condition that initializes cx_tb and returns true the first time it’s called and false the second:

bool cx_impl_try_condition( cx_impl_try_block_t *tb ) {
  switch ( tb->state ) {
    case CX_IMPL_INIT:
      tb->parent = cx_impl_try_block_head;
      cx_impl_try_block_head = tb;
      tb->state = CX_IMPL_TRY;
      return true;
    case CX_IMPL_TRY:
      cx_impl_try_block_head = tb->parent;
      return false;
  } // switch
}

With this, we can augment the definition of try to be:

#define try                                             \
  for ( cx_impl_try_block_t cx_tb = cx_impl_try_init(); \
        cx_impl_try_condition( &cx_tb ); )              \
    if ( setjmp( cx_tb.env ) == 0 )

Given that, we don’t need anything for the for loop increment statement.

Implementing `throw`

When implementing throw, it will be extremely helpful if the “exception” thrown contained the file and line whence it was thrown:

#define throw(XID) \
  cx_impl_throw( __FILE__, __LINE__, (XID) )

But that means we need another data structure to hold the exception information and a global exception object:

struct cx_exception {
  char const *file;
  int         line;
  int         thrown_xid;
};
typedef struct cx_exception cx_exception_t;

static cx_exception_t cx_impl_exception;

We also need to add more to cx_impl_try_block:

struct cx_impl_try_block {
  jmp_buf               env;
  cx_impl_try_block_t  *parent;
  cx_impl_state_t       state;
  int                   thrown_xid;  // Thrown exception, if any.
  int                   caught_xid;  // Caught exception, if any.
};

Given that, we can implement cx_impl_throw() as:

_Noreturn
void cx_impl_throw( char const *file, int line, int xid ) {
  cx_impl_exception = (cx_exception_t){
    .file = file,
    .line = line,
    .thrown_xid = xid
  };
  if ( cx_impl_try_block_head == NULL )
    cx_terminate();
  cx_impl_try_block_head->state = CX_IMPL_THROWN;
  cx_impl_try_block_head->thrown_xid = xid;
  longjmp( cx_impl_try_block_head->env, 1 );
}

If throw() is called but cx_impl_try_block_head is NULL, it means there’s no active try block which means the exception can’t be caught, so just call cx_terminate():

_Noreturn
static void cx_terminate( void ) {
  fprintf( stderr,
    "%s:%d: unhandled exception %d (0x%X)\n",
    cx_impl_exception.file, cx_impl_exception.line,
    cx_impl_exception.thrown_xid,
    (unsigned)cx_impl_exception.thrown_xid
  );
  abort();
}

We’ll also need a couple new states to distinguish them from the un-thrown state CX_IMPL_TRY:

enum cx_impl_state {
  CX_IMPL_INIT,       // Initial state.
  CX_IMPL_TRY,        // No exception thrown.
  CX_IMPL_THROWN,     // Exception thrown, but uncaught.
  CX_IMPL_CAUGHT,     // Exception caught.
};

and update cx_impl_try_condition() accordingly:

bool cx_impl_try_condition( cx_impl_try_block_t *tb ) {
  switch ( tb->state ) {
    case CX_IMPL_INIT:
      tb->parent = cx_impl_try_block_head;
      cx_impl_try_block_head = tb;
      tb->state = CX_IMPL_TRY;
      return true;
    case CX_IMPL_TRY:
    case CX_IMPL_THROWN:
    case CX_IMPL_CAUGHT: 
      cx_impl_try_block_head = tb->parent;
      if ( tb->state == CX_IMPL_THROWN )
        cx_impl_do_throw();  // Rethrow uncaught exception.
      return false;
  } // switch
}

It’s also necessary to split part of cx_impl_throw() into cx_impl_do_throw() so it can be called directly to rethrow an uncaught exception:

_Noreturn
void cx_impl_do_throw( void ) {
  if ( cx_impl_try_block_head == NULL )
    cx_terminate();
  cx_impl_try_block_head->state = CX_IMPL_THROWN;
  cx_impl_try_block_head->thrown_xid = xid;
  longjmp( cx_impl_try_block_head->env, 1 );
}

_Noreturn
void cx_impl_throw( char const *file, int line, int xid ) {
  cx_impl_exception = (cx_exception_t){
    .file = file,
    .line = line,
    .thrown_xid = xid
  };
  cx_impl_do_throw();
}

Implementing `catch`

Just like in the original example, we can #define catch as:

#define catch(XID) \
  else if ( cx_impl_catch( (XID), &cx_tb ) )

and implement cx_impl_catch() as:

bool cx_impl_catch( int catch_xid, cx_impl_try_block_t *tb ) {
  if ( tb->caught_xid == tb->thrown_xid )
    return false;
  if ( tb->thrown_xid != catch_xid )
    return false;
  tb->state = CX_IMPL_CAUGHT;
  tb->caught_xid = tb->thrown_xid;
  return true;
}

The first if checks for the case when the same exception is thrown from a catch block. Once an exception is caught at the current try/catch nesting level, it can never be recaught at the same level. By returning false for all catches at the current level, the code in cx_impl_try_condition() will pop us up to the parent level, if any, where this check will fail (because the parent’s caught_xid will be 0) and we can possibly recatch the exception at the parent level.

Implementing `finally`

Even though C++ doesn’t have finally like Java does, C doesn’t have destructors to implement RAII, so having finally would be useful to clean-up resources (free() memory, fclose() files, etc.).

It turns out that adding finally isn’t difficult. The big difference is that the for loop has to execute twice: once to run the original try/catch code and a second time to run finally code. However, for the second iteration, setjmp() must not be called again. This can be achieved by adding another state of CX_IMPL_FINALLY and an if in the definition of try:

#define try                                             \
  for ( cx_impl_try_block_t cx_tb = cx_impl_try_init(); \
        cx_impl_try_condition( &cx_tb ); )              \
    if ( cx_tb.state != CX_IMPL_FINALLY )               \
      if ( setjmp( cx_tb.env ) == 0 )

The implementation of finally therefore trivially becomes:

#define finally                                \
    else /* setjmp() != 0 */ /* do nothing */; \
  else /* cx_tb.state == CX_IMPL_FINALLY */

The implementation of cx_impl_try_condition() also needs to account for the new CX_IMPL_FINALLY state:

bool cx_impl_try_condition( cx_impl_try_block_t *tb ) {
  switch ( tb->state ) {
    case CX_IMPL_INIT:
      tb->parent = cx_impl_try_block_head;
      cx_impl_try_block_head = tb;
      tb->state = CX_IMPL_TRY;
      return true;
    case CX_IMPL_CAUGHT:
      tb->thrown_xid = 0;      // Reset for finally case.
      // fallthrough
    case CX_IMPL_TRY:
    case CX_IMPL_THROWN:
      tb->state = CX_IMPL_FINALLY;
      return true;
    case CX_IMPL_FINALLY:
      cx_impl_try_block_head = tb->parent;
      if ( tb->thrown_xid != 0 )
        cx_impl_do_throw();    // Rethrow uncaught exception.
      return false;
  } // switch
}

The CX_IMPL_TRY, CX_IMPL_THROWN, and CX_IMPL_CAUGHT states now return true to execute the for loop once more for the finally block, if any. In the CX_IMPL_FINALLY state, we have to remember whether the exception was caught or not. We could have added a separate flag for this, but we can alternatively just reset tb->thrown_xid to 0 and check for non-0 later to know whether to rethrow the exception.

With the addition of finally, we can now write code like this:

void read_file( char const *path ) {
  FILE *f = fopen( path, "r" );
  if ( f == NULL )
    throw( EX_FILE_NOT_FOUND );
  try {
    do_something_with( f );
  }
  finally {
    fclose( f );
  }
}

The C Exception Library

The end result is the C Exception Library. It has several additions not discussed here so as not to make the article even longer. The additions are:

The macros are actually named cx_try, cx_catch, cx_throw, and cx_finally to avoid possible name collisions. However, if you want the traditional names, just do:
```
#define CX_USE_TRADITIONAL_KEYWORDS 1
#include <c_exception.h>
```
catch() is able to take zero arguments to mean “catch any exception.”
throw() is able to take zero arguments to mean “rethrow the caught exception.”
throw() is able to take an optional second argument of “user data” that’s copied into cx_exception.
Since C doesn’t have inheritance, it’s impossible to create exception hierarchies. Instead via cx_set_xid_matcher(), you can set a custom function to compare exception IDs. This at least allows you to, say, catch any exception in a certain range.
Via cx_set_terminate_handler(), you can set a custom function to be called by cx_terminate().
The variables cx_impl_exception and cx_impl_try_block_head are thread_local so the library will work correctly in multithreaded programs.

Restrictions

Even though the C Exception library meets all of our requirements, there are still a number of restrictions:

The requirement of volatile variables and prohibition of VLAs still apply. There’s simply no way around these.
Within a try, catch, or finally block, you must never break unless it’s within your own loop or switch due to the use of the for loop in the implementation.
Similarly, within a try, catch, or finally block, you must never either goto outside the blocks nor return from the function. In addition to the finally block, if any, not being executed, cx_impl_try_block_head will become a dangling pointer to the defunct cx_tb variable. (Fortunately, this situation can be detected, so the library checks for it and reports the error.)
Within a try or catch block, continue will jump to the finally block, if any. (Perhaps this is a nice thing?)
When compiling your own code using the library, you have to use the compiler options -Wno-dangling-else and -Wno-shadow (or equivalent for your compiler) to suppress warnings. There’s simply no way to use {} to suppress “dangling else” warnings nor use unique names for cx_tb to avoid the “no shadow” warnings and keep the “natural looking” code.

Conclusion

While this C exception library works, there are a number of restrictions that are easily forgotten because wrong code doesn’t look wrong. It’s not clear whether the utility of being able to throw and catch exceptions outweighs the restrictions and pitfalls. Only actual users of the library will be able to answer.

What sort of C programs might benefit from using such a library? Obviously, smaller, simpler programs can continue to use more traditional error-checking mechanisms. However, classes of C programs that might benefit from exception handling are those that either employ long function call chains or callbacks where it’s either too difficult or impossible to check for and handle errors all the way back through the call stack.

Acknowledgement

This article was inspired by Remo Dentato’s article and my implementation is based on his.

Top comments (2)

Serpent7776 • Oct 23 '23

This is exactly how Postgres manages errors.
Their model is somewhat simpler, based on do while loop and if statement.
They also support finally block, but there might be either finally or catch, not both.

github.com/postgres/postgres/blob/...

Recently I went through integrating this with extension written in C++ using exceptions. That was "fun".

Paul J. Lucas • Oct 24 '23

Their implementation is similar in that they use both setjmp() and longjmp() as well as macros, but, other than those superficial similarities, it's completely different.

My implementation uses more “natural looking” try/catch code, catches specific exceptions, and allows both catch and finally.

DEV Community

setjmp(), longjmp(), and Exception Handling in C

Introduction

The Basics

The Details

`setjmp()` Call Restrictions

`volatile` Variables

Variable Length Arrays

Resource Leaks

`longjmp()` Details

Exception Handling in C

Requirements

Implementing `try`

Implementing `throw`

Implementing `catch`

Implementing `finally`

The C Exception Library

Restrictions

Conclusion

Acknowledgement

Top comments (2)

Read next

TS1230: A type predicate cannot reference element '{0}' in a binding pattern

TS1237: The return type of a parameter decorator function must be either 'void' or 'any'

🚀 JavaScript Tips: Essential Tips and Tricks for Developers

Navigating the World of Open Source Funding: Strategies, Challenges, and Innovative Platforms

Introduction

The Basics

The Details

setjmp() Call Restrictions

volatile Variables

Variable Length Arrays

Resource Leaks

longjmp() Details

Exception Handling in C

Requirements

Implementing try

Implementing throw

Implementing catch

Implementing finally

The C Exception Library

Restrictions

Conclusion

Acknowledgement

Read next

TS1230: A type predicate cannot reference element '{0}' in a binding pattern

TS1237: The return type of a parameter decorator function must be either 'void' or 'any'

🚀 JavaScript Tips: Essential Tips and Tricks for Developers

Navigating the World of Open Source Funding: Strategies, Challenges, and Innovative Platforms

`setjmp()` Call Restrictions

`volatile` Variables

`longjmp()` Details

Implementing `try`

Implementing `throw`

Implementing `catch`

Implementing `finally`