Remo Dentato

Posted on Jun 8

Polymorphic C (2/2)

#api #c #programming

Introduction
Default to 0
- Pointers to NULL
Supply missing arguments
- Skipping arguments
- A Microsoft quirk
Counting arguments
- Multiuse example
Zero arguments
Conclusion

Introduction

To complete the recap of how we can implement Polymorphism in C, let's review how we can handle different function signatures.

Note that I did not invent these techniques; I found them over the years here and there (a site, a book, a magazine, ...), then I adopted and massaged them to make them useful to me.

Since we all build on someone else's previous work, I'm sharing them here in the hope they might be of help.

Default to 0

A very simple case is when the last optional argument drives the behaviour of your function, and it can be defaulted to 0

// Example:   move(steps [, offset]) where offset is 0 if unspecified:
// The actual function is move_f()
#define move(s,...) move_f(s, __VA__ARGS__ + 0)
void move_f(int steps, int offset)

If called with move(10,3) it will expand to move_f(10, 3 + 0).
If called with move(10), it will expand to move(10, +0), which is perfectly fine for the compiler.

Now, in move_f() you can check for offset and behave accordingly.

Pointers to NULL

The same can be done for pointers, but since they are dangerous beasts, it is better to put some more effort into it:

// Example: transfer(items [, aux_info])
#define transfer(i,...) transfer_g(i, __VA_ARGS__ +0)
#define transfer_g(i,a) _Generic((a), \
                           info_t *: transfer_f(i,(info_t *)(a)),\
                                int: transfer_f(i,(info_t *)NULL) \
                        )
int transfer_f(int items, info_t *aux);

Here we use _Generic to ensure that the second argument is of the proper type and that defaults to NULL.

Note that the C standard does not guarantee that (void *)0 is equal to NULL, even if it is very often the case, so we can't count on the fact that (info_t *)0 will be equal to NULL.

When called as transfer(32, new_info), it will expand to transfer_g(32, new_info + 0). Assuming that new_info is of type info_t * (as it should be) also the value (new_info + 0) is of type info_t *. So, the macro transfer_g() will convert it to transfer_f(32,(info_t *)(new_info+0)).

When called as transfer(32), it will expand to transfer_f(32, +0). The second argument is an int (as per the C standard), and the macro will be expanded to transfer_f(32,(info_t * )NULL) by the second branch of generic.

If the second argument is of any other type, there will be a syntax error.

Note that in the info_t *: branch of _Generic, casting a is mandatory, otherwise the compiler will complain when the second argument is missing an int.

Supply missing arguments

Say you have a function where more than one last argument can be optional: f(a [,j [,k]]).

A simple method to implement it involves augmenting the list of arguments to ensure the proper number of arguments is always provided to the function:

// Select the n-th argument (0-based count)
#define arg_0( x,...)           x
#define arg_1(_0, x,...)        x
#define arg_2(_0,_1, x, ...)    x

// Example:   move(steps [, offset [, direction]]) :
#define move(...) move_f(arg_0(__VA_ARGS__),\
                         arg_1(__VA_ARGS__ , 0), \
                         arg_2(__VA_ARGS__ ,'N',`N`), \
                        )
void move_f(int steps, int offset, int direction);

The expansion of move(13, -3, 'S') will be:

move_f( arg_0(13, -3, 'S') , arg_1(13, -3, 'S', 0) , arg_2(13, -3, 'S', 'N', 'N'))
              ╰────╮                 ╭──╯                  ╭────────╯
move_f(            13       ,       -3         ,          'S' )

The expansion of move(13, -3) will be:

move_f( arg_0(13, -3) , arg_1(13, -3, 0) , arg_2(13, -3, 'N', 'N'))
           ╭──╯               ╭────╯             ╭────────╯
move_f(    13         ,      -3          ,      'N' )

The expansion of move(13) will be:

move_f( arg_0(13)      , arg_1(13, 0)  , arg_2(13, 'N', 'N'))
              ╰───╮                ╰──╮           ╭──────╯
move_f(           13   ,              0   ,      'N' )

Note how the default for the last argument ('N' in the example) must be repeated twice to ensure it is properly expanded.

Skipping arguments

A noticeable drawback is that if you want to provide the third argument, you must provide the second argument. It would be nice if we could just omit the ones we are not interested in.

Providing the types are different, we can ask _Generic for help:

#define arg_0( x,...)           x
#define arg_1(_0, x,...)        x
#define arg_2(_0,_1, x, ...)    x

// We need a different type than int for the direction
#define N ((char)'N')
#define S ((char)'S')
#define E ((char)'E')
#define W ((char)'W')

// Example:   move(steps [, offset [, direction]]) :
#define move(...) move_g(arg_0(__VA_ARGS__),\
                         arg_1(__VA_ARGS__ , N), \
                         arg_2(__VA_ARGS__ , N, N), \
                        )

#define move_g(s, t, d) _Generic( (t) \
                            int: move_f(s, (int)(t), d), \
                           char: move_f(s, 0, (char)(t))  \
                         ) 

void move_f(int steps, int offset, char direction);

To better understand how it works, let's check the macro expansion of move(13, -3, S):

move_g( arg_0(13, -3, S) , arg_1(13, -3, S, N) , arg_2(13, -3, S, N, N))
              ╰────╮               ╭──╯               ╭────────╯
move_g(            13    ,        -3           ,      S )
                   │               │                  │
move_f(            13    ,        -3           ,      S )

For move(13, -3) :

move_g( arg_0(13, -3)    , arg_1(13, -3, N) , arg_2(13, -3, N, N))
              ╰────╮               ╭──╯            ╭────────╯
move_g(            13    ,        -3        ,      N )  // -3 is an int
                   │               │               │
move_f(            13    ,        -3        ,      N )

For move(13, E) :

move_g( arg_0(13, E)   , arg_1(13, E, N) , arg_2(13, E, N, N))
              ╰────╮               │             ╭──────╯
move_g(            13  ,           E     ,       N )  // E is a char
                   │               ╰─────────────╮
move_f(            13  ,           0     ,       E )

And for move(13) :

move( arg_0(13)   , arg_1(13, N) , arg_2(13, N, N))
            ╰──╮              │          ╭──────╯
move_g(        13  ,          N  ,       N )  // E is a char.
               │              ╰──────────╮
move_f(        13  ,          0  ,       N )

This technique allows for more flexibility, but it's a little bit more complicated. For example, the default for the third argument is repeated three times, and the default for the second parameter (0) is in the move_f() function, not in the original move() macro.

A Microsoft quirk

The Microsoft compiler cl preprocessor might need an extra level of macro expansion to ensure everything works correctly. To stay safe, you may want to add the macro arg_x that will ensure the end value will be expanded one more time:

#define arg_x(...)           __VA_ARGS__
#define arg_0( x,...)        x
#define arg_1(_0, x,...)     x
#define arg_3(_0,_1, x, ...) x

// Example:   move(steps [, offset [, direction]]) :

#define move(...) move_f(arg_x(arg_0(__VA_ARGS__)),\
                         arg_x(arg_1(__VA_ARGS__ , 0)), \
                         arg_x(arg_1(__VA_ARGS__ , 0, 'N')), \
                        )
void move_f(int steps, int offset, char direction);

I didn't check if the latest version of cl still requires this, but arg_x is harmless and will keep you safe from possible bugs.

Counting arguments

The most flexible way to handle a variable number of arguments is to count how many of them there are.
This allows for specifying entirely different functions for each signature (with the help of _Generics, if needed).

Unfortunately, the C preprocessor still lacks a way to count the number of arguments passed to a macro, and, to stay within C11 boundaries, we have to resort to some additional macros.

This is the most common way to count the number of arguments (up to a maximum of 4):

#define ARG_CNT(_1,_2,_3,_4,_N, ...) _N
#define ARG_COUNT(...)   ARG_CNT(__VA_ARGS__, 4, 3, 2, 1, 0)

For example, the macro ARG_COUNT(a,b,c,d) will be expanded to:

ARG_COUNT(a, b, c, d)

  ARG_CNT(a, b, c, d, 4, 3, 2, 1, 0)
          │  │  │  │  │
         _1,_2,_3,_4,_N
                      |
                      4

The macro ARG_COUNT(a,b) will be expanded to:

ARG_COUNT(a, b)

  ARG_CNT(a, b, 4, 3, 2, 1, 0)
          │  │  │  │  │
         _1,_2,_3,_4,_N
                      |
                      2

As long as there are fewer than 5 arguments, _N will match the number of passed arguments.

The limit here is that we can only count up to a predefined number (4 in the example above) but this is not a real limitation: the number of arguments of a function should not be too high, and, in any case, you can extend the list of numbers in ARG_COUNT() and the list of arguments in ARG_CNT() to match your needs.

Now we need to map our function so that, for example, move(14) maps to move_1(14), and move(14,5) to move_2(14,5):

#define ARG_JOIN(x ,y)   x ## y
#define ARG_VRG(x, y)    ARG_JOIN(x, y)

The macro ARG_VRG() creates a new identifier from two pieces. The process needs to be completed with the help of the ARG_JOIN() to ensure proper macro expansion. For example, the expansion of ARG_VRG(move_,2) is the identifier move_2.

Let's put everything together:

#define ARG_CNT(_1,_2,_3,_4,_N, ...) _N
#define ARG_COUNT(...)   ARG_CNT(__VA_ARGS__, 4, 3, 2, 1, 0)
#define ARG_JOIN(x ,y)   x ## y
#define ARG_VRG(x, y)    ARG_JOIN(x, y)

#define N ((char)'N')
#define S ((char)'S')
#define E ((char)'E')
#define W ((char)'W')

// A specific macro for each function is needed to avoid possible conflicts with other functions using `ARG_VRG()`.
#define move_arg(f,...) ARG_VRG(f, ARG_COUNT(__VA_ARGS__))(__VA_ARGS__)

#define move(...)       move_arg(move_, __VA_ARGS__)     
#define move_1(s)       move_f(x,0,N)
#define move_2(s, t)   _Generic( (t) \
                            int: move_f(s, (int)(t), N), \
                           char: move_f(s, 0, (char)(t))  \
                        )
#define move_3(s,t,d)   move_f(s,t,d)
void move_f(int steps, int offset, char direction);

Multiuse example

Let's look at another example around the following object:

  typedef struct {
    int x;
    int y;
    int RGB;
  } obj_t;

Let's focus on the setters/getters for RGB, we want to be able to:

Get the current value of the RGB field
Set the RGB value passing an integer (e.g. 0xFFE0A0)
Set the RGB value passing the three components R, G, and B separately
Set the RGB value passing the name of the color (e.g. "White", "Pink", ...)

Normally, one would create the following API:

  int get_rgb(obj_t *obj);
  int set_rgb_from_int(obj_t *obj, int rgb );
  int set_rgb_from_str(obj_t *obj, char *);
  int set_rgb_from_rgb(obj_t *obj, int r, int g, int b );

and will call the appropriate function depending on what we need.

But let's say we want to create a sort of "super function" that can do all of the above:

  int obj_rgb(obj_t *obj, ...);

and will select the proper function for us. Here's how we can make it:

#define ARG_CNT(_1,_2,_3,_4,_N, ...) _N
#define ARG_COUNT(...)   ARG_cnt(__VA_ARGS__, 4, 3, 2, 1, 0)
#define ARG_JOIN(x ,y)   x ## y
#define ARG_VRG(x, y)    ARG_JOIN(x, y)

#define obj_rgb_arg(f,...)  ARG_VRG(f, ARG_COUNT(__VA_ARGS__))(__VA_ARGS__)

#define obj_rgb(...)        obj_rgb_arg(obj_rgb_, __VA_ARGS__)     

#define obj_rgb_1(o)        get_rgb(o)

#define obj_rgb_2(o, c)    _Generic( (c) \
                                 int: obj_rgb_from_int(o, (int)((uintptr_t)(t))), \
                              char *: obj_rgb_from_str(o, (char *)((uintptr_t)(t))), \
                            )

#define obj_rgb_4(o,r,g,b)  obj_rgb_from_rgb(o,r,g,b)

Armed with these, we can write something like:

   int old rgb = obj_rgb(my_obj);

   obj_rgb(my_obj, "Black");

   obj_rgb(my_obj, 255,255,255);

   obj_rgb(my_obj, old_rgb);

Don't be distracted by the fact that mixing getters and setters might be seen as bad design. The point here is to show what is possible to do, the actual API design must follow sane principles of usability, understandability, and maintainability.

Zero arguments

All the techniques above have one thing in common: they assume there is at least one argument. Sometimes it would be nice to be able to call f() or f(x), but it is not very common.

My vrg library allows for zero arguments in a C11-compatible way, you can have a look at it if you are interested.

However, I feel it is too complicated and, frankly, the cases for zero arguments are very few.

The actual solution for handling zero arguments is to use the new __VA_OPT__ keyword introduced by C23 (and already available in some form as an extension in gcc, for example).

Let's wait for C23 support to be more widespread in C compilers and we'll revisit these techniques.

Conclusion

We showed what C allows us to do about polymorphism. As it often happens with C, it only offers the basic building blocks, it's up to the programmer to put them together in the correct way.

Mixing the techniques described here to handle function signatures and the techniques described in the previous article, you can achieve most of the (good) things modern object oriented languages can do.

Just one caveat about complexity. It's far too easy to be tempted to write long, complex macros just to allow for some little nicer function definition. One should always balance the complexity of the code with the actual benefit it provides.

Remember: since there is no bug in non-existent code, the less the code the better!

If you find yourself staring at a series of complicated, indirect macro expansions, you might have gone too far with your polymorphic desires.

DEV Community