DEV Community

Cheng Shao
Cheng Shao

Posted on

Thoughts about configure scripts and feature vectors

For a long time, I’ve been wanting to rant about stuff like configure scripts. They indirectly contribute a lot to my worktime headaches these days. Given I’m restarting personal blogging here, let’s see if I can turn that rant into a post.

What’s configure

Suppose you need to compile some software from a source tarball. The old unix tradition goes like ./configure && make && make install. Nothing fancy about the make part, but why is configure needed in the first place?

Well, configure is just a shell script that probes the build environment and generates some C header file to be included in the source code. Each project has its unique configure script that probes for different things, headers, functions, any feature that the source code needs to check for build-time existence and provide a fallback code path when it’s absent.

Say that the source code needs to call the foo function, which exists on just some of the project’s supported platforms. If configure detects foo, it’ll write a #define HAVE_FOO 1 line to the generated header. The source code can then include the auto-generated feature header, use CPP declarations like #if defined(HAVE_FOO) to decide whether function foo exists in the build environment.

configure is typically auto-generated from a template using autoconf. In some projects it can be a hand-written python script. There are also build systems like cmake that take over configure's role completely, probes the build environment on their own and generate the feature header.

Anyway, my rants here are only related to the idea of build-time feature detection, and irrelevant to how configure is actually implemented (although that’s also annoying enough for it’s own blog post)

What’s a feature vector

How many HAVE_ macros do you have in your project?

~/ubuntu/ghc$ grep -rIF HAVE_ | wc -l
842
Enter fullscreen mode Exit fullscreen mode

Wait a sec. Most of those should be mere duplications, for instance, HAVE_FOO is very likely to occur in multiple source locations. One should really check how many features (headers, functions, etc) are checked by configure.

~/ubuntu/ghc$ grep -rIF AC_CHECK_ | wc -l
171
Enter fullscreen mode Exit fullscreen mode

The number above is a lower estimation, since in autoconf, a single AC_CHECK_HEADERS or AC_CHECK_FUNCS line can check multiple entities.

Now, we can introduce the concept of a “feature vector”: an N-dimentional boolean vector, where N corresponds to the number of things you’re checking at build-time. Each value of the feature vector is a point in the feature space, specifying a build-time configuration.

How large is the feature space?

  • Definitely not as large as 2^N. Most of the dimensions aren’t orthogonal, one can imagine clusters of things that either exist as a whole, or not exist at all.
  • Still, way larger than the the space where people actually test on CI to avoid bit-rotting.

My rant is the second point above.

Why the rant

  • In GHC, one can pass various configure arguments to enable/disable features like unreg codegen, large address space, native IO manager, etc. The default configuration passes the test suite, but once you start messing with configure config, expect failed test cases. At least those cases should be explicitly marked fragile/broken in those configurations!
  • In GHC, unix, probably other places I’ve hacked and forgotten: the API evolves, but people forgot to update the code in the #else -guarded parts, because it’s not tested on CI, maybe that particular CPP-checked thing is thought to exist on all platforms. Well, WASI is a rather restricted platform, so those bitrotten parts all come back to bite you when I target WASI.

There’s nothing to blame for the need to write portable code and do build-time feature detection. We all know untested code is bad, but much fewer people are aware: untested feature vectors are also bad!

Also, this isn’t just a matter of “code coverage”. It’s perfectly possible to achieve a high coverage rate by testing against just a few feature vectors, while leaving the potentially broken build-time configurations in the dark.

How to solve it

Most software written with tons of #ifdef lack the feature vector mindset, and don’t have the testing logic to:

  • Perform QuickCheck-style random testing in the feature space. Generate a feature vector, run the tests against it, and “shrinking” is just moving the point closer to the known-to-work base point, the default config you get when configuring on a typical platform without any custom arguments. This allows discovering failures that arise from complex & unintended interactions between different dimensions in the feature vector.
  • Hide certain existing auto-detected features. This allows testing for restricted/exotic platforms, while still running the tests on a common platform. Not trivial to implement, especially for things in standard libraries, but it should be possible by using poison pragma, creating cc wrappers, or even isolated sysroots.

Top comments (0)