Discussion on: The uninitialized variable anathema: non-deterministic C++

View post

Probably the best option would be to design language rules allowing to write to uninitialized variables, but not to read them, based on a static, compile-time analysis of the code. I'm not 100% sure but I remember C# does this. This allows to safely initialize a variable by using two different expressions in the two branches of an if-else sentence, while avoiding double initialization at run time. The compiler must tag each variable read access in the text (as legal or ilegal), by computing all the possible paths leading there and making sure each of them includes at least a write acces to the variable.

edA‑qa mort‑ora‑y • Dec 28 '17

This is not possible to do. There is no way a compiler can know, via static analysis whether a program uses an uninitialized variable or not. I believe it was proven to be an unsolvable problem, or rather an NP-complete problem over the size of the code.

In limited cases, such as local variables, some basic deducations can be done. But, since it's not possible to do fully, the compiler must err on the side of caution. It will assume all uses are invalid unless it has a simple path it can prove it was set.

For this reason you get a lot of false positive warnigns about uninitialized variables -- I see them when compling in C++ with warnings on. I recall having them in C# as well.

Carlos Ureña • Dec 28 '17

Of course, it cannot be done in the general case, you're ok. I was thinking in simple cases where the compiler can assert the variable is explicitly accessed inside its declaration scope. In fact, you can declare a variable 'v' (uninitialized) and then immediately pass its address as a parameter to a function 'f' whose source is not available to the compiler at that moment (by using something like f(&v) in C/C++ syntax). Thus the compiler cannot tell whether 'f' correctly initializes 'v' or not, or whether 'f' incorrectly reads it. In order to handle this, the language must be extended with in/out tags (and rules) for parameters. If 'p' is the single formal parameter of 'f', then 'p' must be tagged as 'in', 'out' or both in 'f' declaration. Then the call 'f(&v)' by using an uninitialized variable is correct if 'p' is tagged as 'out', but not if it is tagged as 'in/out', or 'in'. When (independently) compiling 'f', the compiler ensures 'p' is accessed legally according to its tag. This is how C# works. This still does not handles it properly when a function directly access a global variable, but we all know this side-effects must not be used.....