> Option 1) seemed like the easiest one, but it also felt a bit like kicking the can down the road – plus, it introduced the question of which standard to use.
Arguably, that's the sanest one: you can't expect the old C code to follow the rules of the new versions of the language. In a better world, each source file would start with something like
#pragma lang_ver stdc89
and it would automatically kick off the compatibility mode in the newer compilers, but oh well. Even modern languages such as Go miss this obvious solution.On the topic of the article, yeah, sticking anything other than 0 or 1 into C99 bool type is UB. Use ints.
If you’re just a packager, it’s your job to get the package to build and work correctly; for your own sanity, you should be making minimal changes to the underlying code to facilitate that. Get it building with the old language version and file a bug report.
edition =
statement. //go:build go1.18
tells the toolchain to use Go 1.18. A slightly different syntax was used prior to Go 1.17 but the feature itself has existed since Go 1.0.Well, to be pedantic, the entire point of the C standard, and the standard body, is that you should expect it to work, as long as you're working within the standard!
In any case: blindly switching what is essentially a typedef-ed int into _Bool has no business working as expected, since _Bool is a rather quirky type.
Modern compilers are bloated as hell huge things which makes it a bit impractical, but if it was a normal thing to do then we'd probably have optimized the binary sizes somewhat.
I just really like the idea of including _everything_ you need for the project. Also ensures that weird problems like this dont happen. As an extra benefit, if you included the compiler source and a bootstrapping path instead of just the latest binary, then you could easily include project specific compiler / language extensions with no extra effort.
If I were the author, I would skip the part about using the compiler explorer and reading the assembly. When I write C code, I need to satisfy the rules of the C language. Unless I’m debugging the compiler or dealing with performance issues, my experience is that reading the generated assembler and understanding it is usually a slow way of debugging. The author eventually did compile with -fsanitize=undefined but I would honestly do that first. It prints a nice error message about the exact issue as the author has shown.
Understanding what the C compiler generates is interesting, but without a corresponding intuition about the optimizer passes, such understanding is shallow and unlikely to be generalized to other problems in the future. You probably won’t even remember this the next time you debug another undefined behavior. On the other hand, if you were to know the optimizer passes employed by the compiler and could deduce this code from that, then it is a useful exercise to enhance your intuition about them.
I do agree that knowledge of compiler optimizations is really important to working this way, though you'll eventually pick them up anyway. I don't see much value in looking at -O0 or -Og disassembly. You want the strongest stuff the compiler can generate if you're going to do this, which is usually either -O3 or -Oz (both of which are strong in their own ways). -O0 disassembly is... just so much pain for so little gain. Besides, -O3 breaks more stuff anyway!
For someone without this level of experience (and who isn't interested in learning)... yeah, I can see why you'd want to do this another way. But if you've got the experience already, it's plenty fast enough.
All of that is less true in the microcontroller world where compilers change more slowly and your product will likely be locked to a specific compiler version for its entire lifecycle anyway (and certainly you don't have to worry about end users compiling with a different compiler). In that case maybe getting deeply involved in your compiler's internals makes more sense.
You'd never want (except in extreme desperation) to use this knowledge to to justify undefined behavior in your code. You use it to make sure you don't have any UB around! Strategies like "I wrote a null pointer check, why isn't it showing up anywhere in the assembly?" can be really helpful to resolve problems.
Yikes. I think this article undersells the point somewhat. This line of code undermines the type system by spraying -1's into an array of structs, so the only surprise to me is that it took this long to break.
typedef struct
{
// If false use 0 for any position.
// Note: as eight entries are available,
// we might as well insert the same name eight times.
boolean rotate;
// Lump to use for view angles 0-7.
short lump[8];
// Flip bit (1 = flip) to use for view angles 0-7.
byte flip[8];
} spriteframe_t;
which is okay to splat with all-ones as long as `boolean` is either a typedef for a fundamental type(*) or an enum – because C enums are just ints in a trenchcoat and have no forbidden bit patterns! The C99 `bool`/`_Bool` is, AFAICS, the first type in C that has fewer values than possible bit patterns.So yeah, on C99 and C++ this always had UB and could've broken at any time – though I presume compiler devs were not particularly eager to make it ill-behaved just because. But in pre-C99 it's entirely fine, and `rotate == true || rotate == false` could easily be false without UB.
---
(*) other than `char` for which setting the MSB is… not UB but also not the best idea in general.
> When boolean is an enum, the compiler does exactly what I expected – the == false condition checks if the value is equal to 0, and the == true condition checks if the value is equal to 1.
> However, when boolean is actually _Bool, the == false check is transformed into != 1, and the == true check is transformed into != 0 – which makes perfect sense in the realm of boolean logic. But it also means that for a value of 255, hilarity ensures: since 255 is neither 0 nor 1, both conditions pass!
So a value of 255 also makes both checks fail for the enum, but because of the ordering of the code, it was expected to evaluate as != false.
Had the check: ``` if (sprtemp[frame].rotate == false) ```
been written as: ``` if (sprtemp[frame].rotate != true) ```
then it would work for the `bool` type, but not the `enum` type, at least in C23 mode. Assumedly the C++ mode (effectively) treated the boolean as an enum, or possibly as `false == 0`, `true != 0`.
Seeing "== false" and variations thereof always triggers the suspicion that its author doesn't fully understand boolean expressions. I have once seen the even worse "(x == false) == true".
while(p) {
// do something with p
...
p = p->next;
}
if(err=foo()) {
printf("Error %d occurred\n", err);
...
}That would make it slightly easier to do things like memset()'ing a vector of boolean, or a struct containing a boolean like in this case. Backwards compatibility with pre-_Bool boolean expressions in C99 probably made that a non starter in any case.
Converting a 1-bit integer to a byte-sized or word-sized integer, by using the same extension rules as for any other size (i.e. by using either sign extension or zero extension), yields as the converted value for "true" either "1" for the unsigned integer interpretation or the value with all ones (i.e. "-1") for the signed integer interpretation.
So you could have "unsigned bool" and "signed bool", exactly like you have "unsigned char" and "signed char", to choose between the 2 possible representations.
Historical note: this was the case in QBasic, where true was defined as -1.
However when the use of Boolean algebra is embedded in some bigger theories, there are cases when the mapping to 0 and 1 becomes mandatory, e.g. in relationship with the theory of probabilities or with the theory of binary polynomials, where the logical operations can be mapped to arithmetic or algebraic operations.
The mapping to 0 and 1 is fully exploited in APL and its derivatives, where it enables the concise writing of many kinds of conditional expressions (in a similar manner to how mask registers are used in GPUs and in AVX-512).
Yes? That's precisely what I meant when I said that the traditional presentation of mathematical logic get it wrong: it assigns 0 to FALSE and 1 to TRUE, but it can be done other way around.
> Ah-ha! The generated instructions were ever so slightly different. This would be great news, if it wasn't for me forgetting about one little detail: I have zero knowledge of x86 assembly.
Lol'd at this. I've been there: "ah hah! I found you. hrm, now what does this mean..."
TFA makes me thankful my work doesn't involve C / C++. Learning it earlier in life was enough.
some_var[1] = 500;
however, seeing the output, what was actually run must have been some_var[1] = 1;That practice misses the point of what it means for a value to be a boolean. (Even in C, if it's actually an int.)
Ah yes, obfuscation
if (sprtemp[frame].rotate == false)
Note that this is explicitly comparing two values, which is very different from checking whether a single value is true. Surely you wouldn't expect -1 == 0 to evaluate to true.I wouldn't, no - but that's exactly what's happening in the test case.
Likewise, I wouldn't expect -1 == 1 to evaluate to true, but here we are.
The strict semantics of the new bool type may very well be "correct", and the reversed-test logic used by the compiler is certainly understandable and defensible - but given the long-established practice with integer types - i.e "if(some_var) {...}" and "if(!some_var) {...}" - that non-zero is "true" and zero is "false", it's a shame that the new type is inconsistent with that.
No, but that's not what happened. What happened was that 255 == 0 was true. That's a bug.
if (something == true)
I haven't done so ever since (1997), and thus I avoid the contrary (with == false) as well, using ! instead. But I would be a lot less ashamed if I knew that there are such conditions in production software.
I would also never guess that the problem described in the article may occur...
That being said, for just testing the value, using the zero/nonzero test that every (?) cpu has is enough; I'm not sure what is achieved here with this more complex test.