Show HN: Xr0 – Vanilla C Made Safe with Annotations (https://news.ycombinator.com/item?id=37536186, 2023-09-16, 13 comments)
Xr0 Makes C Safer than Rust (https://news.ycombinator.com/item?id=39858240, 2024-03-28, 41 comments)
Xr0: C but Safe (https://news.ycombinator.com/item?id=39936291, 2024-04-04, 144 comments)
Show HN: Xr0 is a Static Debugger for C (https://news.ycombinator.com/item?id=40472051, 2024-05-05, 4 comments)
I've had projects which stalled for a few months or even a year, but generally if I said I'll get to it "soon" and two years later I haven't, that's not getting done. There are two boxes of books I planned to unbox "soon" after discovering that the bookshelves for my new flat were full & so I had nowhere to put them. That was when Obama was still President of the United States of America. Don't expect me to ever get around to unboxing those books, I clearly haven't even missed them.
And they got stuck with the bounds checker and loops.
But other such checkers are far more advanced, with a better contract syntax.
But "We currently model all buffers as infinitely-sized. We will be adding out-of-bounds checking soon." That's the hard problem.
It's not particularly difficult for the prover. You essentially just need to do a translation from C in to your ATP language of choice, with a bunch of constraints that check undefined behaviour never occurs. Tools such as Frama-C and CBMC have supported this for a long time.
The difficult part is for the user as they need to add assertions all over the place to essentially keep the prover on the right track and to break down the program in to manageable sections. You also want a bunch of tooling to make things easier for the user, which is a problem that can be as difficult as you want it to be since there's always going to be one more pattern you can detect and add a custom handler/error message for.
I know that. Did that decades ago. I wanted to see how they did the annotations and proofs. What can you express? Partially filled arrays are moderately hard. Sparsely initialized arrays need a lot of scaffolding. This is an area where most "safe C" variants have foundered.
Ownership semantics are described in every serious C interface. Linters for checking it have also existed for decades. I find the notion that Rust invented it to be incredible stupid. Rust just has different ownership semantics and makes it an enforced part of the language (arguable a good idea). And yes they of course also do bounds-checking.
#include <stdio.h>
int main()
{
int i;
return i;
}
the behaviour is undefined because i’s value is indeterminate.
D solves that problem by initializing variables with the default initializer (in this case 0) if there is not an explicit initializer.That was in the first version of D because I have spent days tracking down an erratic bug that turned out to be an uninitialized variable.
int test()
{
int i;
return i;
}
using clang on my Mac mini, and: clang -c test.c
and it compiled without complaint. > clang -Wall test.c
test.c:4:16: warning: variable 'i' is uninitialized when used here [-Wuninitialized]
4 | return i;
| ^
test.c:3:14: note: initialize the variable 'i' to silence this warning
3 | int i;
| ^
| = 0
1 warning generated.
For the oldest compiler I have access to, VC++ 6.0 from 1998: warning C4700: uninitialized local variable 'i' usedNot really, as C has had even more diverse implementations per-standardization. I would say the situation is now, much less diverse under the rule of GCC and Clang. (Yeah MSVC also exists.)
I tried pretty hard to make D a warning-less language, but still some crept in grump grump.
Have fun with this one:
for (int i = 0; i < end; ++i);
foo(i);
One of the best programmers I know came up to me with this loop and told me my C compiler was broken because the loop was only executed once. I pointed at the ; and you can guess the rest.I added a warning for that in the C compiler, and for D simply disallowed it. I've noticed that some C compilers have since added a warning for that as well. The C folks should just make it illegal.
I've also fixed printf in D so that:
char* p;
printf("%d\n", p);
gives an error message, and the right format to use for `p`. It was a little thing, but it sure found a lot of incorrect formats in my code.I often have code, which looks like this:
for (ptr = start; random_condition (*ptr); ptr = ptr->next);
for (ptr = ptr->next; other_condition (*ptr); ptr = ptr->prev);
... [do action]
for (ptr = end; to_be_deleted (*ptr) && (delete (ptr), TRUE); ptr = ptr->prev);
I wouldn't be happy about your policy.> I've also fixed printf in D so that [...] gives an error message
Just last week I had the case that the C compiler complained, I should use %lld for long long, but the printf implementation shipped with the compiler doesn't support that. Thus, using %ld, even if undefined behaviour was the correct action. I wouldn't like my language making up more work for me for no reason.
for (ptr = start; random_condition (*ptr); ptr = ptr->next) { }
Then anyone reading your code will know the empty loop was intentional. BTW, many C compilers warn about the ; on a for loop.Have you ever discovered this bug:
if (condition);
doThis();
It's a hard one to see in a code review. Yes, that's also disallowed in D.> I should use %lld for long long, but the printf implementation shipped with the compiler doesn't support that.
Weird that your compiler would support long long, but with a broken Standard library. I don't see that as a feature. You can always cast the argument to long, and at least you won't get undefined behavior.
Could do that I just don't like the look. :-)
> Have you ever discovered this bug:
Actually no, because the coding style I use either puts it on a single line or with braces. I never indent just a single line.
So:
if (condition) doThis ();
or: if (condition)
{
doThis ();
}
> Weird that your compiler would support long long, but with a broken Standard library.Yeah, for your information it was GCC combined with newlib for arm-none-eabi shipped with Debian/MS Windows.
Those are not what I was mentioning. It was the use of ; immediately following for(). No loop body.
> Have you ever discovered this bug: [...]
Why? What does it cost you?
int x = void;
will do it.To be clear I think D default initializing is better than C leaving uninitialized. I just don't think it's optimal since the issue isn't one of convenience but rather bug prevention.
I don't dislike it per se. I think it's an improvement over not having it but I also think that it's sub-optimal in many scenarios when compared to simply failing the build. Since it's a tradeoff to prevent bugs at the expense of language brevity I don't think it should be surprising that not everyone agrees about it.
My personal preference is the "void" that D has for uninitialized in addition to empty braces or something similarly terse for default initialization.
Slight tangent but somehow C++ ended up with the worst of all possible worlds. Some types uninitialized by default, other types default initialized, terse initialization syntax variants that look highly similar but do different things, and much of it context dependent.
struct S { int a; float f; }
S s;
For C, having an error when the initializer is omitted is better than nothing. However, that is not part of the C Standard, and you'll be relying on having a C compiler with that extension. Otherwise there is undefined behavior. To do it right means it needs to be standardized.if (turing_machine_halts(tm)) return malloc(1); else return NULL;
How is this handled?
This is "valid" C, but I wholly support checking tools that reject it.
This sounds like a very simple form of abstract interpretation, how do you handle the issues it generally brings?
For example if after one branch you don't converge, but after two you do, do you accept that? What if this requires remembering the relationship between variables? How do you represent these relationships?
Historically this has been a tradeoff between representing the state space with high or perfect precision, which however can require an exponential amount of memory/calculations, or approximate them with an abstract domain, which however tend to lose precision after performing certain operations.
Add dynamically sized values (e.g. arrays) and loops/recursion and now you also need to simulate a possibly unbounded number of iterations.
you should refactor so that it's representable.
> Add dynamically sized values (e.g. arrays) and loops/recursion and now you also need to simulate a possibly unbounded number of iterations.
regions are hard. You kinda have to reject regions that are not uniform. loops you can find a fixpoint for.
In those cases you generally try to rewrite it in another way
It's odd that so many people promote rust, yet we don't even use static analysis and validators for c or C++.
How about enforcing coding standards automatically first, before switching to a new language?
Rust restricts the shape of program you are able to write so that it's possible to statically guarantee memory safety.
> Does it require annotations or can it validate any c code?
If you had clicked through you would see that it requires annotations.
C and C++ don't require static analysis, and it's difficult to set up, and so most of us slide down the incentive gradient of using C / C++ without any helpers except CMake and gdb.
Rust requires it, so the noobies use it, so in 40 years the experts will accept it.
Is it though? I've only ever had to run "scan-build make" for my projects and it spits out a full folder of HTML pages that details any static analysis issues, and I didn't have to change my build system at all.
This version of Qt has leaks on exit so you need to ignore them when running asan/valgrind etc...
I agree it's not that hard and should be standard, same regarding enabling all warnings that are reasonable and enable warnings as errors.
There is some use, how much I don't know. I guess it should be established best practice by now. Also run test suites with valgrind.
Historically many of the C/C++ static analyzers were proprietary. I haven't checked lately but I think Coverity was (is?) free for open source projects.