In Linux it'd be tough to implement this because errors are usually raised as a side effect of returning some negative value, but also because you have code like:
err = -EIO;
... nothing else sets err here ...
return err;
But instrumenting every function that returns a negative int would be impossible (and wrong). And there are also cases where the error is saved in (eg) a bottom half and returned in the next available system call.The system call wrappers could all have explicitly set errno to 0 on success, but they didn't.
Because it's plainly unnecessary. It'd be a waste today, and even more so on a PDP-11 in the 1970s.
This design choice reflects the POSIX philosophy of minimizing overhead and maximizing flexibility for system programming. Frequent calls to write(), for example, would be hindered by having to reset errno with each call/check of write() return value - especially in cases where a lot of write()'s are queued.
Or .. a library function like fopen() might internally call multiple system calls (e.g., open(), stat(), etc.). If one of these calls sets errno but the overall fopen() operation succeeds, the library doesn’t need to clear errno. For instance, fopen() might try to open a file in one mode, fail with an errno of EACCES (permission denied), then retry in another mode and succeed. The final errno value is irrelevant since the call succeeded.
This mechanism minimizes overhead by not requiring errno to be reset on success.
It allows flexible, efficient implementations of library and system calls and encourages robust programming practices by mandating return-value checks before inspecting errno.
It supports complex, retry-based logic in libraries without unnecessary state management - and it preserves potentially useful debugging information.
You only care about errno when you know an actual error occurred. Until then, ignore it.
This is similar to other systems-level things that can occur in such environments, for example when setting a hard Reset-Reason or Fail-Reason register in static/non-volatile memory somewhere, for later assessment.
IMHO, the thing thats most peculiar about this is that folks these days think of it as weird/quaint - when in fact, it makes a lot of sense if you think about it.
If its important to you to find these cases, use clang's static analyzer to perform deep source code analysis via scan-build. See also cppcheck and pc-lint, or PVS-Studio, each of which have means by which you can catch this error.
Plus, we're talking about POSIX here. You don't have a time machine. Shall we argue about just how much POSIX software is out there, working perfectly fine with this technique?
Sure, in New Fangled Language De Jour™, return as many tuples as your heart desires.
But don't expect POSIX to play along ..
I'd argue this is in spite of choosing a path that makes maintainable software more difficult than it needs to be. Constraints change over time, and the thought process that made this practice rational no longer coheres with modern-day constraints. Maintaining software is now (much, much) more expensive than the performance minutae that led to this cost.
> Sure, in New Fangled Language De Jour™, return as many tuples as your heart desires.
This isn't related to language at all—C may not have tuples, but structs are an equivalent.
I don’t really blame the people in 1970 for coming up with this design but it’s 2025 now; we can agree that it has problems. Tape recorders were also a neat idea but I can record a thousand times more on my phone now, often at higher quality. By modern standards, they suck.
That is quite expensive. Obviously you need to physically add the register to the chip.
After that the real work comes. You need to change your ISA to make the register addressible by machine code. Pdp11 had 8 general purpose registers so they used 3 bits everywhere to address the registers. Now we need 4 sometimes. Many op codes can work on 2 registers, so we need to use 8 out of 16 bits to address both where before we only needed 6. Also pdp11 had fixed 16 bits for instruction encoding so either we change it to 18 bit instructions or do more radical changes on the ISA.
This quickly spirals into significant amounts of work versus encoding results and error values into the same register.
Classic worse is better example.
There are quite a few registers (in all the ISAs I'm familiar with) that are defined as not preserved across calls; kernels already have to wipe them in order to avoid leaking kernel-specific data to userland, one of them could easily hold additional information.
EDIT: additionally, it's been a long time since the register names we're familiar with in an ISA actually matched the physical registers in a chip.
By 1983, operating system vendors designing their APIs ab initio were already making APIs that just used separate registers for error and result returns. Sinclair QDOS was one well-known example. MS-DOS version 2 might have done things the PDP-11 way, but by the time of MS-DOS version 4 people were already inventing INT calls that used multiple registers to return things. OS/2 was always returning a separate error value in 1987. Windows NT's native API has always been returning a separate NTSTATUS, not doubled up with anything else, since the 1990s.
Some kernels return error status as a CPU flag or otherwise separately from the returned value. But that's very hard to use in C, so the typical convention for a syscall wrapper is to return a non-negative number for success and -error for failure, but if negative numbers are valid as the return, you've got to do something else.
* https://jdebp.uk/FGA/function-calling-conventions.html#Watca...
I think strtol was one such function, but there were others.
Thus, to distinguish between an overflow and a legitimate maximum value, you need to set errno to 0 before calling it, because something else you called previously may have already set it to ERANGE.
* https://github.com/jdebp/nosh/blob/trunk/source/socket_conne...
kqueue() can apparently return the error right in the data of the kevent, but I'm still using poll() so cannot confirm; whilst I can confirm that kqueue/kevent is alas not as truly consistent as one might expect. (Someone recently tried to move FreeBSD devd to kqueue, and hit various problems of FreeBSD devices that are still, even in version 14, not yet kqueue-ready.)
One of the great misfortunes of traditional software [and network] design is a lack of visibility throughout the stack. The author here talks about "multiple return values", which is to find out multiple pieces of information from some other piece of code. But that code calls other code, and that other code calls more, all with its own information that might be useful for you to know.
Good software design is cohesive and loosely coupled. That means you should not know, or depend on, the internal workings and variables of some other component. But at the same time, when problems happen, it is useful to know what happened in some other component, maybe even 3 components down the line. In particular, you can usually determine the cause of failures by examining just the inputs and the outputs of a function. Examine the inputs and outputs of every function in the system, and that's enough to identify or recreate most bugs. (i/o, system resource pressure, and network interruption are the last bits of info you need, but harder to gather)
But I'm not aware of this capability (examining the input and output of components multiple levels away from the current code) existing as a software design pattern. Within one component, sure, but outside it? If you load a different component into yours, maybe it exposes attributes and whatnot to you. But what about the components that component uses? And what if we're leaving the immediate computing environment? I still want to know what was going on further down the line.
Such a solution exists within systems-of-systems, as in distributed tracing. But only to an omnipotent observer in a faraway land. I want my code to know what happened elsewhere, if only to report a more accurate error message than "500 internal server error". I can count at least 20 times in the past 6 months that I have encountered a web app whose frontend literally did nothing when I clicked a button, and only upon opening up the browser's inspection tools did I see a backend API returning an actual error message. But an equal number of times, just "500 error" or similar. I want to see "the add-user api call failed because you do not have permissions to add users", or "the server that tried to process this request ran out of disk space", and I want to press a button that automatically composes an e-mail to the company with a bug report that includes all the details. Can you build that today with existing software design?
Sure, if you spend 120 hours building the distributed tracing and observability infrastructure (or pay Datadog a quarter million for it), and 80 more hours to train the devs how to use it. But we shouldn't need infrastructure. The software can carry relevant data to-and-fro; let it carry more than just "errno".
* https://tty0.social/@JdeBP/114816928464571239
Even some of the later augmentations in MS/PC/DR-DOS did things like return an error code in AX and the result in (say) CX instead of using AX and CF.
Except for symlinks. `fgetxattr` requires a file opened for read or write, but symlinks can only be opened as `O_PATH`.
If you need thread-specific local paths just use one of the *at() variants that let you explicitly specify a directory handle that your path is relative to.