> When using an Io.Threaded instance, the async() function doesn't actually do anything asynchronously — it just runs the provided function right away.
While this is a legal implementation strategy, this is not what std.Io.Threaded does. By default, it will use a configurably sized thread pool to dispatch async tasks. It can, however, be statically initialized with init_single_threaded in which case it does have the behavior described in the article.
The only other issue I spotted is:
> For that use case, the Io interface provides a separate function, asyncConcurrent() that explicitly asks for the provided function to be run in parallel.
There was a brief moment where we had asyncConcurrent() but it has since been renamed more simply to concurrent().
(To my understanding this is pretty similar to how Go solves asynchronicity, expect that in Go's case the "token" is managed by the runtime.)
Zig actually also had solved the coloring problem in the old and abandondend async-await solution because the compiler simply stamped out a sync- or async-version of the same function based on the calling context (this works because everything is a single compilation unit).
Well, yes, but in this case the colors (= effects) are actually important. The implications of passing an effect through a system are nontrivial, which is why some languages choose to promote that effect to syntax (Rust) and others choose to make it a latent invariant (Java, with runtime exceptions). Zig chooses another path not unlike Haskell's IO.
I mean, the concept of "function coloring" in the first place is itself an artificial distinction invented to complain about the incongruent methods of dealing with "do I/O immediately" versus "tell me when the I/O is done"--two methods of I/O that are so very different that it really requires very different designs of your application on top of those I/O methods: in a sync I/O case, I'm going to design my parser to output a DOM because there's little benefit to not doing so; in an async I/O case, I'm instead going to have a streaming API.
I'm still somewhat surprised that "function coloring" has become the default lens to understand the semantics of async, because it's a rather big misdirection from the fundamental tradeoffs of different implementation designs.
But most functions don't. They require some POD or float, string or whatever that can be easily and cheaply constructed in place.
I do wonder if there's more magic to it than that because it's not like that isn't trivially possible in other languages. The issue is it's actually a huge foot gun when you mix things like this.
For example your code can run fine synchronously but will deadlock asynchronously because you don't account for methods running in parallel.
Or said another way, some code is thread safe and some code isn't. Coloring actually helps with that.
There is no 'async' anywhere yet in the new Zig IO system (in the sense of the compiler doing the 'state machine code transform' on async functions).
AFAIK the current IO runtimes simply use traditional threads or coroutines with stack switching. Bringing code-transform-async-await back is still on the todo-list.
The basic idea is that the code which calls into IO interface doesn't need to know how the IO runtime implements concurrency. I guess though that the function that's called through the `.async()` wrapper is expected to work properly both in multi- and single-threaded contexts.
I meant this more as simply an analogy to the devX of other languages.
>Bringing code-transform-async-await back is still on the todo-list.
The article makes it seem like "the plan is set" so I do wonder what that Todo looks like. Is this simply the plan for async IO?
> is expected to work properly both in multi- and single-threaded contexts.
Yeah... about that....
I'm also interested in how that will be solved. RTFM? I suppose a convention could be that your public API must be thread safe and if you have a thread-unsafe pattern it must be private? Maybe something else is planned?
There's currently a proposal for stackless coroutines as a language primitive: https://github.com/ziglang/zig/issues/23446
It's valuable to library authors who can now write code that's agnostic of the users' choice of runtime, while still being able to express that asynchronicity is possible for certain code paths.
var a_future = io.async(saveFile, .{io, data, "saveA.txt"});
var b_future = io.async(saveFile, .{io, data, "saveB.txt"});
const a_result = a_future.await(io);
const b_result = b_future.await(io);
In Rust or Python, if you make a coroutine (by calling an async function, for example), then that coroutine will not generally be guaranteed to make progress unless someone is waiting for it (i.e. polling it as needed). In contrast, if you stick the coroutine in a task, the task gets scheduled by the runtime and makes progress when the runtime is able to schedule it. But creating a task is an explicit operation and can, if the programmer wants, be done in a structured way (often called “structured concurrency”) where tasks are never created outside of some scope that contains them.From this example, if the example allows the thing that is “io.async”ed to progress all by self, then I guess it’s creating a task that lives until it finishes or is cancelled by getting destroyed.
This is certainly a valid design, but it’s not the direction that other languages seem to be choosing.
Neither task future is guaranteed to do anything until .await(io) is called on it. Whether it starts immediately (possibly on the same thread), or queued on a thread pool, or yields to an event loop, is entirely dependent on the Io runtime the user chooses.
> When using an Io.Threaded instance, the async() function doesn't actually do anything asynchronously — it just runs the provided function right away. So, with that version of the interface, the function first saves file A and then file B. With an Io.Evented instance, the operations are actually asynchronous, and the program can save both files at once.
Andrew Kelley’s blog (https://andrewkelley.me/post/zig-new-async-io-text-version.h...) discusses io.concurrent, which forces actual concurrency, and it’s distinctly non-structured. It even seems to require the caller to make sure that they don’t mess up and keep a task alive longer than whatever objects the task might reference:
var producer_task = try io.concurrent(producer, .{
io, &queue, "never gonna give you up",
});
defer producer_task.cancel(io) catch {};
Having personally contemplated this design space a little bit, I think I like Zig’s approach a bit more than I like the corresponding ideas in C and C++, as Zig at least has defer and tries to be somewhat helpful in avoiding the really obvious screwups. But I think I prefer Rust’s approach or an actual GC/ref-counting system (Python, Go, JS, etc) even more: outside of toy examples, it’s fairly common for asynchronous operations to conceptually outlast single function calls, and it’s really really easy to fail to accurately analyze the lifetime of some object, and having the language prevent code from accessing something beyond its lifetime is very, very nice. Both the Rust approach of statically verifying the lifetime and the GC approach of automatically extending the lifetime mostly solve the problem.But this stuff is brand new in Zig, and I’ve never written Zig code at all, and maybe it will actually work very well.
It wouldn't be quite as bad as the perennial "I thought Go is fast why is it slow when I spawn a full goroutine and multiple channel operations to add two integers together a hundred million times" question, but it would still be a fairly expensive operation. See also the fact that Go had fairly sensible iteration semantics before the recent iteration support was added by doing a range across a channel... as long as you don't mind running a full channel operation and internal context switch for every single thing being iterated, which in fact quite a lot of us do mind.
(To optimize pure Python, one of the tricks is to ensure that you get the maximum value out of all of the relatively expensive individual operations Python does. For example, it's already handling exceptions on every opcode, so you could win in some cases by using exceptions cleverly to skip running some code selectively. Go channels are similar; they're relatively expensive, on the order of dozens of cycles, so you want to make sure you're getting sufficient value for that. You don't have to go super crazy, they're not like a millisecond per operation or something, but you do want to get value for the cost, by either moving non-trivial amount of work through them or by taking strong advantage of their many-to-many coordination capability. IO often involves moving around small byte slices, even perhaps one byte, and that's not good value for the cost. Moving kilobytes at a time through them is generally pretty decent value but not all IO looks like that and you don't want to write that into the IO spec directly.)
This allows library authors to write their code in a manner that's agnostic to the Io runtime the user chooses, synchronous, threaded, evented with stackful coroutines, evented with stackless coroutines.
Of course even if that exact queue is not itself selectable, you can still implement a Go channel with select capabilities in Zig. I'm sure one exists somewhere already. Go doesn't get access to any magic CPU opcodes that nobody else does. And languages (or libraries in languages where that is possible) can implement more capable "select" variants than Go ships with that can select on more types of things (although not necessarily for "free", depending on exactly what is involved). But it is more than a queue, which is also why Go channel operations are a bit to the expensive side, they're implementing more functionality than a simple queue.
Everything is "just another thing" if you ignore the advantage of abstraction.
Eg.
doSomethingAsync().defer
This removes stupid parentheses because of precedence rules.
Biggest issue with async/await in other languages.
Although, it does seem like dependency injection is becoming a popular trend in zig, first with Allocator and now with Io. I wonder if a dependency injection framework within the std could reduce the amount of boilerplate all of our functions will now require. Every struct or bare fn now needs (2) fields/parameters by default.
Storing interfaces a field in structs is becoming a bit of an an anti-pattern in Zig. There are still use cases for it, but you should think twice about it being your go-to strategy. There's been a recent shift in the standard library toward "unmanaged" containers, which don't store a copy of the Allocator interface, and instead Allocators are passed to any member function that allocates.
Previously, one would write:
var list: std.ArrayList(u32) = .init(allocator);
defer list.deinit();
for (0..count) |i| {
try list.append(i);
}
Now, it's: var list: std.ArrayList(u32) = .empty;
defer list.deinit(allocator);
for (0..count) |i| {
try list.append(allocator, i);
}
Or better yet: var list: std.ArrayList(u32) = .empty;
defer list.deinit(allocator);
try list.ensureUnusedCapacity(allocator, count); // Allocate up front
for (0..count) |i| {
list.appendAssumeCapacity(i); // No try or allocator necessary here
}Please, anything but a dependency injection framework. All parameters and dependencies should be explicit.
IMO every low level language's async thing is terrible and half-baked, and I hate that this sort of rushed job is now considered de rigueur.
(IMO We need a language that makes the call stack just another explicit data structure, like assembly and has linearity, "existential lifetimes", locations that change type over the control flow, to approach the question. No language is very close.)
In my current job, I mostly write (non-async) python, and I find it to be a performance footgun that you cannot trivially tell when a method call will trigger I/O, which makes it incredibly easy for our devs to end up with N+1-style queries without realizing it.
With async/await, devs are always forced into awareness of where these operations do and don’t occur, and are much more likely to manage them effectively.
FWIW: The zig approach also seems great here, as the explicit Io function argument seems likely to force a similar acknowledgement from the developer. And without introducing new syntax at that! Am excited to see how well it works in practice.
1) It tracks code property which is usually omitted in sync code (i.e. most languages do not mark functions with "does IO"). Why IO is more important than "may panic", "uses bounded stack", "may perform allocations", etc.?
2) It implements an ad-hoc problem-specific effect system with various warts. And working around those warts requires re-implementation of half of the language.
Rust could use these markers as well.
What the heck did I just read. I can only guess they confused Haskell for OCaml or something; the former is notorious for requiring that all I/O is represented as values of some type encoding the full I/O computation. There's still coloring since you can't hide it, only promote it to a more general colour.
Plus, isn't Go the go-to example of this model nowadays?
Every other example I've seen encodes the execution model in the source code.
From the article:
std.Io.Threaded - based on a thread pool.
-fno-single-threaded - supports concurrency and cancellation.
-fsingle-threaded - does not support concurrency or cancellation.
std.Io.Evented - work-in-progress [...]
Should `std.Io.Threaded` not be split into `std.Io.Threaded` and `std.Io.Sequential` instead? Single threaded is another word for "not threaded", or am I wrong here?