Zig's new plan for asynchronous programs
80 points
2 hours ago
| 13 comments
| lwn.net
| HN
AndyKelley
26 minutes ago
[-]
Overall this article is accurate and well-researched. Thanks to Daroc Alden for due diligence. Here are a couple of minor corrections:

> When using an Io.Threaded instance, the async() function doesn't actually do anything asynchronously — it just runs the provided function right away.

While this is a legal implementation strategy, this is not what std.Io.Threaded does. By default, it will use a configurably sized thread pool to dispatch async tasks. It can, however, be statically initialized with init_single_threaded in which case it does have the behavior described in the article.

The only other issue I spotted is:

> For that use case, the Io interface provides a separate function, asyncConcurrent() that explicitly asks for the provided function to be run in parallel.

There was a brief moment where we had asyncConcurrent() but it has since been renamed more simply to concurrent().

reply
ethin
24 minutes ago
[-]
One thing the old Zig async/await system theoretically allowed me to do, which I'm not certain how to accomplish with this new io system without manually implementing it myself, is suspend/resume. Where you could suspend the frame of a function and resume it later. I've held off on taking a stab at OS dev in Zig because I was really, really hoping I could take advantage of that neat feature: configure a device or submit a command to a queue, suspend the function that submitted the command, and resume it when an interrupt from the device is received. That was my idea, anyway. Idk if that would play out well in practice, but it was an interesting idea I wanted to try.
reply
NooneAtAll3
20 minutes ago
[-]
what's the point of implementing cooperative "multithreading" (coroutines) with preemptive one (async)?
reply
woodruffw
1 hour ago
[-]
I think this design is very reasonable. However, I find Zig's explanation of it pretty confusing: they've taken pains to emphasize that it solves the function coloring problem, which it doesn't: it pushes I/O into an effect type, which essentially behaves as a token that callers need to retain. This is a form of coloring, albeit one that's much more ergonomic.

(To my understanding this is pretty similar to how Go solves asynchronicity, expect that in Go's case the "token" is managed by the runtime.)

reply
flohofwoe
1 hour ago
[-]
If calling the same function with a different argument would be considered 'function coloring', every function in a program is 'colored' and the word loses its meaning ;)

Zig actually also had solved the coloring problem in the old and abandondend async-await solution because the compiler simply stamped out a sync- or async-version of the same function based on the calling context (this works because everything is a single compilation unit).

reply
adamwk
37 minutes ago
[-]
The subject of the function coloring article was callback APIs in Node, so an argument you need to pass to your IO functions is very much in the spirit of colored functions and has the same limitations.
reply
jakelazaroff
32 minutes ago
[-]
In Zig's case you pass the argument whether or not it's asynchronous, though. The caller controls the behavior, not the function being called.
reply
woodruffw
1 hour ago
[-]
> If calling the same function with a different argument would be considered 'function coloring', than every function in a program is 'colored' and the word loses its meaning ;)

Well, yes, but in this case the colors (= effects) are actually important. The implications of passing an effect through a system are nontrivial, which is why some languages choose to promote that effect to syntax (Rust) and others choose to make it a latent invariant (Java, with runtime exceptions). Zig chooses another path not unlike Haskell's IO.

reply
jcranmer
50 minutes ago
[-]
> If calling the same function with a different argument would be considered 'function coloring', than every function in a program is 'colored' and the word loses its meaning ;)

I mean, the concept of "function coloring" in the first place is itself an artificial distinction invented to complain about the incongruent methods of dealing with "do I/O immediately" versus "tell me when the I/O is done"--two methods of I/O that are so very different that it really requires very different designs of your application on top of those I/O methods: in a sync I/O case, I'm going to design my parser to output a DOM because there's little benefit to not doing so; in an async I/O case, I'm instead going to have a streaming API.

I'm still somewhat surprised that "function coloring" has become the default lens to understand the semantics of async, because it's a rather big misdirection from the fundamental tradeoffs of different implementation designs.

reply
rowanG077
1 hour ago
[-]
If your functions suddenly requires (currently)unconstructable instance "Magic" which you now have to pass in from somewhere top level, that indeed suffers from the same issue as async/await. Aka function coloring.

But most functions don't. They require some POD or float, string or whatever that can be easily and cheaply constructed in place.

reply
doyougnu
50 minutes ago
[-]
Agreed. the Haskeller in me screams "You've just implemented the IO monad without language support".
reply
dundarious
1 hour ago
[-]
There is a token you must pass around, sure, but because you use the same token for both async and sync code, I think analogizing with the typical async function color problem is incorrect.
reply
jayd16
1 hour ago
[-]
Actually it seems like they just colored everything async and you pick whether you have worker threads or not.

I do wonder if there's more magic to it than that because it's not like that isn't trivially possible in other languages. The issue is it's actually a huge foot gun when you mix things like this.

For example your code can run fine synchronously but will deadlock asynchronously because you don't account for methods running in parallel.

Or said another way, some code is thread safe and some code isn't. Coloring actually helps with that.

reply
flohofwoe
1 hour ago
[-]
> Actually it seems like they just colored everything async and you pick whether you have worker threads or not.

There is no 'async' anywhere yet in the new Zig IO system (in the sense of the compiler doing the 'state machine code transform' on async functions).

AFAIK the current IO runtimes simply use traditional threads or coroutines with stack switching. Bringing code-transform-async-await back is still on the todo-list.

The basic idea is that the code which calls into IO interface doesn't need to know how the IO runtime implements concurrency. I guess though that the function that's called through the `.async()` wrapper is expected to work properly both in multi- and single-threaded contexts.

reply
jayd16
57 minutes ago
[-]
> There is no 'async'

I meant this more as simply an analogy to the devX of other languages.

>Bringing code-transform-async-await back is still on the todo-list.

The article makes it seem like "the plan is set" so I do wonder what that Todo looks like. Is this simply the plan for async IO?

> is expected to work properly both in multi- and single-threaded contexts.

Yeah... about that....

I'm also interested in how that will be solved. RTFM? I suppose a convention could be that your public API must be thread safe and if you have a thread-unsafe pattern it must be private? Maybe something else is planned?

reply
messe
49 minutes ago
[-]
> The article makes it seem like "the plan is set" so I do wonder what that Todo looks like. Is this simply the plan for async IO?

There's currently a proposal for stackless coroutines as a language primitive: https://github.com/ziglang/zig/issues/23446

reply
rowanG077
1 hour ago
[-]
Having used zig a bit as a hobby. Why is it more ergonomic? Using await vs passing a token have similar ergonomics to me. The one thing you could say is that using some kind of token makes it dead simple to have different tokens. But that's really not something I run into often at all when using async.
reply
messe
1 hour ago
[-]
> The one thing you could say is that using some kind of token makes it dead simple to have different tokens. But that's really not something I run into often at all when using async.

It's valuable to library authors who can now write code that's agnostic of the users' choice of runtime, while still being able to express that asynchronicity is possible for certain code paths.

reply
rowanG077
1 hour ago
[-]
But that can already be done using async await. If you write an async function in Rust for example you are free to call it with any async runtime you want.
reply
messe
1 hour ago
[-]
But you can't call it from synchronous rust. Zig is moving toward all sync code also using the Io interface.
reply
amluto
1 hour ago
[-]
I find this example quite interesting:

       var a_future = io.async(saveFile, .{io, data, "saveA.txt"});
        var b_future = io.async(saveFile, .{io, data, "saveB.txt"});

        const a_result = a_future.await(io);
        const b_result = b_future.await(io);
In Rust or Python, if you make a coroutine (by calling an async function, for example), then that coroutine will not generally be guaranteed to make progress unless someone is waiting for it (i.e. polling it as needed). In contrast, if you stick the coroutine in a task, the task gets scheduled by the runtime and makes progress when the runtime is able to schedule it. But creating a task is an explicit operation and can, if the programmer wants, be done in a structured way (often called “structured concurrency”) where tasks are never created outside of some scope that contains them.

From this example, if the example allows the thing that is “io.async”ed to progress all by self, then I guess it’s creating a task that lives until it finishes or is cancelled by getting destroyed.

This is certainly a valid design, but it’s not the direction that other languages seem to be choosing.

reply
jayd16
1 hour ago
[-]
C# works like this as well, no? In fact C# can (will?) run the async function on the calling thread until a yield is hit.
reply
throwup238
20 minutes ago
[-]
So do Python and Javascript. I think most languages with async/await also support noop-ing the yield if the future is already resolved. It’s only when you create a new task/promise that stuff is guaranteed to get scheduled instead of possibly running immediately.
reply
nmilo
1 hour ago
[-]
This is how JS works
reply
messe
1 hour ago
[-]
It's not guaranteed in Zig either.

Neither task future is guaranteed to do anything until .await(io) is called on it. Whether it starts immediately (possibly on the same thread), or queued on a thread pool, or yields to an event loop, is entirely dependent on the Io runtime the user chooses.

reply
amluto
8 minutes ago
[-]
It’s not guaranteed, but, according to the article, that’s how it works in the Evented model:

> When using an Io.Threaded instance, the async() function doesn't actually do anything asynchronously — it just runs the provided function right away. So, with that version of the interface, the function first saves file A and then file B. With an Io.Evented instance, the operations are actually asynchronous, and the program can save both files at once.

Andrew Kelley’s blog (https://andrewkelley.me/post/zig-new-async-io-text-version.h...) discusses io.concurrent, which forces actual concurrency, and it’s distinctly non-structured. It even seems to require the caller to make sure that they don’t mess up and keep a task alive longer than whatever objects the task might reference:

    var producer_task = try io.concurrent(producer, .{
        io, &queue, "never gonna give you up",
    });
    defer producer_task.cancel(io) catch {};
Having personally contemplated this design space a little bit, I think I like Zig’s approach a bit more than I like the corresponding ideas in C and C++, as Zig at least has defer and tries to be somewhat helpful in avoiding the really obvious screwups. But I think I prefer Rust’s approach or an actual GC/ref-counting system (Python, Go, JS, etc) even more: outside of toy examples, it’s fairly common for asynchronous operations to conceptually outlast single function calls, and it’s really really easy to fail to accurately analyze the lifetime of some object, and having the language prevent code from accessing something beyond its lifetime is very, very nice. Both the Rust approach of statically verifying the lifetime and the GC approach of automatically extending the lifetime mostly solve the problem.

But this stuff is brand new in Zig, and I’ve never written Zig code at all, and maybe it will actually work very well.

reply
et1337
2 hours ago
[-]
I’m excited to see how this turns out. I work with Go every day and I think Io corrects a lot of its mistakes. One thing I am curious about is whether there is any plan for channels in Zig. In Go I often wish IO had been implemented via channels. It’s weird that there’s a select keyword in the language, but you can’t use it on sockets.
reply
jerf
1 hour ago
[-]
Wrapping every IO operation into a channel operation is fairly expensive. You can get an idea of how fast it would work now by just doing it, using a goroutine to feed a series of IO operations to some other goroutine.

It wouldn't be quite as bad as the perennial "I thought Go is fast why is it slow when I spawn a full goroutine and multiple channel operations to add two integers together a hundred million times" question, but it would still be a fairly expensive operation. See also the fact that Go had fairly sensible iteration semantics before the recent iteration support was added by doing a range across a channel... as long as you don't mind running a full channel operation and internal context switch for every single thing being iterated, which in fact quite a lot of us do mind.

(To optimize pure Python, one of the tricks is to ensure that you get the maximum value out of all of the relatively expensive individual operations Python does. For example, it's already handling exceptions on every opcode, so you could win in some cases by using exceptions cleverly to skip running some code selectively. Go channels are similar; they're relatively expensive, on the order of dozens of cycles, so you want to make sure you're getting sufficient value for that. You don't have to go super crazy, they're not like a millisecond per operation or something, but you do want to get value for the cost, by either moving non-trivial amount of work through them or by taking strong advantage of their many-to-many coordination capability. IO often involves moving around small byte slices, even perhaps one byte, and that's not good value for the cost. Moving kilobytes at a time through them is generally pretty decent value but not all IO looks like that and you don't want to write that into the IO spec directly.)

reply
osigurdson
1 hour ago
[-]
At least Go didn't take the dark path of having async / await keywords. In C# that is a real nightmare and necessary to use sync over async anti-patterns unless willing to re-write everything. I'm glad Zig took this "colorless" approach.
reply
rowanG077
1 hour ago
[-]
Where do you think the Io parameter comes from? If you change some function to do something async and now suddenly you require an Io instance. I don't see the difference between having to modify the call tree to be async vs modifying the call tree to pass in an Io token.
reply
messe
1 hour ago
[-]
Synchronous Io also uses the Io instance now. The coloring is no longer "is it async?" it's "does it perform Io"?

This allows library authors to write their code in a manner that's agnostic to the Io runtime the user chooses, synchronous, threaded, evented with stackful coroutines, evented with stackless coroutines.

reply
rowanG077
1 hour ago
[-]
Rust also allows writing async code that is agnostic to the async runtime used. Subsuming async under Io doesn't change much imo.
reply
ecshafer
1 hour ago
[-]
Have you tried Odin? Its a great language thats also a “better C” but takes more Go inspiration than Zig.
reply
kbd
1 hour ago
[-]
One of the harms Go has done is to make people think its concurrency model is at all special. “Goroutines” are green threads and a “channel” is just a thread-safe queue, which Zig has in its stdlib https://ziglang.org/documentation/master/std/#std.Io.Queue
reply
jerf
1 hour ago
[-]
A channel is not just a thread-safe queue. It's a thread-safe queue that can be used in a select call. Select is the distinguishing feature, not the queuing. I don't know enough Zig to know whether you can write a bit of code that says "either pull from this queue or that queue when they are ready"; if so, then yes they are an adequate replacement, if not, no they are not.

Of course even if that exact queue is not itself selectable, you can still implement a Go channel with select capabilities in Zig. I'm sure one exists somewhere already. Go doesn't get access to any magic CPU opcodes that nobody else does. And languages (or libraries in languages where that is possible) can implement more capable "select" variants than Go ships with that can select on more types of things (although not necessarily for "free", depending on exactly what is involved). But it is more than a queue, which is also why Go channel operations are a bit to the expensive side, they're implementing more functionality than a simple queue.

reply
jeffbee
37 minutes ago
[-]
If we're just arguing about the true nature of Scotsmen, isn't "select a channel" merely a convenience around awaiting a condition?
reply
0x696C6961
1 hour ago
[-]
What other mainstream languages have pre-emptive green threads without function coloring? I can only think of Erlang.
reply
smw
1 hour ago
[-]
I'm told modern Java (loom?) does. But I think that might be an exhaustive list, sadly.
reply
femiagbabiaka
1 hour ago
[-]
Maybe not mainstream, but Racket.
reply
dlisboa
33 minutes ago
[-]
It was special. CSP wasn't anywhere near the common vocabulary back in 2009. Channels provide a different way of handling synchronization.

Everything is "just another thing" if you ignore the advantage of abstraction.

reply
LunicLynx
20 minutes ago
[-]
Pro tip: use postfix keyword notation.

Eg.

doSomethingAsync().defer

This removes stupid parentheses because of precedence rules.

Biggest issue with async/await in other languages.

reply
qudat
2 hours ago
[-]
I'm excited to see where this goes. I recently did some io_uring work in zig and it was a pain to get right.

Although, it does seem like dependency injection is becoming a popular trend in zig, first with Allocator and now with Io. I wonder if a dependency injection framework within the std could reduce the amount of boilerplate all of our functions will now require. Every struct or bare fn now needs (2) fields/parameters by default.

reply
messe
1 hour ago
[-]
> Every struct or bare fn now needs (2) fields/parameters by default.

Storing interfaces a field in structs is becoming a bit of an an anti-pattern in Zig. There are still use cases for it, but you should think twice about it being your go-to strategy. There's been a recent shift in the standard library toward "unmanaged" containers, which don't store a copy of the Allocator interface, and instead Allocators are passed to any member function that allocates.

Previously, one would write:

    var list: std.ArrayList(u32) = .init(allocator);
    defer list.deinit();
    for (0..count) |i| {
        try list.append(i);
    }
Now, it's:

    var list: std.ArrayList(u32) = .empty;
    defer list.deinit(allocator);
    for (0..count) |i| {
        try list.append(allocator, i);
    }
Or better yet:

    var list: std.ArrayList(u32) = .empty;
    defer list.deinit(allocator);
    try list.ensureUnusedCapacity(allocator, count); // Allocate up front
    for (0..count) |i| {
        list.appendAssumeCapacity(i); // No try or allocator necessary here
    }
reply
Mond_
1 hour ago
[-]
Yes, and it's good that way.

Please, anything but a dependency injection framework. All parameters and dependencies should be explicit.

reply
SvenL
2 hours ago
[-]
I think and hope that they don’t do that. As far as I remember their mantra was „no magic, you can see everything which is happening“. They wanted to be a simple and obvious language.
reply
qudat
1 hour ago
[-]
That's fair, but the same argument can be made for Go's verbose error handling. In that case we could argue that `try` is magical, although I don't think anyone would want to take that away.
reply
dylanowen
1 hour ago
[-]
This seems a lot like what the scala libraries Zio or Kyo are doing for concurrency, just without the functional effect part.
reply
Ericson2314
45 minutes ago
[-]
This is a bad explanation because it doesn't explain how the concurrency actually works. Is it based on stacks? Is there a heavy runtime? Is it stackless and everything is compiled twice?

IMO every low level language's async thing is terrible and half-baked, and I hate that this sort of rushed job is now considered de rigueur.

(IMO We need a language that makes the call stack just another explicit data structure, like assembly and has linearity, "existential lifetimes", locations that change type over the control flow, to approach the question. No language is very close.)

reply
ecshafer
2 hours ago
[-]
I like the look of this direction. I am not a fan of the `async` keyword that has become so popular in some languages that then pollutes the codebase.
reply
Dwedit
24 minutes ago
[-]
Async always confused me as to when a function would actually create a new thread or not.
reply
davidkunz
2 hours ago
[-]
In JavaScript, I love the `async` keyword as it's a good indicator that something goes over the wire.
reply
warmwaffles
2 hours ago
[-]
Async usually ends up being a coloring function that knows no bounds once it is used.
reply
amonroe805-2
2 hours ago
[-]
I’ve never really understood the issue with this. I find it quite useful to know what functions may do something async vs which ones are guaranteed to run without stopping.

In my current job, I mostly write (non-async) python, and I find it to be a performance footgun that you cannot trivially tell when a method call will trigger I/O, which makes it incredibly easy for our devs to end up with N+1-style queries without realizing it.

With async/await, devs are always forced into awareness of where these operations do and don’t occur, and are much more likely to manage them effectively.

FWIW: The zig approach also seems great here, as the explicit Io function argument seems likely to force a similar acknowledgement from the developer. And without introducing new syntax at that! Am excited to see how well it works in practice.

reply
newpavlov
1 hour ago
[-]
In my (Rust-colored) opinion, the async keyword has two main problems:

1) It tracks code property which is usually omitted in sync code (i.e. most languages do not mark functions with "does IO"). Why IO is more important than "may panic", "uses bounded stack", "may perform allocations", etc.?

2) It implements an ad-hoc problem-specific effect system with various warts. And working around those warts requires re-implementation of half of the language.

reply
echelon
1 hour ago
[-]
> Why IO is more important than "may panic", "uses bounded stack", "may perform allocations", etc.?

Rust could use these markers as well.

reply
newpavlov
1 hour ago
[-]
I agree. But it should be done with a proper effect system, not a pile of ad hoc hacks built on abuse of the type system.
reply
ecshafer
1 hour ago
[-]
Is this Django? I could maybe see that argument there. Some frameworks and ORMs can muddy that distinction. But most the code ive written its really clear if something will lead to io or not.
reply
warmwaffles
6 minutes ago
[-]
I've watched many changes over time where the non async function uses an async call, then the function eventually becomes marked as async. Once majority of functions get marked as async, what was the point of that boilerplate?
reply
debugnik
1 hour ago
[-]
> Languages that don't make a syntactical distinction (such as Haskell) essentially solve the problem by making everything asynchronous

What the heck did I just read. I can only guess they confused Haskell for OCaml or something; the former is notorious for requiring that all I/O is represented as values of some type encoding the full I/O computation. There's still coloring since you can't hide it, only promote it to a more general colour.

Plus, isn't Go the go-to example of this model nowadays?

reply
gf000
1 hour ago
[-]
Haskell has green threads. Plus nowadays Java also has virtual threads.
reply
debugnik
1 hour ago
[-]
And I bet those green threads still need an IO type of some sort to encode anything non-pure, plus usually do-syntax. Comparing merely concurrent computations to I/O-async is just weird. In fact, I suspect that even those green threads already have a "colourful" type, although I can't check right now.
reply
codr7
1 hour ago
[-]
Love it, async code is a major pita in most languages.
reply
giancarlostoro
1 hour ago
[-]
When Microsoft added Tasks / Async Await, that was when I finally stopped writing single threaded code as often as I did, since the mental overhead drastically went away. Python 3 as well.
reply
codr7
16 minutes ago
[-]
Isn't this exactly the mess Zig is trying to get out of here?

Every other example I've seen encodes the execution model in the source code.

reply
cies
1 hour ago
[-]
I like Zig and I like their approach in this case.

From the article:

    std.Io.Threaded - based on a thread pool.

      -fno-single-threaded - supports concurrency and cancellation.
      -fsingle-threaded - does not support concurrency or cancellation.

    std.Io.Evented - work-in-progress [...]
Should `std.Io.Threaded` not be split into `std.Io.Threaded` and `std.Io.Sequential` instead? Single threaded is another word for "not threaded", or am I wrong here?
reply