FilterHN

Async/Await on the GPU

59 points

by Philpax

1 hour ago

| past

| 5 comments

| vectorware.com

| HN

▲

zozbot234

59 minutes ago

[-]

I'm not quite seeing the real benefit of this. Is the idea that warps will now be able to do work-stealing and continuation-stealing when running heterogenous parallel workloads? But that requires keeping the async function's state in GPU-wide shared memory, which is generally a scarce resource.

▲

LegNeato

51 minutes ago

[-]

Yes, that's the idea.

GPU-wide memory is not quite as scarce on datacenter cards or systems with unified memory. One could also have local executors with local futures that are `!Send` and place in a faster address space.

▲

Arch485

10 minutes ago

[-]

Very cool!

Is the goal with this project (generally, not specifically async) to have an equivalent to e.g. CUDA, but in Rust? Or is there another intended use-case that I'm missing?

▲

shayonj

1 hour ago

[-]

Very cool to see this and something I have been curious about myself and exploring the space as well. I'd be curious what are some parallels and differentiations between this and NVIDIA's stdexec (outside of it being in Rust and using Future, which is also cooL)

▲

textlapse

46 minutes ago

[-]

What's the performance like? What would the benefits be of converting a streaming multiprocessor programming model to this?

▲

LegNeato

38 minutes ago

[-]

We aren't focused on performance yet (it is often workload and executor dependent, and as the post says we currently do some inefficient polling) but Rust futures compile down to state machines so they are a zero-cost abstraction.

The anticipated benefits are similar to the benefits of async/await on CPU: better ergonomics for the developer writing concurrent code, better utilization of shared/limited resources, fewer concurrency bugs.

▲

firefly2000

42 minutes ago

[-]

Is this Nvidia-only or does it work on other architectures?

▲

LegNeato

38 minutes ago

[-]

Currently NVIDIA-only, we're cooking up some Vulkan stuff in rust-gpu though.

▲

monster_truck

16 minutes ago

[-]

I don't have anything to offer but my encouragement, but there are _dozens_ of ROCm enjoyers out there.

In years prior I wouldn't have even bothered, but it's 2026 and AMD's drivers actually come with a recent version of torch that 'just works' on windows. Anything is possible :)