The sameple code in the article very much runs on consumer GPUs - I can run it on my mobile GPU(RTX 4050) without any issues. Really, it should work on any Volta+ architecture on NVIDIA(that is what we tested).
And no, there is no extension for that - and that is exactly the thing we have developed. The hostcall layer allows the GPU to invoke more or less arbitrary CPU functions, and get data from them.
IO, Networking, timers, etc - all of that can be implemented by calling CPU helpers from the GPU. As long as you write a small bit of boilerplate code, you should be able to add hostcalls for more or less anything you can do on the CPU.