If everything improves over time, at some point a good chunk of tasks won’t need to be done in data centers or be subject to the whims of a few frontier AI labs.
How close are we to that? Or is my thinking flawed?
A good software to start is LM Studio [1]. Another popular alternative is Ollama [2].
A better software when you're used to it all is llama.cpp as it's usually a bit faster and more frequently updated [3].
A good place to get models is HuggingFace, particularly the Unsloth models [4]
Most popular models lately to run on "regular" gaming PC's, workstations, Macs etc are: Qwen 3.5 9b, Qwen 3.6 35B-A3B, Qwen 3.6 27B, Gemma 4.
But there are hundreds or thousands of other models and different quantizations, finetunes, etc, etc. Have fun :)
[0] https://www.reddit.com/r/LocalLLaMA/