PS: I think MCP/Tool Calls are a boondoggle and LLMs yearn to just run code. It's crazy how much better this works than JSON schema etc.
(Off-topic AMA question: Did you see my voxel grid visibility post?)
We use a ton of smaller models (embeddings, vibe checks, TTS, ASR, etc) and if we had enough scale we'll try to run those locally for users that have big enough GPUs.
(You mean the voxel grid visibility from 2014?! I'm sure I did at the time... but I left MC in 2020 so don't even remember my own algorithm right now)
ONNX and DirectML seem sort of promising right now, but it's all super raw. Even if that worked, local models are bottlenecked by VRAM and that's never been more expensive. And we need to fit 6gb of game in there as well. Even if _that_ worked, we'd need to timeslice the compute inside the frame so that the game doesn't hang for 1 second. And then we'd get to fight every driver in existence :) Basically it's just not possible unless you have a full-time expert dedicated to this IMO. Maybe it'll change!
About the voxel visibility: yeah that was awesome, I remember :) Long story short MC is CPU-bound and the frustum clippings' CPU cost didn't get paid off by the reduced overdraw, so it wasn't worth it. Then a guy called Jonathan Hoof rewrote the entire thing to be separated in a 360° scan done on another thread when you changed chunk + a in-frustum walk that worked completely differently, and I don't remember the details but it did fix the ravine issue entirely!
Some other cool ones I've seen: https://store.steampowered.com/app/2542850/1001_Nights/ https://www.playsuckup.com/
https://www.dexerto.com/gaming/where-winds-meet-players-are-...
https://www.rockpapershotgun.com/where-winds-meet-player-con...