It’s a CPU designed for an AI cluster. Their last CPU Grace was the same thing and no one called it agentic.
Vera now just has more performance/more bandwidth. It’s cool, I’d like to have one of these clusters, but this is not new.
It’s marketed as agentic AI because that’s fashionable in 2026.
https://www.redpanda.com/blog/nvidia-vera-cpu-performance-be...
Is Apple complicit in killings because operators planned missions on Macbooks? Dell? Microsoft?
Xeons, Epycs, whatever this is - they are all also typically optimized for power efficiency. That's how they can fit so many CPU cores in 200-300W.
If they're going to build CPUs I wish they had used Risc-V instead. They are using it somewhat already.
The CPU is integrated with two Rubin GPUs but each of the CPU cores has dedicated FP8 acceleration as well.
1. https://www.nvidia.com/en-us/data-center/vera-rubin-nvl72/
It's somewhat different from how x86 chips do simultaneous multithreading (SMT),
It's quite impressive what purpose build inference can/will do once everyone stops trying to become kind of the best model.
x86 and Apple already sell CPUs with integrated memory and high bandwidth interconnects. And I bet eventually Intel's beancounter board will wake up and allow engineering to make one, too.
But competition is good for the market.
Nah, as so-called "analysts" expected. The no-effort crybabies deriding Apple for being "behind on AI" have turned out to be, shocker of shockers, wrong. Anyone who even put a few minutes of thought into Apple's business realized that it (and its customers) didn't stand to benefit much from "AI."
It's sad that Apple hurried to pander to these clowns, only to be derided further... and to encounter the appropriate apathy from customers, who were and are doing just fine without asinine "AI" gimmicks.
In any case, that article is also looking forward to next-gen models like the sparse Gemini model Google trained for Siri. Apple Silicon simply isn't powerful enough to compete for that inference.
AFAIK they still dominate on clock rate, which I was surprised to see when doing some back of the envelope calculations regarding core counts.
I felt my 8 core i9 9900K was inadequate, so shopped around for something AMD, and IIRC the core multiplier of the chip I found was dominated by the clock rate multiplier so it’s possible that at full utilization my i9 is still towards the best I can get at the price.
Not sure if I’m the typical consumer in this case however.
>8 Cores and 16 processing threads, based on AMD "Zen 5" architecture
which is the same thread geometry as my 9900K.
My main concerns at the time were:
1. More cores for running large workloads on k8s since I had just upgraded to 128G RAM
2. More thread level parallelism for my C++ code
Naively I thought that, ceteris paribus and assuming good L1 cache utilization, having more physical cores with a higher clock rate would be the ticket for 2.
Does the 9800X3D have a wider pipeline or is it some other microarchitectural feature that makes it faster?
You can research microarchitecture differences if you want, it's a fascinating world, or you can just skip to looking at benchmarks/reviews. Little hard to compare against quite that large of a generation gap, but eg https://gamersnexus.net/cpus/rip-intel-amd-ryzen-7-9800x3d-c... or https://www.phoronix.com/review/amd-ryzen-7-9800x3d-linux/2
The problem is not that gaming GPUs are in demand, it’s that selling silicon to AI center buildouts is so absurdly profitable right now they just aren’t making many gaming GPUs.
If you can only get so many mm^2 of dies from TSMC, might as well make 50x selling to AI providers.
Can someone explain what is Vera CPU doing that a traditional CPU doesn't?
Cursor seem to be doing exactly that though
I did see they have the unified CPU/GPU memory which may reduce the cost of host/kernel transactions especially now that we're probably lifting more and more memory with longer context tasks.
(Could be both)
So they make inference cheaper and the models get even worse. Or Jensen Huang has AI psychosis. Or both.
Here is a new business idea for Nvidia: Give me $3000 in a circular deal which I will then spend on a graphics card.
I keep expecting we see fabric gains, see something where the host chip has a better way to talk to other host chips.
It's hard to deny the advantages of central switching as something easy & effective to build, but reciprocally the amazing high radix systems Google has been building have just been amazing. Microsoft Mia 200 did a gobsmacking amount of Ethernet on chip 2.8Tbps, but it's still feels so little, like such a bare start. For reference pcie6 x16 is a bit shy of 1Tbps, vaguely ~45 ish lanes of that.
It will be interesting to see what other bandwidth massive workloads evolve over time. Or if this throughout era all really ends up serving AI alone. Hoping CXL or someone else slims down the overhead and latency of attachment, soon-ish.
Maia 200: https://www.techpowerup.com/345639/microsoft-introduces-its-...
Once you need to reach beyond L2/L3 it is often the case that perfectly viable experiments cannot be executed in reasonable timeframes anymore. The current machine learning paradigm isn't that latency sensitive, but there are other paradigms that can't be parallelized in the same way and are very sensitive to latency.
From the "fridge purpose-built for storing only yellow tomatoes" and "car only built for people whose last name contains the letter W" series.
When can this insanity end? It is a completely normal garden-variety ARM SoC, it'll run Linux, same as every other ARM SoC does. It is as related to "Agentic $whatever" as your toaster is related to it
These things have hardware FP8 support, and a 1.8TB/s full mesh interconnect between CPUs and GPUs. We can argue about the "agentic" bit, but those are features that don't really matter for any workload other than AI.
To mis-quote the politician quip:
How can you tell a marketer is lying?
Answer: His/her mouth is moving.
you lost me here but still got my upvote. Tauri and Electron are pretty much the same, compared to local-first vs cloud SaaS.
Seems like a triumph of hype over reality.
China can do breathless hype just as well as Nvidia.
Wanted to do general purpose stuff? Too bad, we watched the price of everything up, and then started producing only chips designed to run “ai” workloads.
Oh you wanted a local machine? Too bad, we priced you out, but you can rent time with an ai!
Feels like another ratchet on the “war on general purpose computing” but from a rather different direction.