This is not targeted at consumers. It’s competing with nVidia’s high RAM workstation cards. Think $10K price range, not $1-2K.
The 160GB of LPDDR5X chips alone is expensive enough that they couldn’t release this at the $2K price point unless they felt like giving it away (which they don’t)
Any higher and its not really a disruption
They made a dent in the HPC market / Top500 with intel MAX.
It will be interesting to see if they can make a dent in the AI inference market (presumably datacenter/enterprise).
If the internal bus architecture is anything similar to QPI, getting the 'different' parts to communicate reliably is probably also a pain.
Intel for intel on your Intels, perhaps.
Prices are set by what the market will bear, not the lowest possible price where they could break even on the BOM and manufacturing costs.
The high cost of the LPDDR5X should be a clue that this is going to be in the $10K range, not the $2K range.
That don't run CUDA?
https://en.wikipedia.org/wiki/TMS34010
> The TMS34010 can execute general purpose programs and is supported by an ANSI C compiler.
> The successor to the TMS34010, the TMS34020 (1988), provides several enhancements including an interface for a special graphics floating point coprocessor, the TMS34082 (1989). The primary function of the TMS34082 is to allow the TMS340 architecture to generate high quality three-dimensional graphics. The performance level of 60 million vertices per second was advanced at the time.
Like these, there were several others other the IBM PC own history.
Not, as I assume you mean, vector graphics like SVG, and renderers like Skia.
Only in consumer market - which is why GeForce 256 release had the game devs with engines using GL smug for immediately benefiting from hardware T&L which was the original function of earlier GPUs (to the point that more than one "3D GPU" was an i860 or few with custom firmware and some DMA glue to do... mostly vector ops on transforms (and a bit of lighting, as a treat).
The consumer PC market looked differently because games wanted textures, and the first truly successful 3D accelerator was 3Dfx Voodoo which was essentially a rasterizer chip and texture mapping chip, with everything else done on CPU.
Fully programmable GPUs were also a thing in the 2D era, with things like TIGA, where at least one package I heard of pretty much implemented most of the X11 on the GPU.
This was of course all driven by what the market demanded. Original "GPUs" were driven by the needs of professional work like CAD, military, etc. where most of the time you were operating in wireframe and using gouraud/phong shaded triangles was for fancier visualizations.
Games on the other hand really wanted textures (though limitations of consoles like PSX meant that some games were mostly simple colour shaded triangles, like Crash Bandicoot), offloading of which was major improvement for gaming.
Yeah, I remember all the hype about the first Nvidia chip that offloaded “T&L” from the CPU.
Per-pixel matmul (which is what you really need for anything resembling GPGPU) came with Shader Model 2.0, circa 2002; Radeon 9700, the GeForce FX series and the likes. CUDA didn't exist (nor really any other form of compute shaders), but you could wrangle it with pixel shaders, and some of us did.
Granted, if you didn't have the “squared blend” extension, it would be an approximation, but still a pretty convincing one.
the chips are so valuable now NVIDIA will end up owning a chunk of every major tech company, everyone is throwing cash and shares at them as fast as they can.
First, they're not even an also-ran in the AI compute space. Nobody is looking to them for roadmap ideas. Intel does not have any credibility, and no customer is going to be going to Nvidia and demanding that they match Intel.
Second, what exactly would the competitors react to? The only concrete technical detail is that the cards will hopefully launch in 2027 and have 160GB of memory.
The cost of doing this is really low, and the value of potentially getting into the pipeline of people looking to buy data center GPUs in 2027 soon enough to matter is high.
Samples of new products also have to go out to third party developers and reviewers ahead of time so that third party support is ready for launch day and that stuff is going to leak to competitors anyway so there's little point in not making it public.
The other thing is enterprise sales is ridiculously slow. If Intel wants corporate customers to buy these things, they've got to announce them ~a year ahead, in order for those customers to buy them next year when they upgrade hardware.
Then of course Linux took over everywhere except the desktop.
But then Linux on that same commodity hardware was lower yet.
Semiconductors are like container ships, they are extremely slow and hard to steer, you plan today the products you'll release in 2030.
Intel has practically nothing to show for an AI capex boom for the ages. I suspect that Intel is talking about it early for a shred of AI relevance.
Not release anything?
There'll be a good market share for comparatively "lower power/ good enough" local AI. Check out Alez Ziskind's analysis of the B50 Pro [0]. Intel has an entire line-up of cheap GPUs that perform admirably for local use cases.
This guy is building a rack on B580s and the driver update alone has pushed his rig from 30 t/s to 90 t/s. [1]
0: https://www.youtube.com/watch?v=KBbJy-jhsAA
1: https://old.reddit.com/r/LocalLLaMA/comments/1o1k5rc/new_int...
Yeah even RTX’s are limited in this space due to lack of tensor cores. It’s a race to integrate more cores and faster memory buses. My suspicion is this is more me too product announcement so they can play partner to their business opportunities and continue greasing their wheels.
If you're planning a supercomputer to be built in 2027, you want to look at what's on the roadmap.
Stock number go up
The public co valuations of quickly depreciating chip hoarders selling expensive fever dreams to enterprises are gonna pop though.
Spend 3-7 USD for 20 cents in return and 95% project failures rates for quarters on end aren't gonna go unnoticed on Wall St.
As for efficiency, replacing one programmer in group of 10 with AI already will increase productivity and lower the price. In most cases. In reality adding AI accounts to existing group works better. This is _now_, not hopes or sci-fi.
That's why I'm saying there is no way back. 'AI winter' is as likely as smartphones winter.
But that's the foundation.
And there is a plateau in real money spent on AI chips.
You're ignoring a whole group of economic and finance professionals as well as - if you're inclined to listen to their voices more - Sama calling it a bubble.
If not for AI spending, the US already would be in a recession.
So your argument might sound nice and practical from a purely scientific perspective or the narrow use case of AI coding support, but it's entirely detached from reality.
Career finance professionals are calling it a bubble, not due to their suddenly found deep technological expertise, but because public cos like FAANG et. al are engaging in typical bubble like behavior: Shifting capex away from their balance sheets into SPACs co-financed by private equity.
This is not a consumer debt bubble, it's gonna be a private market bubble.
But as all bubbles go, someones gonna be left holding the bag with society covering for the fallout.
It'll be a rate hike, it'll be some Fortune X00 enterprises cutting their non-ROI-AI-bleed or it'll be an AI-fanboy like Oracle over-leveraging themselves and then watching their credit default swaps going "Boom!" leading to a financing cut off.
...and again, this is assuming AI capability stops growing exponentially in the widest possible sense (today, 50%-task-completion time horizon doubles ~7 months).
This won’t be in the price range of an old Dell server or a fun impulse buy for a hobbyist. 160GB of raw LPDDR5X chips alone is not cheap.
This is a server/workstation grade card and the price is going where the market will allow. Consider that an nVidia card with almost half the RAM is going to cost $8K or more. That price point is probably the starting point for where this will be priced, too.
(My guess is Intel's card is only going to have about 400 GB/s bandwidth.)
https://www.linkedin.com/posts/storagereview_storagereview-a...
Makes me wonder whether Gelsinger put all this in motion, or if the new CEO lit a fire under everyone. Kinda a shame if it's the former...
Whatever happened with new products today must've been started before he left.
I assume that hasn't changed.
Local inference is an interesting proposition because today in real life, the NV H300 and AMD MI-300 clusters are operated by OpenAI and Anthropic in batching mode, which slows users down as they're forced to wait for enough similar sized queries to arrive. For local inference, no waiting is required - so you could get potentially higher throughput.
Or, to be more specific, what is the speed when your GPU is out of RAM and it's reading from main memory over the PCI-E bus?
PCI-E 5.0: 64GB/s @ 16x or 32GB/s @ 8x 2x 48GB (96GB) of DDR5 in an AM5 rig: ~50GB/s
Versus the ~300GB/s+ possible with a card like this, it's a lot faster for large 'dense' models. Yes, even an NVIDIA 3090 is ~900GB/s of bandwidth, but it's only 24GB, so even a card like this Xe3P is likely to 'win' because of the higher memory available.
Even if it's 1/3rd of the speed of an old NVIDIA card, it's still 6x+ the speed of what you can get in a desktop today.
How is this better?
To me, the price point is what matters. It's going to be slow with ddr5. The 5090 today is much faster. But sure big ram.
RTX pro 6000 with 96gb of ram will be much faster.
So I'm thinking price point is below the 6000, above the 5090.
It’s gonna be slowwww
It’s gonna be what, 273GB/sec vram bandwidth at most? Might as well as buy an AND 395+ 128GB right now for the same inference performance and slightly less VRAM.
If its fast LPDDR5x (9600 MT/s) with 512 bit bus width (8 64bit channels (actually multiples of quad 16 bit subchannel nonsense)) it could be upwards of 600 GB/s. Lots of bandwidth like the beefy macs have.
For context: if you have a 160GB dense ML model in VRAM and you're just running 600GB/sec, you can do... roughly 4 tokens per second AT BEST. That massive amount of VRAM is unusable if it's slow.
2. 512 bit LPDDR5x is most likely just 512GB/sec with typical LPDDR5x that's not overly expensive. I would be HIGHLY surprised if they gave it the more expensive RAM that'd break 600GB/sec. The Intel B60 is at 456 GB/s and that's using GDDR6.
Honestly, you're better off waiting for regular DDR6 to come out in a year and just build a system using that.
What's the point of this card that's going to be released around the same time as DDR6, and DDR6 will be faster? Might as well as use cheaper system RAM if you system RAM is slower than VRAM.