BarraCUDA Open-source CUDA compiler targeting AMD GPUs
180 points
6 hours ago
| 15 comments
| github.com
| HN
h4kunamata
4 hours ago
[-]
>Requirements

>A will to live (optional but recommended)

>LLVM is NOT required. BarraCUDA does its own instruction encoding like an adult.

>Open an issue if theres anything you want to discuss. Or don't. I'm not your mum.

>Based in New Zealand

Oceania sense of humor is like no other haha

The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.

The cheer amount of knowledge required to even start such project, is really something else, and prove the manual wrong on the machine language level is something else entirely.

When it comes to AMD, "no CUDA support" is the biggest "excuse" to join NVIDIA's walled garden.

Godspeed to this project, the more competition the less NVIDIA can continue destroying the PC parts pricing.

reply
querez
4 hours ago
[-]
> The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.

The project owner is talking about LLVM,a compiler toolkit, not an LLM.

reply
kmaitreys
2 hours ago
[-]
It's actually quite easy to spot if LLMs were used or not.

Very few total number of commits, AI like documentation and code comments.

But even if LLMs were used, the overall project does feel steered by a human, given some decisions like not using bloated build systems. If this actually works then that's great.

reply
butvacuum
2 hours ago
[-]
Since when is squashing noisesum commits an AI activity instead of good manners?
reply
natvert
2 hours ago
[-]
Says the clawdbot
reply
wild_egg
3 hours ago
[-]
This project very most definitely has significant AI contributions.

Don't care though. AI can work wonders in skilled hands and I'm looking forward to using this project

reply
ZaneHam
2 hours ago
[-]
Hello! I didn't realise my project was posted here but I can actually answer this.

I do use LLM's (specifically Ollama) particularly for test summarisation, writing up some boilerplate and also I've used Claude/Chatgpt on the web when my free tier allows. It's good for when I hit problems such as AMD SOP prefixes being different than I expected.

reply
magicalhippo
3 hours ago
[-]
> Oceania sense of humor is like no other haha

Reminded me of the beached whale animated shorts[1].

[1]: https://www.youtube.com/watch?v=ezJG0QrkCTA&list=PLeKsajfbDp...

reply
ekianjo
1 hour ago
[-]
LLVM, nothing to do with LLMs
reply
colordrops
1 hour ago
[-]
I'm still blown away that AMD hasn't made it their top priority. I've said this for years. If I was AMD I would spend billions upon billions if necessary to make a CUDA compatibility layer for AMD. It would certainly still pay off, and it almost certainly wouldn't cost that much.
reply
ddtaylor
1 hour ago
[-]
AMD did hire someone to do this and IIRC he did, but they were afraid of Nvidia lawyers and he released it outside of the company?
reply
andy_ppp
1 hour ago
[-]
Moving target, honestly just get PyTorch working fully (loads of stuff just doesn’t work on AMD hardware) and also make it work on all graphics cards from a certain generation. The matrix of support needed GFX cards, architectures and software together is quite astounding but still yes that should have at least that working and equivalent custom kernels.
reply
dboreham
1 hour ago
[-]
Unrelated: just returned from a month in NZ. Amazing people.
reply
samrus
3 hours ago
[-]
> >LLVM is NOT required. BarraCUDA does its own instruction encoding like an adult.

> The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.

"Has tech literacy deserted the tech insider websites of silicon valley? I will not beleove it is so. ARE THERE NO TRUE ENGINEERS AMONG YOU?!"

reply
bigyabai
3 hours ago
[-]
> and prove the manual wrong on the machine language level

I'll be the party pooper here, I guess. The manual is still right, and no amount of reverse-engineering will fix the architecture AMD chose for their silicon. It's absolutely possible to implement a subset of CUDA features on a raster GPU, but we've been doing that since OpenCL and CUDA is still king.

The best thing the industry can do is converge on a GPGPU compute standard that doesn't suck. But Intel, AMD and Apple are all at-odds with one another so CUDA's hedged bet on industry hostility will keep paying dividends.

reply
piker
4 hours ago
[-]
> # It's C99. It builds with gcc. There are no dependencies.

> make

Beautiful.

reply
parlortricks
3 hours ago
[-]
You gotta love it, simple and straight to the point.
reply
esafak
4 hours ago
[-]
Wouldn't it funny and sad if a bunch of enthusiasts pulled off what AMD couldn't :)
reply
h4kunamata
3 hours ago
[-]
Many projects turned out to be far better than proprietary because open-source doesn't have to please shareholders.

What sucks is that such projects at some point become too big, and make so much noise forcing big techs to buy them and everybody gets fuck all.

All it requires to beat proprietary walled garden, is somebody with knowledge and a will to make things happen. Linus with git and Linux is the perfect example of it.

Fun fact, BitKeeper said fuck you to the Linux community in 2005, Linus created git within 10 days.

BitKeeper make their code opensource in 2016 but by them, nobody knew who they were lol

So give it time :)

reply
bri3d
3 hours ago
[-]
The lack of CUDA support on AMD is absolutely not that AMD "couldn't" (although I certainly won't deny that their software has generally been lacking), it's clearly a strategic decision.

Supporting CUDA on AMD would only build a bigger moat for NVidia; there's no reason to cede the entire GPU programming environment to a competitor and indeed, this was a good gamble; as time goes on CUDA has become less and less essential or relevant.

Also, if you want a practical path towards drop-in replacing CUDA, you want ZLUDA; this project is interesting and kind of cool but the limitation to a C subset and no replacement libraries (BLAS, DNN, etc.) makes it not particularly useful in comparison.

reply
enlyth
2 hours ago
[-]
Even disregarding CUDA, NVidia has had like 80% of the gaming market for years without any signs of this budging any time soon.

When it comes to GPUs, AMD just has the vibe of a company that basically shrugged and gave up. It's a shame because some competition would be amazing in this environment.

reply
cebert
1 hour ago
[-]
What about PlayStation and Xbox? They use AMD graphics and are a substantial user base.
reply
ekianjo
1 hour ago
[-]
Because AMD has the APU category that mixes x86_64 cores with powerful integrated graphics. Nvidia does not have that.
reply
fdefitte
2 hours ago
[-]
Agreed on ZLUDA being the practical choice. This project is more impressive as a "build a GPU compiler from scratch" exercise than as something you'd actually use for ML workloads. The custom instruction encoding without LLVM is genuinely cool though, even if the C subset limitation makes it a non-starter for most real CUDA codebases.
reply
guerrilla
3 hours ago
[-]
> couldn't

More like wouldn't* most of the time.

Well isn't that the case with a few other things? FSR4 on older cards is one example right now. AMD still won't officially support it. I think they will though. Too much negativity around it. Half the posts on r/AMD are people complaining about it.

reply
DiabloD3
3 hours ago
[-]
Because FSR4 is currently slower on RDNA3 due to lack of support of FP8 in hardware, and switching to FP16 makes it almost as slow as native rendering in a lot of cases.

They're working the problem, but slandering them over it isn't going to make it come out any faster.

reply
guerrilla
3 hours ago
[-]
> Because FSR4 is currently slower on RDNA3 due to lack of support of FP8 in hardware, and switching to FP16 makes it almost as slow as native rendering in a lot of cases.

It works fine.

> They're working the problem, but slandering them over it isn't going to make it come out any faster.

You have insider info everyone else doesn't? They haven't said any such thing yet last I checked. If that were true, they should have said that.

reply
wmf
3 hours ago
[-]
We have HIP at home.
reply
skipants
47 minutes ago
[-]
Perusing the code, the translation seems quite complex.

Shout out to https://github.com/vosen/ZLUDA which is also in this space and quite popular.

I got Zluda to generally work with comfyui well enough.

reply
ByThyGrace
2 hours ago
[-]
How feasible is it for this to target earlier AMD archs down to even GFX1010, the original RDNA series aka the poorest of GPU poor?
reply
monster_truck
43 minutes ago
[-]
Don't let anyone dissuade you, it's going to be annoying but it can be done. When diffusion was new and rocm was still a mess I was manually patching a lot to get a vii, 1030, then 1200 working well enough.

It's a LOT less bad than it used to be, amd deserves serious credit. Codex should be able to crush it once you get the env going

reply
bravetraveler
4 hours ago
[-]
> No HIP translation layer.

Storage capacity everywhere rejoices

reply
whizzter
5 hours ago
[-]
Not familiar with CUDA development, but doesn't CUDA support C++ ? Skipping Clang/LLVM and going "pure" C seems to be quite limiting in that case.
reply
hackyhacky
3 hours ago
[-]
Honestly I'm not sure how good is LLVM's support for AMD GX11 machine code. It's a pretty niche backend. Even if it exists, it may not produce ideal output. And it's a huge dependency.
reply
bri3d
59 minutes ago
[-]
Quite good, it’s first party supported by AMD (ROCm LLVM, with a lot upstreamed as well) where it’s fairly widely used in production.

This project is a super cool hobby/toy project but ZLUDA is the “right” drop in CUDA replacement for almost any practical use case.

reply
h4kunamata
3 hours ago
[-]
Real developer never depended on AI to write good quality code, in fact, the amount of slope code flying left and right is due to LLM.

Open-source projects are being inundated with PR from AIs, not depending on them doesn't limit a project.

That project owner seems pretty knowledgeable of what is going on and keeping it free of dependencies is not an easy skill. Many developers would have written the code with tons of dependency and copy/paste from LLM. Some call the later coding :)

reply
gsora
3 hours ago
[-]
LLVM and LLM are not the same thing
reply
brookman64k
3 hours ago
[-]
LLVM (Low Level Virtual Machine) != LLM (Large Language Model)
reply
gzread
2 hours ago
[-]
Nice! It was only a matter of time until someone broke Nvidia's software moat. I hope Nvidia's lawyers don't know where you live.
reply
yodon
4 hours ago
[-]
<checks stock market activity>
reply
phoronixrly
4 hours ago
[-]
Putting a registered trademark in your project's name is quite a brave choice. I hope they don't get a c&d letter when they get traction...
reply
cadamsdotcom
4 hours ago
[-]
Maybe a rename to Barra. Everyone will still get the pun :)
reply
HenrikB
4 hours ago
[-]
... or Baccaruda or Baba-rara-cucu-dada (https://youtu.be/2tvIVvwXieo)
reply
dboreham
2 hours ago
[-]
Or bacaruda.
reply
bee_rider
2 hours ago
[-]
I wonder if they could change the name to Barracuda if pressed. The capitalization is all that keeps it from being a normal English word, right?
reply
Alifatisk
4 hours ago
[-]
Are you thinking of Seagate Barracuda?
reply
adzm
3 hours ago
[-]
They mean the CUDA part
reply
latchkey
2 hours ago
[-]
Note that this targets GFX11, which is RDNA3. Great for consumer, but not the enterprise (CDNA) level at all. In other words, not a "cuda moat killer".
reply
gclawes
3 hours ago
[-]
What's the benefit of this over tinygrad?
reply
bri3d
3 hours ago
[-]
Completely different layer; tinygrad is a library for performing specific math ops (tensor, nn), this is a compiler for general CUDA C code.

If your needs can be expressed as tensor operations or neural network stuff that tinygrad supports, might as well use that (or one of the ten billion other higher order tensor libs).

reply
7speter
1 hour ago
[-]
Will this run on cards that don’t have ROCM/latest ROCM support? Because if not, its only gonna be a tiny subset of a tiny subset of cards that this will allow cuda to run on.
reply
sam_goody
4 hours ago
[-]
Wow!! Congrats to you on launch!

Seeing insane investments (in time/effort/knowledge/frustration) like this make me enjoy HN!!

(And there is always the hope that someone at AMD will see this and actually pay you to develop the thing.. Who knows)

reply
latchkey
2 hours ago
[-]
See also: https://scale-lang.com/

Write CUDA code. Run Everywhere. Your CUDA skills are now universal. SCALE compiles your unmodified applications to run natively on any accelerator, ending the nightmare of maintaining multiple codebases.

reply