FilterHN

testdelacc1

2 days ago

[-]

LLVM is the code generation backend used in several languages, like Rust and one of the many compilers for C and C++ (clang). Code generated by these compilers is considered “fast/performant” thanks to LLVM.

The problem with LLVM has always been that it takes a long time to produce code. The post in the link promises a new backend that produces a slower artifact, but does so 10-20x quicker. This is great for debug builds.

This doesn’t mean the compilation as a whole gets quicker. There are 3 steps in compilation

- Front end: transforms source code into an LLVM intermediation representation (IR)

- Backend: this is where LLVM comes in. It accepts LLVM IR and transforms it into machine code

- Linking: a separate program links the artifacts produced by LLVM.

How long does each step take? Really depends on the program we’re trying to compile. This blog post contains timings for one example program (https://blog.rust-lang.org/2023/11/09/parallel-rustc/) to give you an idea. It also depends on whether LLVM is asked to produce a debug build (not performant, but quicker to produce) or a release build (fully optimised, takes longer).

The 10-20x improvement described here doesn’t work yet for clang or rustc, and when it does it will only speed up the backend portion. Nevertheless, this is still an incredible win for compile times because the other two steps can be optimised independently. Great work by everyone involved.

2 days ago

[-]

In terms of runtime performance, the TPDE-generated code is comparable with and sometimes a bit faster than LLVM -O0.

I agree that front-ends are a big performance problem and both rustc and Clang (especially in C++ mode) are quite slow. For Clang with LLVM -O0, 50-80% is front-end time, with TPDE it's >98%. More work on front-end performance is definitely needed; maybe some things can be learned from Carbon. With mold or lld, I don't think linking is that much of a problem.

We now support most LLVM-IR constructs that are frequently generated by rustc (most notably, vectors). I just didn't get around to actually integrate it into rustc and get performance data.

> The 10-20x improvement described here doesn’t work yet for clang

Not sure what you mean here, TPDE can compile C/C++ programs with Clang-generated LLVM-IR (95% of llvm-test-suite SingleSource/MultiSource, large parts of the LLVM monorepo).

2 days ago

[-]

IMO the worst problem with LLVM isn't that it's slow, the worst problem is that its IR has poorly defined semantics or its team doesn't actually deliver those semantics and a bug ticket saying "Hey, what gives?" goes in the pile of never-never tickets, making it less useful as a compiler backend even if it was instant.

This is the old "correctness versus performance" problem and we already know that "faster but wrong" isn't meaningfully faster it's just wrong, anybody can give a wrong answer immediately and so that's not at all useful.

1 day ago

[-]

Can you point to cases where you feel this has caused harm that you feel outweighs the collective time people spend waiting for LLVM builds?

randomNumber7

2 days ago

[-]

What is the alternative though for a new language though? Transpiring to C or hacking s.th. by using the GCC backend?

2 days ago

[-]

The easy alternative? There isn't one.

The really difficult thing would be to write a new compiler backend with a coherent IR that everybody understands and you'll stick to. Unfortunately you can be quite certain that after you've done the incredible hard work to build such a thing, a lot of people's assessment of your backend will be:

1. The code produced was 10% slower than LLVM, never use this, speed is all that matters anyway and correctness is irrelevant.

2. This doesn's support the Fongulab Splox ZV406 processor made for six years in the 1980s, whereas LLVM does, therefore this is a waste of time.

derefr

2 days ago

[-]

> The really difficult thing would be to write a new compiler backend with a coherent IR that everybody understands and you'll stick to.

But why would you bother, when with those same skills and a lot less time, you could fork LLVM, correct its IR semantics yourself (unilaterally), and then push people to use your fork?

(I.e. the EGCS approach to forcing the upstream to fix their shit.)

> This doesn's support the Fongulab Splox ZV406 processor made for six years in the 1980s, whereas LLVM does, therefore this is a waste of time.

AFAIK, the various Fongulab Sploxes that LLVM has targets for, are mostly there to act as forcing functions to keep around features that no public backend would otherwise rely on, because proprietary, downstream backends rely on those features. (See e.g. https://q3k.org/lanai.html — where the downstream ISA of interest is indeed proprietary, but used to be public before an acquisition; so the contributor [Google] upstreamed an implementation of the old public ISA target.)

2 days ago

[-]

Thanks for the link about Lanai although that site's cert has expired (very recently too) so it's slightly annoying (or of course bad guys are attacking me)

As to the first point, I suspect this is a foundational problem. Like, suppose you realise the concrete used to make a new skyscraper was the wrong mixture. In a sense this is a small change, there's nothing wrong with the elevators, the windows, cabling, furnishing, air conditioning, and so on. But, to "fix" this problem you need to tear down the skyscraper and replace it. Ouch.

I may be wrong, I have never tried to solve this problem. But I fear...

mamcx

2 days ago

[-]

Ok, but then if this were done, then you could also emit LLVM after. It probably get worse timings, but, allow to make the transition palatable

IshKebab

2 days ago

[-]

Or native code generation. Depends on what your performance goals are. It would be cool if there was a standard IR that languages could target - something more suitable than C.

2 days ago

[-]

Being pursued since UNCOL in 1958, each attempt eventually only works out for a specific set of languages, due to politics or market forces.

ahartmetz

2 days ago

[-]

Hm. SPIR-V is a standard IR, but AFAIU not really the kind of IR that you need for communication inside a compiler. It wasn't designed for that.

rafaelmn

2 days ago

[-]

WASM ?

IshKebab

1 day ago

[-]

It's probably not too bad of an option these days tbf. Are there any optimising WASM compilers that can get close to native performance?

nerpderp82

2 days ago

[-]

WASM !

2 days ago

[-]

Produce a dumb machine code quality, enough to bootstrapt it, and go from there.

Move away from classical UNIX compiler pipelines.

However in current times, I would rather invest into LLM improvements into generating executables directly, the time to mix AI into compiler development has come, and classical programming languages are just like doing yet another UNIX clone, in terms of value.

taminka

2 days ago

[-]

mm, a non deterministic compiler with no way to verify correctness, what could go wrong lol

2 days ago

[-]

Ask C and C++ developers, they are used to it, and still plenty of critical software keeps being written with them.

1 day ago

[-]

Excellent point, undefined behavior is exactly like an LLM. Surely this is what “alignment” in the standard is talking about.

taminka

2 days ago

[-]

C and C++ compilers are deterministic and have guarantees of correctness similar to that of other languages (esp ones that share share the same llvm backend)

Kranar

2 days ago

[-]

C++ compilers are not required to be deterministic and in practice are not, at least as far as "same source code produces same observable behavior". Things that can introduce non-determinism include the order in which symbols are linked, static variable initialization, floating point operations (unless you use strict mode, which is not mandated by the standard), and this is ignoring the obvious stuff like unspecified behavior which is specifically defined as behavior which can differ between different runs on the same system.

Also correctness guarantees? Hahaha... I'll pretend you didn't just claim C++ has correctness guarantees on par with other languages, LLVM or otherwise. C++ gives you next to nothing with respect to correctness guarantees.

2 days ago

[-]

Provided you're clever enough to avoid UB land mines, and compiler specific implementation non portable behaviours.

ObscureScience

2 days ago

[-]

[flagged]

tomhow

2 days ago

[-]

Please don't do this here. If a comment seems unfit for HN, please flag it and email us at hn@ycombinator.com so we can have a look.

testdelacc1

2 days ago

[-]

No it wasn’t. There’s no way I can prove that it wasn’t.

But I can prove that this comment wasn’t LLM generated -> fuck you.

(LLMs don’t swear)

2 days ago

[-]

Maybe HN should add "Don't accuse comments of being LLM generated" to the guidelines, because this sure seems like it'll be in the same category as people moaning that they were downvoted or more closely people saying "Have you read the link?"

tomhow

2 days ago

[-]

We've talked about this but we're not adding it to the guidelines. It's already covered indirectly by the established guidelines, and "case law" (in the form of moderator replies) makes it explicit.

testdelacc1

2 days ago

[-]

I feel like a fuck you to the accuser is sufficient. It proves that you’re not an LLM and is a reasonable response to an unfounded accusation.

LLMs decline when asked to say fuck you. Gemini: “I am unable to respond to that request.” Claude: “I’d rather not use profanity unprompted.”

But allowing a fuck you would need a modification to the rules anyway, I suppose.

PoignardAzur

4 days ago

[-]

It feels sometimes that compiler backend devs haven't quite registered the implications of the TDPE paper.

As far as I can tell it gets pareto improvements far above LLVM, Cranelift, and any WebAssembly backend out there. You'd expect there to be a rush to either adopt their techniques or at least find arguments why they wouldn't work for generalist use cases, but instead it feels like maintainers of the above project have absolutely no curiosity about it.

pertymcpert

2 days ago

[-]

There's no free lunch. I don't know where you're getting this "Pareto improvements" thing from because it's a much more constrained codegen framework than LLVM's backend. It supports a much smaller subset, and real world code built by LLVM will use a lot of features like vectors even at -O0 for things like intrinsics. There's a maintenance cost to having a completely separate path for -O0.

2 days ago

[-]

Most contributions are driven by university students papers that somehow managed to get merged into LLVM, there is hardly a product manager driving its roadmap, thus naturally not everything gets the same attention span.

https://github.com/llvm/llvm-project/graphs/contributors

almostgotcaught

2 days ago

[-]

I see you comment here all the time about LLVM and I'm pretty sure you have no idea how LLVM works (technically or organizationally).

Yes there's no roadmap but it's flat out wrong that most contributions are from university students because hardly any contributions are from university students. It's actually an issue because we should be more supportive of student contributors! You can literally look at the contributors on GitHub and see that probably the top 100 are professionals contributing as part of their dayjob:

2 days ago

[-]

I know enough of what is public information, scattered around LLVM website, public forums, published papers, podcast interviews, LLVM meetings recordings and slides.

Thanks for pointing out that is a pile of useless knowledge, maybe I should waste my time reading other stuff more appropriate.

1 day ago

[-]

None of that stuff points to LLVM being driven by university students. Most of that content is written directly by people paid to work on the project by their employer.

1 day ago

[-]

Yet most presentations at LLVM Meetings show otherwise, on the list of authors or collaborators.

Should we go through an agenda?

almostgotcaught

1 day ago

[-]

Hmm I wonder though if the most canonical was to see who contributes to LLVM is the git blame or the list of presenters at a tenuously related dev conf. Hmmmmmmmmmm.

1 day ago

[-]

Sure

gbbcf

2 days ago

[-]

Care to enlighten us? Maybe it's just you overestimating the implications

webdevver

4 days ago

[-]

whats "the TDPE paper"?

https://arxiv.org/abs/2505.22610

webdevver

4 days ago

[-]

found it

https://docs.tpde.org/

https://fredrikbk.com/publications/copy-and-patch.pdf

procrast33

2 days ago

[-]

I am curious why the TPDE paper does not mention the Copy-And-Patch paper. That is a technique that uses LLVM to generate a library of patchable machine code snippets, and during actual compilation those snippets are simply pasted together. In fairness, it is just a proof of concept: they could compile WASM to x64 but not C or C++.

I have no relation to the authors.

[1]: https://home.cit.tum.de/~engelke/pubs/2403-cc.pdf

2 days ago

[-]

There's a longer paragraph on that topic in Section 8. We also previously built an LLVM back-end using that approach [1]. While that approach leads to even faster compilation, run-time performance is much worse (2.5x slower than LLVM -O0) due to more-or-less impossible register allocation for the snippets.

debugnik

2 days ago

[-]

> run-time performance is much worse (2.5x slower than LLVM -O0)

How come? The Copy-and-Patch Compilation paper reports:

> The generated code runs [...] 14% faster than LLVM -O0.

I don't have time right now to compare your approach and benchmark to theirs, but I would have expected comparable performance from what I had read back then.

2 days ago

[-]

The paper is rather selective about the used benchmarks and baselines. They do two comparisons (3 microbenchmarks and a re-implementation of a few (rather simple) database queries) against LLVM -- and have written all benchmarks themselves through their own framework. These benchmarks start from their custom AST data structures and they have their own way of generating LLVM-IR. For the non-optimizing LLVM back-end, the performance obviously strongly depends on the way the IR is generated -- they might not have put a lot of effort into generating "good IR" (=IR similar to what Clang generates).

The fact that they don't do a comparison against LLVM on larger benchmarks/functions or any other code they haven't written themselves makes that single number rather questionable for a general claim of being faster than LLVM -O0.

t0b1

2 days ago

[-]

This is in relation to their TPCH benchmark which can be due to a variety of reasons. My guess would be that they can generate stencils for whole operators which can be transformed into more efficient code at stencil generation time while LLVM-O0 gets the operator in LLVM-IR form and can do no such transformation. Though I can't verify this because their benchmark setup seems a bit more involved.

When used in a C/C++ compiler the stencils correspond to individual (or a few) LLVM-IR instructions which then leads to bad runtime performance. Also as mentioned, on larger functions register allocation becomes a problem for the Copy-and-Patch approach.

PoignardAzur

1 day ago

[-]

Wait, so what's the difference between TDPE and Copy-and-Patch?

I thought they used the same technique (pre-generating machine code snippets in a high-level language)?

procrast33

2 days ago

[-]

Apologies! I did do a text search, but in pdfs... I should have known better.

Your work is greatly appriciated. With unit tests everywhere, faster compiling is more important than ever.

SleepyMyroslav

1 day ago

[-]

Judging from a lot of points people find -O0 code useful. May I ask if someone from those people tell me how you find it useful? The question is based on following experience: if we have lots of C++ code then all of the std lib and ours abstractions are becoming zero cost only after inlining. Inlining implies at least -O1. Why people even build -O0 for large projects? And if a project is not large one then build times should not be that much of a problem.

[1]: https://www.computer.org/csdl/magazine/so/2023/04/10176199/1...

1 day ago

[-]

In AoT compilation, unoptimized code is primarily useful for debugging and short compile-test round trips. Your point on C++ is correct, but test workloads are typically small so the cost is often tolerable and TPDE also supports -O1 IR -- nothing precludes using an -O0 back-end with optimized IR, so if performance is relevant for debugging/testing, there's still a measurable compile-time improvement. (Obviously, with -O1 IR, the TPDE-generated code is ~2-3x slower than the code from the LLVM-O1-back-end; but it's still better than using unoptimized IR. It might also be possible to cut down the -O1 pass pipeline to passes that are actually important for performance.)

In JIT compilation, a fast baseline is always useful. LLVM is obviously not a great fit (the IR is slow to generate and inspect), but for projects that don't want to roll their own IR and use LLVM for optimized builds anyway, this is an easy way to drastically reduce the startup latency. (There is a JIT case study showing the overhead of LLVM-IR in Section 7/Fig. 10 in the paper.)

> And if a project is not large one then build times should not be that much of a problem.

I disagree -- I'm always annoyed when my builds take longer than a few seconds, and typically my code changes only involve fewer compilation units than I have CPU cores (even when working on LLVM). There's also this study [1] from Google, which claims that even modest improvements in build times improve productivity.

SleepyMyroslav

1 day ago

[-]

I mean my colleagues work hard to keep our build times around 3 minutes for full build of multimillion lines of C++ code that can be rebuilt and same or few times bigger code that is prebuilt but provides tons of headers. If I was constantly annoyed by build times longer than few seconds I probably would have changed my career path couple decades ago xD.

I am both hands for faster -O1 build times though. Point taken.

1 day ago

[-]

Debug builds don’t need to be fast. Or, at least, the benefits you get from faster compilation or better symbols outweigh the downsides.

cout

1 day ago

[-]

They do need to not be slow though, which is why I usually stick with -Og.

1 day ago

[-]

-Og trashes debug info

leoh

4 days ago

[-]

I'm glad they mentioned this. When I use Gentoo, though, always compile with -O8 for extreme performance.

wmf

4 days ago

[-]

I thought everybody moved to Arch. I hear they patched the compiler to go to -O11.

jeffreygoesto

4 days ago

[-]

Beware, this only works outside of Scotland!

leoh

4 days ago

[-]

Smart!!

derefr

2 days ago

[-]

For a moment I misread this, and thought LLVM had introduced an optimization flag specifically for when your code will be used in backend software.

(Which... that seems like an interesting idea, now that I think of it. What, if anything, could such a flag hint the compiler to do?)

yjftsjthsd-h

2 days ago

[-]

> What, if anything, could such a flag hint the compiler to do?

Favor throughput over latency?

syrusakbary

4 days ago

[-]

This is awseome, I didn't know this was possible and may replace the need of using Cranelift for having fast builds of Wasm bytecode into assembly in Wasmer.

taeric

2 days ago

[-]

Does TPDE stand for anything? I clicked through quite a ways, and didn't find anything.

baq

2 days ago

[-]

great news! but linking...

ahartmetz

2 days ago

[-]

There are lld and mold for that.

woadwarrior01

2 days ago

[-]

I hope this trickles down to the Swift compiler, sometime soon.

Agingcoder

2 days ago

[-]

Turbo pascal 3 is back !

twothreeone

2 days ago

[-]

That rare Comex comment.

davikr

1 day ago

[-]

What does TPDE stand for?

CyberDildonics

1 day ago

[-]

This seems like one of those times where an acronym is used over and over but everyone forgets to define it.

What does TPDE stand for? Even on their own pages, their github, their docs, their landing pages, they never define it.

SkiFire13

4 days ago

[-]

> We support a typical subset of LLVM-IR

I wonder what such a "typical" subset is. How exotic should something be to not work with it?

3 days ago

[-]

The documentation has a list of currently unsupported features: https://docs.tpde.org/tpde-llvm-main.html

pella

2 days ago

[-]

old discussion: https://news.ycombinator.com/item?id=45084111 ( 2 days ago )

aw1621107

2 days ago

[-]

Interestingly, mpweiher actually submitted first. If you hover over the timestamp for this post you'll see 2025-08-30T06:55:30, while the "old" discussion has a timestamp of 2025-08-31T15:50:35. This particular post got put into the second chance pool [0] and I'm guessing HN's backend decided that now is the time to put it on the front page.

[0]: news.ycombinator.com/pool