Linear scan register allocation on SSA
48 points
5 days ago
| 3 comments
| bernsteinbear.com
| HN
kragen
1 day ago
[-]
Oh man, block-argument-SSA makes so much more sense to me than Φ-SSA. That alone is going to change my life. Thank you, Max!

A lot of this process is the same as graph-coloring register allocation on SSA, of which I found an extensive explanation in Appel's textbook. I think it's maybe time for me to go back to it and do the exercises so I really understand it. The book unfortunately predates the linear allocator age.

The link to https://brrt-to-the-future.blogspot.com/2019/03/reverse-line..., which is a learner's introduction to the LuaJIT reverse linear scan allocator, also seems valuable.

reply
xorvoid
19 hours ago
[-]
Because block-argument is actually the “right way”. The phi-node is very un-natural.. in the same sort of way that pi should have been tau.

Also, notice the connection here between Phi nodes and Continuation Passing Style (CPS). It because obvious with the block-arg form because it’s just the same thing. Jumps to blocks are just calls that don’t return.

reply
kragen
15 hours ago
[-]
Yes, I'd never understood what people meant when they said that SSA was the same thing as CPS, but with block-argument SSA it's clear.
reply
o11c
1 day ago
[-]
> In fact, you might even consider not allocating a register greedily. What might that look like? I have no idea.

One case I'm aware of: if your ISA supports arbitrary memory operands like x86, rarely-used variables can be operated-on entirely on the stack. Historically this was something ICC did better than GCC, though it became much less relevant with the shift to 64-bit bringing more registers.

reply
fanf2
1 day ago
[-]
Most x86 instructions only support one memory operand so you can’t completely avoid register allocation. It isn’t a full-on hardcore CISC like 68k or VAX.
reply
fuhsnn
1 day ago
[-]
x86 memory operands are also nice for binary size, my amateur compiler sometimes produces smaller objects than peers despite having no load-store elimination, all by greedily opting for memory operands during instruction selection; kinda amusing 'cause I'm now worrying my future optimization backend might actually cause a regression in size.
reply
mtklein
1 day ago
[-]
I found this especially nice working with large SIMD constants.
reply
fuhsnn
5 days ago
[-]
Another great overview of Go compiler's register allocation: https://developers.redhat.com/articles/2024/09/24/go-compile...
reply