I haven't used a non-laptop GPU in some time, but that is a crazy amount of "idle" power consumption. Is this normal for cards like this?
For server gear it’s more common to have less dynamic power and voltage switching because it produces more predictable performance and latency.
I could certainly come up with alternative theories about memory compression and prefetching if we were talking about texture reads.
[0] - https://stackoverflow.com/questions/11227809/why-is-conditio...
GPUs do branch prediction? I thought they didn't bother and try to minimize wasted effort by using high amounts of concurrent threads?
~257.5 teraflops for normal distribution, versus ~268 teraflops uniform, reported on the first graph.
I would have liked to see a straight graph of performance vs. clock speed, for each type of data. Pick your data statistics, then pick the peak performance clock speed accordingly.
And for actual runs, from a pre-run sampled curve.
Power limiting does not improve performance but it does improve efficiency. You might be able to get 90% of the performance for only 70% of the power usage, for example. It does not make the card go faster though.
This is precicely because of the efficiency. The lower efficiency of the higher speed triggers a much lower performance sooner.
https://clehaxze.tw/gemlog/2025/04-21-programming-tensotrren...
https://clehaxze.tw/gemlog/2026/01-22-the-real-tenstorrent-t...