RISC-V and Floating-Point
32 points
1 day ago
| 3 comments
| fprox.substack.com
| HN
Netch
1 day ago
[-]
From a bystanderʼs POV it is excessively hard to memorize all the mess with multiple different extensions. The naming style doesnʼt alleviate the task. But this is a typical issue in the whole RISC-V ecosystem.

What Iʼm slightly confused for is that all these extensions, useful for a minor part of applications, arenʼt moved to longer instructions (6-byte).

reply
fidotron
10 minutes ago
[-]
From the point of view of the RISC-V architects the "users" are the chip designers who are engaged in a sort of build-your-own-instruction-set situation, and this kind of makes sense, but does contribute to it being a mess.

They are absolutely in denial as to the downstream effects of this on the software ecosystem. Android, for example, for native support had enough fun dealing with relatively few ARM variants (and x86/MIPS etc), and identifying chip features at run time was reliant on the board support software getting it right (hint: it didn't).

reply
brucehoult
1 day ago
[-]
The average bystander doesn't have to care, just buy a machine implementing the RVA23 profile (standard set of extensions) and be happy.

If you're building your own embedded hardware then you determine what your needs actually are e.g. do you need double precision? half precision? vector?. Then you choose a chip implementing that. Then you copy the ISA string from your chip's documentation to the `-march=` argument for GCC/Clang and be happy.

It's not hard and you don't have to think about it unless you very specifically want to.

reply
fweimer
2 hours ago
[-]
How does the average bystander buy an RVA23 machine today?
reply
self
1 hour ago
[-]
reply
fweimer
57 minutes ago
[-]
The Arace purchase link for the Jupiter 2 kit says it's “in stock“, but it's actually for a discount coupon. The actual system can only pre-ordered. The Sipeed web site does not say anything about shipping timelines, and the products are not offered in their AliExpress store. I think the Sipeed boards are in preorder, too.

Of course, neither of these are machines. And the average bystander probably isn't used to importing computer parts directly from China, either.

reply
bjourne
3 hours ago
[-]
The average bystander might want to write high-performance code for their risc-v cpu. Then they must know precisely which instructions are available and what the performance implications of using them are. E.g., the difference between a shared and non-shared fp register file is huge.
reply
LoganDark
3 minutes ago
[-]
Do high-performance RISC-V CPUs (that you can actually buy) still exist? SiFive Unleashed was great but IIRC it was a single batch that never returned.
reply
rwmj
3 hours ago
[-]
For the "average bystander" they're going to buy an OS and compatible hardware, or if they're the average programmer they're going to use a compiler and libraries that solve the problem already for them. Very very few people need to worry about the details.
reply
akarpathy
2 hours ago
[-]
The average programmer still must inform their compiler what to use. Claude Code can assist with this.
reply
andrepd
2 hours ago
[-]
You need Claude Code to copy a string into your config/make file?
reply
panick21_
3 hours ago
[-]
You think the average person writes performance optmized code?

If you are on that level then you know pretty well what you are targeting. And even then in 99% of cases you just look at the top level profile.

If you do performance analysis for some specific embeded project that is not using a standard profile, then its a bit more work, but hardly impossible.

reply
bjourne
2 hours ago
[-]
Bruh, the "average person" won't buy a riscv-based computer in decades. The average bystander to the riscv project indeed writes high-performance code for their, so far, mostly non-existent or emulated riscv processors.
reply
panick21_
28 minutes ago
[-]
Your seriously arguing the the avg person write performance code so critical that minor difference in hardware implementation are relevant? Most people write code that isn't that performance critical, fireware or they are porting existing software over. A extreme minority of people that interact with an ISA is hand optimizing code.
reply
rwmj
3 hours ago
[-]
I agree the naming is annoying. However the extensions aren't too hard to understand. They are much more regular than on other architectures. I wrote a deep dive into them here: https://research.redhat.com/blog/article/risc-v-extensions-w...

Also groups of extensions are consolidated into Profiles, so in practice you don't really care about individual extensions. You'll only care that the hardware supports eg RVA23.

reply
camel-cdr
1 day ago
[-]
> From a bystanderʼs POV it is excessively hard to memorize all the mess with multiple different extensions

It's the same for other ISAs.

> What Iʼm slightly confused for is that all these extensions, useful for a minor part of applications, arenʼt moved to longer instructions (6-byte).

Because these instructions don't need it. There will be future >4-byte instructions, for things thay can't resonably be done in 4-bytes, e.g. much larger immediates.

reply
pclmulqdq
2 hours ago
[-]
It's way worse on RISC-V. There are maybe 5 x86 or ARM variants to care about at any given time, even if you want to hyper-optimize your code. RISC-V has a soup of literally 100s of extensions with non-uniform use and support.
reply
camel-cdr
24 minutes ago
[-]
There are a lot more ARM extensions than people are aware of. E.g. debian uses ARMv8-A with FEAT_FP and FEAT_AdvSIMDas a base. Yes, floating-point and SIMD are optional in ARMv8-A, as are the following ISA extensions, only including ones that add instructions and excluding the AArch32 stuff: FEAT_CRC32, FEAT_AES, FEAT_PMULL, FEAT_SHA1, FEAT_SHA256, FEAT_RDM, FEAT_F32MM, FEAT_F64MM, FEAT_I8MM, FEAT_LSMAOC, FEAT_SHA3, FEAT_SHA512, , FEAT_SM3, FEAT_SM4, FEAT_SVE, FEAT_EPAC, FEAT_FCMA, FEAT_JSCVT, FEAT_LRCPC, FEAT_DotProd, FEAT_FHM, FEAT_FlagM, FEAT_LRCPC2, FEAT_BTI, FEAT_FRINTTS, FEAT_FlagM2, FEAT_MTE, FEAT_MTE2, FEAT_RNG, FEAT_SB, FEAT_BF16, FEAT_DGH, FEAT_EBF16, FEAT_CSSC, ...

Also fun: FEAT_LittleEnd, FEAT_MixedEnd, FEAT_BigEnd

All of that was just 64-bit ARMv8.x-a, there is a lot more stuff, once you go to R or M profiles, 32-bit and previous versions.

The reason this is mostly not a problem, is that distros converged on a minimum of 64-bit ARMv8-A + FP + SIMD, which will also happen with RVA23 for RISC-V.

Just for fun, here are the Zen4 ISA flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3 dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm

Compared to RVA23 written out: rv64imafdcbv_zicsr_zicntr_zihpm_ziccif_ziccrse_ziccamoa_zicclsm_zic64b_za64rs_zihintpause_zba_zbb_zbs_zicbom_zicbop_zicboz_zfhmin_zkt_zvfhmin_zvbb_zvkt_zihintntl_zicond_zimop_zcmop_zcb_zfa_zawrs_svbare_svade_ssccptr_sstvecd_sstvala_sscounterenw_svpbmt_svinval_svnapot_sstc_sscofpmf_ssnpm_ssu64xl_sha_supm_zifencei

reply
benj111
1 hour ago
[-]
What are you imagining? If this is desktop then most of the extensions are going to be standard.

The only reason they're optional is because I'm using the same instruction set on my Pico, so no it doesn't have floating point, and I believe it has integer divide but I wouldn't be surprised if it didn't.

And the extensions are in groups, a good chunk of which are compressed instructions, which unless you're writing assembly, you don't need to worry about.

In fact most of this you don't need to worry about unless youre writing assembly.

reply
imtringued
1 hour ago
[-]
Electronics distributors search engines tend to work extremely poorly and if you try to overload them with an absurd variety of niche extensions, then nobody is going to find the right RISC V MCU for their needs.
reply
wg0
1 hour ago
[-]
> It's the same for other ISAs.

No they are not. See the Intel Software Programmer Volumes. Highly detailed, highly structured and highly specific.

reply
Joker_vD
1 hour ago
[-]
You're joking, right?
reply
pjmlp
53 minutes ago
[-]
It is like Khronos APIs but in hardware, design by comittee at its best.
reply
NooneAtAll3
1 hour ago
[-]
looks like there's no mention of soft-floats support, like in Hazard3 cores

see f.e.: https://wren.wtf/shower-thoughts/marks-magic-multiply/

reply
andrepd
2 hours ago
[-]
Some of the complexity that comes with this really comes from the complexity of IEEE734 itself, plus the fragmentation of alternatives at lower precision.

I would have loved if the article mentioned the efforts at integrating Posits [0] in risc-v. While IEEE734 compatibility will obviously be necessary for any foreseeable future, it would be nice if the industry could settle on a better alternative which avoids many of the flaws with IEEE floats.

[0] https://github.com/andrepd/posit-rust

reply