Milk-V Titan: A $329 8-Core 64-bit RISC-V mini-ITX board with PCIe Gen4x16
129 points
6 days ago
| 15 comments
| cnx-software.com
| HN
alexrp
8 hours ago
[-]
Most people would be better off waiting for the multiple RVA23 boards that are supposed to come out this year, at least if they don't want to be stuck running custom vendor distros. "RVA23 except V" at this price point and at this point in time is a pretty bad value proposition.

It's honestly a bit hard to understand why they bothered with this one. No hate for the Milk-V folks; I have 4 Jupiters sitting next to me running in Zig's CI. But hopefully they'll have something RVA23-compliant out soon (SpacemiT K3?).

reply
camel-cdr
6 hours ago
[-]
> But hopefully they'll have something RVA23-compliant out soon (SpacemiT K3?).

A handful of developers already have access to SpacemiT K3 hardware, which is indeed RVA23 compliant and already runs Ubuntu 26.04.

geekbench: https://browser.geekbench.com/v6/cpu/16145076

rvv-bench: https://camel-cdr.github.io/rvv-bench-results/spacemit_x100/... (which as instruction throughput measurements and more)

reply
ahoka
4 hours ago
[-]
This is around the performance of a Core 2 Duo, if I understand correctly?
reply
camel-cdr
3 hours ago
[-]
The single core performance is roughly in the middle between Pi4 Cortex-A72 and Pi5 Cortex-A76.

It's slightly faster than a 3GHz Core 2 Dua in scalar single threaded performance, but it has 8 cores instead of two and more SIMD performance. There are also 8 additional SpacemiT-A100 cores with 1024-bit wide vectors, which are more like an additional accelerator.

The geekbench score is a bit lower than it should be, because at least three benchmarks are still missing SIMD acceleration on RISC-V (File Compression, Asset Compression, Ray Tracer), and the HTML5 browser test is also missing optimizations.

I'd estimate it should be able to get to the 500 range with comparable optimization to other architectures.

The Milk-V Titan mention in the original post is actually slightly faster in scalar performance, but has no RISC-V Vector support at all, which causes it's geekbench score to be way lower.

reply
ahoka
1 hour ago
[-]
That’s actually decent, thanks.
reply
oxxoxoxooo
3 hours ago
[-]
Do you happen to know how does one access/use those A100 cores?
reply
camel-cdr
3 hours ago
[-]
No.

The problem is that you can't migrate threads between cores with different vector length.

The current ubuntu 26.04 image, that is installed, lists 16 cores in htop, but you can only run applications on the first 8 (e.g. taskset -c 10 fails). If you query whats running on the A100 cores you see things like a "kworker" processes.

I suspect that it should be possible to write a custom kernel module that runs on the A100s with the current kernel, but I'm not sure.

I expect it will definitely be possible to boot a OS only one the 8 A100 cores.

Well have to see if they manage to figure out how to add support for explicitly pinning user mode processes to the cores.

The ideal configuration would be to have everything run only on the X100s, but with an opt-in mechanism to run a program only on an A100 core.

reply
6SixTy
3 hours ago
[-]
Something is odd here, the Core 2 Duo only has up to SSE 4.1, while the RVA23 instruction set is analogous to x64-v3. I find it hard to believe that the SpacemiT K3 matched a Core 2 duo single core score while leveraging those new instructions.

To wit the Geekbench 6.5.0 RISC-V preview has 3 files, 'geekbench6', 'geekbench_riscv64', and 'geekbench_rv64gcv', which are presumably the executables for the benchmark in addition to their supported instruction sets. This makes the score an unreliable narrator of performance, as someone could have run the other benchmarks and the posted score would not be genuine. And that's on top of a perennial remark that even the benchmark(s) could just not be optimized for RISC-V.

reply
adgjlsfhk1
2 hours ago
[-]
If it's anything like the k1, I wouldn't be surprised if Core 2 performance was on the table. The released specs are are ~Sandybridge-Haswell like, but those were architectures made by (at the time) the top CPU manufacturer and were carefully balanced architectures to maximize performance while minimizing transistors. SpaceMIT is playing on easy mode (they are making a chip on a ~2-4x smaller process node and aren't pioneering bleeding edge techniques), but balancing an out of order CPU is still tough, and it's totally possible to lose 50% of theoretical ipc if you don't have the memory bandwith, cache hierarchy, scheuling etc.
reply
6SixTy
1 hour ago
[-]
Cache issues add another layer here, if it's not the whole issue. Device tree patches for the K3 have 2 clusters of 4 cores with shared 4MB L2 cache per cluster. Core 2 Duo P8400 has 3MB L2 shared between 2 cores, and Sandybridge-Haswell have per core L2 and shared L3.
reply
irusensei
7 hours ago
[-]
I feel this is becoming a bit of a tech urban legend such as ZFS requires ECC.

As far as I understand the RVA23 requirement is an ubuntu thing only and only for current non LTS and future releases. Current LTS doesn't have such requirements and neither other distributions such as Fedora and Debian that support riscv64.

So no, you are not stuck running custom vendor distros because of this but more because the other weird device drivers and boot systems that have no mainline support.

reply
alexrp
7 hours ago
[-]
I'm fairly sure I recall Fedora folks signaling that they intend to move to RVA23 as soon as hardware becomes generally available.

It is of course possible that Debian sticks with RV64GC for the long term, but I seriously doubt it. It's just too much performance to leave on the table for a relatively new port, especially when RVA23 will (very) soon be the expected baseline for general-purpose RISC-V systems.

reply
rwmj
4 hours ago
[-]
As someone from the Fedora/RISC-V project, it'll depend on what our users want. We cannot support both RV64GC and RVA23 (because we don't have the build or software infra to do it) so we have to be careful when we move. Doing something like building with RV64GC generally but having targeted optimizations - perhaps two kernel variants and some libraries - might be possible, but also isn't easy.

Things are different for CentOS / RHEL where we'll be able to move to RVA23 (and beyond) much more aggressively.

reply
znpy
3 hours ago
[-]
First things first: thank you for your work.

That being said: does it make sense to keep a nee but low performance platform alive? As the platform is new and likely doesn’t have many users, wouldn’t it make sense to nudge (as in “gently push”) users towards a higher performance platform?

Chances are the low-performance platform will die anyway, and fedora will not be exploiting the full offering of the high performance platform.

reply
rwmj
2 hours ago
[-]
It's about what users think in our forums: https://discussion.fedoraproject.org/tag/risc-v-sig
reply
fweimer
7 hours ago
[-]
I'm not completely sure, but I suspect Fedora will stick to the current baseline for quite some time.

But the baseline is quite minimal. It's biased towards efficient emulation of the instructions in portable C code. I'm not sure why anyone would target an enterprise distribution to that.

On the other hand, even RVA23 is quite poor at signed overflow checking. Like MIPS before it, RISC-V is a bet that we're going to write software in C-like languages for a long time.

reply
camel-cdr
5 hours ago
[-]
> On the other hand, even RVA23 is quite poor at signed overflow checking

When I tried to measure the impact of -ftrapv in RVA23 and armv9, it was roughly the same: https://news.ycombinator.com/item?id=46228597#46250569

reminder:

    unsigned 64-bit:
    add: RV: add+bltu       Arm: adds+bcc
    sub: RV: sub+bltu       Arm: subs+bcs
    mul: RV: mulhu+mul+beqz Arm: umulh+mul+cbz
    
    unsigned 32-bit:
    add: RV: addw+bgeu     Arm: adds+bcc
    sub: RV: subw+bgeu     Arm: subs+bcs
    mul: RV: mul+slli+beqz Arm: umul+cmp lsr 32

    signed 64-bit:
    add: RV: add+slt+slti+beq  Arm: adds+bcc
    sub: RV: sub+slt+slti+beq  Arm: subs+bcs
    mul: RV: mulh+mul+srai+beq Arm: smulh+mul+cmp asr 63
    
    signed 32-bit:
    add: RV: addw+add+beq   Arm: adds+bvc
    sub: RV: subw+sub+beq   Arm: subs+bvs
    mul: RV: mul+sext.w+bew Arm: smul+asr+cmp asr 31
reply
IshKebab
6 hours ago
[-]
> On the other hand, even RVA23 is quite poor at signed overflow checking.

On the other hand it avoids integer flags which is nice. I doubt it makes a measurable performance impact either way on modern OoO CPUs. There's going to be no data dependence on the extra instructions needed to calculate overflow except for the branch, which will be predicted not-taken, so the other instructions after it will basically always run speculatively in parallel with the overflow-checking instructions.

reply
fweimer
5 hours ago
[-]
It's nice for a C simulator to avoid condition codes. It's not so nice if you want consistent overflow checks (e.g., for automatically overflowing from fixnums to bignums).

Even with XNOR (which isn't even part of RVA23, if I recall correctly), the sequence for doing an overflow check is quite messy. On AArch64 and x86-64, it's just the operation followed by a conditional jump: https://godbolt.org/z/968Eb1dh1

reply
adgjlsfhk1
1 hour ago
[-]
Non-flag based overflow checks are still pretty cheap. The overflow check is only 1 extra instruction for unsigned (both add and multiply), and 3/4 extra for signed overflow (see https://godbolt.org/z/nq1nb5Whr for details). It's also worth noting that in many cases, the overflow checks will be removable or simplify-able by the compiler entirely (e.g. if you're adding 1 or know the sign of one of the operands etc). As such, the extra couple instructions are likely worthwhile if it makes designing a wider core easier. Signed overflow instructions would be reasonable to add, but it's not like modern high performance cores are bottlenecked by scalar instructions that don't touch memory anyway.
reply
IshKebab
6 hours ago
[-]
I don't think you'll be able to get away from custom distros even with RVA23. It solves the problem of binary compatibility - everything compiled for RVA23 should be pretty portable at the instruction level (won't help with the usual glibc nonsense of course).

But RVA23 doesn't help with the hardware layer - it's going to be exactly the same as ARM SBCs where there's no hardware discovery mechanism and everything has to be hard-coded in the Linux device tree. You still need a custom distro for Raspberry Pi for example.

I believe there has been some progress in getting RISC-V ACPI support, and there's at least the intent of making mconfigptr do something useful - for a while there was a "unified discovery" task group, but it seems like there just wasn't enough manpower behind it and it disbanded.

https://github.com/riscvarchive/configuration-structure/blob...

https://riscv.atlassian.net/browse/RVG-50

reply
alexrp
5 hours ago
[-]
> You still need a custom distro for Raspberry Pi for example.

Are you sure that's still the case? I just checked the Raspberry Pi Imager and I see several "stock" distro options that aren't Raspbian.

Regardless, I take your point that we're reliant on vendors actually doing the upstreaming work for device trees (and drivers). But so far the recognizable players in the RISC-V space do all(?) seem to be doing that, so for now I remain hopeful that we can avoid the Arm mess.

reply
IshKebab
2 hours ago
[-]
I'm not totally sure, but I would imagine those stock distros actually have dedicated packages for Raspberry Pi kernel images.

See this for example: https://www.phoronix.com/news/Raspberry-Pi-5-Ethernet-Linux

If you look at the patch series, it's directly adding information about the address of the ethernet device. That's the sort of thing that would be discovered automatically in the x86 world. It wouldn't need to be hard-coded into the kernel for each individual board that is supported.

reply
Findecanor
8 hours ago
[-]
I've noticed that the sentence “Compliant with RVA23 excluding V extension” has apparently been a bit confusing to some reporters in the tech press lately.

It means that the UR-DP1000 chip would have been RVA23-compliant if only it had supported the V (Vector) extension. The Vector extension is mandatory in the RVA23 profile.

There are other chips out there even closer to being RVA23-compliant, that have V but not a couple of scalar extensions. The latter have been emulated in software using trap handlers, but there was a significant performance penalty. V is such a big extension, with many instructions and requiring more resources, that I don't think that it would be worth the effort.

reply
fidotron
8 hours ago
[-]
> The latter have been emulated in software using trap handlers, but there was a significant performance penalty.

This is a thing SoC vendors have done before without informing their customers until it's way too late. Quite a few players in that industry really do have shockingly poor ethical standards.

reply
fweimer
7 hours ago
[-]
I'm not sure if it's intentional. AWS doesn't have CPU features in their EC2 product documentation, either. It doesn't necessarily mean that they can disable CPU features for instances covered by existing customer contracts.
reply
fidotron
7 hours ago
[-]
> I'm not sure if it's intentional

This is the sort of comment that makes people lose faith in HN.

There totally are cases where it's intentional, and no they are not discussed on the internet for obvious reasons. People in the industry will absolutely know what I'm on about.

reply
fweimer
6 hours ago
[-]
I didn't intend to dismiss your experience. From the opposite (software) side, these things are hard to document, and unclear hardware requirement documentation result from the complexity and (perhaps) unresolved internal conflict.
reply
PunchyHamster
7 hours ago
[-]
I'm sure it is in footnote in datasheet
reply
fidotron
7 hours ago
[-]
No, they really are that grimy and will pull tricks like this until you call them out on them.

They will then issue errata later, after millions of devices have been shipped.

reply
reactordev
7 hours ago
[-]
In 6pt mandarin.
reply
drob518
7 hours ago
[-]
The good: eight cores. The bad: it’s slow and still no V extension. On the bright side, it uses DDR4, so you might be able to find RAM for it. “Titan” feels like some wishful over marketing.
reply
dwood_dev
8 hours ago
[-]
I'm surprised we have not seen more investment into RISC-V from Chinese firms. I would think they want to decouple from ARM and the west in general as a dependency. Maybe they view the coup of ARM China as having secured ARM for the time being and not as much pressure?

Either way, it's currently hard to be excited about RISC-V ITX boards with performance below that of a RPi5. I can go on AliExpress right now and buy a mini itx board with a Ryzen 9 7845HX for the same price.

reply
cbm-vic-20
6 hours ago
[-]
I'm surprised we haven't seen more investment from Indian firms. India is really trying to raise their game in the tech economy beyond the services industry. You don't need the most cutting-edge chip fabrication equipment to manufacture these processors.
reply
rwmj
4 hours ago
[-]
The Indian government was actually a very early adopter, they had one of the first RISC-V processors that wasn't based on the vanilla Rocket designs, starting back in like 2016 or so. Unfortunately they made some other weird decisions, like not supporting compressed instructions so the chip wasn't compatible with any mainstream Linux distros, and I don't think the project really went anywhere. Although looking at Wikipedia the project seems to still exist.

https://en.wikipedia.org/wiki/SHAKTI_(microprocessor)

reply
oofbey
5 hours ago
[-]
Services industry is gonna be tough in the age of AI agents.
reply
lazide
5 hours ago
[-]
If by really trying you mean a bunch of PR stunts and diverted funds, sure.

Are you aware of any actual credible attempts?

reply
dev_l1x_be
2 hours ago
[-]
China is vested in Loongson.

https://en.wikipedia.org/wiki/Loongson

reply
crote
8 hours ago
[-]
China also has LoongArch.
reply
6SixTy
4 hours ago
[-]
LoongArch is a weird mix of MIPS and RISC-V. There's not much that would be gained by investing a whole bunch into LoongArch that couldn't also be done to RISC-V, if at all.
reply
PunchyHamster
7 hours ago
[-]
It's not better architecture so the gain is few pennies more per chip at cost of A LOT of work... work that can't just run Android or much else out of the box.
reply
c0wb0yc0d3r
7 hours ago
[-]
Isn’t that kind of like saying automated testing (for apps written without testing in mind) isn’t worth it because you have to spend time getting code into a state that is testable?

I do agree that it takes a lot of work to get something usable, and so I think we are a ways off from mainstream risc-v. I do also think there is a lot more value for low power devices like embedded/IoT or instances where you need special hardware. Facebook uses it to make special video transcoding hardware.

reply
z3512
2 hours ago
[-]
Getting the chip stable is one thing. Getting all the various software applications that one uses test their chip is another, and a lot of hardware companies won’t have experience there. Then there is customer code…
reply
ginko
8 hours ago
[-]
> I would think they want to decouple from ARM and the west in general as a dependency.

Why would you think that? ARM is not like x86 CPUs where you get the completed devices as a black box. Chinese silicon customers have access to the full design. I guess it's not completely impossible to hide backdoors at that level but it'd be extremely hard and would be a huge reputational risk if they were found.

They also can't really be locked out of ARM since if push comes to shove, Chinese silicon makers would just keep making chips without a license.

reply
the_biot
6 hours ago
[-]
I did catch one vendor using a HAL across a whole SoC product line, a very low-level HAL that sat between SoC hardware registers and kernel drivers. It effectively made the drivers use scrambled register locations on the AHB etc, but if you resolved what the HAL did, the registers matched ARM's UART etc IP. So I figured they were ducking license fees for ARM peripherals.
reply
sylware
7 hours ago
[-]
arm has toxic IP locks.

Everybody sane will want to move away from them, there is nothing chinese specific.

The most performant RISC-V implementations are from the US if I am not too mistaken.

Wonder if that hardware can handle an AMD 9070 XT (resizable bar). If so, we need the steam client to be recompiled for RISC-V and some games... if this RISC-V implementation is performant enough (I wish we would have trashed ELF before...)

reply
fweimer
7 hours ago
[-]
Is there an actual U.S. RISC-V CPU that achieves competitive performance? I think the performance leaders are currently based in China.

There's a difference between announcement, offering IP for licensing (so you still have to make your own CPUs), shipping CPUs, and having those CPUs in systems that can actually boot something.

reply
sylware
4 hours ago
[-]
For instance, SiFive in the US, but last time I did check them, their RVA23 CPUs on their workstation boards did not have cache-line size vector instructions (only 128bits aka sse grade I think), RVA23 mandates the same "sweet spot" for a cache line size than on x86_64: 64bytes/512bits.
reply
geerlingguy
7 hours ago
[-]
Box64 already runs Steam (and a good number of games) on RISC-V.
reply
sylware
4 hours ago
[-]
Yep... but RISC-V large and performant implementations won't access the latest silicon process for a good while... better to push for native support to help.
reply
easygenes
7 hours ago
[-]
As a point of comparison, the Radxa Orion O6 shipped a year ago as a 12 core ARMv9 board on same form factor and TDP, for $100 less, with 5x the single core performance (and including a competent iGPU, NPU and VPU). These are very much developer/tinkerer only boards as is.
reply
mkesper
5 hours ago
[-]
reply
fc417fc802
4 hours ago
[-]
It appears to cost about twice as much as the Titan these days. Not sure if that's RAM, tariffs, or something else.
reply
roughly
3 hours ago
[-]
These are still very early days for RISC-V, but I’m always happy to see things progress in this space. No, this isn’t a viable desktop for the average consumer, but if it makes the architecture more accessible for the types of weirdos who tend to pave the way for the rest of us, it’s good.
reply
dev_l1x_be
2 hours ago
[-]
I just got a Milkv Mars and it is bit rough around the edges. Discord group helped me to get a Debian running on it but I would not say it was simple or easy to get it working.
reply
justaboutanyone
5 hours ago
[-]
Where are the RVA23 boards that have been hinted at for so long?
reply
Sparkyte
4 hours ago
[-]
Considering you can get a more core dense package on x86 for that price it is better to wait.

2.0ghz isn’t a whole lot of performance for RISC-V system.

reply
fulafel
8 hours ago
[-]
Is it really this much slower than a Raspberry Pi 5? https://browser.geekbench.com/v5/cpu/compare/23667112?baseli...

tldr; 236 vs 666 single core score

reply
normie3000
8 hours ago
[-]
From your link it seems to be 3x slower. It's not clear to me why this comparison is relevant.
reply
fulafel
8 hours ago
[-]
I was wondering about the value proposition. But I guess it's a more like a dev / tinkering board then.
reply
moffkalast
1 hour ago
[-]
Yep, the Pi 5 is an ARM though and you have to license the architecture to use it. RISC-V is open source, but currently sucks in pretty much all aspects compared to any other. This is still way in the early dev and build support phase.
reply
fweimer
6 hours ago
[-]
For integer workloads it seems closer to 60% of RPi 5 performance. There are some benchmarks that depend on vector support or dedicated CPU instructions for good results, and they skew the results.
reply
6SixTy
2 hours ago
[-]
A TL;DR doesn't explain everything. The Milk-V Titan doesn't have Vector instructions or crypto, while the Pi 5 does. It's very clearly a broken benchmark.

This is why a bunch of RISC-V people won't buy boards without RV Vector instructions.

reply
throwaway85825
3 hours ago
[-]
Until the risc v ecosystem receives upstreamed and maintained support it's going to be a hard sell vs x86 that 'just works'.
reply
flykespice
3 hours ago
[-]
Really, I don't get why would anyone buy these priced RISC-V development boards over much cheaper ARM-based variants that are faster.

What is the target audience for these development boards anyway?

reply
15155
1 hour ago
[-]
People who think "it's open source bro! Boo ARM!" without understanding how peripheral IP works.
reply
TheRealPomax
2 hours ago
[-]
But why are we still slapping on woefully tiny stock coolers that spin so fast your room sounds like an RC racing arena?
reply
fnord77
6 hours ago
[-]
Kinda pricey? You can get an entire M4 Mac Mini for $499
reply
singinishi
8 hours ago
[-]
So slow.
reply
webdevver
6 hours ago
[-]
riscv is going to start having brand issues with these hardware offerings (if it doesn't already.)

sure, prototypes are good. but maybe it shouldn't be sold as a general product, because it implies that the sellers deem it a good product (when it obviously isn't.)

maybe it should be a closed offering, e.g. we're only making 1000, and we're only sending them to select few specialists/reviewers.

reply