FilterHN

What if AMD FX had "real" cores? [video]

32 points

by zdw

5 days ago

| past

| 7 comments

| youtube.com

| HN

▲

magicalhippo

2 days ago

[-]

I'm a bit curious why this couldn't have been covered earlier by simply having the OS disable some cores on a 4-module 8-thread part. After all the video does point out that if only one thread uses one module, it as full access to all the resources of the module.

Also, the benchmark is clock-for-clock, so while the older Phenom II looks like it's ahead, the Buldozer should be able to go faster still.

All that said, I really enjoyed this retrospective look.

▲

zokier

2 days ago

[-]

There were some benchmarks at the time with disabled cores, for example: https://www.hardware.fr/articles/842-9/efficacite-cmt.html

▲

Yizahi

2 days ago

[-]

They were real though. How many ALU were there on say FX-8350? 8 ALUs. How many FPUs were there? 8 FPU each 128 bit wide. What alternative definition of core whis doesn't satisfy? CPU was underperforming at that time and Intel fans were trying to equate their Hyperthreading with AMDs core organization, but they were always real cores.

▲

dragontamer

2 days ago

[-]

8 Integer ALUs, 4 Vector FPUs, 8x L1 d-caches but only 4x L2 d-Caches.

And perhaps most importantly: 4x decoders/4x L1 iCache. IIRC, the entire damn chip was decoder-bound.

--------

Note: AMD Zen has 4x Integer pipelines and 4x FPU pipelines __PER CORE__. Modern high-performance systems CANNOT have a single 2x-pipeline FPU shared between two cores (averaging one pipeline per core). Modern Zen is closer to 4x pipelines per core, maybe more depending on how you count load/store units.

▲

dannyw

2 days ago

[-]

Yup. The limited decoders meant your pipeline just wasn’t flowing every cycle, because many of the stages were sitting idle.

▲

dragontamer

2 days ago

[-]

Note that Intel's modern e-Core has 3x decoders per core. When code is straight, they alternate (decoder#1 / decoder#2 / decoder#3). When code is branchy, they split up across different jumps aka if/else statements.

Shrinking the decoder on Bulldozer was clearly the wrong move for Fx-series / AMD. Today's chips are going wide decoder (ex: Apple can do 8x decode per clock tick), deep opcode cache (AMD Zen has a large opcode cache allowing for 6x way lookup per clocktick), or Intel's new and interesting multiple-decoder thing.

▲

sidewndr46

2 days ago

[-]

How do you know the behavior of the decoding portion of Intel's E-core's? Do you work for them?

▲

AlotOfReading

2 days ago

[-]

People use clever code to tease out microarchitectural details and scour through public information to with these things out. Agner Fog is one example. His microarch analysis documents 3x decoders for the Tremont microarch, predecessor to gracemont (what's currently used for E-cores).

https://www.agner.org/optimize/microarchitecture.pdf

▲

zokier

2 days ago

[-]

The architectures of Intel cores is widely discussed and publicized. Here are the some details for the e-cores mentioned: https://chipsandcheese.com/p/skymont-intels-e-cores-reach-fo...

> Leapfrogging fetch and decode clusters have been a distinguishing feature of Intel’s E-Core line ever since Tremont. Skymont doubles down by adding another decode cluster, for a total of three clusters capable of decoding a total of nine instructions per cycle.

▲

dragontamer

2 days ago

[-]

Intel tells you this in their optimization manuals and white papers.

They want you to write code that takes advantage of their speedups. Agner Fog is a better writer (a sibling comment already linked to Agner Fogs stuff). But I also like referencing the official manuals and whitepapers as a primary source document.

Hard to beat Intels documents on Intel chips after all.

▲

Zardoz84

2 days ago

[-]

I had a few FX cores (and I keep yet stored). The early cheap 4 cores and the latter generation 8 cores (FX 8370E). And I can say that if you run code that scales well with multiple CPUs, it excels at it ( I can share a n-problem simalutor that I used as benchmark back in the day) Even, they aged far better than some Intel cpus of the time, because they had 8 cores.

FX cores had his issues. But one, was the AMD bet too early, and too hard that the future was to have a high number of cores.

▲

zokier

2 days ago

[-]

Problem was that even for multithreaded workloads the "8 core" FX-8150 did not always win against 4 hyperthreaded Intel cores. That is pretty apparent from e.g. the benchmarks here: https://www.phoronix.com/review/intel_corei7_3770k

You can easily see the multithreaded workloads there because you have the six core 3960X as comparison too.

▲

HankStallone

2 days ago

[-]

I'm actually replacing the FX-8350 in my fileserver next week, because I was running ffmpeg on it and it kept crashing about a minute into the job, so I assume it was overheating either the CPU or something on the motherboard.

It's almost 10 years old, so I can't complain. And I think I got a check for $2 or something like that from the class-action suit.

▲

doublepg23

2 days ago

[-]

Definitely worth replacing for the performance at this point but is it possible it just needs a repaste? Thermal paste would’ve definitely dried out over 10 years and cause overheating symptoms.

▲

close04

2 days ago

[-]

I'm running one daily for the past ~12-13 years and the stability is impeccable but the performance is as you'd imagine. More likely that the motherboard age and degradation of various components would lead to instability, than the CPU itself.

▲

HankStallone

2 days ago

[-]

Good point. I was kind of itching to upgrade that box anyway, but maybe I should repaste it and make it a backup server.

▲

puskavi

1 day ago

[-]

Wouldn't be surprised if caps on mobo have been cooked by all the heat

▲

stn8188

2 days ago

[-]

That was a neat video, I wasn't aware of the FX architecture in that detail. I loved my FX series... I had a 6300 that got me through engineering school, and now the same basic desktop serves as my kids' gaming computer (though I was able to upgrade to a cheap 8350). It definitely still holds it's own with the older games that I let the kids play!

▲

dannyw

2 days ago

[-]

It was good value! There’s few good value CPU’s sadly. I remember using my ryzen 3300x for years and years; it got me by on a budget.

▲

FancyFane

2 days ago

[-]

The Phenom II will always have a special place in my heart being the CPU of choice in my first CPU build in 2011. It's wild to see it's still being compared to modern CPUs, and winning the against the competition in select benchmarks.

▲

ahartmetz

2 days ago

[-]

I completely skipped the FX disaster / Intel dominance phase by holding on to a Phenom II X6. At the time, my upgrade policy was "when twice the performance is available for the same price as the old part". That never quite happened with Intel's 4 core parts.

▲

flyinghamster

2 days ago

[-]

One of my old builds was a Phenom II X2 550 Black, where I found that I could either overclock it, or unlock two more cores, but not both. I chose the cores, and it ran that way for a long time. That was one of the best bang-for-the-buck deals I ever ran into for a CPU.

▲

Zardoz84

2 days ago

[-]

They had real cores. Only, that each two cores, shared the float point units.

▲

wmf

2 days ago

[-]

Nothing could really save the FX series. It had lower performance than Intel with twice the die size.