https://thechipletter.substack.com/p/the-arm-story-part-1-fr...
https://thechipletter.substack.com/p/the-arm-story-part-2-ar...
https://thechipletter.substack.com/p/the-arm-story-part-3-cr...
This interview with ARM's first CEO Robin Saxby is also really entertaining and informative. His energy is really infectious:
What those early Archimedes systems demonstrated was that the whole thing did actually work, but their design was not what the market needed. Saxby was the right guy to lead them, and his energy just seems like something else.
Sophie (a major computing hero for many in the UK in my era) went on to do esoteric VLIW work at Broadcom for ADSL iirc.
IIRC Steve Furber and colleagues considered the licensing model and decided it would never work. Saxby made it work. All credit to him too for standing down before he overstayed his welcome and keeping out the limelight since.
Yes, re Sophie it was Firepath I think as per this presentation (2014).
https://old.hotchips.org/wp-content/uploads/hc_archives/hc14...
Part 3: https://arstechnica.com/gadgets/2023/01/a-history-of-arm-par...
* SuperH [0], 32bit only, now basically dead, but microcontrollers are still available
* AVR32 [1], 32 bit only, also quite dead
* ARC [2], 32/64bit, still quite popular in an automotive
[0]: https://en.wikipedia.org/wiki/SuperHWith my (limited) understanding of how ARM conquered the market, I guess this turned out to be a very consequential cost-saving measure.
Doesn't the x86 chips also use microcode? There are several differences between RISC and CISC not mentioned here.
(also Sophie was called Roger at this point in time, so the article has been retconed)
EDIT: Though later, in the Pentium era, x86 started to do simple instructions like `ADD AX, [BX]` without microcode.
In theory PLAs and ROMs are fully equivalent. In practice, while the ROM can accept any possible "microcode", a PLA might have to be enlarged if you want to change some of the "micro instruction". This need to change the hardware to change the functionality of an instruction is what makes me consider this design hardwired instead of microcoded.
[EDIT] Another issue is that the ARM1 has three pipeline stages. The "microcode" here is not used for the fetch and decode stages, only the execute one. So though register to register operations take 3 clock cycles to execute, only one "micro instruction" is needed (the second line in the table).
So reading, the citation fully, it seems that Furber doesn't really dive deep into the ARM1, instead saving a deep dive for the ARM2 as well as an additional chapter about the changes then made for the ARM3. I think kens might be steel manning the position.
> In theory PLAs and ROMs are fully equivalent. In practice, while the ROM can accept any possible "microcode", a PLA might have to be enlarged if you want to change some of the "micro instruction". This need to change the hardware to change the functionality of an instruction is what makes me consider this design hardwired instead of microcoded.
Traditionally, the difference would be defined as whether or not it's structured as an address being decoded to create a one hot enable signal for a single row of the array at a time. When you take the FSM signals as a sub address, and the init, interrupt, and decoded instr bits as a segment address, this is what you see here. And that matches the structure seen in traditional CISC microcode ROMs.
Additionally, there is extra space on the die for an additional few rows.
> [EDIT] Another issue is that the ARM1 has three pipeline stages. The "microcode" here is not used for the fetch and decode stages, only the execute one. So though register to register operations take 3 clock cycles to execute, only one "micro instruction" is needed (the second line in the table).
The pipelining doesn't really matter here. The 486 for instance more or less completed one (simple) instruction a clock, but had a rather deep for the time piepline where those instructions would have several cycles of latency. Those simple instructions were also a single micro op despite being being processed in several pipeline stages. And the micro decode was not the first stage of the pipeline either. The 486 had a deeper pipeline with fetch, decode 1, decode 2, execute, and writeback pipeline stages, but didn't start emitting micro instructions except as the output of the third pipeline stage.
I think the actual explanation is that the CISC ops are decoded to more or less the same or similar types of RISC ops, but requiring more physical hardware to do the decode, correct?
The tradeoff here being lower memory for instructions, but more silicon+transistors needed for decode hardware.
What I didn't realise is RISC existed and was an initiative by IBM prior to Dave Patterson research and coining of the term.
I remember the quote but not the source :(
So this not an urban legend after all, and it's about the first ever ARM CPU! Very cool story indeed