In an 2014 interview¹ with the Computer History Museum, Bill Mensch said:
> We think of ARM as our prodigal son. Got to do what they got to do, and it's not something I want to do, but I'm proud of them. They started with my technology, and I wish them well.
¹: https://archive.computerhistory.org/resources/access/text/20...
The 816 has a 24-bit address bus. But its registers are limited to 16-bits. So there's no way to represent a "full" address in a register. Aka a pointer can't be held in a register.
Unfortunately this core, staying true to the 816, still has a 16-bit program counter and the awkward banked address scheme of the 816. I assume it has the same restriction of limiting stack and direct-page to the 00-bank, too?
The 816 really felt to me like Mensch just kind of bolted a little circuitry on the side of the 502. It doesn't have the simplistic elegance of the 6502.
A "better" 32-bit 65xx I think would be just to widen all registers (including program counter) out to 32-bits and leave it at that, forgetting about binary compatibility. A big linear memory.
With the 816, you can sometimes avoid switching with sequences like AND #$00FF. But with the 832, that could involve AND #$000000FF, which is more expensive, especially with the data bus still being 8 bit. It would have been better to repurpose the WDM opcode for a prefix or another switching instruction.
On top of that, the 65832 datasheet is dated March 1990. For comparison, the 80486 was launched in 1989, the 68040 in 1990, and the ARM6 in 1992. This design already looked clunky compared to the 68008, much less what its contemporaries would have been.
I've done a little 65816 coding and I quickly learned that it was best to standardize on register sizes throughout most of the code and for routine calls, only switching to optimize things. 8-bit A and 16-bit X and Y made the most sense by far for small-scale asm code. It let you work in 8 bits when dealing with registers and using common 8-bit variables, while being able to loop over larger data structures with X and Y as memory addresses and counters, and also manipulate 16-bit values to some degree (copying, incrementing/decrementing).
Other common CPUs of the time instead either had different registers names for different sizes (80x86), or coded the size in the instruction (68000). This avoided the mode bits and issues with code written for different modes (which even affected instruction length of the 65816 when using immediate data).
The 32 bit version, called the W65C832, only exists as a datasheet This is a FPGA version.
"builds with the regular opensource FPGA tools: yosys, nextpnr-ice40, icepack, and iceFUNprog. I recently added support for the Tang Nano 20k board (should work with the 9k model too)"
This is a "fantasy" 32-bit extension of the 65816 that WDC specced out but never built. It wouldn't help with getting the IIGS on MiSTer.