FilterHN

https://www.copetti.org/writings/consoles/nintendo-64/

accrual

14 days ago

[-]

It's very impressive to see "realistic" graphics on the N64. The demo reminds me of "ICO" for the PS2.

I've always wondered if it would be possible to create an SDK to abstract the N64 graphics hardware and expose some modern primitives, lighting, shading, tools to bake lighting as this demo does, etc. The N64 has some pretty unique hardware for its generation, more details on the hardware are here on Copetti.org:

somat

14 days ago

[-]

Note that the N64 was designed by SGI, And seeing as how influential SGI was for 3d graphics, I sort of assume the reverse, that the n64 probably has the most standard hardware of it's generation. I would be vaguely surprised if there was not an opengl library for it.

However there is a large caveat, 1. you have to think of the system as a graphics card with a cpu bolted on. and 2. the graphics system is directly exposed.

Graphics chip architecture ends up being a ugly hateful incompatible mess, and as such the vendors of said accelerators generally tend to avoid publishing reference documents for them, preferring to publish intermediate API's instead. things like OpenGL, DirectX, CUDA, Vulcan, mainly so that under the hood they can keep them an incompatible mess(if you never publish a reference, you never have to have hardware backwards compatibility, the up side is they can create novel designs, the down side is no one can use them directly) so when you do get direct access to them, as in that generation of game console, you sort of instinctively recoil in horror.

footnote on graphics influence: OpenGL came out of SGI and nvidia was founded by ex SGI engineers.

14 days ago

[-]

> that the n64 probably has the most standard hardware of it's generation

The Reality Coprocessor (or RCP) doesn't look like any graphics cards that previously came out of SGI. Despite the marketing, it is not a shrunk down SGI workstation.

It approaches the problem in very different ways is actually more advanced in many ways. SGI workstations had strict fixed function pixel pipelines, but RCP's pixel pipeline is semi-programmable. People often call describe it as "highly configurable" instead of programmable, but it was the start of what lead to modern Pixel Shaders. RCP could do many things in a single-pass which would require multiple passes of blending on a SGI workstation.

And later SGI graphics cards don't seem to have taken advantage of these innovations either. SGI hired a bunch of new engineers (with experience in embedded systems) to create the N64, and then once the project was finished they made them redundant. The new technology created by that team never had a chance to influence the rest of SGI. I get the impression that SGI was afraid such low-cost GPUs would cannibalise their high-end workstation market.

BTW, The console looks most like a shrunk down 90s SGI workstation is actually Sony's Playstation 2. Fixed function pixel pipeline with a huge amount of blending performance to facilitate complex multi-pass blending effects. Though, SGI wouldn't have let programmers have access to the Vector Units and DMAs like Sony did. SGI would have abstracted it all away with OpenGL

------------------

But in a way, you are kind of right. The N64 was the most forwards looking console of that era, and the one that ended up the closest to modern GPUs. Just not for the reason you suggest.

Instead, some of the ex-SGI employees that worked on the N64 created their own company called ArtX. They were originally planning to create a PC graphics card, but ended up with the contract to first create the GameCube for Nintendo (The GameCube design shows clear signs of engineers overcompensating for flaws in the N64 design). Before they could finish, ArtX were bought by ATI becoming ATI's west-coast design division, and the plans for a PC version of that GPU were scrapped.

After finishing the GameCube, that team went on to design the R3xx series of GPUs for ATI (Radeon 9700, etc).

The R3xx is more noteworthy for having a huge influence on Microsoft's DirectX 9.0 standard, which is basically the start of modern GPUs.

So in many ways, the N64 is a direct predecessor to DirectX 9.0.

nyanpasu64

13 days ago

[-]

> The GameCube design shows clear signs of engineers overcompensating for flaws in the N64 design

I haven't programmed for either console. Which features show this in what sense?

13 days ago

[-]

Both use a unified memory architecture, where the GPU and CPU share the same pool of memory.

On the N64, the CPU always ends up bottlenecked by memory latency. The RAM latency is quite high to start with, your CPU is sitting idle for ~40 cycles if it ever misses the cache, assuming RCP is idle. If RCP is not idle, contention with can sometimes push that well over 150 cycles.

Kaze Emanuar has a bunch of videos (like this one https://www.youtube.com/watch?v=t_rzYnXEQlE) going into detail about this flaw.

The gamecube fixed this flaw in multiple ways. They picked a CPU with a much better cache subsystem. The PowerPC 750 had Multi-way caches instead of a direct mapped, and a quite large L2 cache. Their customisations added special instructions to stream graphics commands without polluting the caches, resulting in way less cache misses.

And when it did cache miss, the latency to main memory is under 20 cycles (despite the Gamecube's CPU running at 5x the clock speed). The engineers picked main memory that was super low latency.

To fix the issue of bus contention, they created a complex bus arbitration scheme and gave CPU reads the highest priority. The gamecube also has much less traffic on the bus to start with, because many components were moved out of the unified memory.

---------------------------

The N64 famously had only 4KB of TMEM (texture memory). Textures had to fit in just 4KB, and to enable mipmapping, they had to fit in half that. This lead to most games on the N64 using very small textures stretched over very large surfaces with bilinear filtering, and kind of gave N64 games a distinctive design language.

Once again, the engineers fixed this flaw in two ways. First, they made TMEM work as a cache, so textures didn't have to fit inside it. Second, they bumped the size of TMEM from 4KB all the way 1MB, which was massive overkill, way bigger than any other GPU of the era. Even today's GPUs only have ~64KB of cache for textures.

---------------------------

The fillrate of the N64 was quite low, especially when using the depth buffer and/or doing blending.

So the Gamecube got a dedicated 2MB of memory (embedded DRAM) for its framebuffer. Now rendering doesn't touch main memory at all. Depthbuffer is now free, no reason to not enable, and blending is more or less free too.

Rasterisation was one of the major causes of bus contention on the N64, so this embedded framebuffer has a side-effect of solving bus contention issues too problem too.

---------------------------

On the N64, the RSP was used for both vertex processing and sound processing. Not exactly a flaw, it saved on hardware. But it did mean any time spent processing sound was time that couldn't be spend rendering graphics.

The gamecube got a dedicated DSP for audio processing. The audio DSP also got its own pool of memory (once again reducing bus contention).

As for vertex processing, that was all moved into fixed function hardware. (There aren't that many GPUs that did transform and lighting in hardware. Earlier GPUs often implemented transform and lighting in DSPs (like the N64's RSP), and the industry were very quickly switching to vertex shaders)

midnightclubbed

14 days ago

[-]

The RCP was actually two hardware blocks, the RDP which as you say did the fixed function (but very flexible) pixel processing and the RSP which handled command processing and vertex transformation (and audio!).

The standard api was pretty much OpenGL, generating in-memory command lists that could be sent to the RSP.

However the RSP was a completely programmable mips processor (with simd instructions in parallel).

One of my favorite tricks in the RDP hardware was it used the parity bits in the rambus memory to store coverage bits for msss

14 days ago

[-]

> The standard api was pretty much OpenGL

Good point. It is the software APIs are where you do see the strong SGI influence. It's not OpenGL, but it's clearly based on their experience with OpenGL. The resulting API is quite a bit better than other 5th gen consoles.

It's only the hardware (especially RDP) that has little direct connection to other SGI hardware.

the-rc

13 days ago

[-]

The hardware folks came mostly from outside SGI and were picked especially because they had worked on cheaper systems before.

13 days ago

[-]

Wasnt GameCube less programmable? I remember reading about most lighting tricks accomplished with texture tricks.

12 days ago

[-]

The gamecube had a completely fixed function vertex pipeline (though, only a few N64 games used custom μcode. For the rest, it might has well been fixed function, and was had less functionality than the gamecube)

But per-vertex lighting was kind of old and boring by even 1995, it massively limited your art style. You really wanted per-pixel lighting.

The GameCube's vertex pipeline was very fixed function, but its Pixel pipeline was quite programmable. Far more programmable than the N64. It was basically equivalent to the Xbox's pixel shaders, more advanced in some ways. But because it wasn't exposed with the pixel shader programming model, many people don't consider it to be "programmable" at all.

*And in many ways, you shouldn't consider the xbox and other DirectX 8.0 shaders to be fully programmable. You were limited to 8-16 instructions, with no control flow at all. On the gamecube, instead of 8-16 instructions, you had 16 stages, each being equivalent to an instruction. The N64 had just two stages, which were less flexible. True Fixed function pixel pipelines (like on the PS1, PS2 or Dreamcast) have just a single stage, and very little configurability.

Those "texture tricks" are per-pixel lighting. Many of them aren't possible on fixed function GPUs like the PS2, they required both textures and a reasonably programmable pixel pipeline.

Even today, most per-pixel lighting is done with a mix of textures and shaders.

14 days ago

[-]

Super Mario 64 has been decompiled an ported to GL 1.3.

heraldgeezer

14 days ago

[-]

Shadow of the Colossus... https://www.youtube.com/watch?v=xMKtYM8AzC8

rightbyte

14 days ago

[-]

That is very impressive for a PS2 game.

HideousKojima

14 days ago

[-]

And a sequel (prequel?) to ICO, from the same devs

reidrac

14 days ago

[-]

I love how the post, about N64 graphic tricks, ends with the question: "Is this the future?"

[2] https://m.youtube.com/watch?v=bZl8xKDUryI

echelon

14 days ago

[-]

The amount of indie N64 development happening right now is wild. The platform is flourishing.

The system has seen a dozen of its most popular games decompiled [1] into readable source files, which enables easy porting to PC without an emulator. It also enables a ton of mods to be written, many of which will run on the original hardware.

There are numerous Zelda fan remakes [2]. Complete games with new dungeons and storylines.

The Mario 64 scene is on fire. Kaze has deeply optimized the game [3], and is building his own engine and sequels. If you like technical deep dives into retro tech, his channel is literally golden.

Folks are making crazy demos for the platform, such as Portal [4], which unfortunately brought Valve's lawyers' attention.

Lost games, such as Rare's Dinosaur Planet [5], have leaked, been brought up to near production ready status, been decompiled, and have seen their own indie resurgence.

[1] https://wiki.deco.mp/index.php/N64

[3] https://m.youtube.com/channel/UCuvSqzfO_LV_QzHdmEj84SQ

The whole channel is gold. He has dozens of deep dives like this: https://m.youtube.com/watch?v=DdXLpoNLywg

And his game and engine are beautiful: https://youtu.be/Drame-4ufso

[4] https://m.youtube.com/watch?v=yXzoZ2AfWwg

[5] https://m.youtube.com/watch?v=s0QSiPRmWaI

14 days ago

[-]

I'd love Perfect Dark backported to GL 2.1, but sadly some effects require GL 3.3 at minimum.

echelon

14 days ago

[-]

A 60 fps Perfect Dark with online play would be fun.

Curious - why the desire to have it run on GL 2.1?

Levitz

13 days ago

[-]

I've played multiplayer goldeneye online.

Turns out that perfect precision weapons on a m+kb setup are actually not much fun to play with. The movement is so limited compared to the brutal precision a mouse offers that everything just dies really really fast.

13 days ago

[-]

I have an old GL 2.1 netbook which still works and renders the game perfectly fine minus the GL 3.3 FX's which are kinda like the framebuffer FX's in the N64, mappable to current day shaders. Without GL 3.3 shader effects, menus are unreadable and you loss some translucid effects. If they did a GL 2.1 backport it would be great.

typeofhuman

14 days ago

[-]

It blows my mind how genius these game engineers were. They dealt with so many limitations and created such imaginative and brilliant solutions.

14 days ago

[-]

Limitations demand and produce extraordinary creativity. That's the secret behind pico8 and Animal Well and so many amazing games.

I wish I didn't think of a significantly better architecture for my 2d-pixel-art-game-maker-maker this weekend. Now it'll be another month before I can release it :(

jebarker

14 days ago

[-]

What were the limitations for Animal Well?

14 days ago

[-]

- 320 x 180 screen size for starters

- Limited map size

- Limited color palette I think

- and more!

jebarker

14 days ago

[-]

Were those imposed as artistic choices rather than due to hardware limitations etc? I just asked because it shipped on PC and the major consoles, so any limitations seem like they were by choice.

14 days ago

[-]

Yeah he talks about how it was a choice he made simply so he could get stuff done and have some end in sight.

01HNNWZ0MV43FF

14 days ago

[-]

Limitations, and, popularity

14 days ago

[-]

Popularity comes from utility. Utility comes from the right trade offs. Limitations demand careful trade offs.

01HNNWZ0MV43FF

14 days ago

[-]

The tradeoff was that the N64 was cheap and had Pokemon on it

ninjin

14 days ago

[-]

Cheap? In its generation the Nintendo 64 was the expensive choice. Maybe not because of the console itself (price varied across its lifetime relative to the competition) but because of the cost of the games (and nearly complete lack of piracy).

As for Pokémon, the Nintendo 64 launched in June 1996 and the first Pokémon game was Pokémon Snap released nearly three years after the console in March 1999.

amaranth

14 days ago

[-]

The N64 is older than Pokemon.

bonki

14 days ago

[-]

Not true, the N64 was released a couple of months after the first Game Boy games.

13 days ago

[-]

Pokémon in Japan came much earlier. Also, the PSX was the cheap choice, among the rampant CD piracy vs the very expensive N64 cartridges.

Dwedit

14 days ago

[-]

This is new stuff, not stuff done during the reign of the N64.

https://news.ycombinator.com/item?id=31075622

bob1029

14 days ago

[-]

Only recently did we figure out how to make Mario64 run at 30fps.

corysama

14 days ago

[-]

Around the end of the PS2’s lifetime, some engine dev friends of mine figure out to do palletized spherical harmonic lighting on the PS2. That was pretty straightforward.

What was tricky was a separate technique to get real cubemaps working on the PS2.

Unfortunately, these came too late to actually ship in any PS2 games. The SH trick might have been used in the GameCube game “The Conduit”. Same team.

http://research.tri-ace.com/Data/Practical%20Implementation%...

msk-lywenn

13 days ago

[-]

I always thought triace had shipped sh lighting on ps2 but maybe it was just a demo?

corysama

12 days ago

[-]

They were doing per-vertex SH. That was a much more practical approach :)

OCASMv2

14 days ago

[-]

> What was tricky was a separate technique to get real cubemaps working on the PS2.

Any details on that?

corysama

14 days ago

[-]

If you lay out a cubemap as a 2d texture that looks literally like https://www.turais.de/content/images/size/w1000/2021/05/Stan... it's not hard, given the VU1-based triangle processing (like proto-mesh-shaders 25 years ago), to set the UVs of triangles independently to use the correct square even in the case of dynamic reflections. This doesn't do per-pixel spherical UV normalization. But, with a dense enough mesh, a linear approximation looks good enough.

Except... The triangle UVs will often cross over between multiple squares. With the above texture, it will cross over into the white area and make the white visible on the mesh. So, you fill the white area with a duplicate of the texels from the square that is adjacent on the cube. That won't work for huge triangles that span more than 1.5 squares. But, it's good enough given an appropriate mesh.

Probably would have been better to just use a lat-long projection texture like https://www.turais.de/content/images/size/w1600/2021/05/spru... Or, maybe store the cubemap as independent squares and subdivide any triangles that cross square boundaries.

OCASMv2

14 days ago

[-]

Interesting, thanks!

Sharlin

14 days ago

[-]

I'm sure they were but, as noted, this specifically is 2025 stuff, and demoscene, not gamedev.

dejobaan

14 days ago

[-]

While I'm really happy we have faster systems now, there was something fun about about having to subvert constraints in games, and so satisfying and lovely when you did it right.

HN folks are probably familiar with raster interrupts (https://en.wikipedia.org/wiki/Raster_interrupt) and "racing the beam." I always associated this with the Atari 800. You weren't "supposed" to be able to do stuff like https://youtu.be/GuHqw_3A-vo?t=33, but Display List Interrupts made that possible.

What I didn't know until recently was how much Atari 2600's games owed to this kinda of craziness: https://www.youtube.com/watch?v=sJFnWZH5FXc

It's stuff like this that makes me think that if hardware stopped advancing, we'd still be able to figure out more and more interesting stuff for decades!

paulryanrogers

14 days ago

[-]

Demo scene and work like this is impressive. Yet I can't help but notice that it tends toward simpler more empty scenes. The kind of stuff one might expect in the background or as only a part of a game mechanic. It's as if there's just not enough resources to really make complete experiences with most of the techniques.

What I find more impressive are efforts like FastDoom or the various Mario-64 optimization projects which squeeze significantly better performance out of old hardware. Sometimes even while adding content and features. Maybe there is a connection between demo sceners and more comprehensive efforts?

kookamamie

13 days ago

[-]

We did similar palette-based lighting techniques in our shareware game in the 90s. Basically, arranging the VGA 256-color palette so that each color we supported would have a gradient of N shades of the color. Illumination within each color could then be easily altered by adding or subtracting color indices.

https://old.reddit.com/r/ps2/comments/1cktw88/gran_turismos_...

heraldgeezer

14 days ago

[-]

I miss the PS1 and PS2 optimization. Most of them look amazing uprezzed to 1080p or 4k or more with emulation. Halo 2 era graphics in 4k is all we need imo. Yes that one is xbox but try Halo MCC Halo 2 in classic graphics. Still looks incredible.

GT3 heatwave summarizes it well.

"I showed a demo of GT3 that showed the Seattle course at sunset with the heat rising off the ground and shimmering. You can’t re-create that heat haze effect on the PS3 because the read-modify-write just isn’t as fast as when we were using the PS2. There are things like that."

https://youtu.be/ybi9SdroCTA?t=4103

It's not trying to emulate a real heatwave as new engines like UE5 does, that just tanks fps. It does "tricks" to do it instead. And honestly, looking at RTX tanking frame rates, I would rather have these cheap tricks.

A 299MHz MIPS runs this:

Shadow of the Colossus... https://www.youtube.com/watch?v=xMKtYM8AzC8

GoW2 https://youtu.be/IpKLwIIdvuk?si=TjifKmlYsUuvhk0F&t=970

FFXII https://youtu.be/NytHoYOs_4M?si=jE1Fxy40khEvV6Bn&t=51

GT4 https://www.youtube.com/watch?v=F6lZIxk_h9g (THE BOOTSCREEN crying)

Black (Renderware was a crazy engine) https://youtu.be/bZBjcwyq7fQ?si=Pev5ifpksJm4X6Oi&t=356

Valkyrie profile 2 https://youtu.be/9ScjO4NuUtA?si=Z29cR-hLsT2pnP2I&t=38

Rouge Galaxy https://youtu.be/iR1evzyl-7Q?si=fldm3-NnuFxOITMn&t=624

Burnout 3 https://www.youtube.com/watch?v=_r5r0nE1sA4

Jak and Daxter, Ratchet.

For GC - RE4, Metroid, The Zeldas... ofc. Looks crazy good.

I kneel.

14 days ago

[-]

With the PS2 you are right. With the PSX... so-so. Yes, it could match maybe a Pentium 90 almost 100, but a MMX pentium with 3DFX would stomp it and be on par of the N64 if not better.

MIPS CPU's are amazing, they can do wonders at low cicles. Just look at the PSP, or the SGI Irix.

Also, the PS2 "GPU" is not the same as the R4k CPU. BTW, on the PS2... the Deus Ex port sucked balls against the PC port, it couldn't fully handle the Unreal engine.

Yes, the PS2 did crazy FX, but with really small levels for the mentioned port; bear in mind DX was almost 'open word' for a huge chunk of the game.

14 days ago

[-]

> With the PSX... so-so. Yes, it could match maybe a Pentium 90 almost 100, but a MMX pentium with 3DFX would stomp it

Pentium much faster than MIPS CPU for game logic, 3dfx 50 MPixels/s fillrate matches Playstations 60 MPixels/s, Pentium FPU tho is no match for Playstation GTE 90-300K triangles per second meaning you would have to rely on CPU power alone for geometry processing (like contemporary Bleem) resulting in 166-233MHz Pentium minimum requirements. MMX would be of no help here, it was barely used in few games for audio effects.

13 days ago

[-]

Bleem it's an emulator; it emulates the architecture, is not a virtualizer. 233 MHZ to emulate the 33 MHZ PSX seems reasonable, Windows 95/98 take up a good chunk of the CPU themselves. But, you forgot something.

The PSX "GPU" just worked with integers and that's it. Any decent compiler such as GCC and flags like -ffast-math would emulate the both dead simple MIPS CPU and the fixed point GPU where no floats are used at all while taking tons of shortcuts. MMX? Ahem, MPEG decoding from videos. If you did things right you could even bypass the BIOS decodings and just call the OS MPEG decoding DLL's (as PPSSPP does with FFMPEG) and drop the emulation CPU usage to a halt and let your media player framework do the work for you.

13 days ago

[-]

Bleem didnt need 233MHz Pentium to emulate 33MHz MIPS CPU, it needed it for the geometry (rotation, scaling). GTE 90-300K triangles per second is a LOT LOT. Geometry was the bottleneck of PC games in mid nineties. For example contemporary Quake was heavily optimized to operate on as little geometry as possible (BSP), rendering up to ~1000 triangles per second while only ever touching up to maybe 10K? triangles (I dont want to research this down to instrumenting quake code/looking at map data, Google results suggest PVS leaves are as small as hundreds of triangles) in active Potentially Visible Sets (PVS) at any given time. Playstation 1 on the other hand DGAF and could rotate/scale/light whole levels on every frame with raw power of GTE.

MMX was meant for anything you would normally use traditional DSP (also fixed point). Intel envisioned software modems and audio processing, in reality it was criminally underused and fell into 'too much effort for little gain' hole. Intels big marketing win was paying Ubi Soft cool $1mil for POD "Designed for MMX" ad right no the box https://www.mobygames.com/game/644/pod/cover/group-3790/cove... while game implements _one optional audio filter_ using MMX. Microsoft also didnt like Intel's Native Signal Processing (NSP) initiative and killed it https://www.theregister.com/1998/11/11/microsoft_said_drop_n...

MP3 - you could decode on Pentium ~100 so why bother, MPEG Pentium ~150 will play it flawlessly as long as graphic card can scale it in hardware. I would love to see the speed difference decoding MPEG with ffmpeg between Pentium 166 with and without MMX. Contemporary study shows up to 2x speedup in critical sections of image processing algorithms but only marginal gains for mp3/mpeg cases https://www.cs.cmu.edu/~barbic/cs-740/mmx_project.html

>drop the emulation CPU usage to a halt

Playstation 1 doesnt support MPEG.

Now could you implement GTE with MMX? Certainly yes, but again why bother when already 166-233MHz CPU is enough to accomplish same thing with integer unit alone.

12 days ago

[-]

Didn't the PSX use MPEG for videos? They were tons of these ingame...

Also, PAL resolutions for instance were a bit like SVGA, but the PSX did most of games at 320x240, as scaling them on fuzzy CRT's gave them almost a free antialiasing. 320x240 was easier to render on couch/bed TV's far away than Unreal or Quake 2 based games at 640x480.

The PSX pushed tons of triangles? At 320x240 on my 14" Nokia TV, they could look astounding, not so much on a PC desktop having to be rendered into a better quality CRT for PC's where 320x240 would look so-so.

12 days ago

[-]

Some PSX games did push a lot of triangles to the screen, but its not even about that. GTE is able to brute force process a LOT of geometry per frame. https://psx-spx.consoledev.net/geometrytransformationengineg... Think of GTE as T&L coprocessor, even wiki uses it as an example of early hardware T&L https://en.wikipedia.org/wiki/Transform,_clipping,_and_light...

This is why Pentium 100 could not pull of Playstation 1 games with just accelerated rasterizer like 3Dfx. Multiplying matrices is expensive, you either get specialized hardware to do it or hire Carmack and Abrash to use every trick in the book avoiding as much computation as possible. Lowering resolution does nothing, still have to cull, rotate scale, perspective correct and light same amount of geometry per frame.

kjkjadksj

14 days ago

[-]

I still think halo 3 looks a lot better than some modern games. Stuff like blur bloom and all that grass and foliage pop in does not in fact look good. It looks worse than just turning all of that off. And I can’t appreciate a high polygon count model when the game is a high speed fps so whats the point of that either. Halo 3 texture resolution to my eye is fine. I don’t think I would notice twice or 4x the size textures. Only thing I notice is the hardware demands.

heraldgeezer

12 days ago

[-]

Yes, I meant that aera in general. Halo 3 look great on PC still but many 360 titles have too much brown & bloom due to UE3 but Halo is custom engine.

Xbox 1 titles like jet set radio future etc look amazing also still.