Show HN: I cross-compiled llama.cpp to run on Windows XP
2 points
1 hour ago
| 3 comments
| okt.ai
| HN
Had a dumb thought: what if someone in 2003 could run a local LLM on their machine? XP desktop, rolling hills wallpaper, maybe Winamp in the corner—and you just chat with an AI locally.

I saw there were some attempts on Reddit, so I tried it myself.

Cross-compiled llama.cpp from macOS targeting Windows XP 64-bit. Main hurdles: downgrading cpp-httplib to v0.15.3 (newer versions explicitly block pre-Win8), replacing SRWLOCK/CONDITION_VARIABLE with XP-compatible threading primitives, and the usual DLL hell.

Qwen 2.5-0.5B runs at ~2-8 tokens/sec on period-appropriate hardware. Not fast, but it works.

Video demoand build instructions in the write-up.

Claude helped with most of the debugging on the build system. I just provided the questionable life choices.

vintagedave
9 minutes ago
[-]
> XP-era hardware doesn’t have AVX. Probably doesn’t have AVX2 or FMA either. But SSE4.2 is safe for most 64-bit CPUs from 2008 onward:

It won't; FMA is available from AVX2-era onwards. If you target 32-bit, you'd only be "safe" with SSE2... if you really want a challenge, you'd use the Pentium Pro CPU feature set, ie the FPU.

I have to admit I'd be really curious what that looked like! You'd definitely want to use the fast math option.

This is an awesome effort, btw, and I enjoyed reading your blog immensely.

reply
vintagedave
1 hour ago
[-]
Really shows what could be achieved back then -- and in a sense, how little the OS versions we have today add.

Challenge: could you build for 32-bit? From memory, few people used XP64, it was one of the Server editions, and Vista and Windows 7, when people started migrating.

reply
dandinu
31 minutes ago
[-]
That's pretty accurate. I'm always amazed how much we move forward with technology, just to later realize we already had it 15 years ago.

regarding your question:

I have a 32bit XP version as well, and I actually started with that one.

The problem I was facing was that it's naturally limited to 4GB RAM, out of which only 3.1GB are usable (I wanted to run some beefier models and 64bit does not have the RAM limit).

Also, the 32bit OS kept freezing at random times, which was a very authentic Windows XP experience, now that I think about it. :)

reply
vintagedave
4 minutes ago
[-]
> out of which only 3.1GB are usable

That would be a real issue. I vaguely recall methods to work around this - various mappings, some Intel extension for high memory addressing, etc: https://learn.microsoft.com/en-us/windows/win32/memory/addre...

Maybe unrealistic :( I doubt this is drop-in code.

reply
vintagedave
1 hour ago
[-]
> Eventually found it via a GitHub thread for LegacyUpdate.

Can you share that link in the blog? This is the equivalent of one of those forums posts, 'never mind, solved it.' It's helpful to share what you learned for those coming later :)

reply
dandinu
50 minutes ago
[-]
there is a full technical write-up in the Github repo in "WINDOWS_XP.md": https://github.com/dandinu/llama.cpp/blob/master/WINDOWS_XP....

Sorry for failing to mention that.

Link to vcredist thread: https://github.com/LegacyUpdate/LegacyUpdate/issues/352

reply