Show HN: EasyWheels – Pre-built CUDA wheels, never compile flash-attn again
1 points
5 hours ago
| 1 comment
| easywheels.io
| HN
davidkny22
5 hours ago
[-]
I spent a full day trying to get mamba-ssm working on Windows. Hours of compiler errors, CUDA architecture mismatches, manually patching source code, rebuilding over and over drove me insane. Then I checked the GitHub issues. And I found hundreds of people stuck in the same hell.

The problem is bigger than one package. deepspeed on PyPI is source-only. flash-attn has pre-built wheels, but they're spread across multiple community GitHub repos. Some packages only cover one or two CUDA versions. Others are months out of date. And if the exact combo you need doesn't exist anywhere, you're compiling from source.

I built EasyWheels to make installing GPU wheels easy. It mirrors pre-built wheels from every upstream source I can find, and for the gaps where nobody has built a wheel, I build them myself on GPU infrastructure.

```

$ pip install easywheels

$ easywheels install deepspeed

Detected: Python 3.12, CUDA 12.8, RTX 4090 (sm_89)

Found: deepspeed-0.18.9+cu128-cp312-cp312-linux_x86_64.whl

Done in 9 seconds.

```

2,200+ wheels across 10 packages, CUDA 12.4 through 13.0, Python 3.10-3.13. The open source CLI detects your environment and pulls the right wheel. No guessing, no compiling. Linux builds are covered heavily. Windows build-out is in progress.

Free 14-day trial, then $9-19/mo or $2/wheel. I'm a student. I wish I could give this away, but building wheels takes a ton of GPU time and that costs money.

Source: https://github.com/davidkny22/easywheels-cli

Ask me anything about the build pipeline or why GPU packaging is this broken. If you need any help getting it set up, happy to help.

Doing ML research shouldn't be this hard. I'll do everything in my power to make sure it isn't ever again.

reply