FilterHN

How can I do my research as a GPU poor?

23 points

11 months ago

| 10 comments

I need to train 70b parameter LLMs for my research on world simulation and self-improving system. Right now I can only train small models on my 8gb 3050 or for a couple of dollars on the cloud, but I'm lacking resources to train better, faster models. What's your advice?

▲

protocolture

11 months ago

[-]

Have been looking at this myself and it seems the advice isnt good.

Its basically Cost/Speed/Size Pick 2 or maybe even 1.

Some people have been able to run large LLM's on older slower CUDA gpus. A lot of them are truly ancient and have found their way back to ebay simply due to market conditions. They aren't good, but they work in a lot of use cases.

There are a couple of janky implementations to run on AMD instead. But reviews have been so mixed that I decided not to even test it. Ditto multi GPU setups. I thought that having access to 16 8g AMD cards from old mining rigs would have set me in good stead but apparently it benches roughly the same as just using a server with heaps of RAM because of the way the jobs are split up.

The cloud services seem to be the best option at the moment. But if you are going to spend 100 bucks vs going to spend 1000 bucks it might be worth just to fork out for the card.

Also honestly, hoping that someone else has a better idea in this thread because it will be useful to me too.

▲

gperkins978

11 months ago

[-]

A100's are cheap. Are those bad now?

▲

protocolture

11 months ago

[-]

Define cheap?

I am seeing an average cost of 15k+ on feebay.

I think anyone with 15k could put together a rig with enough VRam to handle a decent model.

The question is more focused on a budget that seems sub 2k.

▲

luussta

11 months ago

[-]

thanks!

▲

worstspotgain

11 months ago

[-]

I'd say the first step is to rule out LoRA. If LoRA is an option, it buys you more than just the rig savings:

- You can deploy multiple specialized LoRAs for different tasks

- It massively reduces your train-test latency

- You get upstream LLM updates for "free," maybe you can even add the training to your CI

▲

luussta

11 months ago

[-]

thanks!

▲

jebarker

11 months ago

[-]

Is there a different angle on the research you can take that doesn't require training models of that size?

When choosing research problems it's important to not only follow what's interesting but also what's feasible given your resources. I frequently shelve research ideas because they're not feasible given my skills, time, data, resources etc

▲

RateMyPE

11 months ago

[-]

AWS, Azure and GCP have programs for Startups and for Researchers that give free credits for their respective platforms. Try applying for those programs.

They usually give between $1000 and $5000 worth of credits, and they may have other requirements like being enrolled in college, but you should check each of their respective programs to find out more.

▲

_davide_

11 months ago

[-]

I have a couple of M40 with 24 gb on a desktop computer, i had to tape the pci-e to "scale it down to 1x". It's ok for interference and play around with small trainings, but I can barely infer a quantized version of 70b parameters. Training anything bigger than 3b parameters in human timescale is impossible. Either you scale it down or ask for a sponsor. It's frustrating because IT had always been approachable, until now...

▲

8organicbits

11 months ago

[-]

Are you part of a university? Do they have resources or funding you can access?

▲

luussta

11 months ago

[-]

I'm not part of any university

▲

RecycledEle

11 months ago

[-]

Look into buying a used Dell c410x GPU chassis (they are $200 on my local CraigsList) and a Dell R720. Then add some cheap GPUs.

For a few thousand dollars you might get some processing power.

But you will never be able to pay your electric bill.

▲

thijson

11 months ago

[-]

Is there a cheap GPU you suggest? I was looking at NVidia P40's as they have 24 GB of ram, and cost a few hundred dollars.

▲

gperkins978

11 months ago

[-]

I was going to recommend an A100, but they are not available cheaply anymore. I set someone up with two of them I managed to score from Amazon for like $400 each, but used. They had 40gb RAM, but it was HBM and super fast. There were mining cards based off of these as well, but I have no idea what was stripped down to make them cheaper. The only issue is that you need to heat sink them if they are not going in a server, and in a rack-mounted server, there needs to be proper cooling (big dog fans).

But I just looked again, and A100's are not around at reasonable prices anymore. I cannot even find the old mining equivalents (they used to be everywhere, and CHEAP). Perhaps many people are building similar systems now. After the Ether merge it was a great time to build.

▲

NBJack

11 months ago

[-]

Yes, just ensure you can keep them cool. I have an old m40 that opened up my options with 24GB of VRAM, installed on a traditional case with a 3d printed cooler adapter + fan. While it isn't always fun getting older cards to work, it is certainly viable (and scalable if needed).

▲

luussta

11 months ago

[-]

thanks!

▲

muzani

11 months ago

[-]

Would this help? https://github.com/evilsocket/cake

▲

dragonwriter