FilterHN

Ask HN: How much would you pay for local LLMs?

7 points

7 months ago

| 6 comments

I want to build a private AI setup for my company. Im thinking of hosting our model locally instead of in the cloud, using a server at the office that my team can access. Has anyone else done this and had success with it?

This setup will be used internally for uncensored chat, coding, image gen and analysis.

We're thinking of using a combo of hardware:

- RTX 4090 GPU (heard it's a beast)

- Threadripper Pro 5955WX (anyone used this one before?)

- SSD NVMe 1TB

What are your picks for a local AI setup? And what’s the minimum budget to achieve it?

▲

bick_nyers

6 months ago

[-]

If you are comfortable with purchasing used hardware, used 3090 are great value, they can be had for roughly a third of the price of a new 4090.

How many GPUs you need is completely dependent on the size of your team, their frequency of usage, and the size of the models you are comfortable with.

I generally recommend you rent instances on something like runpod to build out a good estimate of your actual usage before commiting a bunch of money to hardware.

▲

nancy_le

6 months ago

[-]

thanks for your recs. I had tried runpod and want to start building a setup

▲

muzani

6 months ago

[-]

Could you just do this with LM Studio? It supports local servers too. I'm not sure about image gen but Llama 3 is tolerable with a MacBook Pro 2022 M1 with 32 GB RAM and no other GPU. Surely someone has an old MBP lying around for this.

But it's been 6 months since I used it and I feel like the proprietary ones (gpt-mini/o1-mini, claude-sonnet, gemini-flash) are just so much faster and cheaper than self-hosted. The real value is that your data remains private and the model doesn't silently change from 3.5 to 3.5-new.

What do you plan to use for "uncensored chat?" Many of the open source stuff are trained to be censored and I've had much better luck trying to get recent OpenAI chats to be uncensored.

▲

nancy_le

6 months ago

[-]

i want a private AI for internal use to ensure it's private, instead of using OpenAI. There's something about business confidentiality and cost, so I came to the decision to create a local one

▲

ActorNightly

6 months ago

[-]

It should not be a matter of how much, it should be a matter of when is the cost even breakpoint, because at some point and time after x number of runs, it will be cheaper to own your own hardware.

▲

eth0up

7 months ago

[-]

That's the only AI I'm interested in, for personal use. I presently pay $20 monthly for something I'm far from satisfied with.

▲

kayduyenma

7 months ago

[-]

What tool are you using? I'm self hosting a small AI chatbot. Although it's not as fast as ChatGPT, it's somehow more private and a fun experience for me

▲

eth0up

7 months ago

[-]

I'm embarrassed to admit, I'm using Perplexity. It's not all bad but the censorship and sometimes extreme reluctance to discuss out-of-the-box material infuriates me. The token caching also makes soome coding excursions a hassle and requires frequent new sessions.

▲

worksonmymach

6 months ago

[-]

Dont pay $20, use the API and pay $2 or less :)

▲

drdrey

6 months ago

[-]

it depends how good the answers are and how fast they are, how else would you put a dollar amount on it?

▲

p1esk

6 months ago

[-]

How big is your model?