This setup will be used internally for uncensored chat, coding, image gen and analysis.
We're thinking of using a combo of hardware:
- RTX 4090 GPU (heard it's a beast)
- Threadripper Pro 5955WX (anyone used this one before?)
- SSD NVMe 1TB
What are your picks for a local AI setup? And what’s the minimum budget to achieve it?
How many GPUs you need is completely dependent on the size of your team, their frequency of usage, and the size of the models you are comfortable with.
I generally recommend you rent instances on something like runpod to build out a good estimate of your actual usage before commiting a bunch of money to hardware.
But it's been 6 months since I used it and I feel like the proprietary ones (gpt-mini/o1-mini, claude-sonnet, gemini-flash) are just so much faster and cheaper than self-hosted. The real value is that your data remains private and the model doesn't silently change from 3.5 to 3.5-new.
What do you plan to use for "uncensored chat?" Many of the open source stuff are trained to be censored and I've had much better luck trying to get recent OpenAI chats to be uncensored.