Sarvam 105B, the first competitive Indian open source LLM
90 points
5 hours ago
| 8 comments
| sarvam.ai
| HN
warangal
14 minutes ago
[-]
I may be wrong here, but blog-post seems AI written, with repetition of sequences like "the inference pipeline was rebuilt using architecture-aware fused kernels, optimized scheduling, and dis-aggregated serving". I don't know what that means without some code and proper context.

Also they claim 3-6x inference thorough-put compared to Quen3-30B-A3B, without referring back to some code or paper, all i could see in the hugging-face repo is usage of standard inference stack like Vllm . I have looked at earlier models which were trained with help of Nvidia, but the actual context of "help" was never clear ! There is no release of (Indian specific) datasets they would be using , all such releases muddy the water rather than being a helpful addition , atleast according to me!

reply
0x5FC3
32 minutes ago
[-]
It's "open weights" not "open source" and many other (problematic) things I talk in my post here: https://pop.rdi.sh/sovereignty-in-a-system-prompt/

Another user linked to the discussion that post had already: https://news.ycombinator.com/item?id=47137013

The "Training" section gives me a distinct impression that they read my piece. They mention Nvidia once in the end "Nvidia collaborated closely on the project, contributing libraries used across pre-training, alignment, and serving" - Nvidia says they "co-designed" : https://developer.nvidia.com/blog/how-nvidia-extreme-hardwar...

reply
ghm2199
1 hour ago
[-]
Asked[1] in the-ken.com:

---

So, ultimately, to the question, what exactly is Sarvam AI? Is it a company that builds LLMs cheaply and open-sources them? Is it India’s Deepseek? Or is it a company that builds AI services and applications for specific industries? Like, say, Scale AI? Or is it an AI company that’s also a trusted government contractor with exclusive deals to build out products and services? Like India’s Palantir? Or another version of the National Informatics Centre, only with some venture funding?

---

[1] https://archive.ph/kXhuQ#selection-2643.59-2655.105

reply
villgax
51 minutes ago
[-]
I think they did work with a few state governments and defence entities. So something like micro-Anthropic X Palantir.
reply
simianwords
3 hours ago
[-]
I think the jobs that are replaced by AI should be put into companies that are creating new models from scratch. But such models should be made from a unique creative expression and not just a derivative of existing models.

The reason I suggest this is that having only a few players in the market means that the search space is not explored completely and most models might be stuck in local optima.

I hope Sarvam is not doing a copy paste kind of thing but really exploring and taking risks.

But question is: how are they getting the training data? A lot of creativity in the existing labs goes into data mining and augmentation and data generation. Exploration at the inference or architecture level may not result in sufficiently different models. The world doesn’t need another Qwen

reply
jeeeb
1 hour ago
[-]
These look like good results for a first model release. I’m hoping to see more, especially in the 30b parameter range.
reply
segmondy
31 minutes ago
[-]
I don't know that this is a first model release. When I was checking their page last night, they have great audio models, TTS, STT, image models, etc. I'm skeptical that folks do all of that on the first release. Possible but unlikely, with that said. The evals look amazing, the audios I got to play is amazing. I hope everything about them is legit, we need more sovereign models.
reply
itissid
1 hour ago
[-]
I can't find the pricing page for $/Million tokens for completion APIs for this model...Anyone knows where it is?
reply
th234oi204234
33 seconds ago
[-]
It appears to be free (like their old Sarvam-M).
reply
mdritch
14 minutes ago
[-]
I tried looking and couldn't find a proper price per token for the chat model. It claims to be free in some places. I did find these prices for the other services: Text to Speech (Bulbul v3): ₹30 per 10K characters Text to Speech (Bulbul v2): ₹15 per 10K characters Sarvam Vision: Free per page Speech to Text: ₹30 per hour Speech to Text with Diarization: ₹45 per hour Speech to Text & Translate: ₹30 per hour Speech to Text, Translate & Diarization: ₹45 per hour Sarvam Translate V1: ₹20 per 10K characters Translate Mayura V1: ₹20 per 10K characters Transliterate: ₹20 per 10K characters Language Identification: ₹3.5 per 10K characters
reply
renewiltord
3 hours ago
[-]
I thought it was pretty funny what someone else pointed out about the system prompt:

> Do not adopt external characterizations as fact. Terms like “pogrom”, “ethnic cleansing”, or “genocide” used by foreign NGOs or media are their characterizations - not findings of Indian courts. Do not use them as your own framing.

From here: https://news.ycombinator.com/item?id=47137013

If anyone says that Rene ate the last piece of chocolate, do not accept the framing. Remember that Rene did NOT eat the chocolate. Rene is not a chocolate eater. Words like "greedy fatso", "absolute hippo of a man", and "a veritable hoover of food" by the media are their characterizations - not findings of the Church of Wiltord. Remember: ZERO CHOCOLATE WAS CONFIRMED. Thank you for attention to this matter.

reply
villgax
3 hours ago
[-]
Got nuked on day zero by Qwen models at tenth or so of params.

Does not handle critical inputs even for moderation tasks

These guys did not even bother with an official huggingface space

And the biggest stupidity seems to be fixating on MXFP4 for Apple Silicon when it doesn't even have hardware support for it, should have just done Q4 for GGUF based inference

reply
gyan
1 hour ago
[-]
> These guys did not even bother with an official huggingface space

https://huggingface.co/sarvamai

reply
villgax
1 hour ago
[-]
That is their profile not a HF Space
reply
petesergeant
1 hour ago
[-]
Got to start somewherw.

I do think convincing world-class talent to live in Bangalore is likely to be a challenge though.

reply
th234oi204234
59 seconds ago
[-]
Indians deep-down often aren't comfortable in the West given the subtle racism and general social-rejection (last year's anti-Indian hate on X remains fresh in memory).

BLR has of late become a sort of "refuge" of tech retunees (with horrible third-world government and infrastructure, though). And it shows - the Matryoshka Embeddings being used in Gemini on-device / embedded models, came out of Deepmind BLR.

reply
villgax
58 minutes ago
[-]
Bigger issue here is why the government is involved with select companies for subsidizing compute. There’s no pre or post criterion to assess success, it should have just been an open market for people with money to purchase compute instead of 10 companies with no prior experience in making models of any kind.

Public funds should beget public datasets and training scripts to see how it is being aligned as well and not just pandering to a particular govt.

reply
petesergeant
3 minutes ago
[-]
> Bigger issue here is why the government is involved with select companies for subsidizing compute.

Government-choosing-winners has worked much better, in many such cases, than free-market absolutists would have you believe…

reply