FilterHN

Meta Keeps Delaying the Release of Its New AI Model to Developers

67 points

by mekpro

1 day ago

| past

| 7 comments

| wsj.com

| HN

▲

zeroonetwothree

1 day ago

[-]

I’ve used it at Meta. It’s very bad, if they released it in its current state it would be laughed at. I imagine they need to improve quality massively before it’s viable to release.

▲

anonym00se1

1 day ago

[-]

This is what I suspected. Wang was a generationally bad hire.

He has Meta SWEs making $250k+/year labeling data in AAI. He has exactly one move and it's this: https://i.imgflip.com/atotpp.jpg

▲

futuraperdita

1 day ago

[-]

“We don’t have good, unique data” is a pretty good out to keep Wang collecting extreme payouts and fleecing the leadership that has built trust in him, because it also plays to the one place where he’s demonstrated profitable expertise. It’s plausible enough to not be seen as a deflection.

I’m not sure the incentives are really aligned when you’re pouring that much cash and liquid RSUs at someone on normal vesting schedules. News stories of some of the acquisitions state that there are engineers in Meta’s AI organisation clearing 8 figures of compensation. If you didn’t think the strategy was successful, it’s rational (if not very principled) to continue to make excuses as to why until the gravy train stops and then use that to fund your retirement and the things you’d want to do instead.

▲

leosanchez

1 day ago

[-]

But didn't Zuck say it will replace junior to mid level engineers on Joe Rogan podcast or something ?

▲

fnordpiglet

1 day ago

[-]

And look what else that podcast has brought us.

▲

bamboozled

1 day ago

[-]

The podcast of truths

▲

krackers

1 day ago

[-]

Great way to filter out garbage benchmarks (many have Meta Muse at the top)

▲

enahs-sf

1 day ago

[-]

Occam’s razor tells me it’s probably because it’s not good. Perhaps running a company like survivor in a pressure cooker is not an effective management strategy.

▲

GoToRO

1 day ago

[-]

Also when you finally make it better, the others make theirs even better and you are still behind.

▲

DonsDiscountGas

1 day ago

[-]

Seemed to work when it comes to selling ads. I'm thinking training LLMs is harder than anthropic and openai make it look

▲

cyanydeez

1 day ago

[-]

I'm guessing both openai and anthropic have transitioned to prompt magic and fine tuning rather than try to keep building LLMs at scale. The fact that QWEN and other models are impressive, small and perfectly suitable for most work means every dollar you're spending on trying to train larger models is a losing prop.

▲

vinni2

20 hours ago

[-]

> every dollar you're spending on trying to train larger models is a losing prop

You probably don’t know how smaller models are trained then. Most of them are knowledge distilled or trained using data generated from larger models. If larger models are stopped there is no magical way smaller models will keep getting better.

▲

cyanydeez

14 hours ago

[-]

you're arguing with capitalism not science or engineering.

▲

vinni2

10 hours ago

[-]

Why don’t you argue with science and engineering then?

▲

ilaksh

1 day ago

[-]

The article makes it unclear if they are building a new model or if it's just the API. But I am guessing it's the API.

So it's "release to developers" rather than "new AI model". They cannot ship the API.

I would assume you would just provide an OpenAI compatible endpoint or two? But maybe they are not doing it that way.

Who knows what they are doing though. Maybe Meta has some kind of global API mesh thing and they can't quite make it work with vLLM or Sglang or something. Maybe they are building out a whole metered cloud IaaS for AI from scratch and that's just how long it takes. Maybe it's not technical complexity and just one of the managers is a problem.

Maybe they are delaying the API release until another more competitive model finishes training and testing.

▲

mekpro

1 day ago

[-]

API server is not hard problem and not make sense for indefinite postpone. I think the more likely explanation is model quality.

Too bad for Meta, and very sad day Llama.

▲

xnx

1 day ago

[-]

Completely forgot that Meta was doing AI (and certainly spending billions doing so). They've got a lot of money, but are far behind on experience, talent, technology, and infrastructure.

▲

zdragnar

1 day ago

[-]

Don't forget llama.cpp came about when meta released the weights to their LLaMa LLM. They've been in the game for awhile, just not anywhere near the top of the score board since.

▲

rhdunn

19 hours ago

[-]

llama.cpp is great. However, Llama 4 was a misstep for them: it was too big, so was out of reach of the LocalLlama crowd and hard to train/customize into different variants like has happened with the smaller models on Hugging Face. 70B seems to be about the limit there, with smaller models being easier to run and customize.

▲

ramshanker

1 day ago

[-]

If they release a model comparable to OpenAI / Antropic, will there be any reason left for 1T valuation of other companies? At that point, it will simply become Revenue proportional to Gigawatts available. Whoever got the energy wins.

▲

verdverm

1 day ago

[-]

DeepSeek and friends already exists, yet $1T valuations still exist. I think we are nearing a point where inference and cost metrics become the primary optimization for a while. Both capacity and costs are going to drive it from the demand side. I've personally moved to open weights already, now setting up vendor calls to make them available at work.

Talking with OpenCode and Fireworks, appreciate any recommendations that have SOC-2 and the like

▲

killingtime74

1 day ago

[-]

CloudFlare host the best models (Deepseek/Kimi)

▲

azinman2

1 day ago

[-]

*best open models

▲

verdverm

1 day ago

[-]

They seem to be missing a number of them, DeepseekV4 did not show in their list / search. They seem to be slow on offering the latest options, a number of them are MIA

▲

havaloc

1 day ago

[-]

If you spend more than 1 minute on Facebook, you realize what they are potentially training their data on, and it is not good. Their advertising algorithm is very good, I'll give them that.

▲

tosh

1 day ago

[-]

https://archive.is/ia01T