FilterHN

mlmonkey

3 hours ago

[-]

For those interested, Wired ran a backstory about the Attention is All You Need paper 2 years ago: https://www.wired.com/story/eight-google-employees-invented-...

It gives some context on the contributions of each of the authors. About Shazeer, from the article:

Shazeer’s joining the group was critical. “These theoretical or intuitive mechanisms, like self-attention, always require very careful implementation, often by a small number of experienced ‘magicians,’ to even show any signs of life,” says Uszkoreit. Shazeer began to work his sorcery right away. He decided to write his own version of the transformer team’s code. “I took the basic idea and made the thing up myself,” he says. Occasionally he asked Kaiser questions, but mostly, he says, he “just acted on it for a while and came back and said, ‘Look, it works.’” Using what team members would later describe with words like “magic” and “alchemy” and “bells and whistles,” he had taken the system to a new level.

SiempreViernes

2 hours ago

[-]

> Using what team members would later describe with words like “magic” and “alchemy” and “bells and whistles,”

Ok, these peopl have all gotten extensive training on how to hype for the non-technical crowd without saying anything of substance.

ahmadyan

27 minutes ago

[-]

As a hacker, I kinda like naom's code. I was had to implement a TC MoE kernel, and stumbled upon his code from [tensor2tensor](https://github.com/tensorflow/tensor2tensor/blob/master/tens...) and i think "alchemy" is justified. Dude writes some beautiful kernels.

He also saw LLM would replace search before anyone else, and that is something to look at the Lamda or GPT-1's output and think: yeah this will answer all of our questions one day.

jvican

2 minutes ago

[-]

There's no doubt about Noam's abilities. But I read through that code, and struggle to see its 'magic' or 'alchemy'. Can you elaborate what you find especially good about that code? (You may assume GPU kernel programming knowledge on my end.)

eli_gottlieb

3 minutes ago

[-]

Also, evaluating complicated functions with numerical stability and automatic differentiation is hard.

https://news.ycombinator.com/newsguidelines.html

dang

1 hour ago

[-]

"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."

anon_shill

1 hour ago

[-]

Does that apply to quotes from an article? They seemed to be criticizing a second or third degree source for being PR, which feels fair.

dang

55 minutes ago

[-]

Yes, in the sense that if there's nothing interesting to say about a quote then there's no reason to copy it into the thread.

nrds

50 minutes ago

[-]

And, of course, "interesting" means "interesting to dang"; whether apparently technical sources have apparently received PR training is therefore not "interesting". Drop the "my preferences are actually just objective truth" routine for once. Why is it so painful to admit you curate this site by preference?

epihelix

9 minutes ago

[-]

The "bells and whistles" label sounds more dismissive / perjorative to me. An odd, and not a particularly nice, thing to say. Makes me wonder how the "magic" and "alchemy" terms were intended in this case, also.

petilon

4 hours ago

[-]

Noam Shazeer was one of the lead authors of the seminal paper "Attention Is All You Need", which introduced the transformer architecture. (From Wikipedia)

4 hours ago

[-]

This understates his criticality. The author list was randomized, but the critical idea was truly his. Wonder what this says about GDM …

HarHarVeryFunny

3 hours ago

[-]

The architecture was Shazeer's, but the rough idea came from Jakob Uszkoreit who initiated the project.

Uszkoreit wanted to build a more efficient/scalable language/seq2seq model that could take advantage of GPU parallelism (replacing RNNs which were the main approach to sequence modelling at that time).

Uszkoreit's insight was that although language appears sequential, it is in fact really part parallel part hierarchical, as can be seen by linguist's sentence parse trees where at each level there is parallelism/independence between the branches of the tree, with them getting combined at the next level up. This is what gave rise to the idea of a model that consisted of a stack of of parallel processing layers (transformer layers). I believe that attention was also part of the plan from day one, as this had already been proven to be valuable (Bahdanau) with RNN seq2seq modelling.

So, this is what Uszkoreit wanted to build, but by his own account he failed to come up with an implementation that matched or outperformed the prevailing RNN approach that he wanted to replace. At this point, Uszkoreit mentioned the idea to Shazeer, who got on board and eventually arrived at a performant architecture which was then pared back by an ablation process resulting in the initial encoder-decoder Transformer architecture. Shazeer later came up with the mixture-of-experts architecture, and also other optimizations after he left to found character.ai

abixb

1 hour ago

[-]

Curious about others' contributions, such as Vaswani, Parmar, Jones and Gomez, to the paper. What sucks about co-authorship in research papers is that you don't get a clean breakdown of who contributed what to the research paper, and the distribution (in more cases than not) is very much like a pareto distribution.

I'm talking from plenty of group project experience here.

senordevnyc

1 hour ago

[-]

Can you expound on the ablation process? Is that referring to a stripping down of the data or weights or something? Or a stripping down of the transformer architecture structurally? Just curious

tedd4u

1 hour ago

[-]

You train the model then do a baseline evaluation. Then you evaluate many variants where you have removed or nulled out different layers or chunks of the model. By comparing the performance of those mutated models to the baseline you can learn a lot about the model. What parts don't have much value and can be removed, the location of "functions" or "facts." Etc. Google it.

flebron

3 hours ago

[-]

Source for this? The notion of attention dates to a content-addressable lookup during sequence alignment (as well as, concurrently, memory lookups in neural Turing machines). Attention had been used in other models, like GRUs and LSTMs with attention. The Vaswani et. al. paper did not introduce attention, just removed everything _but_ attention (and FFW) from the network. Are you claiming the "critical idea" of removing the GRU and LSTM parts and just keeping attention was "truly" Noam's?

daemonologist

3 hours ago

[-]

At some point in late 2017 the paper was updated with this additional detail:

    Equal contribution. Listing order is random. Jakob proposed replacing RNNs with self-attention and started the effort to evaluate this idea. Ashish, with Illia, designed and implemented the first Transformer models and has been crucially involved in every aspect of this work. Noam proposed scaled dot-product attention, multi-head attention and the parameter-free position representation and became the other person involved in nearly every detail. Niki designed, implemented, tuned and evaluated countless model variants in our original codebase and tensor2tensor. Llion also experimented with novel model variants, was responsible for our initial codebase, and efficient inference and visualizations. Lukasz and Aidan spent countless long days designing various parts of and implementing tensor2tensor, replacing our earlier codebase, greatly improving results and massively accelerating our research.

In any case, if the authors considered their contributions equal, that's good enough for me.

1 hour ago

[-]

Thanks - wanted to point to this, and indeed should have worded my claim more precisely. And yes, am aware of prior work on attention. (I need to look it up, but I recall Noam saying publicly that he wouldn’t have agreed to random ordering of contributions if he knew this was going to be this big).

mi_lk

3 hours ago

[-]

I don't know we can just say things now. Ah we're on the internet

d4rkp4ttern

3 hours ago

[-]

Is this a generally well known thing?

1 hour ago

[-]

Nope, but it’s not particularly unknown either. It shouldn’t be a surprise; he had remarkable research contributions before and after (separately, he was also an IMO gold medalist).

markdown

4 hours ago

[-]

Even more important, I wonder what it says about HBW...

khazhoux

3 hours ago

[-]

Even if we knew, we’d still fail to understand GHO

fastball

3 hours ago

[-]

But more importantly the impact this has on TLAs

gzer0

4 hours ago

[-]

Some context for people who haven’t followed the full loop: Shazeer was a long-time Google researcher, joined Google in 2000, and was one of the co-authors of “Attention Is All You Need.”

He left Google in 2021 to co-found Character.AI. In 2024, Google brought him and some Character.AI researchers back via a licensing/talent deal with Character.AI (reportedly around $2.7B). He was then made a Gemini co-lead.

Now he’s leaving Google again for OpenAI.

Exciting times!

https://youtu.be/v0gjI__RyCY?is=nz77XP4KiJy7L1AX

paulmist

2 hours ago

[-]

I first saw Noam on Dwarkesh’s podcast together with Jeff Dean. Recommend if you want a taste of what’s Google’s folks take on things.

rhipitr

2 hours ago

[-]

At this point is it even pay that’s tempting or is it more about what they get to do? I would assume Google could easily pay them what openAI can, unless as an older company it’s harder for Google to match something really out there

p1necone

2 hours ago

[-]

Yeah my current feeling is that once I had double digit millions earning further money would be pretty meaningless to me, and the difference between 'large salary' and 'even larger salary' would be even more meaningless, but who knows maybe it really would change me. I kind of assume people like this are primarily chasing the most interesting/impactful work though.

nrds

47 minutes ago

[-]

The problem with this belief is that it implies that all of bigtech is massively overpaying for top talent who would happily stay on for pennies. While bigtech overpaying talent is more plausible than any other bigcorp doing so, it's still rather unlikely.

zaat

1 hour ago

[-]

It gets to the point where what you do is the main question while payment is barely a minor concern way earlier than that point, at least in my experience. You don't need to be in the top AI research tier for that.

dudus

2 hours ago

[-]

How can an acquired dude leave after less than 2 years?

mikeyouse

2 hours ago

[-]

OpenAI pays for the earn out he would’ve otherwise received at Google + a new comp package. Made up numbers, if Google still owed him $10M for lasting the full two years, OpenAI can just pay him market rate +$10M.

deadbabe

1 hour ago

[-]

Yes, but what about the audacity of it? Get paid a lot to join a company but then decide to get up and leave again 2 years later? He just wants to be passed around?

mlmonkey

10 minutes ago

[-]

There's a possibility that he lost out in internal political battles, and things weren't going his way. Google is full of battle-hardened political warriors who will do anything (subterfuge, sabotage, etc.) to win battles. It is possible that a guy who just wants to build cool shit would feel like a misfit in such an environment.

drevil-v2

34 minutes ago

[-]

Oh my goodness think of the poor multi-trillion dollar company!! No honour among thieves these days...

mikeyouse

1 hour ago

[-]

You can have character and be loyal to Google (lol) or make $xx million… I’m not surprised when people choose the latter.

quantumink

1 hour ago

[-]

I would argue its not the millions though, but rather that sweet rare compute - OpenAI has more of it for his interests than anyone - it is understandable why an exceptional mind would prioritize access to greater capabilities above all else

cubefox

2 hours ago

[-]

> Exciting times!

What is exiting about this?

kkotak

1 hour ago

[-]

Right?! Unless you think this move is going to generate general excitement in our lives, it's just another rich guy moving from one high paying job to another.

dude250711

2 hours ago

[-]

Maybe he figured out a good way to short AI companies?

ddmma

2 hours ago

[-]

Hopefully will get to the conclusion that "Hopfield Networks is All You Need"

Natfan

3 hours ago

[-]

this character.ai? https://www.bbc.co.uk/news/articles/ce3xgwyywe4o

mlmonkey

3 hours ago

[-]

Oui!

root-parent

3 hours ago

[-]

The Netflix documentary will reveal he was secretly working for Sam Altman the whole time... (Cue diabolical VC-backed evil laugh.)

Google lost three critical years chasing AGI, and got acquired by SpaceX, now a Dyson Sphere startup whose pitch deck is just: "What if we put a paywall around the Sun?"

gniv

10 hours ago

[-]

Wow. What could possibly have caused him to quit so soon after coming back?

I hope this is not accurate but I'm afraid it is: https://x.com/signulll/status/2067446889956430273

afavour

4 hours ago

[-]

https://nitter.net/signulll/status/2067446889956430273 for those who don't want to click the above

er4hn

4 hours ago

[-]

signull is more of an anonymous sh*tposter than a known industry insider, but I think this does capture the sama contribution to OpenAI very well. At least from an outsider who follows this stuff based on vibes.

thewebguyd

4 hours ago

[-]

That twitter story isn't anything unique to OpenAI or Google, it's just classic "big public corp vs private startup" culture. Once you have to worry about the SEC, shareholders, antitrust, regulations, lawsuits, etc. it's very, very difficult to avoid turning into "big corp" culture.

Sama, and any other founder, will always have a difficult fight against bureaucracy, and once you let a little bit in, the bureaucracy's sole purpose becomes to grow itself.

electriclove

4 hours ago

[-]

Google and Apple both need a culling similar to what Elon did with Twitter after taking over.

quentindanjou

3 hours ago

[-]

I disagree. It's not about the culling, it has never been, and actually, it makes things worse. You spend countless hours and tons of money recruiting talented people not to lay them off because you don't want a bureaucratic org.

If the issue is inefficiency, tons of meetings, too much team alignment etc, then that's the issue that you need to tackle, and these issues can already appear in a 50-100 employee company. Sure, that's an easy problem to solve with a smaller size but unless you hired people for no reason, these people have a very specific set of problems to tackle and are often, in these companies, the best in class to tackle them, culling half of the company isn't going to make things better.

(And X rehired part of the laid-off engineers)

zipy124

2 hours ago

[-]

That depends who you are firing. There are many job roles who's primary output is meetings and documents.

What percentage of Google employees are engineers...

whatever1

3 hours ago

[-]

Google bloat gave us transformers. Apple bloat gave us a usable touchscreen only, pocket computer (famously an entire org within Apple had developed an iPod-based approach that was competing with what was released)

The leaps forward need bloat. A startup can execute on specific vector direction way better.

Now back to your point, what did X deliver with its lean ops? It seems that it needed 2 bailouts (one from xAI, and one from space X)

objclxt

2 hours ago

[-]

> Google and Apple both need a culling similar to what Elon did with Twitter after taking over.

You could cut Google's size by 40% and they'd still have more corporate employees than Apple.

(Google has ~190k employees, Apple has ~160k but 50k of those are retail staff, so ~110k corporate)

VirusNewbie

1 hour ago

[-]

Google is competing with nvidia (TPU), AWS (GCP), Netflix (youtube), Tesla (waymo self driving), OpenAI (Gemini), Microsoft (Workspace), Apple (Android)....

HDThoreaun

4 hours ago

[-]

Google is facing a legitimate innovators dilemma here. It makes sense to have all this process when youre protecting a $4.5 trillion golden goose. The tragedy here is that one predictable outcome of this situation is google deciding to considerably cut research funding when they figure out it just serves to bootstrap future competitors.

eikenberry

4 hours ago

[-]

This is when it makes sense to split your business up into multiple smaller businesses. The government should be doing this via anti-trust but they have dropped the ball there so, at this point, the corps really need to just do it to themselves to better compete.

ginko

4 hours ago

[-]

Wasn't that what the whole Alphabet re-org was supposed to do?

breppp

3 hours ago

[-]

Alphabet has Google with 99% of the profit through Ads, Search, Cloud, Gmail, Youtube etc

and tens of losing companies that make balloons or whatnot

4 hours ago

[-]

If I had to make a guess, money played a role lol.

karmasimida

3 hours ago

[-]

He is close or already a billionaire, not sure much more money will be do much heavy-lifting

efficax

3 hours ago

[-]

you'd be surprised! people seem to have a limitless appetite for that money stuff. they just can't get enough of it, i've found

tedd4u

1 hour ago

[-]

I know some pretty wealthy people. They are very aware of those who are 10x wealthier than them. If Noam has 1B, he is probably pretty aware of those that have 10B. He's met them and seen their properties, scope, and powers. Likewise, they are thinking about those that have 100B, and those are thinking about Elon, who now has "four commas."

eloisant

2 hours ago

[-]

Most are happy and stop at multimillionaires, but of course we don't hear about them. The focus on hungry billionaires is survivor bias.

We don't hear about Tom from MySpace.

VirusNewbie

1 hour ago

[-]

How much money do you have? or are you just commenting from the peanut gallery?

busymom0

3 hours ago

[-]

This reads like an episode of Silicon Valley. I wish that show was rebooted, they'd have so much funny material nowadays.

swader999

2 hours ago

[-]

I think real life has far eclipsed the absurdity of the original show. They might have a hard time competing with just the news now days.

cguess

1 hour ago

[-]

Even back then Mike Judge said he had to tone down the absurdity he saw on fact-finding trips to Bay Area. He said no one would believe how absolutely stupid so much of all of it he saw was.

busymom0

1 hour ago

[-]

Or they might give tech companies more ideas!

tedd4u

1 hour ago

[-]

Three comma guy would now be four comma guy

busymom0

1 hour ago

[-]

Wonder what the modern version of the hotdog app would be?

beng-nl

3 hours ago

[-]

I loved that show. The love that went into it really shows.

Sadly the gap between reality and satire has shrunk.

But yes. I also wish that show would come back.

Noam shazeer would be google head dreamer

HarHarVeryFunny

2 hours ago

[-]

The gap between reality and satire was apparently already very small back when the the show was written. The creator, Mike Judge (who also created Beavis & Butthead, and Idiocracy) had worked in Silicon Valley as a developer and based the show on what he saw. Apparently it was very popular with SV insiders precisely because it was so accurate.

dekhn

1 hour ago

[-]

Judge also consulted with various teams at places like Google; I worked with one of the guys who provided details that later showed up on the show (as well as many plushies). He didn't watch the show because "it hit too close to home"

zipy124

2 hours ago

[-]

And office space!

jmaw

3 hours ago

[-]

Gilfoyle was really ahead of the times with Son of Anton.

cubano

3 hours ago

[-]

Your dream may be only a prompt away.

tyre

4 hours ago

[-]

going to go with "money" and a lot of BS from altman

https://www.youtube.com/watch?v=gilk-76W9rE&t=60

nostrademons

3 hours ago

[-]

[Edit: note that my comment was reparented, it was originally a response to someone claiming Noam was another "Scam Altman". I don't mind the reparenting or the killing of the original subthread, but I feel like this is necessary context to understand this.]

Noam is the real deal, he was pretty legendary within old-time ('00s) Google engineering. Paul Buchheit had a story about interviewing him with the "how to write a spellchecker" question and then him coming up with something better than the state-of-the-art, then basically delivering Google's spell corrector in his first 2-week Noogler project.

root-parent

3 hours ago

[-]

If he is supposedly extremely smart, then surely he would have known what he was doing. So how can anyone claim all this was just an accident?

"Google and Character.AI agree to settle lawsuits over teen suicides" - https://www.axios.com/2026/01/07/google-character-ai-lawsuit...

Be aware...very disturbing: https://www.judiciary.senate.gov/imo/media/doc/e2e8fc50-a9ac...

janalsncm

2 hours ago

[-]

Is this genuinely confusing for people? He helped invent the transformer, he didn’t solve content moderation.

mikeryan

2 hours ago

[-]

he didn’t solve content moderation.

Considering what character.ai is, maybe he should have at least taken a shot at it.

freejazz

2 hours ago

[-]

Just from reading the threads here it seems readily apparent that he then went to start this company that did these bad things. Does not seem confusing at all?

Noumenon72

1 hour ago

[-]

Wow, he was using AI to solve problems in 2000 already, that spell corrector being trained on the Web and becoming the first widely used AI tool. Decades ahead.

https://old.reddit.com/r/singularity/comments/1u8xc9m/most_l...

kxxx

4 hours ago

[-]

Seems like there are some insights here!

edit: it seems the post has been removed but comments are viewable.

1 liner summary:

To put it lightly, the dude was politically outspoken and held strong beliefs.

dgellow

3 hours ago

[-]

https://www.yahoo.com/news/articles/google-cracks-down-posts... seems to have some context

statuslover9000

1 hour ago

[-]

Shazeer is an aggressive Zionist, and while Altman is better at reading the room, he has previously aligned himself with Israel: https://www.timesofisrael.com/openais-sam-altman-says-israel...

UltraSane

1 hour ago

[-]

What does Zionist mean when Israel has existed as a Jewish state for 78 years? I'm genuinely asking because the way the word is used doesn't make sense to me. There aren't similar terms for other countries to just stay the same, like for China to keep being run by the CCP. Every other country is assumed to have ontological inertia except for Israel.

Rumudiez

43 minutes ago

[-]

I'm confused, is 78 years a long time? even the US is considered a toddler by empirical terms. zionism wasn't a thing until a minority group had the loudest voice in the room when the allies were discussing what to do with all the european refugees after ww2, and it happened to align well with the brits abandoning their failed colony in the region due to disputes with the locals

https://en.wikipedia.org/wiki/History_of_Palestine

Rumudiez

38 minutes ago

[-]

here's a quote from wikipedia. it was an utter land grab and an easy way out of responsibility for those in power

> The League of Nations gave Britain mandatory power over Palestine in 1922. British rule and Arab efforts to prevent Jewish migration led to growing violence between Arabs and Jews, causing the British to announce its intention to terminate the Mandate in 1947. The UN General Assembly recommended partitioning Palestine into two states: Arab and Jewish. However, the situation deteriorated into a civil war. The Arabs rejected the Partition Plan, the Jews ostensibly accepted it, declaring the independence of the State of Israel in May 1948 upon the end of the British mandate. Nearby Arab countries invaded Palestine, Israel not only prevailed, but conquered more territory than envisioned by the Partition Plan. During the war, 700,000, or about 80% of all Palestinians fled or were driven out of territory Israel conquered and were not allowed to return, an event known as the Nakba (Arabic for 'catastrophe') to Palestinians. Starting in the late 1940s and continuing for decades, about 850,000 Jews from the Arab world immigrated ("made Aliyah") to Israel.

frollogaston

27 minutes ago

[-]

Yes, this is the important thing to know. I've heard way too many conversations that go back and forth about every act of vengeance in either direction after this, it's all noise. Partition plan started this. But I wouldn't call it an easy way out of responsibility; UK's leaders took a clear and binding position in favor of Zionism.

Also, it was Ottoman territory for hundreds of years up to WWI. I've had friends tell me for some reason about how Palestine was an independent country before... literally wasn't.

UltraSane

20 minutes ago

[-]

You didn't actually answer my question. How does using the word for people who want to create a Jewish state make sense when a Jewish state has existed for 78 years?

frollogaston

12 minutes ago

[-]

One reasonable possibility is they're referring to people like Ben-Gvir who have themselves claimed that Zionism means fighting for Israeli control over more territory like the West Bank. I don't know whether Zionists 78 years ago would've agreed, it's possible.

To some it still means favoring any existence of a Jewish state. The inertia isn't there because aside from the original partition plan being pushed by the UK, other countries have attacked Israel several times later in ways they would've have withstood without outside support.

wk_end

24 minutes ago

[-]

IMO people just use the term to mean “pro-Israel” rather than in any reference to the original meaning ("supporter of the idea of a Jewish state"). Which could mean any combination of “pro-American support for Israel”, “support for Israel in their various military actions”, “opposed to the creation of a Palestinian state”, “a belief that Israel should continue to exist as a Jewish state”, and so on. It's more about the broad political alignment than the specific meaning of the word.

UltraSane

16 minutes ago

[-]

Thank you for actually answering my question. That is very vague and explains why I find the word so annoying.

nsbk

3 hours ago

[-]

Alright. OpenAI feels like a better fit for him after all

reasonableklout

21 hours ago

[-]

Very bad news for Gemini - the brief comeback with 2.5 Pro last year looked to be driven by Noam

4 hours ago

[-]

Don't think it matters in the long run to be honest. The models have no moat, they are becoming a commodity.

Besides that, Google is in a pretty good position, they're not bleeding money on AI like Anthropic/OpenAI, and they own product verticals where they can integrate it. Plus they have a mature ads-model which is what might actually drive a bit of revenue for LLMs.

fourseventy

4 hours ago

[-]

I think the 'models have no moat' thing is overblown. Only like 3-4 companies in the entire world have cutting edge models, that means there is some kind of moat...

rvnx

4 hours ago

[-]

Money.

That's their moat.

Maybe also stolen copyrighted content that cannot be found anywhere else now, so they are the only ones who can train on it.

gordonhart

1 hour ago

[-]

Meta has tons of that, but no frontier contender. Clearly there’s _something_ more to the equation than money

seydor

4 hours ago

[-]

money. but it eventually runs out

rvnx

4 hours ago

[-]

A little IPO is the solution.

Don't we all want to (automatically) and passively invest in a company losing billions of dollars ?

At least we can diversify our portfolio from SpaceX.

tcp_handshaker

4 hours ago

[-]

Pre-Quote: "We are all going to lose, hundreds of billions"

maxdo

4 hours ago

[-]

yeah, sure, look at anthropic revenue, what is it if not the moat? you can argue for how long but for them good model = the fastest growing company ever.

rvnx

3 hours ago

[-]

Revenue is not a metric of success at all.

Grabbing market-share if you have investors that are ready to burn cash infinetely. Find a hot niche, buy a banana 1 USD, sell it for 0.10 USD.

Example: Cursor, they became popular because they were selling ChatGPT unlimited for 20 USD / month.

When they launched, just a reskinned VS Code, "fastest growing AI company"

No coincidence they were bought by SpaceX, who wants to consolidate revenue even if non-sense as long it helps other investors to exit. It shows rapid growth.

Profit is the real moat.

One example: Nvidia. Proprietary tooling, proprietary IP, proprietary hardware, no alternative, expensive.

signatoremo

2 hours ago

[-]

Revenue is moat. Ask Amazon. Or Alibaba. Or Temu.

You don't know what Cursor's game plan was. Maybe acquisition was their plan.

Buying at $1 and selling for $0.1 is still viable as long as they have money in the bank, until they achieve their goals. Most startups start out that way. Even giving away their services for free.

Obviously there will be failures. Doesn't mean they have no moat. Can you say a business with 100 customers and $1000 debt is less viable than one with a single customer and no debt?

dabbz

3 hours ago

[-]

I feel like the models have no moat paradigm died when a single model expanded past the memory of single GPU slices. The moat is hosting the model. Even paying a server host to run a rack of GPUs has immense upstart cost, and then you're still struggling to compete on the add-ons of the things on top of the model (prompts, validation loops, etc). You can only throw so much money at a problem.

xyzsparetimexyz

50 minutes ago

[-]

Many different companies host the open source models. Where's the moat there?

root_axis

4 hours ago

[-]

And they've had some initial success with TPUs which could be a major differentiator in the future.

4 hours ago

[-]

Yup, and they have the Apple partnership for now as well. Much better position generally than OpenAI in my opinion.

xnx

4 hours ago

[-]

> models have no moat

Possibly true. Any smart innovations developed by one organization will be smuggled into others.

Training, inferring, and data collection, infrastructures are definitely moats. High-volume usage feedback is also hard to come by for new entrants.

thewebguyd

4 hours ago

[-]

And Google has all of those. Custom silicon, more data than anyone else and probably the most comprehensive data collection system, and phones in the hands of 73% of the global smartphone using population to push gemini into to get high volume usage feedback and even more telemetry and data.

observationist

4 hours ago

[-]

I don't think you're honestly accounting for the engineering behind the progress models are making. If it was just a matter of compute on hand and iterating, Meta would be neck and neck with Ant, OAI, and Google, but clearly you've gotta have more.

Noam has a deep expertise in these systems at every level, both algorithmically and at production scale, and knows how to leverage things at different levels.

It's not like Google won't have anyone else that can do what he does, but at the same time, it's an implicit criticism of Google's culture, operations, development, and overall AI program. Shazeer is well past the point where the paycheck is the deciding factor, although I'm certain he is very well paid. Having the freedom to innovate and build free from the corporate fuckery of Google and Facebook is probably more valuable than the pay raise he got with the move, and OAI has the advantage of not having to cope with decades of corporate cruft and inertia. They'll get there - all corporations do - but they're relatively young enough to still be nimble.

xyzsparetimexyz

48 minutes ago

[-]

> Noam has a deep expertise in these systems at every level

As do thousands of people say this point. You think the head of deepseek doesn't?

4 hours ago

[-]

I honestly don't think that matters for multiple reasons:

1. There are already multiple "sota" models on the market that compete with only marginal gains between them (OpenAI, Anthropic, Google/Gemini) and some that are catching up (DeepSeek, Qwen,..).

2. The fact that something is a hard engineering problem does not mean it's generating revenue. So while what you said is true, deep expertise is required to push the industry forward, I don't think that is going to matter for the bottom line of these companies. Hence why I think the models don't give a company any 'moat' in a capitalist economy.

aykutseker

1 hour ago

[-]

AI hiring starting to look like sports free agency.

Karpathy to Anthropic, now Noam to OpenAI.

throwaway314155

1 hour ago

[-]

I thought Karpathy was going to OpenAI?

https://x.com/karpathy/status/2056753169888334312?lang=en

tnorthcutt

1 hour ago

[-]

fancyfredbot

3 hours ago

[-]

Question one: How much did this cost OpenAI?

Question two: Why are OpenAI spending that money taking talent from Google, who can definitely outspend them for talent, and not Anthropic, who are leading the market and are at least somewhat financially constrained.

supern0va

3 hours ago

[-]

Reporting on this seems to indicate that people at Anthropic are significantly more loyal, and that attempts to poach by OpenAI and Meta have been largely unsuccessful.

bootsmann

2 hours ago

[-]

Their options are probably insane sunk cost, hard to steal an engineer who has Xm in potential gains if they choose to stay.

supern0va

2 hours ago

[-]

People seem to have turned down offers that would have netted out more upside for them, so it doesn't seem to just be that. Anthropic seems to lure in the true believers, whereas people are highly skeptical of Sam's motivations these days (particularly after how much safety/alignment has been reportedly cut).

But I'm sure for at least some folks, this is true, given recent valuations.

HPMOR

23 hours ago

[-]

Wow - Google paid a couple billion dollars to bring Noam back. Really impressive by OAI if this reporting is accurate!

alexcos

17 hours ago

[-]

It is accurate. Confirmed by Noam himself on X https://x.com/i/status/2067400851438932297

4 hours ago

[-]

Love the choice of words by Noam- exceptional team for OpenAI, amazing team for GDM.

john_strinlai

3 hours ago

[-]

it would sound weird to say either word twice in such a short blurb.

david_shi

4 hours ago

[-]

Love this type of detailed textual analysis.

xnx

21 hours ago

[-]

This does suck for Google. Noam will take a lot of Google trade secrets with him to OpenAi. Google's bench is deeper than this one guy though.

dansquizsoft

31 minutes ago

[-]

Trade secrets? Like how to invent a trillion dollar technology and then sit on it for years while others eat your lunch with it? Like how to consistenly release inferior quality models to others despite infinite compute and engineering talent and insane profitability in your legacy businesses?

bigyabai

2 minutes ago

[-]

Not really sure what you're talking about. Apple just licensed Gemini for Siri, Google and their TPU hardware is starting to hit primetime audiences that OpenAI can only dream of.

biffles

22 hours ago

[-]

Surprised to not see more comments on this, especially given the popularity of the Anthropic/Karpathy article. What a win for OpenAI - and what a loss for Google, just 2 years after paying $2.7bn to bring Noam back into the fold. Does not bode well for Gemini long-term... Or could be a signal for how deeply they are leaning into world models.

pixelp3

15 hours ago

[-]

I think nobody they acquired from Character.AI is at Google anymore.

yigalirani

4 hours ago

[-]

is it partly due to alleged antisemitism at google?

mrcwinn

22 minutes ago

[-]

I would just love to hear your definition of that word.

uejfiweun

2 hours ago

[-]

Any sources on this?

mosfets

2 hours ago

[-]

How to be a legend like him?

ai_fry_ur_brain

4 hours ago

[-]

Its getting pretty lame that we talk about the these guys like they're football players transferring teams.

https://www.yahoo.com/news/articles/google-cracks-down-posts...

chubot

3 hours ago

[-]

In this case, it's not a new thing ... back in 2005 (yes 21 years ago), people talked about the achievements of Noam Shazeer at Google (and Jeff Dean and Sanjay, etc)

I always appreciated Jeff having a level head ... which this article seems to confirm:

Noumenon72

1 hour ago

[-]

I wonder if the ideological censorship described in your link is part of why Noam decided to leave.

matthew_hre

4 hours ago

[-]

Speak for yourself, my Fantasy Developer League is crushing it this season

glaslong

3 hours ago

[-]

How do I ̷g̷a̷m̷b̷l̷e̷ sports bet on this

ai_fry_ur_brain

4 hours ago

[-]

I feel like there was a scene in Silicon Valley about a developer fantasy league.

Scene_Cast2

4 hours ago

[-]

Krazam already has a video covering this exact idea.

https://www.youtube.com/watch?v=KIZt9YPAPZo

kirubakaran

3 hours ago

[-]

Fantasy FAANGball

ttoinou

4 hours ago

[-]

It could be the opposite. Those are really useful people, they deserve this more than football players

ai_fry_ur_brain

4 hours ago

[-]

Idk, football players actually make a bunch of people happy and entertained. 80% of the United States wishes this tech never existed.

What they're working on is just making peoples jobs, skills obsolete and trying to invent machines that will concentrate the worlds wealth into the hands of the people who own those machines.

ttoinou

3 hours ago

[-]

Very few people interpret football so much that the actual frontier work of the best players matter. Out of 30 friends I know who like football only 1 of them could explain what’s going on in the field technically. For most people, pro players are replaceable.

Popular entertainment and unique progress of human civilization can’t be really compared either

iooi

3 hours ago

[-]

This "guy" is worth on the order of all football players put together.

tayo42

1 hour ago

[-]

I think it's more about how the products that impact our lives might change and what might flow down to us becasue of that.

bookofjoe

4 hours ago

[-]

What's the AI equivalent of NIL?

mrandish

3 hours ago

[-]

This situation is kind of like backend NIL value. His value to OAI isn't just the work he'll do "on the playing field", it's the perceptual value of "OAI just hired the guy Google paid >$2B to get back" right before their IPO.

Nebasuke

3 hours ago

[-]

Have you seen the Krazam fantasy FAANGball sketch? https://www.youtube.com/watch?v=KIZt9YPAPZo

It's funny, but with the AI hires/moves it feels more like satire now.

bbeonx

4 hours ago

[-]

wait this is kinda brilliant tho

krembo

4 hours ago

[-]

We're a community of geeks. We admire Tesla, Feynman, Linus and such. For me they are far greater than football players

Catloafdev

4 hours ago

[-]

I hope this doesn't impact Google's progress on open models.

CamperBob2

4 hours ago

[-]

Is Shazeer known to be opposed to open-weight releases?

Catloafdev

4 hours ago

[-]

OpenAI hasn't released open weights since GPT-OSS-20/120B. Google has the Gemma line.

I wouldn't expect OpenAI to start releasing open weight competitive models again, but I could be wrong.

irishcoffee

3 hours ago

[-]

Their models are the only moat they think they have left, which at this point is more of filled-in wet circle of dirt.

https://www.youtube.com/watch?v=KIZt9YPAPZo&ra=m

sph

3 hours ago

[-]

From the excited comments and fanboyism, I have to say KRAZAM predicted the cult of personality that has infected the AI space.

ur-whale

3 hours ago

[-]

Looks like Google is leaking both AI talent and know-how something fierce ... and since the very day the transformer paper was written.

As an outsider, I'd be really curious to understand why, given how well positioned they seem to be in the AI battle:

- huge, quasi unmatched data war chest

- huge, quasi unmatched, planet-scale infrastructure

- native AI chip design and production (TPU)

- the core ideas for what we now know as "AI" were invented there

- deepmind, enough said

- pretty much the deepest pocket of all the AI players with the possible exception of MSFT

- a massively large user base and reach to deploy AI to (Android, YT, Cloud, Search, Email, ...)

- supposedly one the best engineering culture of the valley

Why do the best people leave ?

Why do their AI product always come in 3rd place ?

Why can't they seem to take the lead, both in terms of product design or in term of raw LLM performance?

The only answer I can think of is:

- culture is completely broken

- management sucks something fierce

- company is so fat and rich no one is actually interested in winning anymore

1: https://ln.hixie.ch/?start=1700627373&count=1

dwrodri

2 hours ago

[-]

Google has muddied the waters on their Gemini usage statistics as it now powers a big chunk of Search. Depending on how you cut it, Gemini (and Gemini powered products) are probably producing the most output tokens seen by the most human eyeballs by a large margin.

Google at its core is not a dev tools company and it has become evident that is where the money is given the verifiable nature of software. Hixie's reflections on his tenure at Google still ring in my head to this day, though I have never worked there[1].

The people at the helm of Google no longer see the company's identity as something which must be channeled through a product or an experience. Some will point to the DoubleClick acquisition, others will point to Google Reader, or Pichai's ascension. Despite his very short tenure, MBA/McKinsey-brain is a very real phenomenon and it's no mistake that it shaped the "promotion packaged as a product launch" culture that steered Google away from seriously betting on anything that wasn't ads. To quote the signull tweet linked elsewhere in this thread, you can have everything at Google, except for permission.

Most importantly--I don't think there's a single tech product where I can point and say "Google wouldn't do that". You can contrast this with say, other Alphabet companies which don't suffer from this remotely as much. It is VERY clear what Waymo and YouTube are trying to accomplish, and while it frequently makes a ton sense for the companies to share infrastructure and product knowledge, YouTube does an exceptional job on the product side of making it very clear what they would and wouldn't do. They have experimented and shut down experimental features before (is their MOOC functionality still around?), but since it's fairly clear Google specifically is no longer working in service to the mission of providing the world's best digital portal for accessing information, I think it would behoove of them to figure out what their mission is.

https://www.youtube.com/watch?v=3t6L-FlfeaI

speed_spread

15 minutes ago

[-]

Sounds like Noam just wanted to serve 5 terabytes.

xyst

2 hours ago

[-]

This is what you call a PR hire.

nothrowaways

2 hours ago

[-]

Good luck Noam, Gemini is a great piece of work.

ur-whale

3 hours ago

[-]

Silver lining: given the leaked financials of OpenAI, he might very well be joining a sinking ship.

Also, why didn't they nail him down contractually when they bought character.ai ... isn't that pretty standard with these type of superstar (re)hires?

mrandish

3 hours ago

[-]

You can't force someone to keep taking your money (that's indentured servitude), you can only incentivize them to stay with increasing amounts of money. Google almost certainly did do that. Probably by vesting his hiring bonus over 2-3 years.

OpenAI is in a unique position right now to grant pre-IPO options (probably in the form of RSUs). And they wanted him badly enough to grant the extra options necessary to effectively 'buy out' whatever unvested Google bonus he's walking away from.

ur-whale

3 hours ago

[-]

Yah, I guess Cali doesn't allow non-competes or something like that.

LOL.

tcp_handshaker

4 hours ago

[-]

I guess this means Google is nowhere close, to even discern a hint of an AGI? So when Demis Hassabis says AGI...could arrive in just 3 years he has learned the best from Larry Ellison?

dboreham

4 hours ago

[-]

I would guess it means Sam Altman gave him more money.

overfeed

25 minutes ago

[-]

And threw in a sweetener when he seemed hesitant: he can say whatever he likes about trans people in the workplace and not worry about being PC.

pixelneon

2 hours ago

[-]

Niceee

HardCodedBias

3 hours ago

[-]

Huge blow to Google.

I doubt that the money had anything to do with it.

I also doubt that the state of the technology at OAI vs. Google had much to do with it, Google is behind no doubt, but the gap is not as far as we know, insurmountable.

I suspect that this is a leadership clash. Noam was working in GDM. GDM somehow went away from coding and RSI into "world models" and that has played out very poorly. Who made that call? Who was still playing politics?

Given this is Noam the list of people that could be pissing him off is very small: Demis, Sergey (?!), a couple of VPs in GDM.

What the hell happened?

dekhn

56 minutes ago

[-]

It's entirely possible he got in a flame war over political issues with other Googlers and Google asked him to leave.