Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing
137 points
1 hour ago
| 35 comments
| businessinsider.com
| HN
delichon
22 minutes ago
[-]
There is little new under the big fusion reactor in the sky. I just read a chapter in James Glieck's "The Information" about tokenmaxxing in the telegraphy industry. There used to be a big market for code books to reduce the per-character charges for sending telegrams. Compression was cash in the pocket. The telegraph companies discouraged the practice but were forced to accept it. The telegraph code industry started with the initial commercialization of telegraphy and didn't end until the 1920s.

There was a cost to it though. Codes greatly reduced redundancy, and caused large miscommunications from very small errors. As Glieck explains it, this was the opposite of the African drumming practice of adding redundancy to strengthen the relationship between the rhythm and the language that the drums mimic.

reply
izanton
1 hour ago
[-]
What if... we stop for a moment, and then, after thinking for a moment, we stop hammering nails with a microscope, and stop using token usage as a metric of productivity?

I know it's sounds stupid, but what if

reply
symfoniq
58 minutes ago
[-]
There is a complete lack of courage in the leadership of tech companies today, and top-down AI mandates are just another manifestation.

True visionaries think outside the box, but most tech executives are forcing their employees into black boxes, out of fear of not doing exactly what their competitors are doing.

We have lemmings for leaders, and that means that—much like the LLMs that are being shoehorned into everything—there isn’t room for original thinking. Everyone’s strategy looks exactly the same.

reply
duxup
1 minute ago
[-]
There was an amusing post about judging developers based on token usage where some user on HN here was pushing this idea “ICs don’t like it but this is the best way to evaluate” (something like that).

They have a whole management team and can’t seem to find a way to judge developers…

reply
CharlieDigital
4 minutes ago
[-]
I'm going to offer a contrarian view here:

First is that despite a lot of waste, some innovation will arise from an enterprising employee finding some interesting use case. A lot of the tokenmaxxing is just waste, but out of that waste may arise a small number of genuinely powerful use cases.

Second is that many workers will be entrenched in their ways. If your executive goal is to achieve the above (find innovative ways of using AI), then you need to move everyone to use it. Most will just waste tokens, but someone may find a novel and useful way of using it that benefits the organization. It is difficult to achieve these without forcing people to act since their default is to follow the well-worn grooves.

So mandates like these are a top-down forcing function like a slime mold feeling out different paths to find resources.

Some devs in my org have fully embraced AI; some would not even use AI if not for leadership mandates and linking usage to performance reviews (I know, I think this is stupid, too). I can see why mandates could be useful since some folks definitely won't be inclined to use AI.

reply
justinparus
5 minutes ago
[-]
Is keeping your company private the easiest way to get around this?
reply
overfeed
33 minutes ago
[-]
> Everyone’s strategy looks exactly the same.

If one is a CxO who's looking out for one's job security, herd-like behavior is the safest option, due to the (near universal) structure of "performance"-based executive remuneration.

reply
AdrianB1
17 minutes ago
[-]
Lacking not just courage, but also character. Wasting company money on buzzwords and dubious outcomes is lack of character.
reply
Lalabadie
1 hour ago
[-]
You're now in the last frame of the comic, getting thrown out the window.
reply
swed420
49 minutes ago
[-]
Maybe it's time we adopt/design an economic system that isn't so easily co-opted by counterproductive prisoner's dilemmas.
reply
nradov
38 minutes ago
[-]
What would such an economic system look like?
reply
tekno45
1 hour ago
[-]
Not very Billion Dollar Valuation of you.
reply
blitzar
50 minutes ago
[-]
If there are any tech CEOs out there reading, I can offer my services. I will pointlessly burn unfathomable amounts of tokens, in parallel, 24 hours a day, 7 days a week, all for you. Think big big big numbers of tokens, you know whats cooler than a trillion tokens, a quadrillion tokens.

Lets talk my bonus, I will open the bidding at $1 per token.

reply
99954bb63ccc
57 minutes ago
[-]
I feel like individually, if you sat down with literally any reasonable person on the planet they would arrive at and/or agree with the tenor here.

I'd be curious to hear from people well versed in group psychology/dynamics and/or just a lot of leadership/people experience: what leads people to this type of thinking once they get in a group setting? It just... seems endemic at this point.

Obviously nobody here is going to know what I do or don't know, but I'm just increasingly curious what I am not understanding about this type of thing. It seems so obvious, yet that makes me ever more suspect that I'm oversimplifying it, or just totally ignorant about the problem in general.

reply
mike_hearn
40 minutes ago
[-]
It's because the average organization has lots of people who don't care about their own productivity and won't adopt new tools or processes unless forced to. This is true of most new tech - lots of workers had to be forced into using computers - but AI also has some other bumps to cross like lots of people who tried early models and then wrote them off, not realizing how fast they'd improve. And most orgs have no infrastructure or processes for allocating individuals token budgets, and most employees have no experience of properly deploying budgets.

Roll it all together and saying "just use it dammit" has some obvious advantages:

1. It's clear.

2. It's simple.

3. It eliminates all excuses employees might come up with for not using it.

The people at the top of these companies aren't stupid. They might have miscalculated how many tokens people can actually use, but that's very hard to calculate because usage is opaque and tools/processes change on a nearly weekly basis. They will eventually build out processes, tools, social conventions and performance metrics that take into account efficiency of token usage. But this is hard! Most managers aren't really assessed on the precise productivity of their teams, for instance, because productivity is often poorly defined.

reply
overfeed
15 minutes ago
[-]
> what leads people to this type of thinking once they get in a group setting

Game theory! The downside of being brave vastly outweighs the upside. For the C-suite, there is no cost to herdlike-behavior, regardless of the outcome. However, there is a very high personal downside to being a maverick, and your board later discovers you made the wrong choice against the grain. The upside of being maverick and right is very limited.

Once a behavior has become mainstream, hopping on the bandwagon is no longer individually attributable to decision-makers, but is seen (and reported) as a macro-economic phenomenon: Nadella, Zuckerberg and Bezos didn't overhire - the American tech industry overhired.

reply
turzmo
48 minutes ago
[-]
Won’t be canned for going with the herd. I think it’s that simple, even if the herd is running off a cliff.
reply
stusmall
34 minutes ago
[-]
That was a fun thought experiment while I waited for my ralph wiggum to finish running. Now thinking is over and back to the vibe
reply
devin
1 hour ago
[-]
The people who have ascended to leadership positions are deeply divorced from reality.

"It is difficult to get a man to understand something, when his salary depends on his not understanding it." -Upton Sinclair

reply
lorecore
1 hour ago
[-]
The crazy thing is their salary does not actually benefit from riding these trends. Unless it's equally/even more clueless board level pressure with ulterior motives (i.e., lifting their other AI investments or the sector as a whole).
reply
repeekad
1 hour ago
[-]
Every c suite in the country is panicking about being left behind, from their perspective it’s either token max or fade into obscurity, or at least that’s what they were sold
reply
sandeepkd
25 minutes ago
[-]
its a herd mentality, its a lot easier to follow the louder voices than to spend time understanding how it impacts your own particular business. Because google does this way, or apple does this way is a common argument in lot of feature/business decisions
reply
lorecore
1 hour ago
[-]
I don't think that's accurate. I think every C suite in the country is looking to do away with labor's leverage as much as possible. I think this is a cultural thing more than anything else, C suite + investors looking to get rid of those pesky humans required to prop up their lifestyles. AI is the most credible path toward that. Short, medium or long term returns be damned, this is a reconfiguration of society and they want to shed what they consider to be baggage.
reply
devin
57 minutes ago
[-]
Like anything it's a mixed bag. I am certainly working with people who I think truly believe the "max out on AI usage or become irrelevant" line. There are people who will privately let you know they're just working with the current meta the best way they can, but others who are drunk on kool aid.

Trying to operate as a rational, thinking person in a lot of environments right now feels impossible. Rational thought is being treated like AI skepticism.

reply
treis
1 hour ago
[-]
Please. These are the same people that force their employees to use Microsoft teams because slack is $5 an employee a month. They're not going to sit idly by while employees burn thousands a month in tokens.
reply
devin
55 minutes ago
[-]
It depends on which people you're referring to. The allocation toward AI budget has been so massive that I think a lot of businesses are way behind on trying to assess value for dollar for the AI-related crud they're shelling out for.
reply
treis
34 minutes ago
[-]
Everyone is feeling it out but the vast majority of spend has been subscription based. Some outliers may have used a massive amount of tokens but companies didn't pay for that.

That VC funded gravy train is likely coming to an end. But fortunately there are also reasonably efficient models now so that the tokenmaxxers can still make the (much cheaper) tokens go brrrr.

reply
transitorykris
31 minutes ago
[-]
I deeply believe this but have no strong evidence. Revenue has always been a cure all remedy. This will keep model providers alive along with the very wide range of companies that are experiencing growth with them (from chips to backhoes), for a time anyway. If/when that house of cards starts going in the other direction there’s going to be widespread pain. By analogy the nonsense of the dotcoms and that crash had a very direct impact on their suppliers (e.g. telecoms). My only advice is to let the Microsoft’s and Meta’s do the tokenmaxxing, and don’t get suckered into the idea you (startup, individual, etc) should be playing that game.
reply
pera
54 minutes ago
[-]
They get paid for saying whatever VCs want to hear and now that thing is "we have now become an AI-native company". The thing I'm still trying to understand is who is scamming whom
reply
nradov
21 minutes ago
[-]
Uber is publicly traded. They're not beholden to VCs any more.
reply
zeroonetwothree
1 hour ago
[-]
Come on, don’t be crazy
reply
FartyMcFarter
1 hour ago
[-]
If any company announces that they use token consumption as an employee performance signal, for me that's close to a red flag to stay away from that company.

No company with good engineering leadership should act like this is remotely a good idea.

reply
LaurensBER
1 hour ago
[-]
Tokens are the new "lines of code per engineer". Easy to graph, easy to "manage".
reply
mig39
49 minutes ago
[-]
The new TPS reports!
reply
KellyCriterion
1 hour ago
[-]
...and easier to bill! Back, then noboday had the idea to charge per "lines of code", but today it seems accepted to charge per words processed?
reply
abvdasker
1 hour ago
[-]
Meta does this. Guess what one of the criteria for their recent layoffs was.
reply
loeg
34 minutes ago
[-]
Meta tracks token consumption, but has explicitly stated that it is not a performance metric.
reply
KaiserPro
21 minutes ago
[-]
Indeed, they also said that previous time off for ill health wasn't a reason either.

but looking at the number of people who had taken leave, it suggests otherwise.

reply
redwall_hp
9 minutes ago
[-]
Anything a trash company says they're not doing is an admission of guilt.
reply
hcnews
20 minutes ago
[-]
Sure, and I have a bridge to sell you. Or alternately refer you to the inevitability of Goodhart's law.
reply
an0malous
1 hour ago
[-]
I worked at a YC company that was doing this and left last month. I wonder where this all started from, VCs and tech execs are such a monoculture
reply
mrkeen
1 hour ago
[-]
I always used to wonder this about software stacks even prior to LLMs, but it seems more relevant now somehow:

When will Uber (or your favourite company) be 'done'? They've been writing software for 16 years.

They match drivers to passengers. More software isn't going to increase the chance that I seek them out instead of taking a bus or train.

Will their software be finished in 20 years? 80?

reply
goldenarm
1 hour ago
[-]
Most of the codebase is custom integrations for local markets. You can systematize some of it but most of the complexity comes from there.
reply
AlotOfReading
25 minutes ago
[-]
Sure, but custom integrations seem unlikely to explain the majority of Uber's technical headcount. Let's say they average a dedicated engineer for each of their 1000 largest markets/locations. Let's assume another 200 across the countless smaller markets. Let's assume 50% overhead atop this for things like infra, tools, and management. These all seem like exceedingly generous estimates to me.

They actually had 5,000 engineers in the tokenmaxxing blog post. That's a lot of engineers for the rest of Uber's business activities.

reply
SoftTalker
1 hour ago
[-]
Can you provide an example? What is different about running Uber services in Chicago vs. Indianapolis?
reply
tpolm
44 minutes ago
[-]
Vegas: ordering a tax "to a hotel" - hotels have different entrances, pickup / dropoff there during crazy times is hard. Uber UI for Vegas is unique / some features are designed to make it easier for driver and passanger to find each other

Airports: different regulations, different rules for pickup/dropoff. Also scammers who pretend to be in a car, walk with their phones around pick-up ares in airport and do bait-and-switch (saw that in Istanbul SAW and in Dubai Al Maktoum)

reply
iLoveOncall
56 minutes ago
[-]
For example in Seattle you pay county fees, and then state fees, and then maybe special fees if you were picked up in the airport.

I took a ride from SEATAC to my hotel in downtown Seattle and besides the ride itself, there were 5 other items on the bill, 4 of which are specific to the place I used Uber.

Then I had the return trip from my hotel to SEATAC, on this one I got EIGHT items on the bill, on top of the ride fare. Some specific to Seattle itself, some specific to the road that the Uber took (a tunnel fee - which is different based on the direction you take it in), etc.

So the real question is what is NOT different between two locations. Less than 15% of the bill.

I also took Uber in India, where you have to share a one-time password with the driver for example, which I've never seen in any other country.

In some other countries the Uber app exists but Uber drivers are actually taxis, so you're actually ordering a taxi via the app.

reply
Groxx
35 minutes ago
[-]
Uber has also been public transit: https://www.theguardian.com/cities/2019/jul/16/the-innisfil-... (like actually public transit, not "lol they reinvented busses again" (though, lol, yes that too))

Essentially every single airport in the world is custom UI and custom walking path guides and pickup instructions, and rules for where pickups/dropoffs/etc can occur can change multiple times in a day, much to everyone's enjoyment. They're almost all private property, and are so valuable that whatever they want is what they get.

And food. Most/~all? major brands get custom integrations.

Hundreds (iirc) of identity verification providers, most or all custom, and constantly weighed against cost and accuracy because it ain't cheap and it ain't good but it is far better than none (both legally and ethically).

No idea how many payment sources they accept, but it's definitely a lot more than anyone who hasn't lived on 5+ continents thinks.

And remember that this is all international. So scale is huge and law changes are constant and frequently conflicting. Darn near every useful feature is illegal somewhere, at some time, for both good and bad reasons.

---

This is not at all to say I think Uber is efficient, clearly it is not. Not by an enormous margin. But there is a legitimate need for truly absurd complexity, because the world is not consistent. You see similar things happen anywhere [thing] tightly interacts with humans.

reply
SoftTalker
54 minutes ago
[-]
Ah local regulations and fees. Not so much the core service algorithms. That makes sense.
reply
zeroonetwothree
30 minutes ago
[-]
Use Link next time. Only $3
reply
MajorBee
35 minutes ago
[-]
There's an excellent HN thread that talks about this very question (that comes up on HN every now and then - what _does_ company X do that needs so many engineering resources?): https://news.ycombinator.com/item?id=25375921

TL;DR: Managing a taxi service (that's what Uber is in my mind, not whatever "ride share" means) that spans cities and states, never mind countries, is extremely complicated. To their credit, Uber manages to make it look simple to the end user, prompting such comments as "meh it's just a few screens how hard could it be", which is triumph of product engineering as far as I am concerned.

Related: this blog from Uber talks about the problem of serving market-specific configuration data at scale: https://www.uber.com/us/en/blog/how-we-unified-configuration...

reply
great_psy
1 hour ago
[-]
I think you’re missing how complex international operations and optimization are.

Each country has their own laws around what uber is and isn’t allowed to do. This needs to be formalized in code. For example you actually call a taxi, though the uber app, and the amount you pay is per mile, not a fixed fare decided ahead of time. To add to this complexity, some cities will have their own laws. What happens if you take an uber from town a to b, where each one has different laws ? A lawyer probably has an answer but the app needs to adhere to that. On top of that laws change all the time.

Optimization, well you can always optimize something. speed, costs, paths etc. In a way this never ends.

I think the part we interact with as consumers is a tiny sliver of the complexity those services have to build and operate.

reply
bee_rider
1 hour ago
[-]
Weren’t they trying to do their own self-driving thing?

I think this is partly a problem with companies that have had heavy investment. Uber’s value isn’t based on what they are doing, it is based on the idea that they are going to render ideas like owning your own car or taking public transit obsolete (I mean that’s an exaggeration but less of one than it ought to be).

reply
redwall_hp
3 minutes ago
[-]
They shut it down after they killed a pedestrian. (They also got sued by Waymo for illegally acquiring trade secrets, and settled.)

https://en.wikipedia.org/wiki/Death_of_Elaine_Herzberg

reply
SoftTalker
59 minutes ago
[-]
AFAIK they gave up on doing self-driving themselves a while ago. I'm sure they are still hoping to be able to get rid of human drivers somehow.
reply
trollbridge
32 minutes ago
[-]
If they didn’t have human drivers, they’d have one less human to exploit per ride.
reply
zeroonetwothree
33 minutes ago
[-]
Well there is a lot of ongoing maintenance cost. There is probably still some marginal gains possible on the matching side. There are new products to launch. So while one specific software can mostly be finished, the total software of a company is always changing.
reply
dag100
1 hour ago
[-]
There are always newer technologies and techniques to be implemented. Better algorithms. Larger deployments. Better reliability. There are also almost always bugs to fix. So, so many bugs.
reply
darepublic
51 minutes ago
[-]
shiny new tools but people only want to use them on the same old problems. how can we innovate the development of crud apps even more?! that was what plagued the web dev landscape for some time. Constantly seeking newer lazier means of producing the same old product. I admit it has an allure but if companies are no longer constrained by dev effort / labour then they can only ponder their own reflection as the source of their failures.
reply
SpicyLemonZest
51 minutes ago
[-]
Uber is at a large enough scale that this analysis doesn't work. You and I do not care even a tiny bit about "Eats for the Way", one of their planned features this year (https://www.uber.com/us/en/newsroom/go-get-2026/) that lets Uber Black passengers specify that their car should arrive with their Starbucks coffee order. But if 0.01% of users order 1 additional ride a month because of this, that's about 200k rides a year, which may well be sufficient to justify the development costs.
reply
BonoboIO
40 minutes ago
[-]
There is always a rewriting around the corner
reply
avidiax
10 minutes ago
[-]
AI for engineering productivity seems to be widely misunderstood to be a magic button that produces the same result, but faster and more cheaply. And based on that reasoning, you should want to force employees to tokenmax, because, why wouldn't you want to get more results but faster and cheaper?

A more nuanced view would be something like:

* AI lets you achieve your roadmap somewhat faster, but:

  * You incur tech debt that's similar to if you hired a dev temporarily for the features. You don't necessarily have someone on the team that understands the new code.

  * Similarly, you aren't upskilling your junior team members. So you aren't getting skill/wage arbitrage as much as before.

  * You will complicate the product. P2 features are P2 for a reason, but AI can cause them to be included and complicate the product for lower marginal gain.
reply
crorella
1 hour ago
[-]
Tokenmaxxing makes no sense, it is akin to write extremely inefficient SQL / Spark Jobs, full of cartesian joins, ultra skewed datasets, etc, just for the sake of using as much compute / memory / IO as possible.

This always happens when the metric becomes the goal, companies should nurture and foster an environment where AI is used in the most efficient way possible, first asking "do we really need an agent for this" and if so, what kind of agent is needed, what model, reasoning level, etc.

They should also promote projects that aim at saving tokens, increasing cache hits, codifying the information in ways such they use as less context as possible (graphs of knowledge are pretty good for this!)

reply
InsideOutSanta
1 hour ago
[-]
It's toddler-level logic. "You can achieve positive outcomes by using X. Therefore, we need to use as much X as possible to maximize positive outcomes."

It's like trying to win a race by setting a gas station on fire.

reply
HDThoreaun
43 minutes ago
[-]
Tokenmaxxing exists because executives think employees are resistant to change. Thats it, a way to incentivize/force every employee to experiment with a new technology. Obviously once they think everyone is utilizing AI the tokkenmaxxing stuff will end.
reply
loeg
32 minutes ago
[-]
Yes. Executives think, correctly, that employees are resistant to change.
reply
SpicyLemonZest
1 hour ago
[-]
The argument in favor of "tokenmaxxing" has always been that it's creating space for employees to freely explore the broad and novel space of AI-enabled workflows. I've seen a number of use cases where I'm skeptical any value is being produced, but a number of others where some team or another has finally solved a long-standing problem of theirs with an agentic workflow that would have been hard to justify to a cost review committee.

> They should also promote projects that aim at saving tokens, increasing cache hits, codifying the information in ways such they use as less context as possible (graphs of knowledge are pretty good for this!)

My understanding is that most big "tokenmaxxing" companies do have teams who are working on this in the background.

reply
mchusma
43 minutes ago
[-]
I actually do think token maxing is good, but they should have limited it per user. I find it reallly hard to get people to max out the Claude $100 plan, let alone the $200 plan. I understand the enterprise plans are different and more expensive, which is how you get these kinds of issues. But encouraging people to try things with AI is very important, and some amount of token maxing is importsnt.
reply
trollbridge
32 minutes ago
[-]
Man, it sure isn’t hard for me to max it out.
reply
SpicyLemonZest
24 minutes ago
[-]
It's not hard for most people now. 6 months ago when agents first started getting big, I genuinely didn't know enough about AI tools to understand how it was possible to use so many tokens, and I don't think I would have bothered to find time to learn without a kick.
reply
tquinn35
42 minutes ago
[-]
Who’s it important for?
reply
loeg
35 minutes ago
[-]
The business. Employees are hesitant to learn new tools that are very different from what they are used to, so if your business believes that AI is a productivity multiplier, it behooves it to incentivize individual employees to learn to use the tool.
reply
tquinn35
23 minutes ago
[-]
I think the key word is “believes”. There is no proof that AI usage improves productivity. Token maxing is essentially customers paying to try and prove a business’s unsubstantiated claim. The AI companies should be proving their claims themselves not the other way around.

I do think AI has value and is useful but the idea of token maxing is ridiculous.

reply
rr808
54 minutes ago
[-]
I have Opus 4.7 at work at 15x. Burns through tokens like water. It feels like one of these new mega datacenters is just for me. I'd love to know what the bill is, but we're just encouraged to do as much AI as possible.
reply
bachmeier
47 minutes ago
[-]
> Burns through tokens like water.

Pretty sure I know what you're saying, but the visual on this one doesn't match the point you're making.

reply
rr808
11 minutes ago
[-]
lol yeah I'm not a poet.
reply
3eb7988a1663
5 minutes ago
[-]
Just append a reactive metal. "Like water through sodium"
reply
loeg
30 minutes ago
[-]
2^30 tokens costs something like 2^10 dollars, order of magnitude, if that helps ballpark.
reply
jhack
1 hour ago
[-]
Maybe don't use the most expensive models on the planet? Maybe use AI like a tool and not this black box that grants wishes?
reply
onlyrealcuzzo
1 hour ago
[-]
I think companies are reluctantly realizing that AI is not a magic genie in a bottle, and is instead a tool.

Still very valuable. They just need to have strategies that match what the tools are capable of - not strategies that involve "rub the magic lamp and increase profits 80%".

If the market is rewarding companies going after the "rub the lamp" strategy, they're going to say they're doing that to juice stock prices.

Maybe the market is finally realizing blindly spending billions on LLMs with almost no strategy is not a good strategy.

Who knows.

reply
dgellow
1 hour ago
[-]
Sounds like you want to be in the next round of layoffs?
reply
simonw
1 hour ago
[-]
I'd be interested to know if this is about individual employee AI usage, or use of AI tokens in production features, or both - and assuming both, what the split is.

I can see how Uber could burn unbelievable amounts of tokens if they start running internal features that run a bunch of prompts against every completed ride, or every customer profile, for example.

Or maybe this is about employee usage, but they introduced some stupid "you get evaluated on how many tokens you used" thing a couple of months ago when that was trendy and are just beginning to notice how much that cost?

reply
devin
1 hour ago
[-]
IMO, it's undoubtedly both.

The number of product teams who have shipped expensive-to-operate AI features is wayyyy up there, and for many of the scenarios I've seen, customers simply don't care or are unwilling to pay significant premium for access to it.

At the same time I'm starting to see some direction from people in leadership that I should "use the right model for the job" and things along those lines, which is a very, very different line from what I was hearing 12 months ago.

My continued prediction is that we are going to see a tweak on the SaaS model where the sweet spot moves to metered usage pricing of really fine-grained API-based access for apps which traditionally have been operated solely via the UI. Long term the trend is going to be "we'll house the data, enrich it, maintain it, provide fine-grained API access over it tailored to model usage, and you bring the model" with some services opting to give you the model interaction layer/harness. IOW I don't think SaaS is dead. Far from it. However, I do think that a lot of people are going to be looking to interact with SaaS apps via their own models with APIs that support those use cases better than a lot of those APIs do today.

reply
bilater
1 hour ago
[-]
The black bill that is coming that nobody is prepared for is that the value of a token varies greatly depending on the human. Companies will quickly find out its much better to give your top 10% engineers a lot more tokens and lay off your average engineers. The 10x engineer will become the 1000x engineer.

Wrote about this and the impact of to jobs here: https://x.com/deepwhitman/status/2058324179506831372

reply
cryo32
1 hour ago
[-]
Waiting for tokenedging next.
reply
postsantum
1 hour ago
[-]
^ Philip K. Dick's unreleased book title
reply
SecretDreams
1 hour ago
[-]
Is this when you type the prompt into the text window, but don't hit enter? Make the GPU see the message "x is typing"? Lol.
reply
FartyMcFarter
1 hour ago
[-]
As long as there's an RPC connection established and a partially sent request, I think it would count.
reply
dominotw
2 minutes ago
[-]
tangent: anyone have businessinsider subscription. i feel like they've really stepped up their game last few years.
reply
InsideOutSanta
1 hour ago
[-]
"He said that, based on talks with Uber's senior engineering leaders, he realized higher token usage did not translate into a proportional increase in useful consumer features."

He's saying that like it's some grand epiphany and not the most self-evident, obvious thing I've heard this month. Some of the literal dumbest people on earth are in charge of these major companies.

reply
mmastrac
36 minutes ago
[-]
I am certain that the max sustainable boost from AI use -- with code review and otherwise all-in -- is approximately 20% with the appropriately skilled senior engineering talent, and the token budget for any engineer should not exceed that.

I do not believe that engineers who are tokenmaxxing are truely productive and I have not seen any evidence whatsoever (perhaps the opposite).

I've personally found that with the right flow and codebase knowledge, that's achievable with sustainable levels of effort.

reply
victor9000
42 minutes ago
[-]
Clearly they need more layoffs, and for that matter why keep anyone around? After all, AI will be writing 100% of code in 2026.
reply
mustaphah
54 minutes ago
[-]
Feels like they are debating internally whether to cut people or AI spending. Very healthy debate. Let's hope they spare people.
reply
chihuahua
1 hour ago
[-]
It's amazing that it took months to figure this out. "Well we thought that if engineers are told to maximize costs through AI use, to consume as much as possible of a resource that costs us money, then obviously good things will happen. Imagine my surprise when it didn't turn out that way."

Imagine if engineers were ranked based on their AWS spend. People allocate VMs and fill databases with terabytes of random bits, to get to the top of the AWS leaderboard. If you don't do this, you're ranked at the bottom, and good luck at the next review cycle. Who could have expected that this is not the road to success?

reply
this_user
1 hour ago
[-]
The point of this was always to explore what is possible with AI as quickly as possible. Obviously, there is going to be a lot of waste, but the 5-10% of employees who are truly thinking about it and discovering novel applications are what you are truly after. Because right now, you effectively have a giant, as of yet poorly explored space of potential uses.

Anyone who can find the actually valuable portions of the space early has a potentially huge competitive advantage. Even if the result of the experiment is the negative that AI is actually mostly not that useful, that is still extremely useful information in a time of great uncertainty regarding outcomes.

The bottom line is that this approach may be expensive, but if you have the money to burn, it's far from the worst strategy if you are trying to position yourself correctly for the future.

reply
adrianN
1 hour ago
[-]
What’s the huge advantage though? Adopting workflows that give big productivity gains is relatively easy even for big corporations. It’s only an advantage if you can keep it secret.

OTOH maybe we’re in for a future of patenting prompts.

reply
uejfiweun
1 hour ago
[-]
The thing I don't get though, is that most people just don't have that much work they need to do. I can use AI to pretty easily get my work done just via the regular chat interfaces. But because of the tokenmaxxing metrics that leadership tracks, I end up just having the AI deliberate for hours on random things just so that I can boost my token numbers. I think tokenmaxxing for the end goal you described is only realistic when the engineers are truly buried under a backlog of work.
reply
davnicwil
1 hour ago
[-]
I think unfortunately it's not about what seems obvious, or even what seems more likely, but about what seems retrospectively justifiable regardless of outcome.

The incentive structure of this type of decision is 'absolutely under no circumstances existentially mess up'. Ostensibly with respect to the organisation, but in actual reality much more so with respect to the individual(s) involved in the decision.

If everyone else is doing something that kind of obviously makes no sense, and you decide to break from the crowd by instead doing what does make sense, then there's a pretty solid chance of gaining a temporary edge while reality resolves the truth. But those gains probably won't matter all that much for the organisation, or indeed your position within it. It's a solid chance of an unimportant gain.

However on the other hand, there's a tail risk that something very unexpected happens and the thing everyone's doing that makes no sense actually turns out to make sense - sometimes even for entirely unpredictable incidental reasons - and then, well, you're in trouble. Not necessarily 'you' the organisation.. they'll likely be able to catch up and it won't matter that much. But for 'you' personally, the decision maker, it's very much not good.

As a bonus, in the much more likely scenario that the thing that makes no sense turns out to indeed make no sense, you're in the same boat as everyone else, there's no relative loss, and most importantly you don't stick out as someone who did something as risky as to go against the prevailing, albeit pretty clearly nonsensical, sentiment.

So basically, game theory tells you pretty quickly to just go with the thing that makes no sense if you're optimising for some (weighted) cross of what's best for the organisation and yourself as the decision maker.

reply
saghm
1 hour ago
[-]
Someday maybe Goodhart's Law will be intuitive to people making decisions like this, but not any time soon I guess
reply
dgellow
1 hour ago
[-]
> It's amazing that it took months to figure this out

We aren’t there yet, so far it is just a COO questioning the investment

reply
roxolotl
1 hour ago
[-]
The inability of leaders to understand Goodhart’s Law is always a sight to behold. They see a number go up and pat themselves on the back for how well their employees are making it go up without ever wondering if the thing they care about is happening.
reply
solenoid0937
1 hour ago
[-]
You say "amazing that it took months to figure this out" as if the answer to the question is obvious.

But it's not. Some FAANGs are doing amazing things with unlimited tokens. Other companies have no clue what to do with tokens, they've just told their engineers to max them.

It really depends on how you're using the tokens. If you're just using them for Codex and Claude Code - yeah, tokenmaxxing is incredibly dumb.

reply
saghm
1 hour ago
[-]
In other words, people who are productive get more done when you scale up what they're already doing, and people who aren't productive will not magically become productive when you scale up what they're already doing. That's incredibly obvious, because we've seen how this plays out repeatedly in so many different ways (lines of code, commits, tickets closed, etc.), and it has nothing to do with tokens or even programming, but just how trying to manage people works.
reply
steveBK123
1 hour ago
[-]
> Some FAANGs are doing amazing things with unlimited tokens. Others have no clue what to do with tokens.

Unlimited tokens is different from “use AI a lot or we will fire you, and we are counting token consumption as usage”. Obviously the latter is stupid and yet it was done in many places.

reply
SpicyLemonZest
21 minutes ago
[-]
I'm not convinced it actually was done in many places, although I understand why in a bad job market people don't trust that it isn't happening in secret. Every time I've heard of a token leaderboard or such it's come with a denial that the company is using it as an employee performance metric.
reply
morpheuskafka
1 hour ago
[-]
> But it's not. Some FAANGs are doing amazing things with unlimited tokens

Giving someone unlimited access to a resources is not the same as directing or incentivizing them to use it for the sake of using it which is what the parent comment criticized.

As for the other FAANGs, Meta and Google have (not good but still) frontier models of their own, so they are very different from a company paying API costs per token.

reply
dgellow
1 hour ago
[-]
Where can I see those amazing things done by FAANGs?
reply
fsloth
1 hour ago
[-]
> Some FAANGs are doing amazing things with unlimited tokens.

Would love to know what things!

reply
SecretDreams
1 hour ago
[-]
Show me some fang that have made nice outwards facing products through a fully embraced AI workflow?

AI is an accelerator that engineers should know and have access to, but it's not something that should have mandated usage and quotas around. It's also absolutely dangerous for young engineers and the like - it fundamentally denies you of the "learning" aspect. I'm now seeing in interviews young graduates being given AI tasks to complete and they come back with a correct solution and no concept of how it is working.

You learn and reinforce learning by DOING and reading in depth. High level summaries don't teach anything and are the kinds of things only VPs care about. So, unless the intention in the future is for everyone to be a VP using AI to do the work, we need some middle ground here and some real thought around implementation of these tools or there's going to be a generational canyon gap of knowledge between being able to "say" and being able to "do".

reply
JackDanMeier
1 hour ago
[-]
At what point is there a difference between a burn rate and tokenmaxxing? Isn't it the same as during the dotcom bubble?
reply
mustaphah
52 minutes ago
[-]
Tokenmaxxing is so dumb. You should never show your team how exactly you're measuring their performance; people will optimize for the metric, not the actual performance.

Classic Goodhart’s Law: when a measure becomes a target, it ceases to be a good measure.

reply
rcvassallo83
1 hour ago
[-]
Oof leader of bubble are starting to take a step back?
reply
matheusmoreira
23 minutes ago
[-]
LLMs are great, I can understand using them in general. I can even understand chasing 100% weekly usage if you're using the gacha-like subscriptions since that's how you get the most value out of what you paid for.

The way these corporations are going about it is completely insane though. They're essentially ordering their employees to set money on fire or be fired themselves. The more money you burn on tokens at insane API rates, the better an employee you are. Absolutely mind boggling.

reply
illithid0
1 hour ago
[-]
>"He said that, based on talks with Uber's senior engineering leaders, he realized higher token usage did not translate into a proportional increase in useful consumer features."

Goodhart's law strikes again at someone with enough power to be both ignorant of it and make others suffer their ignorance. You cannot simply measure productivity by tokens spent just like you can't measure it by hours spent in a chair at a desk.

reply
colechristensen
1 hour ago
[-]
You can measure productivity by hours spent at a desk?
reply
batch12
1 hour ago
[-]
You can measure attendance by hours spent at a desk
reply
devttyeu
1 hour ago
[-]
Well if you're a devshop just billing hours of mostly low impact work then hours are very much equal to productivity.
reply
saghm
1 hour ago
[-]
Next time you're going to work for an hour, ping me, and I bet I can surprise you with how much less productive I am than you
reply
epolanski
1 hour ago
[-]
Productivity is measured by economists in $/hour.

Which is why two identical jobs with the same real life output have drastically different productivity.

A nursing home in Luxembourg has 5 times the productivity of one in Romania despite the services being identical and tech-unrelated.

reply
phendrenad2
1 hour ago
[-]
AI productivity hasn't been well studied yet, but I'm betting that we'll end up with some variation on Price's Law, I.E. some small subset of workers get most of the benefit, while most just burn tokens with little to show for it.

I also want to call out the false productivity opportunities AI offers. There are whole teams building their own "gas town" and not shipping features.

reply
lorecore
1 hour ago
[-]
Not all tokens are created equal. It's easy to use a ton of tokens by having agents work together in parallel. That's basically the equivalent as people spending time in meetings, hardly a productivity win. As with everything in development, results matter, how you get there doesn't (unless you're a bad manager).
reply
irishcoffee
1 hour ago
[-]
I just realized my company is months behind this curve. About to blow my token allocation. Before I do, anyone have requests? Sincerely.
reply
kibwen
1 hour ago
[-]
I hereby suggest you take the fragmentary excerpts of the infamous erotic stage play The Lusty Argonian Maid shown in The Elder Scrolls series of games and extrapolate them to 100,000 additional full-length acts.
reply
paulpauper
1 hour ago
[-]
many of these leading AI companies are operating at large losses and subsidizing users with VC money. Profitability will entail having to impose greater limits and raising prices, so this will reduce to some degree the value proposition of AI compared to humans.
reply
7777777phil
1 hour ago
[-]
As soon as tokens stop stop being subsidized, heavy agentic use will become as least as expensive than paying an (entry level) employee. When this happens many companies will trade off havy tolen usage for (maybe a bit slower, bit less accurate) employees again.
reply
Wowfunhappy
1 hour ago
[-]
DeepSeek is an open weights model. It's possible the hosted versions are subsidized, but we know what it costs to run locally. And it's expensive, but it's also pretty clearly cheaper than an employee.

Of course, the latest DeepSeek models are not as good as Claude, but they're not super far off either.

reply
amluto
16 minutes ago
[-]
When you use DeepSeek’s first-party API, you are giving them your token stream. This has some training value, but it also has incredible amounts of, well, business intelligence value. When you tell AWS your secrets or your customer data, you can be fairly confident they won’t abuse that knowledge. When you give this data to, say, OpenAI, they more or less promise not to abuse it if you’re on an appropriate business plan. If you give it to DeepSeek, even incidentally as something your agent reads, I would be quite surprised if DeepSeek doesn’t mine it for whatever purpose they or the government feel is appropriate.

The risk of letting your agent read .env goes far beyond the risk that the agent itself does something you don’t like with the contents.

reply
Wowfunhappy
14 minutes ago
[-]
But this shouldn't be a risk if you host the model locally.
reply
irishcoffee
1 hour ago
[-]
They're not far off, getting the same seamless integration as hosted models is a full time job. I think what just happened is that devops is about to explode. What will naturally follow is local hosting of all the things when people realize subscription costs for cloud-whatever are absurd.

Gitlab is going to take off? This is not investment advice.

reply
Wowfunhappy
1 hour ago
[-]
> What will naturally follow is local hosting of all the things when people realize subscription costs for cloud-whatever are absurd.

Even acknowledging we don't know exactly what costs would look like in a world without VC money, wouldn't hosting models logically be cheaper to do at scale in a data center?

When I compared to the cost of running DeepSeek locally, I meant that we can treat that cost as a price ceiling, not the floor.

reply
Groxx
46 minutes ago
[-]
Like how server hosting at scale in a datacenter is cheaper than running your own datacenter? Despite ~every company consistently concluding that hosting their own stuff is several multiples cheaper?

No, I think local stuff using also-useful-for-other-things hardware will vastly undercut cloud hosting when the free money pipeline shuts down, and will stay that way for roughly forever. That doesn't mean cloud stuff isn't useful, clearly it is, but adding another company in the middle is rarely the solution for reducing costs.

reply
stult
1 hour ago
[-]
You're assuming the price won't come down as the tech matures. That seems like a big assumption, considering how quickly open weights models are catching up to frontier models, and how little effort has been invested so far in optimizing inference costs.

It's especially a crazy assumption to make relative to the costs of employing a human. The costs of paying an entry level employee are unlikely to go down at all, and even if those costs do decline, there's a floor they can't drop below (minimum wage at the extreme end), whereas companies are free to optimize agentic costs as close to zero as possible.

So you are assuming that a cost which is extremely susceptible to optimization but which no one has yet seriously attempted to minimize will remain perpetually above a cost which is much less susceptible to optimization, is already subject to enormous efforts to minimize, and has a legally mandated floor. That seems like a bad bet.

reply
skybrian
1 hour ago
[-]
Maybe this just counts as “light use” since I’m a hobbyist programmer and I only run one coding agent session at a time, but I get about as much done as I did back when I was working while spending a lot of time browsing the Internet, etc.

I’ve spent $10-$20 a day using Claude to write code and closer to $5 a day now that I mostly use Deepseek and GLM, using API pricing (no subscriptions) since I don’t use Claude Code.

This is a rounding error for a company. So I think there’s plenty of room to use AI extensively while being more cost-conscious.

reply
kingstnap
59 minutes ago
[-]
A significant caveat is that there is a pricing mismatch that makes it so first party's can subsidize quite heavily.

Agents are expensive in large part because tool calls require round trips. It's because these APIs are stateless and not streaming so you have to resend the whole context each time. This means you have roughly #tool calls x 1/2 context size cached input tokens over any given session. Most API providers overcharge you by a huge amount for cached tokens. A exception being Deepseek. Paying OpenAI $0.05 for 100k cached GPT5.5 tokens during a possibly 2 second round trip agent tool call is like paying $100/hr for what is likely to be ~10 to 20 GB of VRAM residence (holding the KV cache).

Or it got offloaded to NVME and you are paying $0.05 for that much PCIe bandwidth.

reply
fredley
38 minutes ago
[-]
I think if local models catch up with current SOTA then that might not happen. Either way, I'm don't think the long-term for OAI, Anthropic etc. really holds up.
reply
helloplanets
1 hour ago
[-]
More straightforward to talk about the hardware directly. Full Kimi K2.6 needs an 8x H200 node to run and serve around 20 heavy users. You can rent an 8x H200 node for around $30/hr.

I'd imagine GPT-5.5 and Claude Opus 4.7 could run just fine on a 16x H200 node and serve at least 10 heavy users without the token output getting choppy.

reply
saghm
1 hour ago
[-]
What's funny is that this apparently wasn't something that the Uber COO seemed to think about when their company is arguably one of the most successful ever at the "subsidize to drive down costs until you capture nearly the entire market" strategy.
reply
cryo32
1 hour ago
[-]
This is what I’m betting on.

The financials don’t make sense now. Based on the expenditure the finances won’t ever make sense.

reply
BadBadJellyBean
1 hour ago
[-]
I have been saying the same for while. Someone always says "but Anthropic is making money on their API" or "But it's inference will get cheaper". But I don't believe it. first all the investments have to payed off at some point and second of all there are other things that cost money. I don't believe that any of them have a positive balance sheet.

I also don't think that blitz scaling will work like with Uber. The engineers are still there. We can work without the LLM tools.

reply
solenoid0937
1 hour ago
[-]
If by "investments will pay off" you mean major profits, that's never going to happen as long as scaling laws hold. All revenue will just go to financing more compute, and either we hit AGI or have the greatest economic collapse in modern history.

The world will look drastically different 5 years from now; for the better or worse, so save every penny (especially if you work in tech).

reply
Rohunyyy
1 hour ago
[-]
Now we are going to get a new profession. Token Engineer! They will be experts on tokenmaxxing! The job growth that the billionaire CEOs promised us from AI is finally here!
reply
fsloth
1 hour ago
[-]
Well there are already offerings like githits (https://news.ycombinator.com/item?id=46105112) that sort of promise optimize bang-per-buck of inference
reply
yapyap
1 hour ago
[-]
wtv
reply
pocksuppet
1 hour ago
[-]
what the fuck is this timeline I am stuck living in
reply
nekzn
1 hour ago
[-]
It’s funny that “maxxing” entered the common vocabulary.
reply
chihuahua
1 hour ago
[-]
If you're not tokenmaxxing, you're getting tokenmogged on the AI leaderboard, and your next review ain't gonna be pretty.
reply
internet2000
1 hour ago
[-]
A good 80% by volume of the modern vernacular is 4chan language that got sanded down.
reply
nekzn
1 hour ago
[-]
Sanding down is how we got goyslop turned into slop.
reply
harvey9
1 hour ago
[-]
Slop is a word in its own right which got the goy prefix later in life.
reply
amirhirsch
1 hour ago
[-]
I like this too. I have been intentionally -maxxingmaxxing to get the meme out there. It's a good canary to sort out who gets the spicy takes from the pedestrians who probably still copy-paste into the ChatGPT web app like a psychopath.
reply
gigatexal
1 hour ago
[-]
I find it useful that if they cut the use altogether I will pay for it out of pocket.
reply
dghlsakjg
1 hour ago
[-]
Would you decide its usefulness based on how high the bill is, or how many things you get done while using it?

The former is the issue, and how many companies have been operating. It's like a trucking company ranking driver effectiveness by fuel used instead of by cargo moved.

reply
sottol
1 hour ago
[-]
Maybe that's the plan :)

But on a more serious note, do we know how much Uber spent per technical employee/month? I assume it is far more than even any of those $200 "max ai" plans.

And the other question is how much the public would be willing to spend, in my estimation this is as "cheap" as it will ever get (main-stream at least).

reply
KronisLV
1 hour ago
[-]
> I assume it is far more than even any of those $200 "max ai" plans.

Am in a random small company, colleague spent 100 EUR a day on Sonnet through AWS Bedrock (needed to use a EU region). Paying for tokens will get you in a deep hole financially compared to any of the subscriptions, unless it's like DeepSeek or one of the other models that are priced a bit better, though that's also a tradeoff in what they can/cannot do and also where the data goes. Ended up trying out the Mistral subscription for the US stuff btw, it was fine.

reply
Marciplan
1 hour ago
[-]
bigCo’s don’t get to do the $200 Max plans, they have unlimited plans but get charged like API
reply
sottol
1 hour ago
[-]
Exactly. But I did find an article ([1]) and spend doesn't seem that high per engineer ($150 to $250 per eng) - at least on average, I assume the costs were skyrocketing towards the end.

> Adoption climbed from 32 percent of engineers in February to 84 percent classified as agentic coding users by March. By spring, 95 percent of Uber engineers used artificial intelligence tools monthly, and roughly 70 percent of committed code originated from those tools. About 11 percent of live backend updates were written by agents with no human in the loop, according to Uber's own disclosures.

> The numbers behind the spend are what make the story instructive rather than anecdotal. Monthly cost per engineer ranged from $150 to $250 on average, with power users running between $500 and $2,000.

My guess is that the reason to rethink AI-spend was probably the exponential growth in cost over time, and tokenmaxxing payoff not being immediately obvious as mentioned in the article.

[1] https://www.forbes.com/sites/janakirammsv/2026/05/17/uber-bu...

reply
mattlondon
1 hour ago
[-]
Probably long term each dev gets their own GPU and runs a model locally I expect. Seems like a more sustainable approach, even if a local model is not absolute SOTA.
reply
ianm218
28 minutes ago
[-]
GPUs are much more efficient at parallelizing requests for LLMs so it's going to much more efficient to centrally host. Maybe big companies it would make sense to get their own though.
reply
iwontberude
1 hour ago
[-]
Except you won’t because they will threaten to fire you and force you to route all of your AI through data protection proxy to stop exfiltration by filtering and tracking prompts/response tokens.
reply
egypturnash
1 hour ago
[-]
Uber COO says he just decided to short a bunch of AI company stock.
reply
epolanski
1 hour ago
[-]
Slightly ot, but I really dislike this reddit WSBization of HN.

Adds nothing insightful to these discussions.

reply
cwillu
1 hour ago
[-]
“Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.” --hn guidelines (there are links to examples in the original)
reply
noman-land
1 hour ago
[-]
It's unfortunately the WSBification of the entire society.
reply
hmokiguess
1 hour ago
[-]
Why do keep doing this? It's the same as measuring by LoC, we know it's not gonna work. Also, see Goodhart's Law[1]

- https://en.wikipedia.org/wiki/Goodhart%27s_law

reply