Uber's $1,500/month AI limit is a useful signal for AI tool pricing
124 points
by pdyc
6 hours ago
| 20 comments
| simonwillison.net
| HN
ValentineC
35 minutes ago
[-]
> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers.

Do we know that AI providers are going to keep these per-token prices, or eventually lower them because of competition from China?

Many lower-budget individuals are now moving to China open weight models like DeepSeek. I wonder if China's really subsidising the providers, or if inferencing costs are actually much lower, and Anthropic/OpenAI are just making sure no money's left on the table for their eventual IPOs.

reply
freediddy
4 minutes ago
[-]
Most sane US companies will disallow use of cloud-based Chinese AI providers, because everything including code, data, PII, etc is being sent to them.
reply
testdelacc1
30 minutes ago
[-]
Per token costs will fall, but the harnesses will get more token hungry. Instead of just centering the div it’ll spin up a battery of agents to architect, critique, advise, code, review, refactor and so on.
reply
sevenzero
23 minutes ago
[-]
I wish I could disable most of these. I already hate all the "oh you're actually right, let me fix that" nonsense. Then it proceeds to burn 50k tokens on the git history instead of copying logic A from a different part of the codebase to logic B, where I want that exact logic without having to write the boilerplate myself...
reply
apsurd
15 minutes ago
[-]
Makes me think of how my Claude.md files specifies to use the built in framework code-generators (rails). Those generators are deterministically right every time.

I wonder how often the Agent actually follows the guidance. I do see them follow it when I look. But it doesn't seem so every time.

reply
thefunnyman
3 minutes ago
[-]
This is tricky since it can and will ignore your md directions. When possible I try to lean on tool call hooks or skills that invoke deterministic scripts. As much as you can remove the "choice" the better though still there's a lot of randomness in how reliably it invokes skills ime.
reply
sfn42
9 minutes ago
[-]
A lot of the time if you're copying code from one place to another what you actually want to do is abstract it so you can reuse it in both places.

The LLM can easily do this type of stuff, just tell it and it'll happily do it. This is exactly what I mean when I tell people they need to work closer with the AI, tell it how to do things. Don't just tell it what to do and get frustrated when it does it differently than you would.

A good way to achieve this without writing huge prompts is tell it to plan the change first. Just give it some vague low-effort directions. It'll usually get most things right, you tell it what you want different and once you're happy you tell it to go ahead.

reply
sevenzero
24 seconds ago
[-]
Nah the codebase is legacy fucked and I cant be bothered to try and optimize business flows without the fear of other stuff breaking.

Claude 100% of the time even thinks we use laravel despite the project being some old lumen codebase, so most of laravels features are not available. It also gets the PHP version we are using wrong 100% of the time.

reply
cyanydeez
19 minutes ago
[-]
id be amazed any american business will aend data to china
reply
linkregister
12 minutes ago
[-]
HuggingFace offers DeepSeek as one of its models— it's pretty simple to spin up instances under your control.

I'm not sure about OpenRouter but I wouldn't be surprised if they offer a US-based provider of DeepSeek.

For reference, Cursor has their first own light fork of Kimi that they use as their baseline coding and review model.

reply
alpinisme
11 minutes ago
[-]
“Any” is a very high bar Unless laws prevent it, I don’t see why a substantial minority wouldn’t buy services from where they can get them at a similar quality and much lower price.
reply
SecretDreams
25 minutes ago
[-]
> Do we know that AI providers are going to keep these per-token prices, or eventually lower them because of competition from China?

I genuinely do not know how prices can get lower from the current major providers in NA without the whole market collapsing. Everyone is spending copious amounts of money to presumably make more money back.

reply
tuesdaynight
12 minutes ago
[-]
Why there are so many people that still believe that AI coding is a fad? It's something that started less than two years ago and companies are already paying thousands per seat. I know one that gives you 5k per month. Which other tool went from nothing to this level of acceptance so quickly?
reply
anthonypasq
2 minutes ago
[-]
perhaps the personal computer? Companies were spending 3-5k (10-15k inflation adjusted) on every employee for just hardware.

everyone making comparisons to the dotcom bubble seems misguided. this is clearly computing 2.0 imo

reply
f311a
5 hours ago
[-]
How many more months do we need to wait, until big companies realize that flash models work just fine if you:

1) Don't ask LLMs for big changes

2) Review everything and point them in the right direction

Large models still suck at big changes, they produce questionable architecture and you still have to review the code, if your project is serious enough.

The codebase quickly become a mess, if you don't pay enough attention. Does not matter which model.

So why bother with big models, when flash models are 10x cheaper and much faster to iterate under guidance? Large models can be used for security and bug audits. Flash models work almost the same for changes under 300 LOC when you dictate how you want your code to look.

reply
econ
2 minutes ago
[-]
I wonder to what extent models should figure out which model to forward a query to. Or perhaps the big models could learn the difference between an easy and a hard question and charge accordingly? Perhaps, if it can measure complexity, even generate a quote?

Small models are fine for small coding tasks but I don't see why big ones can't be broken down most of the time.

reply
warmwaffles
27 minutes ago
[-]
> Don't ask LLMs for big changes

> Review everything and point them in the right direction

Sorry upper management doesn't care. That's an engineering problem that you need to solve.

reply
eikenberry
17 minutes ago
[-]
He was proposing a solution. To use flash models and use them in a way that best amplifies your work.
reply
CharlieDigital
5 hours ago
[-]
$1500/mo is $18,000/seat/annum.

Maybe Microsoft and Nvidia are on to something.

128 GB machines that can run local LLMs are a bargain even if priced $5-8k. Yes, tok/s is not quite there, but that's probably OK since the bottleneck really isn't the code; it's WTF did Uber build with all of that spend? How did it meaningfully impact their revenue in a positive direction?

reply
ssivark
27 seconds ago
[-]
[delayed]
reply
pqtyw
41 minutes ago
[-]
How is tok/s not a bottleneck I? I assume most people still use ai agents interactively rather than leaving them to do their own thing during the night.

I find anything below 50 tps or so entirely unusable...

Regardless its Apples to oranges anyway, inference is quite cheap for open weight models its just that Claude and OpenAI can charge very high margins compared to e.g. DeepSeek or various provider on OpenRouter since open models are a commodity.

reply
brianwawok
31 minutes ago
[-]
I startup 4 or so projects then go do other things for 4 hours. I don’t have enough energy to steer overnight, but I’m at least “semi afk” for daytime steering. So throughput is king for me, tokens per hour. Not latency or actual tokens per second.
reply
smallerize
14 minutes ago
[-]
Running locally is even worse for this, because if you're running 4 jobs at once they just run at 1/4 speed. Not literally, you can make up some of the difference with batching, but you have limited resources instead of spreading your requests out on an API provider's nodes.
reply
zozbot234
3 hours ago
[-]
I agree on the basic point, but running $1500/mo's worth of SOTA local AI is non-trivial already, and that's a figure for a single seat. That's equivalent to generating at least 20 tok/s on a 24/7 basis, in fact probably quite a bit more than that (because open-weight models are vastly cheaper than proprietary ones even when served from reputable Western providers - reaching the same spend would take around 100 tok/s or more, which is well within datacenter hardware territory).

You could probably reach the former figure on a prosumer platform but only for very special workloads. If you spend a lot of time on prefill (which is common for agentic workloads) the outlook is even worse since that's a significant constraint for any on-prem AI.

reply
Buttons840
42 minutes ago
[-]
I think companies will eventually just buy a local AI server.

Using local hardware is expensive when it's running a complicated software stack that can break in 10,000 different ways.

These eventual local AI servers will just talk some protocol for AI and sit in the corner and nobody will think about them.

I guess they still might need access to various systems, so idk. Eventually I think someone will offer "AI in a box" though, running the latest open model or whatever.

reply
pm90
5 minutes ago
[-]
Yep, its already quite easy to do so with tools like opencode/openrouter. Ive used some open source models and they seem … ok? Im not doing foundational math, just refactoring code, understanding existing code etc. I don’t see a future where companies blow 11% of employee compensation on a single tool; the hosted AI server + oss models will 99% win out.
reply
dangus
9 minutes ago
[-]
I don’t think companies will do that. Why don’t they just buy local on-premise infrastructure even though it’s cheaper than AWS?

“AI in a box” sounds a heck of a lot like “the box” from the Silicon Valley TV show. Or the Google search appliance. Or name any other on-premise thing that is equally dinosauric.

The real finding of this article is that AI tokens are direct competitors with offshoring. $1,500/month buys you a whole employee in India.

And this is before AI companies inevitably increase pricing after the conclusion of the growth phase.

reply
pm90
2 minutes ago
[-]
> I don’t think companies will do that. Why don’t they just buy local on-premise infrastructure even though it’s cheaper than AWS?

For customer facing, production software, its worth paying a cloud tax to get the reliability guarantee. For tools that are used by engineers for code development, there is no need for such bulletproof guarantees.

reply
empath75
8 minutes ago
[-]
I think probably the correct spend is something closer to 10x that if people can figure agent coordination problems out. It's not even really about capability at this point, it's about keeping track of what agents are doing.
reply
dkdcdev
5 hours ago
[-]
at their scale they could also just run a large on-premise or rented (basically still cloud, but cheaper) GPU cluster and run through that. fixed costs, even license a SOTA model’s weights if you’d like
reply
embedding-shape
5 hours ago
[-]
> even license a SOTA model’s weights if you’d like

Yeah, I bet all labs releasing SOTA models are more than happy to remove the main way they make money and let you run it locally, especially if you're a big spender like Uber who seems very willing to throw money into the sea as an experiment.

reply
throwway120385
5 hours ago
[-]
That's going to stop eventually, and I think at that point we're going to see business models more like the major CAD providers.
reply
idiotsecant
5 hours ago
[-]
I don't think they'll have a choice, open weights models are not far behind. At some point it's essentially a commodity game
reply
dkdcdev
5 hours ago
[-]
they also already do this…

Anthropic and OpenAI license to the public clouds. Google reportedly licenses to Apple. licensing to Fortune 100 companies running on their own infra is an obvious next step

it is a race to the bottom and I’m not sure the labs win that race. we’ll see!

reply
mrweasel
2 hours ago
[-]
The problem isn't really Uber, Microsoft or Nvidia, it's all the smaller none IT companies that also have developers on staff. They are screwed. $1500 per seat per month is just way to expensive, but they also can't afford to build and maintain their own on-premise solution. If Microsoft can't afford to run CoPilot for their own developer, what chance does any of their customers stand?

If the large, well founded IT companies in the world believes the current AI cost is to high, then Anthropic, OpenAI and CoPilot have no actual customer base. AI is then relegated to very profitable niche business, but that can't fund the R&D for the models.

reply
treis
55 minutes ago
[-]
There's models for every price point. What was SOTA and stupid expensive to run a year ago is a cheap flash model today.
reply
skybrian
59 minutes ago
[-]
It's an extra 18k a year for developer tools when they're paying how much a year per developer? Having software developers at all isn't cheap.

Also, I don't believe you need to spend $1500 a month on a coding agent if you optimize usage at all.

reply
mrweasel
8 minutes ago
[-]
That depends on where you are. $18K is the equivalent of paying around 15% more for your developer.
reply
ecshafer
35 minutes ago
[-]
$18k a year is a non starter in most companies. Ive seen companies balk at Intellij.
reply
mvdtnz
44 minutes ago
[-]
Why are smaller non-IT companies "screwed" because they can't pay out the nose for their developers' AI usage? They're non-IT companies, developers are presumably not on their critical path, or not their bottleneck. Developers can keep on writing code the old way, or doing it with a more reasonable AI spend. I don't see how this "screws" any company.
reply
mrweasel
18 minutes ago
[-]
That was badly worded on my part, my intend was to indicate that there was no way they can or will pay $1500 per month per seat.
reply
jvanderbot
5 hours ago
[-]
Right - the future of LLMs is like ol' windows XP+Dell. Commercialized "things" you run locally offline, co-designed with hardware, with a known productivity suite, and large businesses building the next generation thing and suite with 18mo release cycles (ish).
reply
gedy
3 minutes ago
[-]
There's waayyyy too much money betting on that not happening, to the point I feel there'll be regulations popping up for "safety reasons" etc to ensure the big players control this.
reply
treis
1 hour ago
[-]
I don't see it. Leasing equipment and paying per seat license fees makes a lot of accounting and cash flow sense. Maybe when it gets to the point where you can run SOTA LLMs on consumer hardware. But that seems a solid decade and probably much more away.

Even then it makes more sense to rent the bigger GPU and get your answer faster.

reply
nonethewiser
5 hours ago
[-]
XP? I can see the argument for enterprise support but in that case the latest windows OS is going to be virtually free and I dont know if MS and Dell etc. would even support an XP machine. Might even be required for hardware. If no enterprise support wouldnt Linux make a lot more sense?

I get that if it's offline the security downside of XP doesnt matter, and I assume XP is free, but being free doesnt really seem that valuable compared to alternatives (free linux and virtually free OS if buying wholesale).

reply
jvanderbot
5 hours ago
[-]
"Windows XP+Dell" should have been in quotes. It's similar to the way enterprise productivity software was developed, packaged co-designed with hardware, and sold on an 18mo upgrade cycle assumption. It's not literally windows xp.
reply
nonethewiser
1 hour ago
[-]
Oh gotcha. Yeah that's an interesting idea.
reply
darkwater
5 hours ago
[-]
> it's WTF did Uber build with all of that spend?

You can ask the same for the median 330k salary in the US for Uber Engineering... and being a bit snarky, attending Uber engineers talks here and there at a few conferences, looks like. they love to (re)invent internal tooling/platforms. That's pretty expensive on its own.

EDIT: I'm not saying that Uber's engineers didn't add value to the company, they absolutely did and handling the scale up they had to handle is not an easy feat. But I do challenge the notion of "what features did they create with that (LLM) spending?" of GP.

reply
SlinkyOnStairs
5 hours ago
[-]
> You can ask the same for the median 330k salary in the US for Uber Engineering

People DO.

It's well known that most tech companies are ran incompetently. As you say, it's not the engineers' fault.

But most projects and hiring in these companies exists to juice promotion criteria. And that, depending on perspective, these companies are either massively overstaffed or massively underproductive.

The comparison to AI spending being wasteful holds up pretty well, these are companies that readily piss away billions in pointless spending.

reply
FergusArgyll
50 minutes ago
[-]
This is a very good answer but there's a flip side too.

The idea of "if you add intelligence you make more money" is contradicted by the fact companies don't just always hire more people. Wy doesn't google just hire everyone?

reply
CharlieDigital
5 hours ago
[-]
This is what all "platform engineers" have to do once things are working nicely: you have to keep inventing work.

I don't know; I'm a Ron Popeil "set it and forget it" kind of guy. Make the dumbest, simplest thing that's going to work with some clear path for scaling. Then go do valuable things instead.

reply
darkwater
5 hours ago
[-]
But most Platform Engineering teams in smaller companies (and especially non-US) add a layer on top of existing technologies. A layer that usually maps to the specific culture and idiosyncrasies of that company; a bit like the deployment flow which is usually very specifically shaped on how a company is.

But in Uber's case, they tend to reinvent lower level pieces of platform/infra.

reply
throwaw12
5 hours ago
[-]
you don't get promotion for supporting existing things, but for "inventing" you can get promoted. also for large migrations
reply
ungreased0675
5 hours ago
[-]
Your last question is really important. What did they accomplish with all that spend?

I suspect there’s some mass delusion with respect to actual accomplishments as a result of LLM use. Sure, things are moving faster, but does it matter?

reply
infecto
5 hours ago
[-]
I am wondering more and more if this becomes true as these smaller models take off. I might be old fashioned but I have yet to crack the workflows some of the hype people spout like Claude codes Boris where he and others talk about running hundreds of agents overnight.

I have still found the sweet spot for me is using LLMs but I am still in the drivers seat.

reply
CharlieDigital
1 hour ago
[-]
That's because for some of these folks, the cost of the tokens doesn't have to match the value of the output; the hype from the story is all they need.

Normal people have to produce something of value from that spend. So starting 100 agents and then waking up to something cool but useless just means you spent a few thousand dollars and created nothing of value............

reply
ofjcihen
4 hours ago
[-]
Running hundreds of agents overnight is almost certainly 99 percent waste.
reply
devttyeu
5 hours ago
[-]
If you believe a 128gb machine that is essentially DGX Spark in a laptop chassis can run models comparable to SOTA you either never ran open models on hard tasks, or you aren't scratching the surface of SOTA closed LLM capability in how you're using them.
reply
f311a
5 hours ago
[-]
Can you show me an example of a hard task that can't be achieved using light models? When we don't want the model to work on autopilot without reviewing the code at all. Even SOTA models will produce garbage code, if you don't guide them all the time.

Hard tasks require a lot of guidance and code reviewing, unless you are creating another throw away project where correctness, maintainability and code understanding does not matter.

reply
sourcecodeplz
5 hours ago
[-]
$1.5kpm for SOTA. 128gb you run DSV4 Flash.
reply
pqtyw
35 minutes ago
[-]
What's the point of running it locally though? Inference for open models is quite cheap already. They could just selfhost, anyway. The experience of running LLMs locally will be excruciatingly bad in comparison at least for the near future.
reply
jcgrillo
5 hours ago
[-]
> WTF did Uber build with all of that spend?

WTF did anyone build with all that spend? Despite all the feel-good anecdotes about how productive folks feel using ai coding tools there's a deafening silence when it comes to actual, demonstrated efficacy. How can we be this far entrenched in these workflows and still not know whether they actually do anything useful?

reply
ftkftk
46 minutes ago
[-]
~70 FTE Engineering team. We are shipping more features, especially features that previously would not have survived the cut to make it on the roadmap. Even though we are shipping more, our total amount of escaped bugs has not increased, so our escape rate has actually lowered. On top of that we are able to triage and fix escaped bugs more quickly now. And then of course there has been an uptick in internal tooling that makes the rest of the company more efficient, and we have been able to address tech debt at a higher rate than before.

I don't think this would have been possible without having solid engineering culture and processes in place before bringing in ai coding tools.

And I don't want to sugarcoat it, this hasn't been easy, requires continued discipline, and took well over a year to get good at. And we still have to continuously learn, experiment and adapt our training, tooling, and processes.

reply
awesan
5 hours ago
[-]
I can say at least for me at a small-ish company (~40 FTE) there has been a surge in internal productivity tools. Nothing to improve the end user product directly but a lot of tools to make processes easier and less error prone.

What would previously be janky internal dashboards or excel sheets are now actually nice to use tools. That said of course the maintenance cost of all that has yet to be discovered, and the ROI is questionable.

reply
CharlieDigital
5 hours ago
[-]
About the same ~40 FTE team. We're doing the same thing. Smattering of internal tools, but no net gain in external revenue. Who knows which of those tools will have any value or ppl are just doing it because it's cool now to make fancy dashboards.

OK. I guess that's good, too.

reply
jcgrillo
5 hours ago
[-]
Yeah this seems to be a pretty widespread story, from what I've heard as well. The thing about those janky dashboards and spreadsheets though is that somebody understood them and built them with intent to solve a particular problem. Despite the rickety appearance, they're trustworthy tools. A polished single page app might look nicer but it's harder to debug than an excel sheet, and much less transparent in its internal workings--especially if nobody actually wrote it...
reply
izacus
58 minutes ago
[-]
More importantly, it's questionable how much extra revenue improving a design of internal tool brings.
reply
nonethewiser
5 hours ago
[-]
The real answer?

Software engineer quality of life.

There can be an increase in productivity without a corresponding increase in total output. The gains could be captured by software engineers doing a days work in an hour then fucking off in a variety of ways.

reply
pqtyw
32 minutes ago
[-]
> doing a days work in an hour then fucking off in a variety of ways

Until companies start hiring 5x less engineers than they did before and well.. we are clearly moving towards that direction

reply
nonethewiser
19 minutes ago
[-]
Quite possibly. Doubftul it will happen all at once. If you can get 8 hours of work done in 1 they'd need to ramp up demand 8x. Would be interesting to see that happen over night. Happy monday. Here, take these 30 tickets.
reply
slopinthebag
1 hour ago
[-]
Yeah I think this is probably most accurate.
reply
RugnirViking
5 hours ago
[-]
Imo its pretty clear that anyone who is taking the issue at least somewhat seriously knows the amount of value they provide is not non-zero. However, the problems are manifold: firstly, toolchains vary wildly, from fancy autocomplete, to engineers chatting with codebases they're unfamiliar with, to people integrating them into devops and infra, to people doing spec driven development, with a thousand philosophies inbetween. Many people suspect that those above them in the ladder are on the cusp of massive failure due to losing track of the code, and many people higher on the ladder think those below them are overly cautious. I hate to be the guy saying "oh it must be somewhere in the middle", but I will say at the very least I like being able to use it to read docs for me, and to synthesize syntax and simple scripts (give me a join that works across these tables and gives me column x, y and z - give me a python script that parses a file like this example and extracts abc data - given this api spec figure out how I can get this data from this endpoint, go)

as for building actually complex software, the art of that is not in simply chaining together such scripts. Its the art of using architecture and testing to shape uncertainty, and developing requirements (and extrapolating sensibly from incomplete requirements). I don't think llms are great at this, but they arent terrible either. A lot of the more active users in the space are doing stuff where theyve realised they need more detailed specs, which like, yeah, we knew this already - better defined problems lead to better software.

reply
jcgrillo
5 hours ago
[-]
I agree the most interesting use cases I've heard of are about increasing the rigor of software development practices, but there's definitely a lack of coherence in methodology.. I believe that some users and companies are successful in this effort, but the odd (and interesting!) thing is that so far we don't seem to know how to communicate how to do it successfully.
reply
m3kw9
5 hours ago
[-]
You can't get an edge using local models, these guys may have competitors that will spend on SOTA models. They won't likely ever consider local machines even for some offloading scenarios, the complexity and costs will be even higher.
reply
CharlieDigital
5 hours ago
[-]
Consider rewiring your perspective: getting an edge doesn't really matter; the only thing that matters is will customers pay for this? Is this a useful, valuable problem to solve?

Coding faster doesn't really solve that.

Uber makes more money if people buy more rides, order more food, have some breakthrough in autonomous driving. They can save money if they can optimize some ops or spend somewhere. Is there any evidence that with the spend on AI that they achieved any of this? If they did, I'm sure we'd hear about it in some engineering blog.

reply
analognoise
4 hours ago
[-]
18k/yr? None of the LLMs generate anything like that in value!
reply
simonw
4 hours ago
[-]
I'm definitely getting that much value out of Claude Code and Copilot.
reply
CharlieDigital
4 hours ago
[-]
You're a content creator; you define your revenue stream.

Uber engineers do not define their revenue stream; the product leadership team does.

$1500/mo of AI spend by engineers does not equate to revenue. They need to figure out revenue first before zeroing in on AI spend.

reply
Daishiman
4 minutes ago
[-]
$18K a year is a fraction of the salary of a junior engineer.

Claude has allowed me to do refactors that would have taken weeks to instead take a couple of days. It has, objectively, increased the velocity of the engineering component of greenfield features by 40% in my org. You can put a number value on that and decide if it gives you favorable ROI.

reply
ofjcihen
4 hours ago
[-]
Can you share some examples that you would say justify that price? Not a gotcha, I’m genuinely curious where you’re seeing a return at that level.
reply
simonw
3 hours ago
[-]
I've written tens of thousands of lines of tested, working code that I would not have written otherwise, and that code is useful to me.

I effectively get to operate at the rate of a small team of engineers - I know that because I've managed small teams of engineers in the past.

reply
ofjcihen
2 hours ago
[-]
> that I would not have written otherwise

I think this is the part I struggle with. The code I write makes me money or is a way of teaching me something, both of which are reasons that I would write the code regardless.

I don’t think I have any projects in mind that I’d be willing to spend half of a car on that I also wouldn’t have written myself.

Obviously just a personal take though. I’m glad you get the usage you want out of it.

reply
simonw
1 hour ago
[-]
My "job" is building open source software for data journalism (and anyone else who needs the tools data journalists need, which is pretty much everyone else). I can build more of those tools, and better, in exchange for a fraction of the cost it would take to hire a team to help.
reply
newobj
1 hour ago
[-]
It's also a useful signal for AI value. Looks like it's a max value add of $18,000 per engineer per year.
reply
Anon1096
19 minutes ago
[-]
No, that's not what it means at all even if just doing it purely in math terms. Really it is just a reasonable amount to cap at to stop the long tail of super spenders (tokenmaxxers). You could also call it "the amount of AI spend after which Uber has decided there is diminishing returns for the average engineer".
reply
csallen
45 minutes ago
[-]
It's not so simple to determine and generalize how much value AI adds. It's going to be different on a per-company basis and a per-engineer basis. It's also affected by the competitive market place and how many other companies are using AI for their engineers.

For example, what if you're a tiny startup and you're considering whether to hire an extra engineer or do all the coding yourself. I would estimate that AI is worth far more than $18,000 a year in that situation where you might reasonably decide to put off hiring an engineer.

reply
pqtyw
43 minutes ago
[-]
I find it really doubtful anyone has managed to quantify that in any meaningful way. Seems like mostly an arbitrary number. Also the article does claim that's its actual several times more than 18k if you are fine with using Codex, Cursor or etc. when you Claude tokens run out.
reply
alasano
36 minutes ago
[-]
Their initial budget for determining how much value AI adds is $18,000 per engineer.
reply
tfehring
21 minutes ago
[-]
Not really. There are clearly diminishing marginal returns, so it's likely that the first $2,400/engineer/year adds >>$2,400 of value, even if 18,001st $/engineer/year adds <$1 of value.
reply
eqvinox
47 minutes ago
[-]
It's among a wave of fresh "non-insane" takes on AI in the enterprise. Maybe we can reel things in to a sustainable level before a giant bubble bursts.
reply
galaxyLogic
44 minutes ago
[-]
It's probabaly a good things that Uber-developers are now forced to do some coding on their own. Only use AI where it absolutely helps
reply
sva_
34 minutes ago
[-]
Or be smarter about their usage. $50 on tokens per day can get you a long way.
reply
estomagordo
28 minutes ago
[-]
Some people also take weekends off.
reply
pmontra
22 minutes ago
[-]
I wonder what they are doing with $1500 per month. I'm on Claude Pro $20 plan and I'm doing well. That's 3 days per week. On the other 2 days I'm using a customer's Claude Max, I don't know if it's the $100 or the $200 plan, but I'm sharing it with some of its other developers.
reply
hrpnk
20 minutes ago
[-]
$1500/mth is token pricing.

Your other plans are fixed price with rate limits where you get more tokens than the dollar equivalent you pay monthly. These plans are economical only if majority of users spend less tokens in $ than the plan's costs. This subsidizes the gap vs. power users who spend multiple k$ monthly in API tokens.

reply
flyinglizard
11 minutes ago
[-]
Yea, I’m sure the personal plans are subsidized. I have $200 Claude Max at home and straight API pricing at work and equivalent work would easily cost me 5x if not more on the API.
reply
idiliv
19 minutes ago
[-]
Uber is likely on an enterprise plan - these charge tokens at API cost, which can be much more expensive than the $20 flat rate.
reply
jkwang
5 hours ago
[-]
The $1500 number is less interesting than the fact that they hit a ceiling at all. Most engineering teams I've talked to have no idea what their AI spend is per developer because it's buried in a consolidated cloud bill. Having a hard cap forces two useful conversations: what workflows actually justify API calls vs local inference, and whether the output is being measured against any real productivity metric. Without that feedback loop it's just a race to see who can burn tokens fastest.
reply
simonw
5 hours ago
[-]
Both the Anthropic and OpenAI "Enterprise" plans include per-developer analytics:

Anthropic: https://support.claude.com/en/articles/12883420-view-usage-a...

OpenAI: https://help.openai.com/en/articles/10875114-workspace-analy...

reply
Igrom
55 minutes ago
[-]
I believe you might be replying to a bot account.
reply
lazyasciiart
33 minutes ago
[-]
What makes it look like one? All their dead comments read pretty normal to me.
reply
etothet
46 minutes ago
[-]
In my experience, this is far below the cost the average dev will incur per month so this seems very reasonable to me. And, no doubt there are exceptions for heavy users so they can get some extra token usage when they need it.
reply
waffuldrop
30 minutes ago
[-]
unless they changed something in the like 2 months (edit: besides implementing a cap for claude code specifically, since other tools already had caps) since ive left my job there im pretty sure 1500$ is the very max you can use after maxing out free calls, initial budget, then 2 extensions individually reviewed by your manager

higher ups pushed for these last 2 years to be AI focused so I don't think this restriction is a measure of "don't use too much AI" as much as it is a measure of "don't use only 'manual' AI tooling" since we had a dozen more specialized tools in-house running locally or otherwise that didn't count towards the budget

reply
hrpnk
19 minutes ago
[-]
If budgeted at $1,500/month per user, power users still can get 5-10x of that allocation if the user pool is large enough.
reply
rasbmn
44 minutes ago
[-]
Uber is in the business of experimenting with robotaxis and automated food delivery.

They can't say that $0 per employee is the appropriate amount for AI spending. So they capped it, perhaps in order to "send a signal" that is eagerly picked up by the AI boosters.

There is no signal. Uber does not work any better since AI. They still want to promote AI, so they chose the highest number that doesn't bankrupt them so the press and AI promoters pick it up as the new price anchor.

Probably they'll quietly reduce the number more soon.

reply
lazyasciiart
34 minutes ago
[-]
Is this inside knowledge, or speculation?
reply
PessimalDecimal
5 hours ago
[-]
These are still at currently subsidized prices. We'll see if they think they're getting $1500/month of value when that buys significantly fewer tokens.
reply
square_usual
5 hours ago
[-]
There is no evidence that per-token inference prices (which is what Uber is setting a cap on) is subsidized.
reply
pier25
5 hours ago
[-]
AI companies have more expenses than inference.
reply
RugnirViking
5 hours ago
[-]
yes, and theres no evidence that they arent (or can't) use profitable inference to subsidise those other expenses. Some companies will keep spending massively to train better models, and some other companies will not, and offer good api prices. Which will end up being used? That depends on whether the spending turns into better value models
reply
pier25
26 minutes ago
[-]
> theres no evidence that they arent (or can't) use profitable inference to subsidise those other expenses

as far as we know there's no evidence that they can produce any profits at all

reply
lelanthran
5 hours ago
[-]
Is there any evidence that it's not?
reply
Topfi
5 hours ago
[-]
The fact that Anthropic models are offered at the same API pricing by not just themselves but AWS, Azure and Vertex despite Anthropic taking a major slice on licensing along with the cost an open weight 1T parameter model like K2.6 costs to run on any third-party provider, make it unlikely that API inference cost are subsidized by the labs.
reply
pqtyw
27 minutes ago
[-]
Openrouter? i.e. Even excluding Deep Seek inference for very large open models is way cheaper. Maybe these providers are not very profitable but its highly unlikely that they are losing $4 for every $1 they make since selling inference is their only product...
reply
thejazzman
5 hours ago
[-]
Yes; they ban various uses of their subscriptions but say you can do whatever if you’re paying for the API without limits
reply
pqtyw
26 minutes ago
[-]
That's just market segmentation and them trying to maximize revenue it doesen't really say anything about their costs.
reply
simonw
5 hours ago
[-]
This story isn't about those subscriptions - enterprise customers like Uber are paying the full API prices.
reply
lelanthran
5 hours ago
[-]
That's not evidence. Very likely though, but the only evidence we get one way or another is when they IPO.
reply
pqtyw
31 minutes ago
[-]
The inference prices for very large open models would indicate that Antrophic's and OpenAI's margins are quite large.
reply
pdyc
5 hours ago
[-]
afaik, enterprise plans are not subsidized. its 20$/seat+api pricing. Unless you are saying api pricing itself is subsidized.
reply
LurkandComment
5 hours ago
[-]
This is market introductory pricing that hasn't factored in cost recovery. Most of it has been run on early investment with the assumption they will recover costs in the long run. The prices are subsidized across the board and they will need to go up signficantly to recover them.
reply
pqtyw
24 minutes ago
[-]
Yeah, that's not going to work if you can get e.g. 80% of value by using 10-20x or more cheaper open models. At some point it would just make sense for large companies to rent compute and deploy their version of DeepSeek or whatever (if they don't trust Chinese providers)
reply
swiftcoder
5 hours ago
[-]
Assuming this were accurate, then presumably the AI companies would be betting that inference costs come down before the bill is due - I don't see enterprises being willing to absorb another ~10x price increase for tokens (as they've just done going from subscription prices to per-token pricing)
reply
LurkandComment
4 hours ago
[-]
For claude shops this was a huge hit. But lets back this up. There are some companies that haven't even built a break-even model at this price because they are funded by investment. As soon as those investors lose patience the first dominos will fall. For those who have somewhat of a business model, will it survive a price increase? The bigger question is do the base model providers have enough runway and have a way to keep going as they need to recover costs.
reply
pqtyw
23 minutes ago
[-]
It's mostly R&D though, not inference. If LLM's effectively become a commodity then they are screwed anyway.
reply
logancbrown
5 hours ago
[-]
None of what you said is true
reply
rimliu
5 hours ago
[-]
And you know this how?
reply
boringg
5 hours ago
[-]
True but they will raise prices slowly so people will optimize their workflow so they aren't just throwing as much inference as fast as possible like the current state. Right now you should do everything you wanted to try out because it is cheap (as long as you don't become dependent ... the risk).
reply
sourcecodeplz
5 hours ago
[-]
I understand current Codex $20 sub is worth about $480 GPT5 api credits.
reply
esafak
39 minutes ago
[-]
reply
MagicMoonlight
5 hours ago
[-]
It's not. They recently forced enterprise customers onto API billing instead of the cheap consumer pricing. Now the pricing is brutal.
reply
epsteingpt
5 hours ago
[-]
Uber engineers reported that loading their workspace and pulling recent commits exhausted that AI limit for Claude Code (4.8 x-high) immediately.
reply
wmf
24 minutes ago
[-]
I don't think loading up a single context window costs $1,500. Which limit are you talking about?
reply
LurkandComment
5 hours ago
[-]
1) This happened because they fundementally misunderstand how to use AI and how AI is priced 2) Most organizations are throwing everything in for analyses and not limiting the answer they want. You need to be specific of about what you analyze and what answers you want 3) People undervalue prompting or templated responses. I will have written. validated and sanity checked a prompt several times and run it across several models before I say its ready for use. But when it is, I know what it will give me and that the scope of its research and answer is as close to what I want as it can be. As little excess as I can. This all saves tokens
reply
jwpapi
5 hours ago
[-]
If you estimate 10k salary per engineer that means the moment it’s cheaper for them to hire another engineer but that doesn’t mean it’s improving productivity 15% but if 15% is the moment it stopped being better than another human we can assume 7.5%?

Probably even less because you would spend those 1500 extra per employee also if you just save 10% so 150 per employee that’s 1.5% on salary.

This is imho one of the best ranges we can assume for now how much would that be on the whole swe market?

reply
ilia-a
5 hours ago
[-]
Seems odd limit, especially since it highly dependant on Token provider used, with Opus this is not much and could easily be burnt in a week or less, but with something like deepseek the 1500 can literarily be an annual budget.

That being said, I do have to wonder why someone as bug as say Uber, simply not rollout OSS model in the cloud for their team, I'd imagine that would be cheapest & most flexible option, while also keeping all the data shared with LLM private.

reply
iceman28
5 hours ago
[-]
It’s not just about the model but also setting up the system to create and share compute (GPUs) which is quite complicated on its own. Ubers primary business focus isn’t infrastructure.
reply
ChrisArchitect
6 hours ago
[-]
Related:

Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing

https://news.ycombinator.com/item?id=48268871

Uber torches 2026 AI budget on Claude Code in four months

https://news.ycombinator.com/item?id=47976415

Corporate America Is Starting to Ration AI as Cost Skyrockets

https://news.ycombinator.com/item?id=48335388

reply
cloudking
5 hours ago
[-]
They are also beholden to enterprise pricing and can't use the subsidized consumer max plans.
reply
sremani
4 hours ago
[-]
I have strong conviction that companies will now choose tech stack/programming languages based on 'tokenomics'. I am vibe coding using Clojure, a language I can read but cannot write and I never hit the usage limits even when using the latest model on Claude. I have similar experience with F#, which is a bit more verbose than clojure but absolutely beats every OOP language, Python, Typescript etc.

The reason, I use F# & Clojure is they hit JVM and CLR, two popular enterprise stacks.

In my not so humble opinion Lisp(Clojure) still remains the language of AI.

reply
jedisct1
5 hours ago
[-]
A lot of things can be done with local models.
reply
rimliu
5 hours ago
[-]
Even more things can be done without any models just as well.
reply
dude250711
5 hours ago
[-]
Single developers seeking local models.
reply