I doubt it.
And what if the technology to locally run these systems without reliance on the cloud becomes commonplace, as it now is with open source models? The expensive part is in the training of these models more than the inference.
I agree. Right now a lot of AI tools are underpriced to get customers hooked, then they'll jack up the prices later. The flaw is that AI does not have the ubiquitous utility internet access has, and a lot of people are not happy with the performance per dollar TODAY, much less when prices rise 80%. We already see companies like Google raising prices stating it's for "AI" and we customers can't opt out of AI and not pay the fee.
At my company we've already decided to leave Google Workspace in the spring. GW is a terrible product with no advanced features, garbage admin tools, uncompetitive pricing, and now AI shoved in everywhere and no way to granularly opt out of a lot of it. Want spell check? Guess what, you need to leave Gemini enabled! Shove off, Google.
Absolutely, not only are most AI services free but even the paid portion is coming from executives mandating that their employees use AI services. It's a heavily distorted market.
And a majority of those workers do not reveal their AI usage, so they either take credit for the faster work or use the extra time for other activities, which further confounds the impact of AI.
This is also distorting the market, but in other ways.
Past successes like Google encourage hope in this strategy. Sure, it mostly doesn't work. Most of of everything that VCs do doesn't work. Returns follow a power law, and a handful of successes in the tail drive the whole portfolio.
The key problem here doesn't lie in the fact that this strategy is being pursued. The key problem is that it is rare for first mover advantages to last with new technologies. That's why Netscape and Yahoo! aren't among the FAANGs today. The long-term wins go to whoever successfully create a sufficient moat for themselves to protect lasting excess returns. And the capabilities of each generation of AI leapfrogs the last so well that nobody has figured out how to create such a moat.
Today, 3 years after launching the first LLM chatbot, OpenAI is nowhere near as dominant as Netscape was in late 1997, 3 years after launching Netscape Navigator. I see no reason to expect that 30 years from now OpenAI will be any more dominant than Netscape is today.
Right now companies are pouring money into their candidates to win the AI race. But if the history of browsers repeats itself, the company that wins in the long-term would launch in about a year from now, focused on applications on top of AI. And its entrant into the AI wars wouldn't get launched until a decade after that! (Yes, that is the right timeline for the launch of Google, and Google's launch of Chrome.)
Investing in silicon valley is like buying a positive EV lottery ticket. An awful lot of people are going to be reminded the hard way that it is wiser to buy a lot of lottery tickets, than it is to sink a fortune into a single big one.
Incorrect. There were about 150 millions Internet users in 1998, or 3.5% of total population. The number grew 10 times by 2008 [0]. Netwcape had about 50% of the browser market at the time [1]. In other words, Netscape dominated a small base and couldn’t keep it up.
ChatGPT has about 800 millions monthly users, or already 10% of total current population. Granted, not exclusively. ChatGPT is already a household name. Outside of early internet adopters, very few people knew who Netscape or what Navigator was.
[0] https://archive.globalpolicy.org/component/content/article/1...
[1] https://www.wired.com/1999/06/microsoft-leading-browser-war/...
People are missing the forest for the trees here. Being the go to consumer Gen AI is a trillion+ dollar business. How many 10s of billions you waste on building unnecessary data centers is a rounding error. The important number is your odds of becoming that default provider in the minds of consumers.
I used ChatGPT for every day stuff, but in my experience their responses got worse and I had to wait much longer to get some poor response. I switched to Gemini and their answers were better and were much faster.
I don’t have any loyalty to Gemini though. If it gets slow or another provider gives better answers, I’ll change. They all have the same UI and they all work the same (from a user’s perspective).
There is no moat for consumer genAI. And did I mention I’m not paying for any of it?
It’s like quick commerce, sure it’s easy to get users by offering them something expensive off of VC money. The second they raise prices or offer degraded experience to make the service profitable, the users will leave for another alternative.
I haven't seen any evidence that any Gen AI provider will be able to build a moat that allows for this.
Some are better than others at certain things over certain time periods, but they are all relatively interchangeable for most practical uses and the small differences are becoming less pronounced, not more.
I use LLMs fairly frequently now and I just bounce around between them to stay within their free tiers. Short of some actual large breakthrough I never need to commit to one, and I can take advantage of their own massive spends and wait it out a couple of years until I'm running a local model self-hosted with a cloudflare tunnel if I need to access it on my phone.
And yes, most people won't do that, but there will be a lot of opportunity for cheap providers to offer that as a service with some data center spend, but nowhere near the massive amounts OpenAI, Google, Meta, et al are burning now.
LLMs complete text. Every query they answer is giving away the secret ingredient in the shape of tokens.
So voice assistants backed by very large LLMs over the network are going to win even if we solve the (substantial) battery usage issue.
The local open source argument doesn't hold water for me -- why does anyone buy Windows, Dropbox, etc when there's free alternatives?
Installing an OS is seen as a hard/technical task still. Installing a local program, not so much. I suspect people install LLM programs from app stores without knowing if they are calling out to the internet or running locally.
See also how all (?) Brits pronounce Gen Z in the American way (ie zee, not zed).
You sometimes see this with real live humans who have lived in multiple counties.
Pay no attention to those fopheads from Kent. We speak proper British English here in Essex
Some people are not from usa or England.
Bullet points hell, a table that feels it came straight out of grok.
IMHO the investors are betting on a winner-takes-it-all market and that some magic AGI will be coming out of OpenAI or Anthropic.
The questions are:
How much money can they make by integrating advertising and/or selling user profiles?
What is the model competition going to be?
What is the future AI hardware going to be - TPUs, ASICs?
Will more people have powerful laptops/desktops to run a mid-sized models locally and be happy with it?
The internet didn't stop after the dotcom crash and the AI wont stop either should there be a market correction.
By itself, this doesn't tell us much.
The more interesting metric would be token use comparison across free users, paid users, API use, and Azure/Bedrock.
I'm not sure if these numbers are available anywhere. It's very possible B2B use could be a much bigger market than direct B2C (and the free users are currently providing value in terms of training data).
But the AI providers are betting, correctly in my opinion, that many companies will find uses for LLM’s which are in the trillions of tokens per day.
Think less of “a bunch of people want to get recipe ideas.”
Think more of “a pharma lab wants to explore all possible interactions for a particular drug” or “an airline wants its front-line customer service fully managed by LLM.”
It’s unusual that individuals and industry get access to basically similar tools at the same time, but we should think of tools like ChatGPT and similar as “foot in the door” products which create appetite and room to explore exponentially larger token use in industry.
Let's estimate 200 million office workers globally as TAM running an average of 250k tokens. That's 50 trillion tokens DAILY. Not sure what model provider profit per token is, but let's say it's .001 cents.
Thats $500M per day in profit.
Pharma does not trust OpenAI with their data, and they don't work on tokens for any of the protein or chemical modeling.
There will undoubtedly be tons of deep nets used by pharma, with many $1-10k buys replacing more expensive physical assays, but it won't be through OpenAI, and it won't be as big as a consumer business.
Of course there may be other new markets opened up but current pharma is not big enough to move the needle in a major way for a company with an OpenAI valuation.
This has been experimented on before by many companies over the recent years, most notably Klarna which was among the earliest guinea pigs for it and had to later on backtrack on this "novel" idea when the results came out.
- OverUtilized/UnderCharged: doesn't matter because...
- Lead Time vs. TCO vs. IRS Asset Deprecation: The moment you get it fully built, it's already obsolete. Thus from a CapEx point of view, if you can lease your compute (including GPU) and optimize the rest of the inputs for similar then your CapEx overall is much lower and tied to the real estate - not the technology. The rest is cost of doing business and deductible in and of itself.
- The "X" factor: Someone mentioned TPU/ASIC but then there is the DeepSeek factor - what if we figure out a better way of doing the work that can shortcut the workflow?
- AGI partnerships: Right now, you see a lot of Mega X giving billions to Mega Y because all of them are trying to get their version of Linux or Apache or whatever at parity with the rest. Once AGI is settled and confirmed, then most all of these partnerships will be severed because it then becomes which company is going to get their AI model into that high prestige Montessori school and into the right ivy league schools - like any other rich parent would for their "bot" offspring.
So what will it look like when it crashes? A bunch of bland empty "warehouses" with mobile PDU's once filling all their parking lot space gone. Whatever "paradise" that was there may come back... once you bulldoze all that concrete and steel. The money will do something else like a Don McLean song.
On Amazon, buying a 5090 costs $3000 [2]
That's a payback time of 212 days. And Runpod is one of the cheaper cloud providers; for the GPUs I compared, EC2 was twice the price for an on-demand instance.
Rental prices for GPUs are pretty darn high.
[1] https://www.runpod.io/pricing [2] https://www.amazon.com/GIGABYTE-Graphics-WINDFORCE-GV-N5090G...
Many of legacy systems still running today are IBM or Solaris servers at 20, 30 year old. No reason to believe GPU won’t still be in use in some capacity (e.g. interference) a decade from now.
Even if all of the GPUs inside burn out and you want to put something else entirely inside of the building, that's all still ready to go.
Although there is the possibility they all become dilapidated buildings, like abandoned factories
Of the most valuable part is quickly depreciating and goes unused within the first few years, it won't have a chance for long term value like fiber. If data centers become, I don't know, battery grid storage, it will be very very expensive grid storage.
Which is to say that while there was an early salivation for fiber that was eventually useful, overallocation of capital to GPUs goes to pure waste.
Giant telecoms bought big regional telecoms which came about from local telecoms merging and acquiring other local telecoms. A whole bunch of them were construction companies that rode the wave, put in resources to run dark fiber all over the place. Local energy companies and the like sometimes participated.
There were no standard ways of documenting runs, and it was beneficial to keep things relatively secret, since if you could provide fiber capabilities in a key region, but your competition was rolling out DSL and investing lots of money, you could pounce and make them waste resources, and so on. This led to enormous waste and fraud, and we're now on the outer edge of usability for most of the fiber that was laid - 29-30 years after it was run, most of it will never be used, or ever have been used.
The 90s and early 2000's were nuts.
At the local level, there is generally a cable provider with existing rights of way. To get a fiber provider, there’s 4 possible outcomes: universal service with subsidy (funded by direct subsidy), cherry-picked service (they install where convenient), universal service (capitalized by the telco) and “fuck you”, where they refuse to operate. (ie. Verizon in urban areas)
The private capitalized card was played out by cable operators in the 80s (they were innovators then, and AT&T was just broken up and in chaos). They have franchise agreements whose exclusivity was used as loan collateral.
Forget about San Diego, there are neighborhoods in Manhattan with the highest population density in the country where Verizon claims it’s unprofitable to operate.
I served on a city commission where the mayor and county were very interested in getting our city wired, especially as legacy telco services are on the way out and cable costs are escalating and will accelerate as the merger agreement that formed Spectrum expires. The idea was to capitalize last mile with public funds and create an authority that operated both the urban network and the rural broadband in the county funded by the Federal legislation. With the capital raised with grants and low cost bonding (public authority bonds are cheap and backed by revenue and other assets), it would raise a moderate amount of income in <10 years.
We had the ability to get the financing in place, but we would have needed legislation passed to get access to rights of way. Utilities have lots of ancient rights and laws that make disruption difficult. The politicians behind it turned over before that could be changed.
if there's ever a glut in GPUs that formula might change but it sure hasn't happened yet. Also, people deeply underestimate how long it would take a competing technology to displace them. It took GPUs nearly a decade and the fortunate occurrence of the AI boom to displace CPUs in the first place despite bountiful evidence in HPC that they were already a big deal.
* The GPUs in use in data centers typically aren’t built for consumer workloads, power systems, or enclosures.
* Data Centers often shred their hardware for security purposes, to ensure any residual data is definitively destroyed
* Tax incentives and corporate structures make it cheaper/more profitable to write-off the kit entirely via disposal than attempt to sell it after the fact or run it at a discount to recoup some costs
* The Hyperscalers will have use for the kit inside even if AI goes bust, especially the CPUs, memory, and storage for added capacity
That’s my read, anyway. They learned a lot from the telecoms crash and adjusted business models accordingly to protect themselves in the event of a bubble crash.
We will not benefit from this failure, but they will benefit regardless of its success.
If someone can afford an 8 GPU server, they should be able to afford some #6 wire, a 50A 2P breaker, and a 50A receptacle. It has the same exact power requirements as an L2 EV charger.
In reality, if you have a dryer outlet, you have a good fraction of 10 kW available.
> You can already use Claude Code for non engineering tasks in professional services and get very impressive results without any industry specific modifications
After clicking on the link, and finding that Claude Code failed to accurately answer the single example tax question given, very impressive results! After all, why pay a professional to get something right when you can use Claude Code to get it wrong?
What do you think LLM tuned GPUs or TPUs are going to be used for that is completely different and not AI related?
The key dynamic: X were Y while A was merely B. While C needed to be built, there was enormous overbuilding that D ...
Why Forecasting Is Nearly Impossible
Here's where I think the comparison to telecoms becomes both interesting and concerning.
[lists exactly three difficulties with forecasting, the first two of which consist of exactly three bullet points]
...
What About a Short-Term Correction?
Could there still be a short-term crash? Absolutely.
Scenarios that could trigger a correction:
1. Agent adoption hits a wall ...
[continues to list exactly three "scenarios"]
The Key Difference From S:
Even if there's a correction, the underlying dynamics are different. E did F, then watched G. The result: H.
If we do I and only get J, that's not K - that's just L.
A correction might mean M, N, and O as P. But that's fundamentally different from Q while R. ...
The key insight people miss ...
If it's not AI slop, it's a human who doesn't know what they're talking about: "enormous strides were made on the optical transceivers, allowing the same fibre to carry 100,000x more traffic over the following decade. Just one example is WDM multiplexing..." when in fact wavelength division multiplexing multiplexing is the entirety of those enormous strides.
Although it constantly uses the "rule of three" and the "negative parallelisms" I've quoted above, it completely avoids most of the overused AI words (other than "key", which occurs six times in only 2257 words, all six times as adjectival puffery), and it substitutes single hyphens for em dashes even when em dashes were obviously meant (in 20 separate places—more often than even I use em dashes), so I think it's been run through a simple filter to conceal its origin.
Other than that I'd rather choose a comprehensive article than a summary.
On topic: It is always quite easy to be the cynical skeptic, but a better question in my view: Is the current AI boom closer to telecoms in 2000 or to video hosting in 2005? Because parallels are strong to both, and the outcomes vastly different (Cisco barely recovered by now compared to 1999 while youtube is printing money).
What about the possibility of improvements in training and inference algorithms? Or do we know we won't get any better than grad descent/hessians/etc ?
This is a kind of risk that finance people are completely blind to. Open AI won't tell them because it keeps capital cheap. Startups that must take a chance on hardware capability remaining centralized won't even bother analyzing the possibility. With so many actors incentivized to not know or not bother asking the question, there's the biggest systematic risk.
The real whiplash will come from extrapolation. If an algorithm advance shows up promising to halve hardware requirements, finance heads will reason that we haven't hit the floor yet. A lot of capital will eventually re-deploy, but in the meantime, a great deal of it will slow down, stop, or reverse gears and get un-deployed.