FilterHN

mystraline

1 day ago

[-]

From the article:

> consumers hate metered billing. they'd rather overpay for unlimited than get surprised by a bill.

Yes and no.

Take Amazon. You think your costs are known and WHAMMO surprise bill. Why do you get a surprise bill? Because you cannot say 'Turn shit off at X money per month'. Can't do it. Not an option.

All of these 'Surprise Net 30' offerings are the same. You think you're getting a stable price until GOTAHCA.

Now, metered billing can actually be good, when the user knows exactly where they stand on the metering AND can set maximums so their budget doesn't go over.

Taken realistically, as an AI company, you provide a 'used tokens/total tokens' bar graph, tokens per response, and estimated amount of responses before exceeding.

Again, don't surprise the user. But that's an anathema to companies who want to hide tokens to dollars, the same way gambling companies obfuscate 'corporate bux' to USD.

pembrook

1 day ago

[-]

Metered billing makes sense for B2B infrastructure-as-a-service type products (AWS), where as your company grows both you and the infra provider know the bill will grow manageably over time. Infra is set it and forget it.

But for AI in the context of point-solutions and on-the-job use cases, metered billing is a death blow.

In this context, metered is a massive incentive to not use the product and requires the huge friction of having to do a cost/benefit analysis before every task. And if you're using it at work you may even need management sign-off before you can use it again.

For a tool that's intended to amplify productivity, very few humans want to make a cost/benefit analysis 250 times a day whether it's worth $3 to code up a boilerplate or not. On metered billing, they just wont use it.

armchairhacker

1 day ago

[-]

People don't have issues paying for metered electricity, heat, and water. AFAIK most people are aware of when they're wasting power/heat/water but don't go to extremes, e.g. don't run the shower constantly but don't shower with a bucket.

marcus_holmes

1 day ago

[-]

There is no "pay $25/month for unlimited electricity" option. And if there was, everyone would use that and all metered options would go broke overnight. Followed fairly shortly by the company offering unlimited flat-rate deals.

Which is what TFA describes, and is why there is no unlimited flat-rate deal for utilities. But that's a mature market that is not relying on growth for valuations, and isn't appealing to VCs trained on net/mobile/crypto bubbles.

portaouflop

1 day ago

[-]

I don’t know about you but where I live the electricity companies definitely offer a “flat rate” for electricity.

But those contracts are so much more expensive that virtually no one gets them.

stavros

22 hours ago

[-]

Where do you live that you can get a flat rate for unlimited electricity? I've never heard of this (I live in Greece).

mathgeek

20 hours ago

[-]

It existed when I lived in Pennsylvania, but it always had a “if you exceed this threshold consistently we will move you to a metered plan” conditional. More a way to plan your budget than truly unlimited.

marcus_holmes

1 hour ago

[-]

Yeah, that's not unlimited. Useful data point, though. Potentially a solution for the AI companies?

sensanaty

20 hours ago

[-]

Dunno if it counts because it's not unlimited usage for a flat rate, but in the Netherlands a lot of utilities like water, gas and electricity are billed at the same flat rate every month, then once a year they do a calculation and you get money back (or have to pay more if your usage was higher) based on actual usage.

I can't see any situation where true unlimited usage for a flat rate, even a wildly expensive and uneconomical one, would make any sense because as often happens you'd basically be incentivizing people to waste as much gas, electricity and water as they possibly can to "get their money's worth" or whatever, and we should be encouraging the exact opposite for those 3 things.

physicsguy

19 hours ago

[-]

Same in the UK, it's mostly to spread out the seasonal cost.

Although here, if people get into debt they can be forced onto pre-payment meters where that's not an option, and the unit price is also higher. There's a lot of controversy about this.

stavros

20 hours ago

[-]

I agree with you, that's why I can't believe it exists anywhere. What you mention is basically to reduce the costs of counting, and is still usage-based metering, just with a reasonable guess to make payment more frequent.

Power in Greece works the same way, except you (are supposed to) get the guess one month and the count the next (and pay the difference). In practice, they count less often than that.

ncruces

19 hours ago

[-]

> What you mention is basically to reduce the costs of counting, and is still usage-based metering, just with a reasonable guess to make payment more frequent.

That's not the motive where I live. There's basically 2 reasons: (1) customers (esp. those, often less well off, that resort to “money earned, money spent”) like the predictability of it; and (2) depending on the season spending changes significantly (heating, cooling), whereas salaries don't.

Usually the corrective amount is also paid out over a period of months if it exceeds X% of one month's spend.

semi-extrinsic

19 hours ago

[-]

Here in Norway, especially 10-20 years ago but still not uncommon today, you will find flats for rent with electricity included in the monthly price. Student housing especially.

We own an old 170 sq.m. semi-detached house, and the electricity bill for June was 20 euros. That's with all heating & hot water coming from electricity (heat pumps) and owning an EV that we charge at home.

physicsguy

19 hours ago

[-]

That's the same for the UK but ultimately the cost is absorbed by someone, just not the tenant.

FuriouslyAdrift

17 hours ago

[-]

I work in heavy industrial manufacturing and we have flat rate contracts. In fact, we MUST use a minimum amount or we get billed a penalty.

In manufacturing, everything is about consistency.

marcus_holmes

1 hour ago

[-]

Interesting. Is there an upper limit too?

Nevermark

1 day ago

[-]

People don't have stable or seasonal "token needs" or wants.

Nor do they have stable don't-give-me-any-better "answer quality" expectations from models. At least not yet.

AI model services don't have any version of the typical utility's geographical market protection or monopoly.

Worse, they have all been funded for massive loss-leader hyper growth, so to survive in the short term, they have to compete on who can raise and lose more.

Then looking into the long term, unlike the Uber's of the world, the point where a market will reach saturation, survivors finally "win", and margins can transition from negative to positive, keeps moving away.

I don't see any way in which utility bills are economically comparable to this situation.

The only "hope" the general AI service business has, is that at some point one of them pulls far enough ahead of the others that everyone else's economics collapse first.

Like a very high speed version of the centralization in advanced chip fabs. But with no profitability until the centralization happens. Brutal.

latexr

22 hours ago

[-]

Those utilities are fairly easy to track in a way that digital services aren’t. Lights in my home consume energy at a consistent rate. Ditto for water coming out of the faucet; if I open it more, I can see it moving more water and costing more.

On the web, it’s not clear that one website is being massively more wasteful than another. From your perspective you’re engaging in the same behaviour (opening a website) but for reasons outside your control you may be spending more than intended.

I remember not having unlimited internet traffic at home and always having to worry. These days, I pay a flat monthly fee for internet access.

great_psy

1 day ago

[-]

But leaving a light on 2x the time will equal very close to 2x the price.

Asking “what day is today” vs “create this api endpoint to adjust the inventory” will cost vastly different. And honestly I have no clue where to start to even estimate the cost unless I run the query.

<https://www.reuters.com/business/energy/texas-freeze-raises-...>

1 day ago

[-]

Except of course that they do:

"Texas freeze raises concerns about ‘ridiculous’ variable rate bills" (2021)

HOUSTON, Feb 23 (Reuters) - In Spring, Texas, about 20 miles (32 km) north of Houston, Akilah Scott-Amos is staring down a more than $11,000 electric bill for this month, a far cry from her $34 bill at this time last year.

"What am I going to do?" Scott-Amos, 43, said. She was among the millions of Texas residents who lost power during several days of bitter cold that caused the state's electrical grid, operated by the Electric Reliability Council of Texas, to break down. "I guess the option is, what, I'll pay it? I just don't feel like we should have to."

This instance involves variable rate billing systems, where the cost per unit use can vary tremendously. There are also stories you'll find where equipment issues (broken water or gas mains, malfunctioning electrical equipment, failure to terminate per-minute-billed telephone calls, etc.). I can remember the latter featuring in some compendium of records (possibly Guiness) for a family who hadn't realised that they needed to hang up the phone between calls and got hit with a staggering (at least for the time) bill. More recently, data and roaming charges tend to be culprits:

<https://worthly.com/most-expensive/expensive-phone-bills-tim...>

In all of these cases, a significant problem is that usage is all but entirely divorced from metering, and people have no idea of what their actual usage patterns are until "invoice therapy" arrives.

(A term I'm borrowing from a friend, though I'll note that monthly feedback cycles on what are typically hourly/daily decisionmaking patterns proves a poor systems control theory match.)

pembrook

22 hours ago

[-]

Utilities are infra-as-a-service.

The point is that infrastructure is non-optional, you're not going to decide not to use it, and you rarely switch (in the case of utilities you often CAN'T switch). You set it and forget it.

AI in the context of an end-user product or tool is not that (unless you're running the AI product company, where it becomes infra for you).

It's a blender or a kitchenaid mixer. Let's say companies gave those away for free but only charged metered fees to use them. If you had to pay $4 to blend something or run your mixer (and weren't sure if the tool would get it right the first time or you'd have to re-do it 5 times) you'd use them much less. They are optional.

You'd treat the decision to bake like a financial one. Nobody asks their spouse if it's okay to take a shower.

latentsea

1 day ago

[-]

LLMs are not essential for survival.

1 day ago

[-]

Yet.

latentsea

12 hours ago

[-]

Ever.

<https://en.wikisource.org/wiki/The_Wealth_of_Nations/Book_V/...>

4 hours ago

[-]

My point is that many technologies which were once thought of as luxuries eventually come necessities. Socially if not physically.

Adam Smith of all people writes of this:

By necessaries I understand not only the commodities which are indispensably necessary for the support of life, but whatever the custom of the country renders it indecent for creditable people, even of the lowest order, to be without. A linen shirt, for example, is, strictly speaking, not a necessary of life. The Greeks and Romans lived, I suppose, very comfortably though they had no linen. But in the present times, through the greater part of Europe, a creditable day-labourer would be ashamed to appear in public without a linen shirt, the want of which would be supposed to denote that disgraceful degree of poverty which, it is presumed, nobody can well fall into without extreme bad conduct. Custom, in the same manner, has rendered leather shoes a necessary of life in England. The poorest creditable person of either sex would be ashamed to appear in public without them.

Running water, sewerage, gas service, home postal service, electricity, automobiles, telephone service, Internet service, mobile phones, universal healthcare, and many other utilities were once considered luxuries but came to be recognised as essentials.

latentsea

1 hour ago

[-]

I don't need my linen shirt either. I'll be just fine, naked in the forest.

Jokes aside, I already don't live without my LLM. Rely on it too much. Though more for comfort than survival.

azinman2

1 day ago

[-]

But what you're missing is imagine you could pay a flat fee for all your utilities... and that fee is quite low. And imagine a very competitive marketplace where the switching costs are basically zero. What do you think will happen?

<https://www.pge.com/en/account/billing-and-assistance/financ...>

1 day ago

[-]

There are utilities which offer a uniform-monthly-billing option. PG&E in California does so as "Budget Billing":

Budget Billing is a free program that helps you easily manage your monthly energy costs . We calculate your monthly payment amount based on your average energy costs over the last 12 months and adjust your payment amount each month, so that you don't have big spikes on your bill.

While it's not a savings program, Budget Billing helps you stay in control of your bill by avoiding seasonal bill spikes.

azinman2

23 hours ago

[-]

That’s not the same at all. That’s a personalized plan based on your usage but smoothed out over the year for predictable bills, in contrast with Claude/openai’s uniform $20/$200/mo plans. You can’t use as much electricity as you want in that PGE plan. You still pay for what you use.

jpc0

23 hours ago

[-]

If everyone’s usage was above the set rate they would include rate limits which is literally a usage limit, which they have proven to do.

So no in fact you do get X tokens for 20 dollars and off your usage is too high they would absolutely put a stop to that. It wouldn’t even surprise me if they have per customer rate limits.

Where is the distinction here, it isn’t an unlimited plan, the limit just isn’t communicated.

namibj

1 day ago

[-]

That's the default in Germany due to yearly meter reading.

It's not at all flat rate, though, as you'll settle the balance after the year like a tax filing (refund or back pay).

4 hours ago

[-]

PG&E's programme also works with a re-balancing of payments to account for over- or under-utilisation. My understanding of recent changes to the system are that those adjustments themselves are also spread out over a number of months, again to increase overall predictability and avoid surprise billing.

nkrisc

20 hours ago

[-]

Imagine all your home appliances ran on magic instead of electricity, and you paid the magic company for each unit of magic used, except you had no idea how much magic each appliance needed to operate.

redhale

1 day ago

[-]

100,% this. PAYG bill anxiety is such a barrier that I will not use it AT ALL if I'm paying the bill. And even when my company is paying, I feel a need to be too careful as to negate much of the benefit. Raise the prices of the all-inclusive plans if need be. But don't make me use (or NOT use) the API for my day-to-day.

inetknght

1 day ago

[-]

> and requires the huge friction of having to do a cost/benefit analysis before every task

That's what we're supposed to do, right?

So let's see if we can spend a few tokens to ask the LLM for a cost/benefit analysis of using an LLM to solve the problem. I'd bet we can trust the result...

drak0n1c

1 day ago

[-]

Venice AI has a solution where I stake their Base ERC20 token, and in exchange I get a proportional amount of free daily-refreshing API access to a variety of high-powered open source models, besides the staking yield and free Pro subscription level access to their webapp.

It's great for free experimentation when coding apps that use API inference, it's actually motivating to build uses for it because it feels akin to a benefit that otherwise goes unused. Of course there is some price risk in holding their crypto token. After a large spike and drop in the first couple days, it's held steady for 6 months with a $2-$3 floor, possibly due to it having that baseline utility. Mentally, it's far more comfortable to me to stake a principal sum that I can wholly unstake in the future, than to spend on per-request fees.

https://www.binance.com/en/square/post/19617661216041

DonHopkins

21 hours ago

[-]

Sigh, yet another greater fool shilling a get-rich-quick pyramid scheme run by a serial crypto scammer who's already been charged by and settled with the SEC for offering unregistered securities, and is still running pump-and-dump insider trading schemes.

You're just going to get the rug pulled out from under you, and shilling on Hacker News isn't going to prevent that, and is totally inappropriate and uncalled for in a serious discussion about the economics of AI.

VVV Token of Venice AI Plummets 50% Due to Insider Trading Suspicions:

Did Venice Team Dump $5.7M Tokens After Coinbase Listing? Venice AI platform faces a new token-issuing and dumping allegation for $5.7M VVV tokens at the recent price after the Coinbase listing:

https://coingape.com/trending/did-venice-team-dump-5-7m-toke...

https://x.com/AmirOrmu/status/1886505621026984107

@AmirOrmu: Venice team issued themselves an additional $5.7M worth of new tokens RIGHT AFTER the Coinbase listing!

@ErikVoorhees, do you care to explain why?

They immediately sold $450K worth of $VVV using only this fresh address...

Wallet Address + Proof

SEC Charges Bitcoin Entrepreneur With Offering Unregistered Securities:

https://www.sec.gov/newsroom/press-releases/2014-111

A 2018 investigation by the Wall Street Journal alleged that Erik T. Voorhees's previous company ShapeShift had facilitated money laundering of $90 million in funds from criminal activities over a two-year period. Yet here you are shilling his latest get-rich-quick crypto scheme Venice AI.

How Dirty Money Disappears Into the Black Hole of Cryptocurrency. The Wall Street Journal investigation documents suspicious trades through venture capital-backed ShapeShift:

https://www.wsj.com/articles/how-dirty-money-disappears-into...

mhitza

1 day ago

[-]

> Again, don't surprise the user. But that's an anathema to companies who want to hide tokens to dollars, the same way gambling companies obfuscate 'corporate bux' to USD.

This is the exact same thing that frustrates me with GitHub's AI rollout. Been trialing the new Copilot agent, and it's cost is fully opaque. Multiple references to "premium requests" that don't show up real-time in my dashboard, not clear how many I have in total/left, and when these premium requests are referenced in the UI they link to the documentation that also doesn't talk about limits (instead of linking to the associated billing dashboard).

saratogacx

1 day ago

[-]

They don't make it easy to figure out but after researching it for my Co. this is what I came to.

    * One chat message -> one premium credit (most at 1 credit but some are less and some, like opus, are 10x)
    * Edit mode is the same as Ask/chat
    * One agent session (meaning you start a new agent chat) is one "request" so you can have multiple messages and they cost the credit cost of one chat message.

Microsoft's Copilot offerings are essentially a masterclass in cost opaqueness. Nothing in any offering is spelled out and they always seem to be just short of the expectation they are selling.

9dev

1 day ago

[-]

But how much is one premium request in real currency, and how many do I have per month?

https://docs.github.com/en/copilot/how-tos/manage-and-track-...

CraigJPerry

1 day ago

[-]

300 or 1500 per month depending on plan. $0.04 per premium request i believe.

datadrivenangel

1 day ago

[-]

You should see how Microsoft does the PowerBI/ Fabric billing. Gotta get premium capacity and licenses and regular capacity and it's so bad.

CraigJPerry

1 day ago

[-]

There's two costs to copilot coding agent, there's the 1 premium request plus there's the minutes the agent runs for comes out of your runner limits for the month.

This is coding agent, the asynchronous copilot, not the agent chatmode in copilot plugins for vscode etc

llbbdd

1 day ago

[-]

Highly recommend getting the $20/month OpenAI sub and letting copilot use that. Quality-wise I feel like I'm getting the same results but OAIs limits are a little more sane.

mhitza

1 day ago

[-]

I'm talking about this new agent mode https://github.blog/news-insights/product-news/github-copilo... for which as far as I'm aware there's no option to switch the underlying model used.

debian3

1 day ago

[-]

How do you link the openai sub to Gh copilot? I thought you needed to use OpenAI api

nijave

6 hours ago

[-]

>Why do you get a surprise bill?

You get a surprise bill because of surprise usage of services billed based on usage.

If you ask anyone how much water or electricity they use per month, the first thing they're going to do is look at last month's usage.

Estimating what you need ahead of time is a hard problem.

In fairness, AWS doesn't give you a lot of tools to help you measure and predict how many "units" you'll use outside of running your thing and measuring. On the other hand, running your thing and measuring is the defacto way to figure out how much of something you'll use.

Finally, there are AWS services like EC2 and RDS you can run at a fixed cost to help you stay within budget. Traffic/bandwidth is the only thing that comes to mind that you're pretty much required to use without a way to fix the cost (although you can get pretty close with bandwidth limits on EC2 interfaces)

ikari_pl

1 day ago

[-]

I often find Amazon pricing to be vague and cryptic, sometimes there's literally no way to tell ehy, for example, your database cost is fluctuating all the time

crinkly

1 day ago

[-]

Yeah that. We moved to AWS using their best practices and enterprise cost estimation stuff and got a 6x cost increment on something that was supposed to be cheaper and now we’re fucked because we can’t get out.

It’s nearly impossible to tell what the hell is going where and we are mostly surviving on enterprise discounts from negotiations.

The worst thing is they worked out you can blend costs in using AWS marketplace without having to raise due diligence on a new vendor or PO. So up it goes even more.

Not my department or funeral fortunately. Our AWS account is about $15 a month.

ajb

1 day ago

[-]

Are you using separate accounts per use case? That's the only real way to get a cost breakdown, otherwise you have no idea what piece of infrastructure is for what. They provide a tagging system but it's only informative if someone spends several hours a month tracking down the stuff that didn't get tagged properly.

https://docs.aws.amazon.com/organizations/latest/userguide/o...

bumblehean

1 day ago

[-]

> They provide a tagging system but it's only informative if someone spends several hours a month tracking down the stuff that didn't get tagged properly.

The way to deal with this is with an org-level Service Control Policy that enforces the tagging standards.

A resource doesn't have the right tags associated with it? It can't be created.

crinkly

1 day ago

[-]

Yeah we have many accounts. Hence why I know ours cheap. Difficult to break it down within the account as you say without tag maintenance.

AtheistOfFail

1 day ago

[-]

> The worst thing is they worked out you can blend costs in using AWS marketplace without having to raise due diligence on a new vendor or PO. So up it goes even more.

Not a bug, a feature.

joseda-hg

1 day ago

[-]

Amazon pricing is nice if you compare it to Azure...

1 day ago

[-]

If your AWS costs are too complex for you to understand you need to employ a finops person or AWS specialist to handle it for you.

I am not saying this is desirable, but it is necessary IFF you chose to use these services. They are complex by design, and intended primarily for large scale users who do have the expertise to handle the complexity.

Shank

1 day ago

[-]

> If your AWS costs are too complex for you to understand you need to employ a finops person or AWS specialist to handle it for you.

The point where you get sticker shock from AWS is often significantly lower than the point where you have enough money to hire in either of those roles. AWS is obviously the infrastructure of choice if you plan to scale. The problem is that scaling on expertise isn’t instant and that’s where you’re more likely to make a careless mistake and deploy something relatively costly.

wredcoll

1 day ago

[-]

It's a good thing aws saves us so much money that we can afford to hire aws specialists.

1 day ago

[-]

If you plan to scale to that extent, then why do you not have the money to hire the people who can use AWS? At least part time or as temporary consultants.

This:

> The point where you get sticker shock from AWS is often significantly lower than the point where you have enough money to hire in either of those roles

makes me doubt this:

> AWS is obviously the infrastructure of choice if you plan to scale.

22 hours ago

[-]

I'm wondering what the point of all of this is.

If you can afford the large fixed cost of vertical integration, it's always cheaper to do things yourself, so the sweet spot for using providers like AWS is scaling down, not up. A managed DB lets you hire a fraction of a sysadmin or devops person from AWS.

The moment you end up paying enough to AWS to hire a sysadmin, you basically are getting an antagonistic sysadmin from AWS, whose primary goal is to make as much money off you as possible. The incentives are not aligned.

motorest

1 day ago

[-]

> If your AWS costs are too complex for you to understand you need to employ a finops person or AWS specialist to handle it for you.

What a baffling comment. Is it normal to even consider hiring someone to figure out how you are being billed by a service? You started with one problem and now you have at least two? And what kind of perverse incentive are you creating? Don't you think your "finops" person has a vested interest in preserving their job by ensuring billing complexity will always be there?

SoftTalker

1 day ago

[-]

> Is it normal to even consider hiring someone to figure out how you are being billed by a service?

Absolutely. This was common for complicated services like telecom/long distance even in the pre-cloud days. Big companies would have a staff or hire a service to review telecom bills and make sure they weren’t overpaying.

dvfjsdhgfv

1 day ago

[-]

Paradoxically you are both right. Yes, the situation seems dystopian. Yes, hiring a finops person is a sound advice once your cloud bill gets big enough.

motorest

1 day ago

[-]

> Yes, hiring a finops person is a sound advice once your cloud bill gets big enough.

Is it, though? At best someone wearing that hat will explain the bill you're getting. What value do you get from that?

To cut costs, either you microoptimize things, of you redesign systems to shed expenses. The former gets you nothing, the latter is not something a "finops" (whatever that is supposed to mean) brings to the table.

1 day ago

[-]

You need to know what to optimise which means you need to know what you are spending on.

I did say it applies IFF and only IFF you choose to use these services, and if you have chosen to use these services you have presumably decided they are good value for money. If not, why are they using AWS.

Of course the complexity and extra cost of managing the billing is something that someone who has chosen to use AWS has already factored in, right?

The alternative is to not use AWS.

quesera

1 day ago

[-]

> IFF and only IFF

If and only if and only if and only if? :)

(also, while on the topic, I think a simple "if" covers it here, since the relationship is not bidirectional)

mulmen

1 day ago

[-]

If the cost of hiring the finops person is less than the savings over operating without one then you hire one, if it isn't then you don't.

bufferoverflow

1 day ago

[-]

It's not baffling. They know what they are getting billed for, that's transparent. They don't understand WHY they are getting billed 6x of what they expected. The problem here isn't with AWS, the problem is they don't understand why their usage is at 6x.

lelanthran

1 day ago

[-]

> If your AWS costs are too complex for you to understand you need to employ a finops person or AWS specialist to handle it for you

At that point wouldn't it simply be cheaper to do VMs?

1 day ago

[-]

Yes, very likely, but then why are you using AWS at all?

I think a lot of people are missing a key part of the wording of my comment, that capitalised for emphasis "IFF" (which means "if and only if").

I am absolutely certain a lot of people would save money using VMs - or at scale bare metal.

IMO a lot of people are using AWS because it is a "safe" choice management buy into that is not expensive in context (its not a big proportion of costs).

ajsnigrutin

1 day ago

[-]

But they're also simple and cheap if you're a "one man band" trying out some personal idea that might or might not take off. Those people have no budgets for specialists.

Pricing schemes like these just make them move back to virtual machines with "unlimited" shared cpu usage and setting up services (db,...) manually.

mort96

1 day ago

[-]

I'm 100% on team "just rent VMs and run the software on there". It's not that hard, it has predictable price and performance, and you don't lock yourself into one provider. If you build your whole service on top of some weird Amazons -specific thing, and Amazon jacks up their prices, you don't have any recourse. With VMs, you can just spin up new VMs with another provider.

You could also have potential customers who would be interested in your solution, but don't want it hosted by an American company. Spinning up a few Hetzner VMs is easy. Finding European alternatives to all the different "serverless" services Amazon offers is hard.

1 day ago

[-]

> You could also have potential customers who would be interested in your solution, but don't want it hosted by an American company.

Not happened yet. The nearest I have come to it was a requirement that certain medical information stays in the UK, and that is satisfied by using AWS (or other American suppliers) as long as its hosted in the UK.

mort96

1 day ago

[-]

I've worked in places where customers (especially municipalities in Germany) have questioned the use of American hosting providers. I don't know whether it has actually prevented a deal from going through (I wasn't close enough to sales to know), but it was consistently an obstacle in some markets. This is despite everything being hosted in EU datacenters.

kaffekaka

23 hours ago

[-]

Isn't the reason that even if it is hosted physically in the EU, if it is an american company the data is still not safe from american spy agencies?

mort96

18 hours ago

[-]

Yeah, something like that.

1 day ago

[-]

Yes, definitely.

Most small business I have dealt with use AWS do just need a VPS. If they are willing to move to a scary unknown supplier I suggest (unknown to them, very often one that would be well known to people on HN) then I suggest AWS Lightsail which is pretty much a normal VPS with VPS pricing - it significantly cheaper than an instance plus storage, just from buying them bundled (which, to be fair to Amazon, is common practice).

My own stuff goes on VPSs.

selcuka

1 day ago

[-]

> AWS Lightsail which is pretty much a normal VPS with VPS pricing

Except it is still Amazon and subject to the same weird billing practices. I once terminated a Lightsail instance and they kept charging me, claiming that I didn't terminate the static IP address associated with it. The IP address itself cost the same as the instance + IP address did.

Now, that would make sense in "real" AWS, but you'd expect it to be more straightforward with a simplified service like Lightsail.

UltraSane

1 day ago

[-]

AWS pricing is actually extremely clearly specified but it is hard to predict your costs unless you have a good understanding of your expected usage.

IX-103

16 hours ago

[-]

They have clear numbers for things, but it's not obvious how those numbers would map to what you're trying to run.

If I charged compute based on the number of micro-ops executed then that would be a clear definition, but the actual cost would not be something you could predict, as it would depend on what architecture of CPU you ended up running it on.

AWS is even more complicated and variable than that as for cloud storage you have to deal with not only the costs of the different storage classes, but also early deletion fees, access charges, etc. Combined it makes it impractical to work out how much deleting a file from cloud storage will save (or cost). Sure you could probably calculate it if you knew the entire billing history of the file and the bucket it is in, but do you really want to do that every time you delete a file?

While I don't know enough to say if this is intentional, as it could result from simply blindly optimizing for profit, this sort of pricing model is anti-capitalistic as it prevents consumers from truly making informed decisions. We see the same thing is the US healthcare system, where no one can actually tell you how much an operation will cost ahead of time. That creates a very inefficient (but very profitable) market.

Spooky23

1 day ago

[-]

Metering is great for defined processes. I love AWS because I can align cost with business. In the old days it was often hard and an internal political process. Some saleschick would shake the assets at a director and now I’m eating the cost for some network gear i don’t need.

But for users, that fine grained cost is not good, because you’re forcing a user to be accountable with metrics that aren’t tied to their productivity. When I was an intern in the 90s, I was at a company that required approval to make long distance phone calls. Some bureaucrat would assess whether my 20 minute phone call was justified and could charge me if my monthly expense was over some limit. Not fun.

Flat rate is the way to go for user ai, until you understand the value in the business and the providers start looking for margin. If I make a $40/hr analyst 20% more productive, that’s worth $16k of value - the $200/mo ChatGPT Pro is a steal.

ajb

1 day ago

[-]

Amazon is worse than this, though the AWS bait and switch is that you are supposed to save over the alternatives. So it should be worth switching if you would save more than the dev time you would invest in doing so right? But your company isn't going to do that. Because of opportunity cost. Your company expects to get some multiple of a the cost of dev time back, that they invest in their own business. And because of various uncertainties - in return, in the time taken to develop, in competition, etc - they will only invest dev time when that multiple is not small. I'm not a business manager, but I'd guess a factor of 5.

But that means that if you were conned into using infrastructure that actually costs more than the alternative, making your cost structure worse, you're still going to eat the loss because it's not worth taking your devs time to switch back.

But tokens don't quite have this problem -yet. Most of us can still do development the old way, and it's not a project to turn it off. Expect this to change though.

siva7

1 day ago

[-]

it's surprising that YC has a gazillion companies doing some ai infrastructure observability product yet i have to see a product that really presents me and the user easily token usage and pricing estimations which for me is the #1 criteria to use that. make billing and pricing for me and the user easier. instead they run their heads into evals and niche features.

selcuka

1 day ago

[-]

> consumers hate metered billing. they'd rather overpay for unlimited than get surprised by a bill.

Standard packages are like insurance. Everyone pays more or less the same premium, but some claim more than others. On average people always overpay for insurance.

The upside is that it's a predictable cost for the users, and also means predictable cash flow for the provider.

sublinear

1 day ago

[-]

IT IS GAMBLING

Imustaskforhelp

13 hours ago

[-]

Caps lock are mostly unwarranted on hackernews but I am not the PC principal.

Like, lets have a real talk, shall we? Lets just assume that the topic we are discussing on that I am right and you are wrong, How can I even convince ya when you are showing so less of maturity...

And lets say that you are right and I am wrong, but the fact that you are being so bullish on the fact that you can't be wrong and bringing the "this is worse than reddit" and etc. can't make me take you serious and can't make me think your opinion is valid.

If you really want, I'd like to logically disect this stuff as adults using pure logic & not mere opinions.

Waiting for your response.

chrisweekly

1 day ago

[-]

GOTAHCA?

arcanemachiner

1 day ago

[-]

Maybe GOTCHA?

scoreandmore

1 day ago

[-]

You can set billing alerts and write a lambda function to respond and disable resources. Of course they don’t make it easy but if you don’t learn how to use limits what do you expect? This argument amazes me. Cloud services require some degree of responsibility on the users side.

mystraline

1 day ago

[-]

This is complete utter hogwash.

Up until recently, you could hit somebody else's S3 endpoint, no auth, and get 403's that would charge them 10s of thousands of dollars. Coudnt even firewall it. And no way to see, or anything. Number go up every 15-30 minutes in cost dashboard.

Real responsibility is 'I have 100$ a month for cloud compute'. Give me a easy way to view it, and shut down if I exceed that. That's real responsibility, that Scamazon, Azure, Google - none of them 'permit'.

They (and well, you) instead say "you can build some shitty clone of the functionality we should have provided, but we would make less money".

Oh, and your lambda job? That too costs money. It should not cost more money to detect and stop stuff on 'too much cost' report.

This should be a default feature of cloud: uncapped costs, or stop services

mbac32768

17 hours ago

[-]

I low key live in fear that if I die, my personal AWS bill will get out of control and consume my entire estate before probate court can award my assets.

leoedin

15 hours ago

[-]

I just don't use AWS for personal projects. I got stung once with a $100 bill. Never again. I just can't accept unlimited liability like that in my personal life.

HelloImSteven

1 day ago

[-]

Lambda has 1mil free requests per month, so there’s a chance it would be free depending on your usage. But still, it’s not straightforward at all, so I get it.

Perhaps requiring support for bill capping is the right way to go, but honestly I don’t see why providers don’t compete at all here. Customers would flock to any platform with something like “You set a budget and uptime requirements, we’ll figure out what needs to be done”, with some sort of managed auto-adjustment and a guarantee of no overage charges.

Ah well, one can only dream.

RussianCow

1 day ago

[-]

> but honestly I don’t see why providers don’t compete at all here

Because the types of customers that make them the most money don't care about any of this stuff. They'll happily pay whatever AWS (or other cloud provider) charges them, either because "scale" or because the decision makers don't realize there are better options for them. (And depending on the use case, sometimes there aren't.)

anothernewdude

17 hours ago

[-]

I do my test infrastructure with prepaid credit cards. If billing goes over, I just drop the account and start again.

22 hours ago

[-]

I don't know why every single person here insists on budget based limits. What you want is resource based limits with throttling and a calculator that takes your resource limits to determine the averaged monthly bill and a traffic spike bill.

Then the goal would be to set the resource limits to something you are happy with.

Yes, this is a pain in the ass to set up and AWS will probably never implement this, but it is the correct solution.

anothernewdude

17 hours ago

[-]

Because infinite downside makes the EV of using AWS infinitely negative. The biggest risk of any business using AWS is that AWS bankrupts you.

gray_-_wolf

1 day ago

[-]

Last time I was looking into this, is there not up to an hour of delay for the billing alerts? It did not seem possible to ensure you do not run over your budget.

esafak

1 day ago

[-]

So you're okay with turning your site off...

mystraline

1 day ago

[-]

This a logical fallacy of false dilemma.

I made it clear that you ask the user to choose between 'accept risk of overrun and keep running stuff', 'shut down all stuff on exceeding $ number', or even a 'shut down these services on exceeding number', or other possible ways to limit and control costs.

The cloud companies do not want to permit this because they would lose money over surprise billing.

verbify

1 day ago

[-]

Isn't that the definition of metered billing?

dd36

1 day ago

[-]

Cats doing tricks has a limited budget.

anothernewdude

17 hours ago

[-]

YES!

ej88

1 day ago

[-]

The article just isn't that coherent for me.

> when a new model is released as the SOTA, 99% of the demand immediately shifts over to it

99% is in the wrong ballpark. Lots of users use Sonnet 4 over Opus 4, despite Opus being 'more' SOTA. Lots of users use 4o over o3 or Gemini over Claude. In fact it's never been a closer race on who is the 'best': https://openrouter.ai/rankings

>switch from opus ($75/m tokens) to sonnet ($15/m) when things get heavy. optimize with haiku for reading. like aws autoscaling, but for brains.

they almost certainly built this behavior directly into the model weights

???

Overall the article seems to argue that companies are running into issues with usage-based pricing due to consumers not accepting or being used to usage based pricing and it's difficult to be the first person to crack and switch to usage based.

I don't think it's as big of an issue as the author makes it out to be. We've seen this play out before in cloud hosting.

- Lots of consumers are OK with a flat fee per month and using an inferior model. 4o is objectively inferior to o3 but millions of people use it (or don't know any better). The free ChatGPT is even worse than 4o and the vast majority of chatgpt visitors use it!

- Heavy users or businesses consume via API and usage based pricing (see cloud). This is almost certainly profitable.

- Fundamentally most of these startups are B2B, not B2C

margalabargala

1 day ago

[-]

> Lots of users use 4o over o3

How much of that is the naming?

Personally I just avoid OpenAIs models entirely because I have absolutely no way of telling how their products stack up against one another or which to use for what. In what world does o3 sort higher than 4o?

If I have to research your products by name to determine what to use for something that is already a commodity, you've already lost and are ruled out.

22 hours ago

[-]

It's the naming. He is confusing 4o/4o-mini with o4-mini, the latter is a pretty strong model and it's also one of the newest. Oh and it's cheaper than the non-mini 4o.

ej88

13 hours ago

[-]

No, I meant 4o over o3. For a ton of people a reasoning model's latency is overkill for them asking for inspiration on what to make for dinner.

margalabargala

17 hours ago

[-]

There's both a 4o and an o4? And they're different?

4 hours ago

[-]

Yes. 4o is a non-CoT model that is the continuation of the GPT-4 line, itself superseded by 4.1. o4 is the continuation of the CoT model line.

There's also 4o-mini and o4-mini...

stingraycharles

18 hours ago

[-]

o4-mini isn’t really that great in comparison to o3, and I still use o3 as my daily driver for reasoning tasks. I don’t really have a purpose for o4-mini, not even for coding tasks.

motorest

1 day ago

[-]

> In fact it's never been a closer race on who is the 'best'

Thank you for pointing out that fact. Sometimes it's very hard to keep perspective.

Sometimes I use Mistral as my main LLM. I know it's not lauded as the top performing LLM but the truth of the matter is that it's results are just as useful as the best models that ChatGPT/Gemini/Claude outputs, and it is way faster.

There is indeed diminished returns on the current blend of commercial LLMs. Deep seek already proved that cost can be a major factor and quality can even improve. I think we're very close to see competition based on price, which might be the reason there is so much talk about mixture of experts approaches and how specialized models can drive down cost while improving targeted output.

Alex-Programs

20 hours ago

[-]

If you're after speed, Groq is excellent. They've recently added Kimi K2.

torginus

1 day ago

[-]

Yeah, my biggest problem with CC is that it's slow, prone to generating tons of bullshit exposition, and often goes down paths that I can tell almost immediately will yield no useful result.

It's great if you can leave it unattended, but personally, coding's an active thing for me, and watching it go is really frustrating.

Alex-Programs

20 hours ago

[-]

I can't deal with any of the in editor tools. I'd love something that handled inputting changes (with manual review!) while still giving me 100% control over the context and actually doing as its told.

furyofantares

1 day ago

[-]

> claude code has had to roll back their original unlimited $200/mo tier this week

The article repeats this throughout but isn't it a straight lie? The plan was named 20x because it's 20x usage limits, it always had enforced 5 hour session limits, it always had (unenforced? soft?) 50 session per month limits.

It was limited, but not enough and very very probably still isn't, judging by my own usage. So I don't think the argument would even suffer from telling the truth.

Aurornis

1 day ago

[-]

You’re right, the Max plan was never advertised as unlimited.

I can’t believe how many comments and articles I’ve read that assume it was unlimited.

It’s like it has been repeated so many times that it’s assumed to be true.

adastra22

1 day ago

[-]

Also the “walkback” of the unlimited Max plan was anything but. The plan is still exactly the same for 95% of the subscribers. It just turns out that less than 5% were doing things like scripting claude to run multiple sessions 24/7 with the same login credentials. The Max plan was supposed to be a single user plan, not an infinite-parallel-sessions plan.

t14000

22 hours ago

[-]

Probably they asked AI if it was unlimited and it responded something like "oh wise and sagacious user what an amazingly insightful question! Yes, yes Max is unlimited! Would you like me to help you use an infinite amount of tokens?"

michaelbuckbee

1 day ago

[-]

A major current problem is that we're smashing gnats with sledgehammers via undifferentiated model use.

Not every problem needs a SOTA generalist model, and as we get systems/services that are more "bundles" of different models with specific purposes I think we will see better usage graphs.

benreesman

1 day ago

[-]

Because none of them are good enough yet to trust completely with any task. Even the absolute best ones still fart out at surprising times, and for most stuff I have an AI that's always on, it requires no cognitive overhead to delegate to my own brain. So to delegate, it has to be a reliable win: I'm not here to make AI look good, I'm here to make my own performance be good, only a sure thing is a candidate for reflexive delegation.

AI companies advertise peak AI performance, users select AI tools on worst case AI fuckups: hence, only SOTA is ever in demand. TFA illustrates this well.

AI will be judged on it's worst performance, just like people are fired for their worst showing, not their best. No one cares about AI performance in ideal (read: carefully contrived) settings. We care how bad it fucks up when we take our eyes off it for 2 seconds.

nijave

6 hours ago

[-]

This is a place testing and benchmarking can definitely save you money.

It's the same as compute--you can skip testing and throw money at the problem but you're going to end up paying more.

We have some pretty basic guidelines at work and I think that's a decent starting point. They amount to a few example prompts/problem types and which OpenAI model to try using first for best bang for your buck.

I think some of it also comes down to scale. Buying a 5 pack of sledgehammers isn't a terrible value when everything comes in a "5 pack" and you only need <= 5 tools total. Or more practically, on the small end it's more economical to run general purpose models than tailor more specific models. Once you start invoking them enough, there's a break even and flip point where spending more time on the tailored or custom model is cheaper.

empiko

1 day ago

[-]

Yeah, but the juiciest tasks are still far from solved. The amount of tasks where people are willing to accept low accuracy answers is not that high. It is maybe true for some text processing pipelines, but all the user facing use cases require good performance.

mustyoshi

1 day ago

[-]

Yeah this is the thing people miss a lot. 7,32b models work perfectly fine for a lot of things, and run on previously high end consumer hardware.

But we're still in the hype phase, people will come to their senses once the large model performance starts to plateau

_heimdall

1 day ago

[-]

I expect people to come to their senses when LLM companies stop subsidizing cost and start charging customers what it actually costs them to train and run these models.

gunalx

20 hours ago

[-]

I mean, there is no reason for a inference provider og open models to subsidice you. And costs there is usually cheaper than Claude API pricing.

_heimdall

18 hours ago

[-]

Its still a market though, there is always the incentive to subsidize if all the competition is keeping prices artificially low.

zamadatix

1 day ago

[-]

People don't want to guess which sized model is right for a task and current systems are neither good or efficient at trying to estimate that automatically. I see only the power users tweaking more and more as performance plateaus and the average user only changing when it's automatic.

bakugo

1 day ago

[-]

> 7,32b models work perfectly fine for a lot of things

Like what? People always talk about how amazing it is that they can run models on their own devices, but rarely mention what they actually use them for. For most use cases, small local models will always perform significantly worse than even the most inexpensive cloud models like Gemini Flash.

totaa

19 hours ago

[-]

Gemma 3n E4B has been crazy good for me - fine tune running on Google Cloud Run via Ollama, completely avoiding token based pricing at the cost of throughput limitations

pigeonhole123

18 hours ago

[-]

What kind of applications are you using it for?

simonjgreen

1 day ago

[-]

Completely agree. It’s worth spending time to experiment too. A reasonably simple chat support system I build recently uses 5 different models dependent on the function it it’s in. Swapping out different models for different things makes a huge difference to cost, user experience, and quality.

alecco

1 day ago

[-]

If there was an option to have Claude Opus guide Sonnet I'd use it for most interactions. Doing it manually is a hassle and breaks the flow, so I end up using Opus too often.

This shouldn't be that expensive even for large prompts since input is cheaper due to parallel processing.

isoprophlex

1 day ago

[-]

You can define subagents that are forced to run on eg. Sonnet, and call these from your main Opus backed agent. /agent in CC for more info...

danielbln

1 day ago

[-]

That's what I do. I used to use Opus for the dumbest stuff, writing commits and such, but now that' all subagent business that run on Sonnet (or even Haiku sometimes). Same for running tests, executing services, docker etc. All Sonnet subagents. Positive side effect: my Opus allotment lasts a lot longer.

illusive4080

1 day ago

[-]

I’m just sitting here on my $20 subscription hoping one day we will get to use Opus

edg5000

1 day ago

[-]

You can just get your own account right? Just pay out of pocket.

nateburke

1 day ago

[-]

generalist = fungible?

In the food industry is it more profitable to sell whole cakes or just the sweetener?

The article makes a great point about replit and legacy ERP systems. The generative in generative AI will not replace storage, storage is where the margins live.

Unless the C in CRUD can eventually replace the R and U, with the D a no-op.

marcosdumay

1 day ago

[-]

> In the food industry is it more profitable to sell whole cakes or just the sweetener?

I really don't understand where you are trying to get. But on that example, cakes have a higher profit margin, and sweeteners have larger scale.

jdietrich

17 hours ago

[-]

Isn't this just MoE?

djhworld

1 day ago

[-]

Over the past year or two I've just been paying for the API access and using open source frontends like LibreChat to access these models.

This has been working great for the occasional use, I'd probably top up my account by $10 every few months. I figured the amount of tokens I use is vastly smaller than the packaged plans so it made sense to go with the cheaper, pay-as-you-go approach.

But since I've started dabbling in tooling like Claude Code, hoo-boy those tokens burn _fast_, like really fast. Yesterday I somehow burned through $5 of tokens in the space of about 15 minutes. I mean, sure, the Code tool is vastly different to asking an LLM about a certain topic, but I wasn't expecting such a huge leap, a lot of the token usage is masked from you I guess wrapped up in the ever increasing context + back/forth tool orchestration, but still

zurfer

1 day ago

[-]

The simple reason for this is that Claude Code uses way more context and repetitions than what you would use in a typical chat.

TechDebtDevin

1 day ago

[-]

$20.00 via Deepseek's api (Yes China, can have my code idc), has lasted me almost a year. Its slow, but better quality output than any of the independently hosted Deepseek models (ime). I don't really use agents or anything tho.

vitaflo

1 day ago

[-]

Agreed, I'm still trying to use up my first $5 on Deepseek. The best thing is the off-peak rate is during the US work day and is only 55 cents per million tokens. Great for use with agents cuz you never have to worry about cost or throttling.

Everyone complains about the prices of other models but there are much cheaper alternatives out there and DS is no slouch either.

drudolph914

12 minutes ago

[-]

wow is the uber moment for AI already over? that was fast

johnfn

1 day ago

[-]

Some AI company needs to create a model that can delegate simple tasks to 'stupider' models. I often encounter a task which is complicated enough to require a strong model like Opus, but which subdivides into a number of tasks - the vast majority of which are enough that 3.5 Sonnet could pick it up. All Opus would need to do is subdivide the task into easy and hard bits, then spin up a bunch of 3.5 Sonnets for the easy stuff.

This seems like such an obvious idea that I'm sure everyone is already working on it!

kgilpin

1 day ago

[-]

Claude code does utilize both the full Sonnet model and the lighter Haiku model in an automatic way. When you exit a Claude code session, it gives you the stats (tokens, cost, etc). I expect there’s a way to get this in-session as well.

mhlakhani

1 day ago

[-]

I believe you can just hit /cost within a session for this

subarctic

1 day ago

[-]

Only if you're using it without a subscription, and without one it doesn't take much usage to get to $20 in a month

https://aider.chat/docs/leaderboards/

andai

22 hours ago

[-]

I think Auto-GPT or a similar project did something like this in 2023.

Aider has an option where you can combine different models, so one does the thinking and one implements the changes.

e.g. o3 (high) + 4.1

There used to be more such combinations in this list but it seems they were removed. A while ago I think the best thing on the list was such a hybrid.

throwmeaway222

1 day ago

[-]

Maybe have the prompt tell the model emit the model "level" needed between 1 and 10 for each subtask it generates

torginus

1 day ago

[-]

First of all, do they shoot you in San Francisco, if you use capital letters and punctuation?

Second, why are SV people obsessed with fake exponentials? It's very clear that AI progress has only been exponential in the sense that people are throwing a lot more resources at AI then they did a couple years ago.

369548684892826

1 day ago

[-]

> First of all, do they shoot you in San Francisco, if you use capital letters and punctuation?

Is it done like this just to show it wasn't written by a LLM?

handfuloflight

1 day ago

[-]

The Substack tagline for the account is "all content here is generated by ai"

SweetSoftPillow

22 hours ago

[-]

It doesn't show that. You can prompt AI to omit capital letters and punctuation.

luqtas

1 day ago

[-]

oh no! i can't deal with the natural morphing a lingua-franca has! /j

Thou needst to live in the archaic.

Lucasoato

1 day ago

[-]

If you go in Tenderloin or Mission Street yes, but even if you don't use capital letters and punctuation.

jsnell

1 day ago

[-]

> now look at the actual pricing history of frontier models, the ones that 99% of the demand is for at any given time:

The meaningful frontier isn't scalar on just the capability, it's on capability for a given cost. The highest capability models are not where 99% of the demand is on. Actually the opposite.

To get an idea of what point on the frontier people prefer, have a look at the OpenRouter statistics (https://openrouter.ai/rankings). Claude Opus 4 has about 1% of their total usage, not 99%. Claude Sonnet 4 is the single most popular model at about 18%. The runners up in volume are Gemini Flash 2.0 and 2.5, which are in turn significantly cheaper than Sonnet 4.

bakugo

1 day ago

[-]

This is true. I agree with the overall premise of the article, but claiming that Opus is more used than Sonnet is just wrong.

One of the graphs even lists a "Claude 3.5 Opus", which does not exist. After 3.5 Sonnet was released, 3 Opus largely fell into irrelevance until they decided to finally release another big, expensive model with Opus 4, which still isn't anywhere near as popular as Sonnet 4 with users who pay API prices.

GiorgioG

1 day ago

[-]

I tried Gemini CLI and in 2 hours somehow spent $22 just messing around with a very small codebase. I didn’t find out until the next day from Google’s billing system. That was enough for me - I won’t touch it again.

adrianbooth17

1 day ago

[-]

Isn't Gemini CLI free? Or did you BYOK?

squirrellous

1 day ago

[-]

The free usage gets exhausted really fast. I had a similar experience as the parent where I easily burned through $70+ in two days with two requests in a medium size codebase. Then I switched to free, and it couldn’t finish a single request end to end.

On top of this Gemini CLI still doesn’t support paying through the Google AI subscription. I assume it’s some sort of bureaucratic reason that’s preventing them from moving quickly.

GiorgioG

1 day ago

[-]

BYOK

JofArnold

17 hours ago

[-]

Same. 30m tokens in a few hours (of which many were cached it seems and only a few output tokens). I use Gemini to solve only specific problems and in my case $20+ was worth it.

farkin88

1 day ago

[-]

Even though tokens are getting cheaper, I think the real killer of "unlimited" LLM plans isn't token costs themselves, it's the shape of the usage curve that's unsustainable. These products see a Zipf-like distribution: thousands of casual users nibble a few-hundred tokens a day while a tiny group of power automations devour tens of millions. Flat pricing works fine until one of those whales drops a repo-wide refactor or a 100 MB PDF into chat and instantly torpedoes the margin. Unless vendors turn those extreme loops into cheaper, purpose-built primitives (search, static analyzers, local quantized models, etc.), every "all-you-can-eat" AI subscription is just a slow-motion implosion waiting for its next whale.

andyferris

1 day ago

[-]

I actually think Anthropic's plans with capped usage per 5-hour period and per week is good, for exactly this problem.

I'd prefer it just specify a number of tokens rather than be variable on demand - I see that lets them be more generous during low periods but the opacity of it all sucks. I have 5-minute time-of-use pricing on my electricty and can look up the current rate on my phone in an insant - why not simply provide an API to look up the current "demand factor" for Claude (along with the rules for how the demand factor can change - min and max values for example) and let it be fully transparent?

farkin88

1 day ago

[-]

Anthropic still relies on quotas (5-hour rolling + weekly coming at the end of the month) so the next step is dynamic per-token pricing. But, even with transparent off-peak rates only batch jobs will shift and history suggests variable pricing usually smooths rather than sharpens peaks. The long-term fix stays the same: route 100 MB PDFs, repo refactors and other whale jobs through retrieval or analysis pipelines and keep the flagship chat model for real-time conversation.

andai

22 hours ago

[-]

>People will always use the heaviest model

Actually when doing my first attempt at vibe coding a few months ago, I found that Gemini Flash was fine for my tasks, and way faster than the heavier models. So I found the smaller model a vastly superior user experience.

The speed really adds up when you're using the autonomous coding agents, since they tend to require many LLM calls for a few simple changes.

dcre

1 day ago

[-]

Vibes-based analysis. We have no idea how much these models cost to serve.

ankit219

1 day ago

[-]

Interesting article, full of speculation and some logical follows, but feels like it feels short of admitting what the true conclusion is. Model building companies can build thinner wrapper / harness and can offer better prices than third party companies (the article assumes it costs anthropic same price per token as it does for their customers) because their costs per token is lower than app layer companies. Anthropic has a decent margin (likely higher than openai) on sale of every token, and with more scale, they can sell at a lower cost (or some unlimited plans with limits that keeps out 1%-5% of the power users).

I don't agree with the Cognition conclusion either. Enterprises are fighting super hard to not have a long term buying contract when they know SOTA (app or model) is different every 6 months. They are keeping their switching costs low and making sure they own the workflow, not the tool. This is even more prominent after Slack restricted API usage for enterprise customers.

Making money on the infra is possible, but that again misunderstands the pricing power of Anthropic. Lovable, Replit etc. work because of Claude. Openai had codex, google had jules, both aren't as good in terms of taste compared to Claude. It's not the cli form factor which people love, it's the outcome they like. When Anthropic sees the money being left on the table in infra play, they will offer the same (at presumably better rates given Amazon is an investor) and likely repeat this strategy. Abstraction is a good play, only if you abstract it to the maximum possible levels.

mark_l_watson

1 day ago

[-]

I have already thought a lot about the large packaged inference companies hitting a financial brick wall, but I was surprised by material near the end of the article: the discussions of lock in for companies that can’t switch and about Replit making money on the whole stack. Really interesting.

I managed a deep learning team at Capital One and the lock-in thing is real. Replit is an interesting case study for me because after a one week free agent trial I signed up for a one year subscription, had fun the their agent LLM-based coding assistant for a few weeks, and almost never used their coding agent after that, but I still have fun with Replit as an easy way to spin up Nix based coding environments. Replit seems to offer something for everyone.

Waterluvian

1 day ago

[-]

On the topic of cost per token, is it accurate to represent a token as, ideally, a composable atomic unit of information. But because we’re (often) using English as the encoding format, it can only be as efficient as English can encode the data.

Does this mean that other languages might offer better information density per token? And does this mean that we could invent a language that’s more efficient for these purposes, and something humans (perhaps only those who want a job as a prompt engineer) could be taught?

Kevin speak good? https://youtu.be/_K-L9uhsBLM?si=t3zuEAmspuvmefwz

4 hours ago

[-]

But also, arguably, Lojban is the language you want to use for LLMs. Especially for the chain of thought.

And the interesting property of Lojban is that it has unambiguous grammar that can be syntax-checked by tools and enforced by schemas, and machine-translated back to English. I experimented with it a bit and found that large SOTA models can generate reasonably accurate translations if you give them tools like dictionary and parser and tell them to iterate until they get a syntactically valid translation that parses into what they meant to say. So perhaps there is a way to generate a large enough dataset to train a model on; I wish I had enough $$$ to try this on a lark.

4 hours ago

[-]

In principle, yes. And you could do the same for programming languages.

In practice, the problem is that any such constructed language wouldn't have a corpus large enough to train on.

It's really unfortunate that we ended up with English as the global lingua franca right at the time generative AI came about, because it is effectively cementing that dominance. Even Chinese models are trained mostly on English AFAIK.

https://www.science.org/content/article/human-speech-may-hav...

deegles

1 day ago

[-]

Human speech has a bit rate of around 39 bits per second, no matter how quickly you speak. assuming reading is similar, I guess more "dense" tokens would just take longer for humans to read.

__s

1 day ago

[-]

Sure, but that link has Japanese at 5 bits per syllable & Vietnamese at 8 bits per syllable, so if billing was based on syllables per prompt you'd want Vietnamese prompts

Granted English is probably going to have better quality output based on training data size

fy20

1 day ago

[-]

English often has a lot of redundancy, you could rewrite your comment to this and still have it convey the original meaning:

Regarding cost per token: is a token ideally a composable, atomic unit of information? Since English is often used as an encoding format, efficiency is limited by English's encoding capacity.

Could other languages offer higher information density per token? Could a more efficient language be invented for this purpose, one teachable to humans, especially aspiring prompt engineers?

67 tokens vs 106 for the original.

Many languages don't have articles, you could probably strip them from this and still understand what it's saying.

joseda-hg

1 day ago

[-]

IIRC, in linguistics there's a hypothesis for "Uniform Information density" languages seem to follow on a human level (Denser languages slow down, sparse languages speed up) so you might have to go for an Artificial encoding, that maps effectively to english

English (And any of the dominant languages that you could use in it's place) work significantly better than other languages purely by having significantly larger bodies of work for the LLM to work from

Waterluvian

1 day ago

[-]

Yeah I was wondering about it basically being a dialect or the CoffeeScript of English.

Maybe even something anyone can read and maybe write… so… Kevin English.

Job applications will ask for how well one can read and write Kevin.

r_lee

1 day ago

[-]

Sure, for example Korean is unicode heavy, e.g. 경찰 = police, but its just 2 unicode chars. Not too familiar with how things are encoded but it could be more efficient

ktzar

20 hours ago

[-]

Also, the way models are evolving (thinking process, llms waiting for interactions with external entities via MCP, mixture of experts, ...) are making "useful chatbot responses" way way way more expensive than they used to be when you were pretty much hitting an autocomplete. To a level where these are starting to be prohibitive to run locally at a decent tokens/s speed, and we're being tied to using their models.

strangescript

1 day ago

[-]

We haven't reached a peak on scaling/performance, so even if an old model can be commoditized, a new one will be created to take advantage of the newly freed infra. Until we hit a ceiling on scaling, tokens are going to remain expensive relative to what people are trying to do with them because the underlying compute is expensive.

fennecfoxy

17 hours ago

[-]

Because models aren't becoming more evolved, just bigger - that's the commercial incentive. NOW NOW NOW PROFITS NOW, aka VC & all what YC is about.

We need another attention is all you need, or three, imo. I hope the gold rush isn't impacting research, but I bet it is.

xrd

1 day ago

[-]

This is the moment an open source solution could pop in and say just "uv add aider" and then make sure you have a 24gb card for Qwen3 for each dev, and you are future proofed for at least the next year. It seems like the only way out.

cadamsdotcom

1 day ago

[-]

The article misses the “musical chairs” game that gets played in land-grabs.

Look at Uber. Lost money for over a decade buying market share with venture capital. Now post IPO they have settled in to a position in the minds of users that’s hard to shake even as cheaper competition arrives. They have a durable business and a steady (even if not amazing) stock price.

jstummbillig

1 day ago

[-]

This is silly? The important metric is value per token, which is obviously increasing, and thus the relative token is getting cheaper because you need far less of them to produce anything of value.

Which then might lead to you using a lot more, because it offsets some other thing that costs even more still, like your time.

acedTrex

1 day ago

[-]

"Which is obviously increasing"

With the primary advancements over the past two years being Chain Of Thought which absolutely obliterates token counts in what world would the "per token" value of a model be going up...

jstummbillig

1 day ago

[-]

If you are able to cogently explain how you would instruct GPT 3.5 with ANY amount of tokens to do what Sonnet 4 is able to do, I am sure there's a lot of wealthy people that would be very interested in having a talk with you.

22 hours ago

[-]

You build a reasoning variant of GPT 3.5. Please deposit 1 billion dollars in my bank account for this advice.

esafak

1 day ago

[-]

This is wrong. People are not dropping old models when new ones come out. I'm always on the lookout for cost effective models. The logical thing is to use the cheapest model that gets the job done, and you get a sense for that once you the model for a while.

It is standard practice with some coding agents to have different models for different tasks, like building and planning.

comrade1234

1 day ago

[-]

I'm kind of curious what IntelliJ's deal is with the different providers. I usually just keep it set to Claude but there are others that you can pick. I don't pay extra for the AI assistant - it's part of my regular subscription. I don't think I use the AI features as heavily as many others, but it does feed my code base to whoever I'm set to...

louthy

1 day ago

[-]

Are you sure you don’t pay extra? I’m on Rider and it’s an additional cost. Unless us C# and F# devs are subsidising everyone else :D

Edit: It says on the Jetbrains website:

“The AI Assistant plugin is not bundled and is not enabled in IntelliJ IDEA by default. AI Assistant will not be active and will not have access to your code unless you install the plugin, acquire a JetBrains AI Service license and give your explicit consent to JetBrains AI Terms of Service and JetBrains AI Acceptable Use Policy while installing the plugin.”

double051

1 day ago

[-]

If you pay for the all products subscription, their AI features are now bundled in. I believe that may be a relatively recent change, and I would not have known about it if I hadn't been curious and checked.

louthy

1 day ago

[-]

I’ve just checked. I have the ‘dotUltimate’ bundle, which now appears to include ‘AI Pro’.

They didn’t cancel my existing ‘AI Pro’ subscription though, and have just let it keep running with no refunds.

Thanks, Jetbrains. You get worse every day.

comrade1234

1 day ago

[-]

When they first added the assistant it was $100/yr to enable it. However, it's now part of the subscription and they even reimbursed me a portion of the $100 that I paid.

terminalbraid

1 day ago

[-]

You're one of the lucky ones. They just outright stole from many of the people who did pay for it.

terminalbraid

1 day ago

[-]

Considering they didn't significantly change their pricing when they bundled the equivalent of a ~$10-20/mo subscription to their Ultimate pack (which I pay something around $180/year for), I'm guessing they're eating a lot of the cost out of desperation for an imagined problem. That or they were fleecing everyone from the beginning.

raincole

1 day ago

[-]

First of all the title is click-bait. Tokens are getting cheaper and cheaper. People just use more and more tokens.

And everything, I mean everything after the title is only a downhill:

> saying "this car is so much cheaper now!" while pointing at a 1995 honda civic misses the point. sure, that specific car is cheaper. but the 2025 toyota camry MSRPs at $30K.

Cars got cheaper. The only reason you don't feel it is trade barrier that stops BYD from flooding your local dealers.

> charge 10x the price point > $200/month when cursor charges $20. start with more buffer before the bleeding begins.

What does this even mean? The cheapest Cursor plan is $20, just like Claude Code. And the most expensive Cursor plan is $200, just like Claude Code. So clearly they're at the exact same price point.

> switch from opus ($75/m tokens) to sonnet ($15/m) when things get heavy. optimize with haiku for reading. like aws autoscaling, but for brains.

> they almost certainly built this behavior directly into the model weights, which is a paradigm shift we’ll probably see a lot more of

"I don't know how Claude built their models and I have no insider knowledge, but I have very strong opinions."

> 3. offload processing to user machines

What?

> ten. billion. tokens. that's 12,500 copies of war and peace. in a month.

Unironically quoting data from viberank leaderboard, which is just user-submitted number...

> it's that there is no flat subscription price that works in this new world.

The author doesn't know what throttling is...?

I've stopped reading here. I should've just closed the tab when I saw the first letter in each sentence isn't capitalized. This is so far the most glaring signal of slop. More than the overuse of em-dash and lists.

4 hours ago

[-]

> the first letter in each sentence isn't capitalized. This is so far the most glaring signal of slop.

just fyi, it's a very common manner of writing for younger folk online. more so in informal contexts, but as with everything else, once it's widely adopted it starts to creep into the more formal communication. it's not about "slop", it's just a cultural convention.

i should also note that many languages that got their orthographies defined relatively recently (e.g. various native american languages) use all-lowercase as well, by design. so there's no inherent reason why english can't do that either.

1 day ago

[-]

All good points, but:

> when I saw the first letter in each sentence isn't capitalized. This is so far the most glaring signal of slop.

How so? It's the exact opposite imho. Lowercase everything with a staccato writing style to differentiate from AI slop, because LLMs usually don't write lowercase.

lelanthran

1 day ago

[-]

I think GP is drawing a distinction between "slop" and "AI slop".

This comes across as sloppily written, but not sloppily generated.

SweetSoftPillow

22 hours ago

[-]

LLM will write lowercase if you prompt it to do that

ankit219

1 day ago

[-]

Likely op does not mean ai slop, but more a signal of human carelessness that they could not write it in a proper manner.

Semaphor

1 day ago

[-]

Human slop instead of AI. Our race is catching up to the machines again.

gdiamos

1 day ago

[-]

The NVIDIA stock price keeps going up because accuracy/intelligence is more valuable than efficiency

Scaling laws let you spend more transistors and watts on intelligence

Do you want more tokens or smarter tokens?

totaa

18 hours ago

[-]

more can lead to smarter in the short term.

brute force

sshine

1 day ago

[-]

> nobody opens claude and thinks, "you know what? let me use the shitty version"

Sure I do!

I will consistently pick the fastest and cheapest model that will do the job.

Sonnet > Opus when coding

Haiku > Sonnet when fusing kitchen recipes, or answering questions where search results deliver the bulk of the value, and the LLM part is really just for summarizing.

Havoc

1 day ago

[-]

The combination of "thinking models" plus the blind focus on incremental benchmarking gains was a mistake for practical use.

You definitely want that for some tasks, but for the majority of tasks there is a lot of space for cheap & cheerful (and non-thinking)

mensetmanusman

1 day ago

[-]

They will soon be subsidized by ads or people will run their own.

nojs

1 day ago

[-]

The article implies that nobody pays usage-based pricing, but aren’t all API customers (ie essentially all business users) doing this now?

codr7

1 day ago

[-]

This is such a nice setup!

They can deliver pretty much whatever they feel like. Who can tell a trash token from an hallucination? And tracking token usage is a pita.

Sum it up and it translates to: sell whatever you feel like at whatever price you feel like.

Nice!

_0ffh

1 day ago

[-]

Dunno, I'm happy to pay for API access based on token usage. I'd never so much as look at flat pricing, but maybe that's just me.

robertclaus

1 day ago

[-]

My team is debating this exact question for a new product we have in early access. Ultimately we realized the issue early on, so even our plans option would include at-cost usage limits.

arnon

21 hours ago

[-]

that's a good idea but the best idea is to decouple the product you're selling from the tokens it consumes.

you shouldn't be pricing compute directly (by charging for tokens yourself)

blotfaba

1 day ago

[-]

We're not going to be using tokens forever, and inevitably specialized hardware will solve this bottleneck. Underestimating how much proprietary advancements are loaded into Google TPUs is sort of like thinking the best we've got are Acura TSXs when somebody's driving around in a Ferrari.

abtinf

1 day ago

[-]

Lack of proper capitalization makes the text unreadable for me.

machomaster

1 day ago

[-]

You are not the only one. I really don't understand this trend of wanting to share opinions, but purposefully making it harder/impossible for other to read it. Might as well write with dark-grey font over black background, just to make readers struggle extra hard.

If you don't care to trivially make your text readable, then we for sure don't care to spend time to struggle through your text to see if there is any useful substance there.

https://convertcase.net/browser-extension/

blamestross

1 day ago

[-]

This extension might make the internet more accessible for you!

machomaster

1 day ago

[-]

If the writer is that lazy to press Shift and do it manually, then it is him who should have used autocapitalization software.

reilly3000

1 day ago

[-]

I’m really liking the experience of using Cline with OpenRouter. You get detailed token count and context size for every Task as it goes, along with the actual price in realtime. That combined with the ability to use virtually any model with a single key makes it easier to determine what work to give to what model, and if it’s worth it to you. Use the best models for the most important work, cheap/free ones for the rest, and do the rest yourself. That way you drive your model, not when the model decides enough is enough.

d4rkn0d3z

17 hours ago

[-]

Wow, comments read like a usage anxiety help forum.

jongjong

1 day ago

[-]

There are a few forces at play and there is a much bigger overarching problem.

Even before AI, it felt like the value of intelligence and knowledge had been dropping over time. This makes sense as the internet has democratized access to information and promoted intellectual self-improvement. The supply of intelligence increased dramatically but demand for it struggled to keep up (in spite of the tech boom). Now demand for intelligence has plateaued; This is one way to look at current tech layoffs.

It got to a point that intelligence is almost worthless now. 'Earning' money is mostly about social connections, not intelligence. So all these use cases which people are using AI for are pointless in terms of earning money in the current system. The current system rewards money-acquisition, it does not reward value-creation. You don't need intelligence to acquire money in this system; you need social connections. AI does not give you social connections; if anything, it takes away social connections. The people using AI to build themselves an amazing second internet will have nobody to share it with; no users, no investment.

The oversupply of intelligence means that it cannot find any serious avenues to earn a financial return, so instead it turns to political manipulations because system reform (or manipulation) is the shortest path to earning monetary returns... Though often this manipulation only further decouples value creation from money-acquisition.

flyinglizard

1 day ago

[-]

The truth is we're brute forcing some problems via tremendous amount of compute. Especially for apps that use AI backends (rather than chats where you interface with the LLM directly), there needs to be hybridization. I haven't used Claude Code myself but I did a screenshare session with someone who does and I think I saw it running old fashioned keyword search on the codebase. That's much more effective than just pushing more and more raw data into the chat context.

On one of the systems I'm developing I'm using LLMs to compile user intents to a DSL, without every looking at the real data to be examined. There are ways; increased context length is bad for speed, cost and scalability.

KennyBlanken

1 day ago

[-]

I see all these people in the comments talking about how they have no idea how much their services are going to cost and my only thought is "how is this legal? how has the US FTC not stepped in, or EU regulators?"

ninetyninenine

1 day ago

[-]

Solution? Move processing to the edge. I want a local model on my desktop. I want training to be some peer to peer open source processing network. We did it for SPA's time to do it for LLMs. I'd pay up to 3000 for a compute module that can do this stuff locally.

happytoexplain

1 day ago

[-]

While reading this, every time I started a paragraph and saw a lowercase, my brain and eyes were stalling or jumping up, to reflexively look for the text that got cut off. My brain has been trained for decades that, when reading full prose, a paragraph starting with lowercase means I'm starting in the middle of a sentence, and something happened in the layout or HTML to interrupt it.

And, I know this seems dramatic, but besides being cognitively distracting, it also makes me feel sad. Chatroom formatting in published writings is clearly a developing trend at this point, and I love my language so much. Not in a linguistic capacity - I'm not an English expert or anything, nor do I follow every rule - I mean in an emotional capacity.

I'm not trying to be condescending. This is a style choice, not "bad writing" in the typical sense. I realize there is often a lot of low-quality bitterness on both sides about this kind of thing.

Edit:

I also fear that this is exactly the kind of thing where any opinion in opposition to this style will feel like the kind of attack that makes a writer want to push back in a "oh yeah? fuck you" kind of way. I.e. even just my writing this opinion may give an author using the style in question the desire to "double down". Though this conundrum is appropriate (ironic?) - the intensely personal nature of language is part of why I love it.

https://news.ycombinator.com/newsguidelines.html

dang

1 day ago

[-]

"Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."

happytoexplain

1 day ago

[-]

Yeah, sorry. That was probably my last comment on this trend, since I think I've said all I have to say. However, I do think "too common" implicitly narrows the definition of "tangential annoyances" - I believe this is a new phenomenon (though I understand the spirit of the rule is to not have comment threads about things other than the content of the submission).

scoofy

1 day ago

[-]

My degrees all were in philosophy, focused on philosophy of language.

Descriptive language is how language evolves, and the internet is the first real regional conflict area that Americans have really ever encountered without traveling.

History, you would have just been in your linguistic local, with your own rules, and differences could easily been attributed to outsiders being outsiders. The internet flattens physical distance.

Thus we have a real parallel to the different regions of Italy, where no one came understand each other, or at least the UK, where different cities have extreme pronunciation differences.

The same exists for written language, and it will continue to diverge culturally. The way I look at it is that language isn’t a thing, trapped in amber, but a river we are all wading through. Different people enter at different times, and we all subtly affect the flow.

I distinctly remember thinking “email” was the dumbest sounding word ever. Now I don’t even hear it.

It’s still fine to nitpick, we’re all battling in the descriptive war for correctness. My own personal hobbyhorse is how stupid American quotations syntax is, when learning at graduate school in the UK that you use single quotes and leave the punctuation outside of the quoted sections, which is entirely sensible!

benhurmarcel

1 day ago

[-]

Weirdly I’m not really bothered by the absence of capitals.

egypturnash

1 day ago

[-]

IT COULD BE WORSE, YOU COULD BE READING A LENGTHY ESSAY PRESENTED ENTIRELY IN ALL CAPS WITH MINIMAL PUNCTUATION TO BREAK IT UP

SEARCH FOR “FILM CRIT HULK” FOR SOME EXAMPLES

braebo

1 day ago

[-]

POTUS, is that you?

majewsky

1 day ago

[-]

HULK SMASH INFERENCE PRICES

simianwords

1 day ago

[-]

It’s to draw contrast against extremely polished and sterile looking slop content. Think of it like avoiding em dash but going a bit far.

rafram

1 day ago

[-]

> all content here is generated by ai

tanseydavid

1 day ago

[-]

It is lazy.

ath3nd

1 day ago

[-]

Mathematics are not relevant when we have hype and vibes. We can't have facts and projections and no path to profitability distract us from our final goal.

Which, of course, is to donate money to Sama so he can create AGI and be less lonely with his robotic girlfriend, I mean...change the world for the better somehow. /s

NitpickLawyer

1 day ago

[-]

I get your point but I think it's debatable. As long as the capabilities increase (and they have, IMO) cost isn't really relevant. If you can reasonably solve problems of a given difficulty (and we're starting to see that), then suddenly you can do stuff that you simply can't with humans. You can "hire" 100 agents / servers / API bundles, whatever and "solve" all tasks with difficulty x in your business. Then you cancel and your bottom is suddenly raised. You can't do that with humans. You can't suddenly hire 100 entry-level SWEs and fire them after 3 months.

Then you can think about automated labs. If things pan out, we can have the same thing in chemistry/bio/physics. Having automated labs definitely seems closer now than 2.5 years ago. Is cost relevant when you can have a lab test formulas 24/7/365? Is cost a blocker when you can have a cure to cancer_type_a? And then _b_c...etc?

Also, remember that costs go down within a few generations. There's no reason to think this will stop.

https://www.rand.org/news/press/2024/02/01/index1.html

ath3nd

1 day ago

[-]

> You can "hire" 100 agents / servers / API bundles, whatever and "solve" all tasks with difficulty x in your business.

In that bright AGI future, who does my business serve, like who actually are my actual paying clients? Like, the robots are farming, the robots are driving, the robots are "creating" and robots are "thinking", right? In that awesome future, what paid jobs do us humans have, so my clients can afford my amazing entrepreneurial business that I just bootstrapped with the help of 100s of agents? And how did I get the money to hire those 100s of agents in the first place?

> Is cost a blocker when you can have a cure to cancer_type_a? And then _b_c...etc?

Yes, it very much is. The fact that even known and long discovered solutions like insulin for diabetes management are being sold to people at 9x its actual price should speak to you volumes that while it's great to have cures for X, Y and Z, it's the control over the production and development of these cures that is equally, if not much more important for the cure to actually reach people. In this rosy world of yours, do you think Zuck will give you his LLAMAGI-generated cancer cure out of the goodness of his heart? We are talking about the same dude that helped a couple of genocides and added ads in Whatsapp to squeeze the last cent of the people who are trapped with an app that gets progressively worse and more invasive.

https://systemicjustice.org/article/facebook-and-genocide-ho...

> Also, remember that costs go down within a few generations. There's no reason to think this will stop.

The destruction of the natural world, the fires all around us, the rise of fascism and nationalism, the wars that are spawning all over the place and the fact that white and blue collar jobs are being automated out while soil erosion and PFAS make our land infertile point to a different future. But yeah, I am simply ecstatic at the possibility that the costs of generating a funny picture Ghibli style with a witty caption could go down by 10 to 30%.

ysofunny

1 day ago

[-]

and the AIs stupider!

I am seeing problems with formatting that seemed 'solved' already.

I mean, I have seen "the same" model get better and worse already.

clearly somebody is calibrating the stupidity level relative to energy cost and monetary gain

senko

1 day ago

[-]

Insisting on flaunting English spelling rules (by not starting a sentence with a capital letter) in a think piece is a dead giveaway that the author thinks too highly of themselves, and results in me automatically discounting whatever they're saying.

If I (and billions others) can be bothered to learn your damn language so we can all communicate, do us a service and actually use it properly, FFS.

4 hours ago

[-]

there's no official authority on what constitutes "proper english". it is a cultural convention, and conventions change over time.

berdario

1 day ago

[-]

> Insisting on flaunting English spelling rules...

To flaunt:

> display (something) ostentatiously, especially in order to provoke envy or admiration or to show defiance.

To flout:

> openly disregard (a rule, law, or convention).

(I'm also a non-native speaker)

feintruled

20 hours ago

[-]

I was wondering if it was a way to 'flaunt' his avoidance of LLMs. Could be in future genuine human conversation will be so rare that we will seize upon mistakes and typos like finding a shiny jewel in mud.

(But then I saw he used the formation - 'Honestly?' which made me think he WAS using LLMs!)

dankwizard

1 day ago

[-]

Way to flaunt that in his face!