> Why this is happening. Two forces are slowing agentic commerce, according to Leigh McKenzie, director of online visibility at Semrush: infrastructure and trust. Real-time catalog normalization across tens of millions of SKUs is a decade-scale problem Google already solved with Merchant Center, and consumers still default to checkout flows they trust — Apple Pay, Google Wallet, and Amazon one-click.
It turns out when you step outside of “hard tech” problems like building GPT6 there are all of these details others have solved already. E-commerce has been optimized to the last decimal point for the last 30 years.
OpenAI is new to it, and if I had to guess, not that interested in getting good at it.
I think they're interested in getting good at it. They just don't want to put in the human time and effort to do so. They expect their many failures and short-comings to be shored up by continuous model training.
But that, of course, means that in the meantime it will suck and nobody will use it.
In most circles, that is "not that interested in getting good at it".
Someone can want a thing, even very badly, without wanting to put in the work for it.
Conversely, someone can work very hard for something they do not want.
The linkage between wanting a thing and wanting to do the work to get it is not absolute, or necessary there at all.
Pretty much the impetus behind a lot of theft. Sure, there's thieving because people can't afford food, but that's all theft. There's theft because they are addicts and don't want to sober up long enough to earn money, so they still things. There's others that can't afford something so rather than saving for it, they just take it.
But a dreamer in me entertains another idea: perhaps they're just holding back, because they realize that actually succeeding at this will instantly kill (or at least mortally wound) e-commerce as we know it.
(This is a more narrow version of my belief that general AI tools like LLMs fundamentally don't fit as additions to products, but rather subsume products, and this makes them an existential threat to the software industry. Not to software or computing, just to all the software vendors, whose job is to slice off pieces of computational universe, put them in boxes to prevent interoperability, and give each a name so it's a "product" that can be sold or rented).
Sam Altman doesn’t give a shit about anyone but himself and has time and again shown he has no restraint for trampling over others to further his own goals. Why would e-commerce be where he draws the line?
Whether or not they want, or will want, to do it at some point, is unknown; the reasons to not do it now are obvious:
1) it's more profitable to keep renting intelligence per token to everyone, preserving the status quo and milking it indefinitely (i.e. while the models aren't yet good enough to reliably single-shot complex software products from half-baked prompts, because once they get there, disruption will happen organically)
2) trying to compete with ~every other software product today is not likely to succeed in the end; a serious attempt would still burn down the software industry, but the major players don't have the capacity to handle it all at once, and doing it gradually will give enough time for regulatory agencies to try and stop it; either way, no one wins
I find their software to be of subpar quality and resilience anyways.
There's lots of easy but drudge work to enable this that needs to be done at the fringes. For example, LLMs today could easily replace most people's smartphone homescreen experience, or travel/commute experience, as the data is there and LLMs have the capability, even prices are acceptable - what's missing is explicit first-party support to wire it up, keep it wired up.
One step up, what's missing is accepting this explicitly as a goal: to replace software, to make existing products (whether whole or in pieces) the tools AI uses to do work for you. All the vendors seem to carefully walk around the idea, but avoid engaging with it directly, because once they do, they'll be competing with everyone instead of milking them.
These are also the same companies allowing their AI to make decisions in war, have no qualms about the mental issues they’re causing in people, and have abused workers in 3rd world countries for years.
But you think they’re holding out on “destroying the software industry” out of the goodness of their hearts? Come on
I would add there are more reasons why this wouldn't work: costs due to OOM more usage, adoption/AI backlash, adversarial environment, players with big head starts (Google).
You don't need to personally win in order to mortally wound someone. It can be informative to speculate about whether or not something is possible regardless of it being strategically advisable in the current context.
They definitely would if they could. They desperately need money. They already told the whole world they want to replace them, they just can’t.
That seems reasonable, its just yet to be seen if LLMs are a form of artificial intelligence in any meaningful sense of the word.
They're impressive ML for sure, but that is in fact different from AI despite how companies building them have tried to merge the terms together.
A software product (whether bought or rented as a service) is defined by its boundaries - there's a narrow set of specific problems, and specific ways it can be used to solve those problems, and beyond those, it's not capable (or not allowed) to be used for anything else. The specific choices of what, how, and on what terms, are what companies stick a name to to create a "software product", and those same choices also determine how (and how much) money it will make for them.
Those boundaries are what LLMs, as general-purpose problem solvers, break naturally, and trying to force-fit them within those limits means removing most of the value they offer.
Consider a word processor (like MS Word). It's solving the problem of creating richly-formatted, nice-looking documents. By default it's not going to pick the formatting for you, nor is it going to write your text for you. Now, consider two scenarios of adding LLMs to it:
- On the inside: the LLM will be able to write you a poem or rewrite a piece of document. It could be made to also edit formatting, chat with you about the contents, etc.
- From the outside: all the above, but also the LLM will be able to write you an itinerary based on information collected from maps/planning tool, airline site, hotel site, a list of personal preferences of your partner, etc. It will be able to edit formatting to match your website and presentation made in the competitor's office tools and projected weather for tomorrow.
Most importantly, it will be able to do both of those automatically, just because you set up a recurring daily task of "hey, look at my next week's worth of calendar events and figure out which ones you can do some useful pre-work for me, and then do that".
That's the distinction I'm talking about, that's the threat to software industry, and it doesn't take "true AI" - the LLMs as we have today are enough already. It's about generality that allows them to erase the boundaries that define what products are - which (this is the "mortal wound to software industry" part) devalues software products themselves, reducing them to mere tool calls for "software agents", and destroying all the main ways software companies make money today - i.e. setting up and exploiting tactics like captive audience, taking data hostage, bundled offers, UI as the best marketing/upsale platform, etc.
(To be clear - personally, I'm in favor of this happening, though I worry about consequences of it happening all at once.)
The US stock market has priced this in already. Many software only companies are perceived to be under threat by ai. It represents a wonderful arbitrage opportunity for ai skeptics in fact.
Considering the money they need, they over promise and under deliver.
I am behind schedule on developing a "summer phase" [1] for my foxographer costume and was chatting with Gemini about a crash priority "spring phase" [2] and asked it for suggestions and it gave me a 10-pack of results that had one good thing in it at rank #8, a similar query run against a normal search engine actually got something better at #1. Now sure I am talking w/ Gemini with big words like "supergraphic" whereas a normal search would be heavy on 3-letter and 5-letter words used in the product descriptions.
It makes think though of expert system based product configurators back in the 1980s
https://en.wikipedia.org/wiki/Xcon
thing is that kind of product configurator is based on an ontology, constraints and rules as opposed to embeddings which might capture the "feel" of things like clothing.
[1] Busytown meets Arknights
[2] supergraphic shirt + camera gets resonance with my promotional system and people keep approaching me (e.g. laugh but every KPI in the system has an extra zero on the left)
I believe that it can still work and I won't claim about being unsurprised about this failure. But this is a great opportunity to execute this problem really well if OpenAI and others are not interested in getting good at this.
Perplexity also attempted this, got sued by Amazon and it appears semi-abandoned.
The only problem is that it must be quicker or just as quick as a Google search, and also compatible with the existing checkout flows.
Any details on that? I feel the answer is more likely there than in "friction".
Hardly any purchase of consequence is so sensitive to friction that the difference between Google Search and an LLM response matters (especially that in reality, we're talking 20+ manual searches per one LLM response). I.e. I'm not going to use LLMs advise on some random 0-100$ purchase anyway, and losing #$ on a ##$ purchase due to suboptimal choice is not that big of a deal - but I absolutely am going to consult it (and have it compile tables and verify sources) on a $500+ purchase and for those I can afford spending few more minutes on research (or rather few hours less, compared of doing it the usual way).
Already your favourite e commerce site has all your data. You can switch on the "buy this automatically" feature.
Today, ads are based on user information you can reasonably collect from the users historical actions on your website, and then whatever search term they enter.
But soon, ads can be based on your current chat context + (derived interests of yours from your entire chat history across all chats. Shhhh.) passed in full to the e-commerce website that will use it to choose ads, generates creatives on the fly, all that crap, hyper-specific to you.
I'm so excited. Aren't you?
Now, as a side effect, searching through these can become better experience wise as well. They can use all that context and genuinely surface fewer, better results. But that's not the motivation of the e-commerce player anyways. If the ads work they'll be happy.
It certainly hasn't been optimized to anything in 1996. In 1996 it was people clumsily scanning print catalogs, spending 5 hours to upload 10 images on dialup and making a simple HTML page (no DB or any kind of backend) and putting their landline phone on it with a message to "call to checkout"
I know you were exaggerating for effect, but E-commerce and catalog normalization are definitely not "solved" everywhere.
McMaster Carr is a good example of a company that has 90%+ of their stuff ironed out, but most websites and especially small ecommerce isn't like that.
Right now, by comparison, it sounds like AI based shopping is still in the very early stages. Maybe further along than the early e-commerce, but still with a long way to go in its evolution. That'll probably happen quicker than with e-commerce, because a lot of the knowledge about what does or doesn't work has already been learned, but it sounds like it's still a long way behind. Caveat - I've never used it myself, so I don't know how far it is along that path, I'm just basing that from the article.
FWIW OpenAI is desperately trying to monetize and they think e-commerce is a "simple" problem to solve. I mean they do need to convert their funnel without alienating their users. I assume they are going to have some big payouts for agentic purchases gone awry or leave merchants on the hook.
as an aside, fall of '96 is when i started college. There was an elementary school on my drive to class where I would routinely get caught in drop-off traffic. All those kids i remember crossing the street are at least in their mid 30s now. ...I think i need to lay down and it's not even 9AM my local time.
Also remember a teacher telling us about that story of a company finding a woman was pregnant from her shopping behavior and pushing relevant recommendation. Prompting people around her like her dad or something to find out she was pregnant
Why would anyone have an extra layer of friction too where things could go wrong, where handing over payment details in another chain.
Just let me buy my stuff in peace. Shopping is not the 'killer app' for GenAI.
Personally I am an inductivist, I imagine you may be too.
Think top down decisioning is deduction. Bottom up is induction.
You might think induction is amazing but if you ask yourself "Are there any black swans?" and your answer is "No I've never seen any so there can't be any black swans." The issue is you've never actually seen every Swan and actually there are black swans in Australia.
Point being, we don't know if this is a good thing until it's tested.
This is an age-old problem.
There a people today learning to use an LLM instead of an actual search engine. For these types of people whatever happens outside of the LLM app is invisible to them. The social media apps did similar where they started letting people purchase directly within the app. People started looking at these shops rather than doing searches for it elsewhere.
Buying things on a social media app is crazy to me, but I don't use the social media apps. Buying things from an LLM app seems crazy to you, (because it's new and it's borked is fixable), but to people that first turn to their LLM app of choice that decision isn't so crazy.
"This isn't what I wanted to buy, I said my feet are a size 10 and these aren't even shoes!"
"You're absolutely right! Shoes go on feet, and each of your 10 feet could wear shoes. Would you like me to research shoes and purchase them for you?"
A chat interface is just fundamentally incompatible with this. The agent makes it too easy to ask questions and comparison shop.
It’s like corporations are angry that they need to go through us to get our money.
This is why I think the "you're the product" saying is wrong. You're just some annoyance to managers (whether they're trying to use you just for user numbers and ad views or they're trying to get your money), whose product is the company (shares or just outright selling the company).
The same way I think shopping at Amazon is better than a place like Nike due to objectivity and comparison, I think a chat interface has the potential to take this to another level since places like Amazon have degraded considerably in terms of things like fake third party products and fake reviews.
Retailers do not want you to make better choices. They want you to buy the widget.
A lot of evidence suggests that also shoppers aren't that interested in making the best choice either. They want to make a tolerable choice with as little effort as possible. There is no basically no consumer market for "power shopping" outside of weird niches like pcpartpicker.com etc.
All you say is true for an aggregator like Amazon. But Amazon is better than Nike.com because as an aggregator they go from 1 to many retailers. LLMs will go from 1 aggregator (Amazon) to many so it will be better. And they don't have to invest a lot in UI/UX as chat is the interface.
Shoppers do not want to pay to shop. Retailers pay thousands to encourage you to shop with them. They are the economic buyers of this feature.
> for a lower price
Catalog is impartial, chatbot is ads pretending as advice.
Am I the only one that think Amazon has gotten pretty awful in the last 5 years?
Not that a chat interface would be an improvement.
The only e-commerce site that fits this standard is that old one for buying (IIRC) nuts and bolts or such, that pops up on HN every other year, and whose name sadly escapes me now. Everyone else is ruthlessly optimizing their experience to fuck shoppers over and get them to products the vendor wants them to buy, not the products the shoppers actually want (or need).
> A chat interface is just fundamentally incompatible with this. The agent makes it too easy to ask questions and comparison shop.
That is precisely the point.
Chats may suck as an interface, but majority of the value and promise of end-user automation (and more than half the point of the term "User Agent" (as in, e.g., a web browser)) is in enabling comparison shopping in spite of the merchants, and more generally, helping people reduce information asymmetry that's intertwined with wealth and power asymmetry.
But it's not something you can generally sell to the vendors, who benefit from that asymmetry relative to their clients (in fact, I was dumbfounded to see so much interest on the sales/vendor side for such ideas, but I blame it on general AI hype).
Adversarial interoperability is the name of the game.
Sadly Sigma-Aldrich, the hyphenated retailer for chemistry, appears to have been covered in javascript sludge.
You realize what shoppers and vendors each consider to be "good" e-commerce sites are fundamentally opposed concepts?
I don't think so. I know for a fact that search terms are a minefield of gotchas and hacks caused by product decisions that reflect ad-hoc negotiations with partners and sellers. It's an unstable equilibrium of partners trying to shift attention to their products in a certain way. I think that calling this fragile equilibrium optimized has no bearing with reality.
You think a crude, unoptimised "minefield" is the route that leads to something as delicate as a "fragile equilibrium?" I don't see something as carefully balanced as your unstable equilibrium even being something that could exist without the processes involved having been refined down to a science. The only real alternative that meets your narrative would be that this is an industry that runs entirely on hope and luck (and enough human sacrifices to keep ample supplies of both on hand).
If we are talking custom products or complex appliances that need a lot of guidance, then maybe chat interface is appropriate.
I dread the day when ads inevitably make their way into the main AI models. One of the things its currently good at will be destroyed.
Grandma wants to buy a good bike, but doesn't know about types of wheels or how many gears they need, or what type of frame is appropriate for their body type.
LLMs are already very good for shopping, but only as long as they sit on the outside.
I found the Jonsbo D41 without the help of LLM despite trying. (There might be a few smaller but they are 3x the price)
LLMs don’t weigh and surveil the options well. They find some texts like from Reddit in this case that mention a bunch subset of cases and that text will heavily shape the answer. Which is not what you want a commerce agent to do, you don’t want text prediction. I doubt that gives the obscure but optimal option in most cases.
That doesn't follow. In fact, having this capacity and information creates a moral dilemma, as giving customers objectively correct advice is, especially in highly competitive markets, bad for business. Ignorance is bliss for businesses, because this lets them bullshit people through marketing with less guilt, and if there's one thing any business knows, is that marketing has better ROI than product/service quality anyway.
https://www.bbc.co.uk/travel/article/20240222-air-canada-cha...
"I need mayo, ketchup, mustard and ground beef"
"Here is a list of products with prices ... proceed to pay $25 (yes/no)" Yes
"Your card has been charged. Delivery will knock on your door in 7 minutes"
I'll code that app in one month, what's there to lose?
Also, it would rather be in the faceless ai shop's interest to arbitrage orders, always show the "middle" price but use the cheapest one for orders.
The food delivery apps (zomato, Swiggy) support "agentic shopping" through an MCP server.
You can reserve a small amount of money in UPI apps that doesn't need any approval to be debited (only by a specific merchant).
Razorpay which is a payment gateway has an MCP server
So you can add razorpay in your app, and claude can pay.
Claude can access the Swiggy MCP server and search using natural language there.
The incentive for the food delivery app to participate in this is better targeted ads.
You won't get it to push your products when users ask what's the best XYZ - either because it'll be too honest to lie or because it'll be too expensive for you.
I’m currently using Gemini to research components for a remote controlled plane. I have the frame of the plane and now need to buy correctly specced servo motors, an engine, battery, etc etc. It has saved me so much time and educated me tremendously on how the different components interact and the options available.
If I could just press “buy” from within Gemini and pay via Google Pay (or better still, Apple Pay) I’d do it in a heartbeat.
If ChatGPT can do this today, I need to try it.
“””If I could just press “buy” from within Gemini and pay via Google Pay (or better still, Apple Pay) I’d do it in a heartbeat”””
Yeah, until that becomes enshittified and you don’t notice because you no longer do research on components.
Insert that "always has been" meme.
Microsoft was brought to court in the 90's for shoehorning in Internet Explorer.
Regular consumer products? Good fucking luck. Anywhere an LLM pulls from is probably going to be mostly SEO'd listicles.
There’s a lot of this going on in AI at the moment. New folks come in thinking they have a magic solution and then produce a total train wreck as it turns out domain expertise is still a thing.
Was anyone suggesting AI would help with it? It seems from the article that Walmart (presumably experts in e-commerce) themselves willingly collaborated with open ai. Especially at Walmart's level, what even was the theory?
In any case. It seems that despite this poor result, Walmart decided to essentially go ahead anyway and partner with open ai to put their "own chatbot" inside the open ai app?
The better comparison might be conversion rate for those who searched on Walmart.com vs those who searched within ChatGPT. Or maybe that is what they're comparing and I misunderstood?
And given the past few decades there is no reason to not try to do that.
This probably means that OpenAI et al are fine-tuning salesman-like LLMs to "fix" that problem.
Can't wait for the future.
> AI accounts for 90% of accidents while only accounting for 1% of traffic
https://abit.ee/en/cars/waymo-robotaxi-autonomous-driving-sa...
The next generation will shop in a different way, if it's better, and the change will be gradual as well.
Adoption takes time.
The only thing holding back "Agentic Purchasing" is convenience. It's much easier to do a conventional search and click "Buy" then it is to have a conversation with some chatbot. If I walk into a store, most of the time I don't want to talk to a salesperson, I just want to grab the thing I came for and leave; this is also true for online shopping. The chatbot is another barrier to the purchase.
I sort of trust them to make product recommendations, but at best I will only open a link they suggest and buy the product there.
Does it actually need direct access to your wallet? I haven't tried it yet, but assumed it would work with a separate wallet, fed through by top-ups.
Never, ever.
Amazon reviews are paid influence. Reddit posts are paid influence. Everything everywhere you read online is paid influence. I'd rank LLMs between "people I personally trust" and "random people online."
Operative word is "for now" - LLMs caught entrepreneurs unprepared, but they'll catch up and poison this too, same thing that happened with search giving rise to SEO.
Do you have a better alternative?
I suppose that's a long-winded way of saying that nearly every category of item requires its own strategy. For a brief period consumers were winning the information war and you could just go to Amazon, read the reviews, and get a superior product for cheap. We're now in a modern-but-old-fashioned situation. It's quite difficult to know if you're going to get ripped off, and you're forced to rely on more blunt heuristics. (eg: trust specific brands, buy things in person, etc.) None of these are perfect, but they are quickly becoming the best of some bad options.
Most people using AI chat are exploring ideas and solutions. They’re doodling, not shopping. Or in old timey parlance, they’re looky-loos or tire kickers at best.
Anyone who’s had to justify ad spend in e-commerce can tell you that some sources produce huge traffic with absolutely terrible conversion. Reddit and Pinterest pretty much blow for this reason, with limited exceptions. It’s also why TikTok and other influencer platforms really work.
Conversion requires a mental shift from discovery to demand.
Also, really hate summaries like this without the actual source so here are the main points from the actual source (WIRED https://archive.is/7DuEV):
1. Instant Checkout inside ChatGPT performed poorly, with conversion about one-third of Walmart’s normal site.
2. The experience failed largely because it forced single-item purchases instead of letting users build a cart.
3. Walmart is shifting to embedding its own assistant, Sparky, inside ChatGPT and keeping checkout on its own system.
4. ChatGPT is still valuable because it’s driving significantly more new customer traffic than search.
5. Purchases that did work were mostly practical, problem-solving items like supplements and tools.
6. Fully automated “agentic shopping” is still unlikely in the near term because people want control over purchases.
7. OpenAI is moving away from in-chat checkout and focusing on helping users research while merchants handle transactions.
In short, AI is useful for discovery, but traditional e-commerce flows still outperform it at closing sales.
I'm confused by the comment that it failed because it forced single item purchases. Most of my "ecommerce" use is researching and buying one item at a time.
Your product has to be a 10x improvement over the incumbent to be competitive.
In AI speak it would be the “extra-bitter” lesson I guess?
You need to add 10x resources to beat a product that’s already solved with mature tech.
If you want to buy a Walmart product, the easiest way is to go to Walmart. Why add an imprecise middle man in between?
The latest AI is trained on the average citizens social media output. Iq 90.
That’s why AI seemed smart. The bar will not be raised again. We’re cooked.
> You ask ChatGPT to setup a website and it instantly purchases web hosting and sets up the website
Multiple comments deflecting from the original shopping conversion failure to recommend ... building a whole new website (with hosting for some reason?). W/o bothering to look through commenter history, one has to assume there are a lot of chatbots on this site or else the people using this stuff have been lobotomized.
I'm sure it'll start happening too, and when that fails, the bots will, i don't know, invent a new macarena. We are definitely headed for an irredeemably stupid future.
Perhaps clickthrough is worse because there are fewer dark patterns involved and people are mostly just browsing and occasionally buying only what they need.
They didn't really seem to specify the "why" of it with any research. And weird that OAI wasn't supporting them to see wha the issue was.
The enshittification is upon us.