I don't know about prospecting, but "answer support tickets accurately"? Seriously, this must be ironic, right?
Vertical integration.
Horizontal integration.
Cross- and/or mass-relationship integration.
Individual relationship investment/artifacts.
Reputation for reliability, stability, or any other desired dimension.
Constant visibility in the news (good, neutral, sometimes even bad!)
A consistent attractive story or narrative around the brand.
A consistent selective story or narrative around the brand. People prefer products designed for "them".
On the dark side: intimidation. Ruthless competition, acquisitions, law suits, reputation for dominance, famously deep pockets.
To keep someone is easier. Tiny things hold onto people: An underlying model that delivers results with less irritation/glitches/hoops. Low to no-configuration installs and operation. Windows that open, and other actions that happen, instantly. Simple attention to good design can create fierce loyalty, for those for whom design or friction downgrades feel like torture.
Obviously, many more moats in the physical world.
Or maybe the honest to God non-dull tool that has nothing to do with AI. Like a Photoshop clone that does everything in linear light, makes gorgeous images, and doesn't crash when you open the font chooser.
Also as foundation models improve, today's "hard to solve" problems become tomorrow's "easy to solve" problems
And guess what, all those mashup companies didn't last a couple of years. Because they didn't have a direct access to data.
- Which brands do people trust? - Which people do people of power trust?
You can have all the information in the world but if no one listens to you then it’s worthless.
These are often at odds with each other. So many times engineers (people) prefer the tool that actually does the job, but the PMs (people of power) prefer shiny tools that are the "best practice" in the industry.
Example: Claude Code is great and I use it with Codex models, but people of power would rather use "Codex with ChatGPT Pro subscription" or "CC with Claude subscription" because those are what their colleagues have chosen.
The biggest data hoarders now compress their data into oracles whose job is to say whatever to whoever - leaking an ever-improving approximation of the data back out.
DeepSeek was a big early example of adversarial distillation, but it seems inevitable to me that frontier models can and will always be siphoned off in order to produce reasonably strong fast-follow grey market competition.
Code generation, you don’t see what’s wrong right away, it’s only later in project lifecycle that you pay for it. Writing looks good to skim, is embarrassingly bad once you start reading it.
Some things (slides apparently) you notice right away how crappy they are.
I don’t think it’s just better training data, I think LLMs apply largely the same kind of zeal to different tasks. It’s the places where coherent nonsense ends up being acceptable.
I’m actually a big LLM proponent and see a bright future, but believe a critical assessment of how they work and what they do is important.
This feels like telling a story after the fact to make it fit.
inb4 "then why do Meta's models still suck?"
And, all the new research around self learning architectures has nothing to do with the datasets.
Companies always try to make it seem like data is valuable. Attention is valuable. With attention, you get the data for free. What they monetize is attention. Data is a small part to optimize the sale of ads but attention is the important commodity.
Why else are celebrities so well paid?
I feel like the the data to drive the really interesting capabilities (biological, chemical, material, etc, etc, etc) is not going to come in large part from end users.
Just commit fraud repeatedly while owning the people who run DoJ, easy peasy, no amount of attention or cash flow can displace that.