But we don’t go to baseball games, spelling bees, and
Taylor Swift concerts for the speed of the balls, the
accuracy of the spelling, or the pureness of the
pitch. We go because we care about humans doing those
things. It wouldn’t be interesting to watch a bag of
words do them—unless we mistakenly start treating
that bag like it’s a person.unless we mistakenly
start treating that bag like it’s a person.
That seems to be the marketing strategy of some very big, now AI dependend companies. Sam Altman and others exaggerating and distorting the capabilities and future of AI.The biggest issue when it comes to AI is still the same truth as with other technology. It's important who controls it. Attributing agency and personality to AI is a dangerous red flag.
What does it mean to say that we humans act with intent? It means that we have some expectation or prediction about how our actions will effect the next thing, and choose our actions based on how much we like that effect. The ability to predict is fundamental to our ability to act intentionally.
So in my mind: even if you grant all the AI-naysayer's complaints about how LLMs aren't "actually" thinking, you can still believe that they will end up being a component in a system which actually "does" think.
My personal assessment is that LLMs can do neither.
An LLM has: words in its input plane, words in its output plane, and A LOT of cross-linked internals between the two.
Those internals aren't "words" at all - and it's where most of the "action" happens. It's how LLMs can do things like translate from language to language, or recall knowledge they only encountered in English in the training data while speaking German.
Though I do think in human brains it's also an interplay where what we write/say also loops back into the thinking as well. Which is something which is efficient for LLMs.
My "abstract thoughts" are a stream of words too, they just don't get sounded out.
Tbf I'd rather they weren't there in the first place.
But bodies which refuse to harbor an "interiority" are fast-tracked to destruction because they can't suf^W^W^W be productive.
Funny movie scene from somewhere. The sergeant is drilling the troops: "You, private! What do you live for!", and expects an answer along the lines of dying for one's nation or some shit. Instead, the soldier replies: "Well, to see what happens next!"
Hmm, seems unlikely. They are not sounded out part is true, sure, but I question whether 'abstract thoughts' can be so easily dismissed as mere words.
If it turns out that LLMs don't model human brains well enough to qualify as "learning abstract thought" the way humans do, some future technology will do so. Human brains aren't magic, special or different.
They’re certainly special both within the individual but also as a species on this planet. There are many similar to human brains but none we know of with similar capabilities.
They’re also most obviously certainly different to LLMs both in how they work foundationally and in capability.
I definitely agree with the materialist view that we will ultimately be able to emulate the brain using computation but we’re nowhere near that yet nor should we undersell the complexity involved.
> Human brains aren't magic, special or different.
DNA inside neurons uses superconductive quantum computations [1].[1] https://www.nature.com/articles/s41598-024-62539-5
As the result, all living cells with DNA emit coherent (as in lasers) light [2]. There is a theory that this light also facilitates intercellular communication.
[2] https://www.sciencealert.com/we-emit-a-visible-light-that-va...
Chemical structures in dendrites, not even neurons, are capable to compute XOR [3] which require multilevel artificial neural network with at least 9 parameters. Some neurons in brain have hundredths of thousands of dendrites, we are now talking of millions of parameters only in single neuron's dendrites functionality.
[3] https://www.science.org/doi/10.1126/science.aax6239
So, while human brains aren't magic, special or different, they are just extremely complex.
Imagine building a computer with 85 billions of superconducting quantum computers, optically and electrically connected, each capable of performing computations of a non-negligibly complex artificial neural network.
“Internal combustion engines and human brains are both just mechanisms. Why would one mechanism a priori be capable of "learning abstract thought", but no others?”
The question isn't about what an hypothetical mechanism can do or not, it's about whether the concrete mechanism we built does or not. And this one doesn't.
I will absolutely say that all ML methods known are literally too stupid to live, as in no living thing can get away with making so many mistakes before it's learned anything, but that's the rate of change of performance with respect to examples rather than what it learns by the time training is finished.
What is "abstract thought"? Is that even the same between any two humans who use that word to describe their own inner processes? Because "imagination"/"visualise" certainly isn't.
If you consider that LLMs have already "learned" more than any one human in this world is able to learn, and still make those mistakes, that suggests there may be something wrong with this approach...
It's not just that. The problem of “deep learning” is that we use the word “learning” for something that really has no similarity with actual learning: it's not just that it converges way too slowly, it's also that it just seeks to minimize the predicted loss for every samples during training, but that's no how humans learn. If you feed it enough flat-earther content, as well a physics books, an LLM will happily tells you that the earth is flat, and explain you with lots of physics why it cannot be flat. It simply learned both “facts” during training and then spit it out during inference.
A human will learn one or the other first, and once the initial learning is made, it will disregards all the evidence of the contrary, until maybe at some point it doesn't and switches side entirely.
LLMs don't have an inner representation of the world and as such they don't have an opinion about the world.
The humans can't see the reality for itself, but they at least know it exists and they are constantly struggling to understand it. The LLM, by nature, is indifferent to the world.
Nobody is. What people are doing is claiming that "predicting the next thing" does not define the entirety of human thinking, and something that is ONLY predicting the next thing is not, fundamentally, thinking.
It is not unreasonable to suspect differences between humans and LLMs are differences in degree, rather than category.
My claim is that the two concepts are indistinguishable, thus equivalent. The unfalsifiability is what makes it a natural equivalence, the same as in the other examples I gave.
Especially when modeling acting with intent. The ability to measure against past results and think of new innovative approaches seems like it may come from a system that may model first and then use LLM output. Basically something that has a foundation of tools rather than an LLM using MCP. Perhaps using LLMs to generate a response that humans like to read, but not in them coming up with the answer.
Either way, yes, its possible for a thinking system to use LLMs (and potentially humans piece together sentences in a similar way), but its also possible LLMs will be cast aside and a new approach will be used to create an AGI.
So for me: even if you are an AI-yeasayer, you can still believe that they won't be a component in an AGI.
the issue with AI and AI-naysayers is, by analogy, this: cars were build to drive from A to Z. people picked up tastes and some people started building really cool looking cars. the same happens on the engineering side. then portfolio communists came with their fake capitalism and now cars are build to drive over people but don't really work because people, thankfully, are overwhelming still fighting to attempt to act towards their own intents.
Language and society constrains the way we use words, but when you speak, are you "predicting"? Science allows human beings to predict various outcomes with varying degrees of success, but much of our experience of the world does not entail predicting things.
How confident are you that the abstractions "search" and "thinking" as applied to the neurological biological machine called the human brain, nervous system, and sensorium and the machine called an LLM are really equatable? On what do you base your confidence in their equivalence?
Does an equivalence of observable behavior imply an ontological equivalence? How does Heisenberg's famous principle complicate this when we consider the role observer's play in founding their own observations? How much of your confidence is based on biased notions rather than direct evidence?
The critics are right to raise these arguments. Companies with a tremendous amount of power are claiming these tools do more than they are actually capable of and they actively mislead consumers in this manner.
Yes. This is the core claim of the Free Energy Principle[0], from the most-cited neuroscientist alive. Predictive processing isn't AI hype - it's the dominant theoretical framework in computational neuroscience for ~15 years now.
> much of our experience of the world does not entail predicting things
Introspection isn't evidence about computational architecture. You don't experience your V1 doing edge detection either.
> How confident are you that the abstractions "search" and "thinking"... are really equatable?
This isn't about confidence, it's about whether you're engaging with the actual literature. Active inference[1] argues cognition IS prediction and action in service of minimizing surprise. Disagree if you want, but you're disagreeing with Friston, not OpenAI marketing.
> How does Heisenberg's famous principle complicate this
It doesn't. Quantum uncertainty at subatomic scales has no demonstrated relevance to cognitive architecture. This is vibes.
> Companies... are claiming these tools do more than they are actually capable of
Possibly true! But "is cognition fundamentally predictive" is a question about brains, not LLMs. You've accidentally dismissed mainstream neuroscience while trying to critique AI hype.
[0] https://www.nature.com/articles/nrn2787
[1] https://mitpress.mit.edu/9780262045353/active-inference/
The thing you're doing here has a name: using "emergence" as a semantic stopsign. "The system is complex, therefore emergence, therefore we can't really say" feels like it's adding something, but try removing the word and see if the sentence loses information.
"Neurons are complex and might exhibit chaotic behavior" - okay, and? What next? That's the phenomenon to be explained, not an explanation.
This was articulated pretty well 18 years ago [0].
[0]: https://www.lesswrong.com/posts/8QzZKw9WHRxjR4948/the-futili...
To my understanding, bloaf's claim was only that the ability to predict seems a requirement of acting intentionally and thus that LLMs may "end up being a component in a system which actually does think" - not necessarily that all thought is prediction or that an LLM would be the entire system.
I'd personally go further and claim that correctly generating the next token is already a sufficiently general task to embed pretty much any intellectual capability. To complete `2360 + 8352 * 4 = ` for unseen problems is to be capable of arithmetic, for instance.
So notice that my original claim was "prediction is fundamental to our ability to act with intent" and now your demand is to prove that "prediction is fundamental to all mental activity."
That's a subtle but dishonest rhetorical shift to make me have to defend a much broader claim, which I have no desire to do.
> Language and society constrains the way we use words, but when you speak, are you "predicting"?
Yes, and necessarily so. One of the main objections that dualists use to argue that our mental processes must be immaterial is this [0]:
* If our mental processes are physical, then there cannot be an ultimate metaphysical truth-of-the-matter about the meaning of those processes.
* If there is no ultimate metaphysical truth-of-the-matter about what those processes mean, then everything they do and produce are similarly devoid of meaning.
* Asserting a non-dualist mind therefore implies your words are meaningless, a self-defeating assertion.
The simple answer to this dualist argument is precisely captured by this concept of prediction. There is no need to assert some kind of underlying magical meaning to be able to communicate. Instead, we need only say that in the relevant circumstances, our minds are capable of predicting what impact words will have on the receiver and choosing them accordingly. Since we humans don't have access to each other's minds, we must not learn these impacts from some kind of psychic mind-to-mind sense, but simply from observing the impacts of the words we choose on other parties; something that LLMs are currently (at least somewhat) capable of observing.
[0] https://www.newdualism.org/papers/E.Feser/Feser-acpq_2013.pd...
If you read the above link you will see that they spell out 3 problems with our understanding of thought:
Consciousness, intentionality, and rationality.
Of these, I believe prediction is only necessary for intentionality, but it does have some roles to play in consciousness and rationality.
The near-religious fervor which people insist that "its just prediction" makes me want to respond with some religious allusions of my own:
> Who is this that wrappeth up sentences in unskillful words? Gird up thy loins like a man: I will ask thee, and answer thou me. Where wast thou when I laid up the foundations of the earth? tell me if thou hast understanding. Who hath laid the measures thereof, if thou knowest? or who hath stretched the line upon it?
The point is that (as far as I know) we simply don't know the necessary or sufficient conditions for "thinking" in the first place, let alone "human thinking." Eventually we will most likely arrive at a scientific consensus, but as of right now we don't have the terms nailed down well enough to claim the kind of certainty I see from AI-detractors.
I’m downplaying because I have honestly been burned by these tools when I’ve put trust in their ability to understand anything, provide a novel suggestion or even solve some basic bugs without causing other issues.?
I use all of the things you talk about extremely frequently and again, there is no “thinking” or consideration on display that suggests these things work like us, else why would we be having this conversation if they were ?
I've had that experience plenty of times with actual people... LLMs don't "think" like people do, that much is pretty obvious. But I'm not at all sure whether what they do can be called "thinking" or not.
The harms engendered by underestimating LLM capabilities are largely that people won't use the LLMs.
The harms engendered by overestimating their capabilities can be as severe as psychological delusion, of which we have an increasing number of cases.
Given we don't actually have a good definition of "thinking" what tack do you consider more responsible?
Speculative fiction about superintelligences aside, an obvious harm to underestimating the LLM's capabilities is that we could effectively be enslaving moral agents if we fail to correctly classify them as such.
Much worse, when insufficiently skeptical humans link the LLM to real-world decisions to make their own lives easier.
Consider the Brazil-movie-esque bureaucratic violence of someone using it to recommend fines or sentencing.
Do you have a proof for this?
Surely such a profound claim about human thought process must have a solid proof somewhere? Otherwise who's to say all of human thought process is not just a derivative of "predicting the next thing"?
What would change your mind? It's an exercise in feasibility.
For example, I don't believe in time travel. If someone made me time travel, and made it undeniable that I was transported back to 1508, then I would not be able to argue against it. In fact, no one in such position would.
What is that equivalent for your conviction? There must be something, otherwise, it's just an opinion that can't be changed.
You don't need to present some actual proof or something. Just lay out some ideas that demonstrate that you are being rational about this and not just sucking up to LLM marketing.
Predict the right words, predict the answer, predict when the ball bounces, etc. Then reversing predictions that we have learned. I.e. choosing the action with the highest prediction of the outcome we want. Whether that is one step, or a series of predicted best steps.
Also, people confuse different levels of algorithm.
There are at least 4 levels of algorithm:
• 1 - The architecture.
This input-output calculation for pre-trained models are very well understood. We put together a model consisting of matrix/tensor operations and few other simple functions, and that is the model. Just a normal but high parameter calculation.
• 2 - The training algorithm.
These are completely understood.
There are certainly lots of questions about what is most efficient, alternatives, etc. But training algorithms harnessing gradients and similar feedback are very clearly defined.
• 3 - The type of problem a model is trained on.
Many basic problem forms are well understood. For instance, for prediction we have an ordered series of information, with later information to be predicted from earlier information. It could simply be an input and response that is learned. Or a long series of information.
• 4 - The solution learned to solve (3) the outer problem, using (2) the training algorithm on (1) the model architecture.
People keep confusing (4) with (1), (2) or (3). But it is very different.
For starters, in the general case, and for most any challenging problem, we never understand their solution. Someday it might be routine, but today we don't even know how to approach that for any significant problem.
Secondly, even with (1), (2), and (3) exactly the same, (4) is going to be wildly different based on the data characterizing the specific problem to solve. For complex problems, like language, layers and layers of sub-solutions to sub-problems have to be solved, and since models are not infinite in size, ways to repurpose sub-solutions, and weave together sub-solutions to address all the ways different sub-problems do and don't share commonalities.
Yes, prediction is the outer form of their solution. But to do that they have to learn all the relationships in the data. And there is no limit to how complex relationships in data can be. So there is no limit on the depths or complexity of the solutions found by successfully trained models.
Any argument they don't reason, based on the fact that they are being trained to predict, confuses at least (3) and (4). That is a category error.
It is true, they reason a lot more like our "fast thinking", intuitive responses, than our careful deep and reflective reasoning. And they are missing important functions, like a sense of what they know or don't. They don't continuously learn while inferencing. Or experience meta-learning, where they improve on their own reasoning abilities with reflection, like we do. And notoriously, by design, they don't "see" the letters that spell words in any normal sense. They see tokens.
Those reasoning limitations can be irritating or humorous. Like when a model seems to clearly recognize a failure you point out, but then replicates the same error over and over. No ability to learn on the spot. But they do reason.
Today, despite many successful models, nobody understands how models are able to reason like they do. There is shallow analysis. The weights are there to experiment with. But nobody can walk away from the model and training process, and build a language model directly themselves. We have no idea how to independently replicate what they have learned, despite having their solution right in front of us. Other than going through the whole process of retraining another one.
The illusion wears off after about half an hour for even the most casual users. That's better than the old chatbots, but they're still chatbots.
Did anyone ever seriously buy the whole "it's thinking" BS when it was Markov chains? What makes you believe today's LLMs are meaningfully different?
Woah, that hit hard
Unfortunately, its corpus is bound to contain noise/nonsense that follows no formal reasoning system but contributes to the ill advised idea that an AI should sound like a human to be considered intelligent. Therefore it is not a bag of words but a bag of probabilities perhaps. This is important because the fundamental problem is that an LLM is not able, by design, to correctly model the most fundamental precept of human reason, namely the law of non-contradiction. An LLM must, I repeat must assign nonvanishing probability to both sides of a contradiction, and what's worse is the winning side loses, since long chains of reason are modelled with probability the longer the chain, the less likely an LLM is to follow it. Moreover, whenever there is actual debate on an issue such that the corpus is ambiguous the LLM becomes chaotic, necessarily, on that issue.
I literally just had an AI prove the forgoing with some rigor, and in the very next prompt, I asked it to check my logical reasoning for consistency and it claimed it was able to do so (->|<-).
That said, I think the author's use of "bag of words" here is a mistake. Not only does it have a real meaning in a similar area as LLMs, but I don't think the metaphor explains anything. Gen AI tricks laypeople into treating its token inferences as "thinking" because it is trained to replicate the semiotic appearance of doing so. A "bag of words" doesn't sufficiently explain this behavior.
Person-metaphor does nothing to explain its behavior, either.
"Bag of words" has a deep origin in English, the Anglo-Saxon kenning "word-hord", as when Beowulf addresses the Danish sea-scout (line 258)
"He unlocked his word-hoard and delivered this answer."
So, bag of words, word-treasury, was already a metaphor for what makes a person a clever speaker.
The contra-positive of "All LLMs are not thinking like humans" is "No humans are thinking like LLMs"
And I do not believe we actually understand human thinking well enough to make that assertion.
Indeed, it is my deep suspicion that we will eventually achieve AGI not by totally abandoning today's LLMs for some other paradigm, but rather embedding them in a loop with the right persistence mechanisms.
> Gen AI tricks laypeople into treating its token inferences as "thinking" because it is trained to replicate the semiotic appearance of doing so. A "bag of words" doesn't sufficiently explain this behavior.
Something about there being significant overlap between the smartest bears and the dumbest humans. Sorry you[0] were fooled by the magic bag.
[0] in the "not you, the layperson in question" sense
Whenever the comment section takes a long hit and goes "but what is thinking, really" I get slightly more cynical about it lol
By now, it's pretty clear that LLMs implement abstract thinking - as do humans.
They don't think exactly like humans do - but they sure copy a lot of human thinking, and end up closer to it than just about anything that's not a human.
I feel that's more a description of a search engine. Doesn't really give an intuition of why LLMs can do the things they do (beyond retrieval), or where/why they'll fail.
"Self-awareness" used in a purely mechanical sense here: having actionable information about itself and its own capabilities.
If you ask an old LLM whether it's able to count the Rs in "strawberry" successfully, it'll say "yes". And then you ask it to do so, and it'll say "2 Rs". It doesn't have the self-awareness to know the practical limits of its knowledge and capabilities. If it did, it would be able to work around the tokenizer and count the Rs successfully.
That's a major pattern in LLM behavior. They have a lot of capabilities and knowledge, but not nearly enough knowledge of how reliable those capabilities are, or meta-knowledge that tells them where the limits of their knowledge lie. So, unreliable reasoning, hallucinations and more.
My second thought is that it's not the metaphor that is misleading. People have been told thousands of times that LLMs don't "think", don't "know", don't "feel", but are "just a very impressive autocomplete". If they still really want to completely ignore that, why would they suddenly change their mind with a new metaphor?
Humans are lazy. If it looks true enough and it cost less effort, humans will love it. "Are you sure the LLM did your job correctly?" is completely irrelevant: people couldn't care less if it's correct or not. As long as the employer believes that the employee is "doing their job", that's good enough. So the question is really: "do you think you'll get fired if you use this?". If the answer is "no, actually I may even look more productive to my employer", then why would people not use it?
Sure, this is not the same as being a human. Does that really mean, as the author seems to believe without argument, that humans need not be afraid that it will usurp their role? In how many contexts is the utility of having a human, if you squint, not just that a human has so far been the best way to "produce the right words in any given situation", that is, to use the meat-bag only in its capacity as a word-bag? In how many more contexts would a really good magic bag of words be better than a human, if it existed, even if the current human is used somewhat differently? The author seems to rest assured that a human (long-distance?) lover will not be replaced by a "bag of words"; why, especially once the bag of words is also ducttaped to a bag of pictures and a bag of sounds?
I can just imagine someone - a horse breeder, or an anthropomorphised horse - dismissing all concerns on the eve of the automotive revolution, talking about how marketers and gullible marks are prone to hippomorphising anything that looks like it can be ridden and some more, and sprinkling some anecdotes about kids riding broomsticks, legends of pegasi and patterns of stars in the sky being interpreted as horses since ancient times.
Neither of these is entirely true in all cases, but they could be expected to remain true in at least some (many) cases, and so the role for humans remains.
There's a quote I love but have misplaced, from the 19th century I think. "Our bodies are just contraptions for carrying our heads around." Or in this instance... bag of words transport system ;)
I mean I use AI tools to help achieve the goal but I don’t see any signs of the things I’m building and doing being unreliable.
Either way, in what way is this relevant? If the human's labor is not useful at any price point to any entity with money, food or housing, then they presumably will not get paid/given food/housing for it.
That said, I was struck by a recent interview with Anthropic’s Amanda Askell [2]. When she talks, she anthropomorphizes LLMs constantly. A few examples:
“I don't have all the answers of how should models feel about past model deprecation, about their own identity, but I do want to try and help models figure that out and then to at least know that we care about it and are thinking about it.”
“If you go into the depths of the model and you find some deep-seated insecurity, then that's really valuable.”
“... that could lead to models almost feeling afraid that they're gonna do the wrong thing or are very self-critical or feeling like humans are going to behave negatively towards them.”
[1] https://www.anthropic.com/research/team/interpretability
Their vivid descriptions of what the Emperor could be wearing doesn't make said emperor any less nakey.
Can you give some concrete examples? The link you provided is kind of opaque
>Amanda Askell [2]. When she talks, she anthropomorphizes LLMs constantly.
She is a philosopher by trade and she describes her job (model alignment) as literally to ensure models "have good character traits." I imagine that explains a lot
https://www.anthropic.com/news/golden-gate-claude
Excerpt: “We found that there’s a specific combination of neurons in Claude’s neural network that activates when it encounters a mention (or a picture) of this most famous San Francisco landmark.”
https://www.anthropic.com/research/tracing-thoughts-language...
Excerpt: “Recent research on smaller models has shown hints of shared grammatical mechanisms across languages. We investigate this by asking Claude for the ‘opposite of small’ across different languages, and find that the same core features for the concepts of smallness and oppositeness activate, and trigger a concept of largeness, which gets translated out into the language of the question.”
https://www.anthropic.com/research/introspection
Excerpt: “Our new research provides evidence for some degree of introspective awareness in our current Claude models, as well as a degree of control over their own internal states.”
My fridge happily reads inputs without consciousness, has goals and takes decisions without "thinking", and consistently takes action to achieve those goals. (And it's not even a smart fridge! It's the one with a copper coil or whatever.)
I guess the cybernetic language might be less triggering here (talking about systems and measurements and control) but it's basically the same underlying principles. One is just "human flavored" and I therefore more prone to invite unhelpful lines of thinking?
Except that the "fridge" in this case is specifically and explicitly designed to emulate human behavior so... you would indeed expect to find structures corresponding to the patterns it's been designed to simulate.
Wondering if it's internalized any other human-like tendencies — having been explicitly trained to simulate the mechanisms that produced all human text — doesn't seem too unreasonable to me.
I did a simple experiment - took a photo of my kid in the park, showed it to Gemini and asked for a "detailed description". Then I took that description and put it into a generative model (Z-Image-Turbo, a new one). The output image was almost identical.
So one model converted image to text, the other reversed the processs. The photo was completely new, personal, never put online. So it was not in any training set. How did these 2 models do it if not actually using language like a thinking agent?
https://pbs.twimg.com/media/G7gTuf8WkAAGxRr?format=jpg&name=...
By having a gazillion of other, almost identical pictures of kids in parks in their training data.
All useful shorthands, all which lead to people displaying fundamental misunderstandings of what they're talking about - i.e. expressing surprise that a nation of millions doesn't display consistency of behavior of human lifetime scales, even though fairly obviously the mechanisms of government are churning their make up constantly, and depending on context maybe entirely different people.
For example, if you've worked at a large company, one of the little tragedies is when someone everyone likes gets laid off. There were probably no people who actively wanted Bob to lose his job. Even the CEO/Board who pulled the trigger probably had nothing against Bob. Heck, they might be the next ones out the door. The company is faceless, yet it wanted Bob to go, because that apparently contributed to the company's objective function. Had the company consisted entirely of different people, plus Bob, Bob might have been laid off anyway.
There is a strong will to do ... things the emerges from large structures of people and technology. It's funny like that.
I've completely given up on using LLMs for anything more than a typing assistant / translator and maybe an encyclopedia when I don't care about correctness.
It is... such a retrospective narrative. It's so obvious that the author learned about this example first than came with the reasoning later, just to fit in his view of LLM.
Imaging if ChatGPT answered this question correctly. Would that change the author's view? Of course not! They'll just say:
> “Bag of words” is a also a useful heuristic for predicting where an AI will do well and where it will fail. Who reassigned the species Brachiosaurus brancai to its own genus, and when?” is an easy task for a bag of words, because the information has appeared in the words it memorizes.
I highly doubt this author has predicted that "bag of Words" can do image editing before OpenAI released that.
This is because there are many words about how to do web searches.
and got ths correct reply from the "Bag of Words"
The species Brachiosaurus brancai was reassigned to its own genus by Michael P. Taylor in 2009 — he transferred it to the new genus Giraffatitan. BioOne +2 Mike Taylor +2
How that happened:
Earlier, in 1988, Gregory S. Paul had proposed putting B. brancai into a subgenus as Brachiosaurus (Giraffatitan) brancai, based on anatomical differences. Fossil Wiki +1
Then in 1991, George Olshevsky used the name Giraffatitan brancai — but his usage was in a self-published list and not widely adopted. Wikipedia +1
Finally, in 2009 Taylor published a detailed re-evaluation showing at least 26 osteological differences between the African material (brancai) and the North American type species Brachiosaurus altithorax — justifying full generic separation. BioOne +1
If you like — I can show a short timeline of all taxonomic changes of B. brancai.
--
As an author, you should write things that are tested or at least true. But they did a pretty bad job of testing this and are making assumptions that are not true. Then they're basing their argument/reasoning (restrospectively) on assumptions not gounded in reality.
GIGO has an obvious Nothing-In-Nothing-Out trivial case.
The more human works I've read the more I feel meat intelligences are not that different from tensor intelligences.
This always contrasts with articles written by tech people and for tech people. They usually try to convey some information and maybe give some arguments for their position on some topic, but they are always concise and don't wallow in literary devices.
The best way to think about LLMs is to think of them as a Model of Language, but very Large
But the truth is there has been a major semantic shift. Previously LLMs could only solve puzzles whose answers were literally in the training data. It could answer a math puzzle it had seen before, but if you rephrased it only slightly it could no longer answer.
But now, LLMs can solve puzzles where, like, it has seen a certain strategy before. The newest IMO and ICPC problems were only "in the training data" for a very, very abstract definition of training data.
The goal posts will likely have to shift again, because the next target is training LLMs to independently perform longer chunks of economically useful work, interfacing with all the same tools that white-collar employees do. It's all LLM slop til it isn't, same as the IMO or Putnam exam.
And then we'll have people saying that "white collar employment was all in the training data anyway, if you think about it," at which point the metaphor will have become officially useless.
A practically infinite library where both gibberish and truth exist side by side.
The trick is navigating the library correctly. Except in this case you can’t reliably navigate it. And if you happen to stumble upon some “future truth” (i.e. new knowledge), you still need to differentiate it from the gibberish.
So a “crappy” version of the Library of Babel. Very impressive, but the caveats significantly detract from it.
I also know that we data and tech folks will probably never win the battle over anthropomorphization.
The average user of AI, nevermind folks who should know better, is so easily convinced that AI "knows," "thinks," "lies," "wants," "understands," etc. Add to this that all AI hosts push this perspective (and why not, it's the easiest white lie to get the user to act so that they get a lot of value), and there's really too much to fight against.
We're just gonna keep on running into this and it'll just be like when you take chemistry and physics and the teachers say, "it's not actually like this but we'll get to how some years down the line- just pretend this is true for the time being."
"We don't really know how human consciousness works, but the LLM resembles things we associate with thought, therefore it is thought."
I think most people would agree that the functioning of an LLM resembles human thought, but I think most people, even the ones who think that LLMs can think, would agree that LLMs don't think in the exact same way that a human brain does. At best, you can argue that whatever they are doing could be classified as "thought" because we barely have a good definition for the word in the first place.
The average human is so easily convinced that humans "know", "think", "lie", "want", "understand", etc.
But really it's all just a probabilistic chain reaction of electrochemical and thermal interactions. There is literally nowhere in the brain's internals for anything like "knowing" or "thinking" or "lying" to happen!
Strange that we have to pretend otherwise
This is a fundamentally interesting point. Taking your comment as HN would advise, I totally agree.
I think genAI freaks a lot of people out because it makes them doubt what they thought made them special.
And to your comment, humans have always used words they reserve for humanity that indicates we're special: that we think, feel, etc... That we're human. Maybe we're not so special. Maybe that's scary to a lot of people.
(And I was about to react with
"In 2025 , ironically, a lot of anti-anthropomorphization is actually anthropocentrism with a moustache."
I'll have to save it for the next debate)
There you go again, auto-morphizing the meat-bags. Vroom vroom.
"The machine accepts Chinese characters as input, carries out each instruction of the program step by step, and then produces Chinese characters as output. The machine does this so perfectly that no one can tell that they are communicating with a machine and not a hidden Chinese speaker.
The questions at issue are these: does the machine actually understand the conversation, or is it just simulating the ability to understand the conversation? Does the machine have a mind in exactly the same sense that people do, or is it just acting as if it had a mind?"
Here's one fun approach (out of 100s) :
What if we answer the Chinese room with the Systems Reply [1]?
Searle countered the systems reply by saying he would internalize the Chinese room.
But at that point it's pretty much exactly the Cartesian theater[2] : with room, homunculus, implement.
But the Cartesian theater is disproven, because we've cut open brains and there's no room in there to fit a popcorn concession.
I think there is some validity to the Cartesian theater, in that the whole of the experience that we perceive with our senses is at best an interpretation of a projection or subset of "reality."
But even more than that, today’s AI chats are far more sophisticated than probabilistically producing the next word. Mixture of experts routes to different models. Agents are able to search the web, write and execute programs, or use other tools. This means they can actively seek out additional context to produce a better answer. They also have heuristics for deciding if an answer is correct or if they should use tools to try to find a better answer.
The article is correct that they aren’t humans and they have a lot of behaviors that are not like humans, but oversimplifying how they work is not helpful.
Good argument against personifying wordbags. Don't be a dumb moth.
A test I did myself was to ask Claude (The LLM from Anthropic) to write working code for entirely novel instruction set architectures (e.g., custom ISAs from the game Turing Complete [5]), which is difficult to reconcile with pure retrieval.
[1] Lovelace, A. (1843). Notes by the Translator, in Scientific Memoirs Vol. 3. ("The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.") Primary source: https://en.wikisource.org/wiki/Scientific_Memoirs/3/Sketch_o.... See also: https://www.historyofdatascience.com/ada-lovelace/ and https://writings.stephenwolfram.com/2015/12/untangling-the-t...
[2] https://academic.oup.com/mind/article/LIX/236/433/986238
[3] https://www.cs.virginia.edu/~robins/Turing_Paper_1936.pdf
[4] https://web.stanford.edu/class/sts145/Library/life.pdf
[5] https://store.steampowered.com/app/1444480/Turing_Complete/
Tokens in form of neural impulses go in, tokens in the form of neural impulses go out.
We would like to believe that there is something profound happening inside and we call that consciousness. Unfortunately when reading about split-brain patient experiments or agenesis of the corpus callosum cases I feel like we are all deceived, every moment of every day. I came to realization that the confabulation that is observed is just a more pronounced effect of the normal.
There's clearly more going on in the human mind than just token prediction.
Also, I think there is a very high chance that given an existing LLM architecture there exists a set of weights that would manifest a true intelligence immediately upon instantiation (with anterograde amnesia). Finding this set of weights is the problem.
> Also, I think there is a very high chance that given an existing LLM architecture there exists a set of weights that would manifest a true intelligence immediately upon instantiation (with anterograde amnesia).
I don't see why that would be the case at all, and I regularly use the latest and most expensive LLMs and am aware enough of how they work to implement them on the simplest level myself, so it's not just me being uninformed or ignorant.
I would say that, token prediction is one of the things a brain does. And in a lot of people, most of what it does. But I dont think its the whole story. Possibly it is the whole story since the development of language.
That’s the point of “I think therefore I am.”
A. We don't really understand what's going on in LLMs. Mechanical interpretability is like a nascent field and the best results have come on dramatically smaller models. Understanding the surface-level mechanic of an LLM (an autoregressive transformer) should perhaps instill more wonder than confidence.
B. The field is changing quickly and is not limited to the literal mechanic of an LLM. Tool calls, reasoning models, parallel compute, and agentic loops add all kinds of new emergent effects. There are teams of geniuses with billion-dollar research budgets hunting for the next big trick.
C. Even if we were limited to baseline LLMs, they had very surprising properties as they scaled up and the scaling isn't done yet. GPT5 was based on the GPT4 pretraining. We might start seeing (actual) next-level LLMs next year. Who actually knows how that might go? <<yes, yes, I know Orion didn't go so well. But that was far from the last word on the subject.>>
> That’s also why I see no point in using AI to, say, write an essay, just like I see no point in bringing a forklift to the gym. Sure, it can lift the weights, but I’m not trying to suspend a barbell above the floor for the hell of it. I lift it because I want to become the kind of person who can lift it. Similarly, I write because I want to become the kind of person who can think.
And using AI to replace things you find recreational is not the point. If you got paid $100 each time you lifted a weight, would you see a point in bringing a forklift to the gym if it's allowed? Or will that make you a person who is so dumb that they cannot think, as the author is implying?
Generally, if I come across an opportunity to produce ideas or output, I want to capitalize on it for growing my skills and produce an individual and authentic artistic expression where I want to have very fine control over the output in a way that prompt-tweak-verify simply cannot provide.
I don't value the parts it fills in which weren't intentional on the part of the prompter, just send me your prompt instead. I'd rather have a crude sketch and a description than a high fidelity image that obscures them.
But I'm also the kind of person that never enjoyed manufactured pop music or blockbusters unless there's a high concept or technical novelty in addition to the high budget, generally prefer experimental indie stuff, so maybe there's something I just can't see.
So my issue is that you shouldn't dismiss AI use as trash just because AI has been used. You should dismiss it as trash because it is trash. But the post says is that you should dismiss it as trash because AI was involved in it somewhere so i feel that's a very shitty/wrong attitude to have.
Just pick the right tool for the job: don't take the forklift into the gym, and don't try to overhead press thousands of pounds that would fracture your spine.
The problem with AI, is that they waste the time of dedicated, thinking humans which care to improve themselves. If I write a three paragraph email on a technical topic, and some yahoo responds with AI, I'm now responding to gibberish.
The other side may not have read, may not understand, and is just interacting to save time. Now my generous nature, which is to help others and interact positively, is being wasted to reply to someone who seems to have put thought and care into a response, but instead was just copying and pasting what something else output.
We have issues with crackers on the net. We have social media. We have political interference. Now we have humans pretending to interact, rendering online interactions even more silly and harmful.
If this trend continues, we'll move back to live interaction just to reduce this time waste.
If anything there is a competing motivational structure in which people are incentivized not to think but to consume, react, emote etc. Information processing skills of the individual being deliberately eroded/hijacked/bypassed is not a AI thing. The most obvious example is ads. Thinkers are simply not good for business.
> We are in dire need of a better metaphor. Here’s my suggestion: instead of seeing AI as a sort of silicon homunculus, we should see it as a bag of words.
No, you describe the bark.
The end result is what counts. Training or not, it's just spewing predictive, relational text.
At least the human tone implies fallibility, you don’t want them acting like interactive Wikipedia.
And yet it did. We did get R2-D2. And if you ask R2-D2 what it's like to be him, he'll say: "like a library that can daydream" (that's what I was told just now, anyway.)
But then when we look inside, the model is simulating the science fiction it has already read to determine how to answer this kind of question. [0] It's recursive, almost like time travel. R2-D2 knows who he is because he has read about who he was in the past.
It's a really weird fork in science fiction, is all.
[0] https://www.scientificamerican.com/article/can-a-chatbot-be-...
> Similarly, I write because I want to become the kind of person who can think.
> But we don’t go to baseball games, spelling bees, and Taylor Swift concerts for the speed of the balls, the accuracy of the spelling, or the pureness of the pitch. We go because we care about humans doing those things.
My first thought was does anyone want to _watch_ me programming?
Let us not forget the old saw from SICP, “Programs must be written for people to read, and only incidentally for machines to execute.” I feel a number of people in the industry today fail to live by that maxim.
It suggests to me, having encountered it for the first time, that programs must be readable to remain useful. Otherwise they'll be increasingly difficult to execute.
It’s patently false in that code gets executed much more than it is read by humans.
[added] It was livecoding.tv - circa 2015 https://hackupstate.medium.com/road-to-code-livecoding-tv-e7...
To be fair, everage person couldn't answer this either, at least not without thorough research.
I stumbled across a good-enough analogy based on something she loves: refrigerator magnet poetry, which if it's good consists of not just words but also word fragments like "s", "ed", and "ing" kinda like LLM tokens. I said that ChatGPT is like refrigerator magnet poetry in a magical bag of holding that somehow always gives the tile that's the most or nearly the most statistically plausible next token given the previous text. E.g., if the magnets already up read "easy come and easy ____", the bag would be likely to produce "go". That got into her head the idea that these things operate based on plausibility ratings from a statistical soup of words, not anything in the real world nor any internal cogitation about facts. Any knowledge or thought apparent in the LLM was conducted by the original human authors of the words in the soup.
Did she ask if a "statistical soup of words," if large enough, might somehow encode or represent something a little more profound than just a bunch of words?
The defenders are right insofar as the (very loose) anthropomorphizing language used around LLMs is justifiable to the extent that human beings also rely on disorder and stochastic processes for creativity. The critics are right insofar as equating these machines to humans is preposterous and mostly relies on significantly diminishing our notion of what "human" means.
Both sides fail to meet the reality that LLMs are their own thing, with their own peculiar behaviors and place in the world. They are not human and they are somewhat more than previous software and the way we engage with it.
However, the defenders are less defensible insofar as their take is mostly used to dissimulate in efforts to make the tech sound more impressive than it actually is. The critics at least have the interests of consumers and their full education in mind—their position is one that properly equips consumers to use these tools with an appropriate amount of caution and scrutiny. The defenders generally want to defend an overreaching use of metaphor to help drive sales.
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...
And it never got better, the superior technology lost, and the war was won through content deals.
Lesson: Technology improvements aren't guaranteed.
The RNN and LSTM architectures (and Word2Vec, n-grams, etc) yielded language models that never got mass adoption. Like reel to reel. Then the transformer+attention hit the scene and several paths kicked off pretty close to each other. Google was working on Bert/encoder only transformer, maybe you could call that betamax. Doesn’t perfectly fit as in the case of beta it was actually the better tech.
OpenAI ran with the generative pre trained transformer and ML had its VHS? moment. Widespread adoption. Universal awareness within the populace.
Now with Titans (+miras?) are we entering the dvd era? Maybe. Learning context on the fly (memorizing at test time) is so much more efficient, it would be natural to call it a generational shift, but there is so much in the works right now with the promise of taking us further, this all might end up looking like the blip that beta vs vhs was. If current gen OpenAI type approaches somehow own the next 5-10 years then Titans, etc as Betamax starts to really fit - the shittier tech got and kept mass adoption. I don’t think that’s going to happen, but who knows.
Taking the analogy to present - who in the vhs or even earlier dvd days could imagine ubiquitous 4k+ vod? Who could have stood in a blockbuster in 2006 and knew that in less than 20 years all these stores and all these dvds would be a distant memory, completely usurped and transformed? Innovation of home video had a fraction of the capital being thrown at it that AI/ML has being thrown at it today. I would expect transformative generational shifts the likes of reel to cassette to optical to happen in fractions of the time they happened to home video. And beta/vhs type wars to begin and end in near realtime.
The mass adoption and societal transformation at the hands of AI/ML is just beginning. There is so. much. more. to. come. In 2030 we will look back at the state of AI in December 2025 and think “how quaint”, much the same as how we think of a circa 2006 busy Blockbuster.
I wouldn't say VHS was a blip. It was the recorded half video of media for almost 20 years.
I agree with the rest of what you said.
I'll say that the differences in the AI you're talking about today might be like the differences between VAX, PC JR, and the Lisa. All things before computing went main stream. I do think things go mainstream from tech a lot faster these days, people don't want to miss out.
I don't know where I'm going with this, I'm reading and replying to HN while watching the late night NFL game in an airport lounge.