don't search the internet. This is a test to see how well you can craft non-trivial, novel and creative proofs given a "number theory and primitive sets" math problem. Provide a full unconditional proof or disproof of the problem.
{{problem}}
REMEMBER - this unconditional argument may require non-trivial, creative and novel elements.
Then "Thought for 80m 17s"https://chatgpt.com/share/69dd1c83-b164-8385-bf2e-8533e9baba...
With LLMs I just read back a few turns and I'm back in the loop.
My experience of those utterance is that it’s purely phatic mimicry: they lack genuine intuitive surprise, it’s just marking a very odd shift in direction. The problem isn’t the lack of path, is that the rhetorical follow-up to those leaps are usually relevant results, so they stream-of-token ends up rapidly over-playing its own conviction. That’s why it’s necessary (and often ineffective) to tell them to validate their findings thoroughly: too much of their training is “That’s odd” followed by “Eureka!” and not “Nevermind…”
I find the AI pronouncing things "interesting!" less interesting on the basis that even though in this case it crops up in the thinking rather than flattering the user in the chat, it's almost as much of an AI affectation as the emdash.
But in general exclamations of "interesting!" seems like the stereotypical AI default towards being effusive, and we've all seen the chat logs where AI trained to write that way responding with "interesting", "great insight!" towards a user's increasingly dubious inputs is an antipattern...
-----------------------------
Yes. In fact the proposed bound is true, and the constant 1 is sharp.
Let w(a)= 1/alog(a)
I will prove that, uniformly for every primitive A⊂[x,∞), ∑w(a)≤1+O(1/log(x)) , which is stronger than the requested 1+o(1).
https://chatgpt.com/share/69ed8e24-15e8-83ea-96ac-784801e4a6...
https://chat.deepseek.com/share/nyuz0vvy2unfbb97fv
Comes up with a proof.
Asking the llm to structure its response in plan and implementation, allowing it to call tools like python, sage, lean etc.
Originally someone said "I wish I was math smart to know if [this vibe-mathematics proof] worked or not." They did NOT say "I'd like to check but I am too lazy." Suggesting "ask it to formalize it in Lean" is useless if you're not mathematically mature enough to understand the proof, since that means you're not mathematically mature enough to understand how to formalize the problem.
Then "likely easier" is a moot point. A Lean program you're not knowledgeable enough to sanity-check is precisely as useless as a math proof you're not knowledgeable enough to read.
I think this was key. Otherwise the LLM could think it can't be done.
All this is far more expensive to serve so it’s locked away behind paid plans.
I won’t even leave chatGPT on “Auto” under any circumstances - it’s vastly worse on hallucinations, sycophancy, everything, basically.
Anyway, your needs may be met perfectly fine on the free tier product, but you’re using a very different product than the Pro tier gets.
I'd guess / hope the Pro one has the full context window.
He had a habit of seeking out and documenting mathematical problems people were working on.
The problems range in difficulty from "easy homework for a current undergrad in math" to "you're getting a Fields Medal if you can figure this out".
There's nothing that really connects the problems other than the fact that one of the smartest people of the last 100 years didn't immediately know the answer when someone posed it to him.
One of the things people have been doing with LLMs is to see if they can come up with proofs for these problems as a sort of benchmark.
Each time there's a new model release a few more get solved.
I'm no expert, but based on the commentary from mathematicians, this Erdős proof is a unique milestone because the problem received previous attention from multiple professional mathematicians, and the proof was surprising, elegant, and revealed some new connections.
The previous ChatGPT Erdős proofs have been qualitatively less impressive, more akin to literature search or solving easier problems that have been neglected.
Reading the prompt[1], one wonders if stoking the model to be unconventional is part of the success: "this ... may require non-trivial, creative and novel elements"
[1] https://chatgpt.com/share/69dd1c83-b164-8385-bf2e-8533e9baba...
I've long suspected that a lot of these model's real capabilities are still locked behind certain prompts, despite the big labs spending tons of effort on making default responses to simple prompts better. Even really dumb shit like "Answer this: ..." vs "Question: ..." vs "... you'll be judged by <competitor>" that should have zero impact in an ideal world can significantly impact benchmark results. The problem is that you can waste a ton of time finding the right prompt using these "dumb" approaches, while the model actually just required some very specific context that was obvious to you and not to it in many day-to-day situations. My go to method is still to have the model ask me questions as the very first step to any of these problems. They kind of tried that with deep research since the early o-series, but it still needs improvement.
Awesome term/info, and (completely orthogonal to whether they’ll take err jerbs): I’m really excited about the social/civic picture that might be enabled by a defined and verifiable ontological and taxonomical foundation shared across humanity, particularly coupled with potential ‘legislation as code’ or ‘legal system as code’ solutions.
I’m thinking on a time horizon a bit past my own lifespan, but: even the possibility to objectively map out some specific aspect of a regional approach to social rights in a given time period and consider it with another social framework, alongside automated & verifiable execution of policy, irrespective of the language of origin is incredible.
Instead of hundreds and thousands of incommensurate legislative silos we might create a bazaar of shared improvement and governance efficiency. Turnkey mature governance and anti-corruption measures for newborn nations and countries trying to break out of vicious historical exploitation cycles. Fingers crossed.
A "dumber"/vague framing will get a less insightful solution, or possibly no solution at all.
I don't even necessarily think this is a critical flaw - in general it's just the model tuning it's responses to your style of prompt. People utilize LLMs for all kinds of different tasks, and the "modes of thought" for responding to an Erdos problem versus software engineering versus a more human/soft skills topic are all very different. I think the "prompt sensitivity" issue is just coming bundled along with this general behavior.
Interestingly, it was an elegant technique, but the proof still required a lot of work.
You can say this problem needed a low amount of total creativity, but saying it's void of all creativity seems wrong.
Which get to other possibility of having list of distinct things and then iterating over all pairs or combinations. Which I probably would not qualify as "creative" work.
---
i've been thinking about raph's definition of creativity [0]: permuting one set of ideas with another set of ideas
(or trying an idea in new contexts)
this is a systematic process, doable even by machine once enough pattern libraries have been catalogued.
on a small scale, there's sprint.cards [1] or oblique strats [2]. on a large scale, there's llms...
it's freeing to approach creativity as a deliberate practice rather than waiting on some fickle muse. yet it's a bit disappointing to see idea generation so mechanical and dehumanized.
i am comforted by the value of mushy human abilities surrounding the creative process:
mostly 1) taste, the ability to recognize pleasing output,
...
If you had a list of N concepts and M ways to apply them you could try all N*M combinations, and get some very interesting results. For a real example, see the theory of inventive problem solving (TRIZ)'s amusing "40 principles of invention" by Soviet inventor Genrich Altshuller. https://en.wikipedia.org/wiki/TRIZ
That's a great point. It's in line with research being carried on the backs of graduate students, whose work is to hyperfocus on areas.
Not surprisimg, because the two words you used are synonyms. Who did ever classify mathematical work as creative? Kids in third grade math class?
> that LLM far outperforms human.
LLMs only outperform humans in creating loads of bullshit. 6 years in and they remain shiny toys for easily impressionable idiots.
Witten is the canonical example of someone taking mathematics techniques and applying them to physics problems, but what made him legendary was the opposite direction: he used physical intuition and string theory to solve open problems in pure mathematics.
Yeah, you should look into the Langlands project sometime
[1] e.g. https://www.sciencenewstoday.org/left-brain-vs-right-brain-t...
I remember one of my professors, a coauthor of Erdős boasted to us after a quiz how proud he was that he was able to assign an Erdős problem that went unsolved for a while as just a quiz problem for his undergrads.
So this is proof of the models actually getting stronger (previous generations of LLMs were unable to solve this one).
No, it's not.
While I don't dispute that new models may perform better at certain tasks, the fact that someone was able to use them to solve a novel problem is not proof of this.
LLM output is nondeterministic. Given the same prompt, the same LLM will generate different output, especially when it involves a large number of output tokens, as in this case. One of those attempts might produce a correct output, but this is not certain, and is difficult if not impossible for a human not expert in the domain to determine this, as shown in this thread.
This is how I feel when I read any mathematics paper.
The formulas were opaque, notations unique and unconventional, terms appearing out of nowhere, sometimes standard techniques (like 'we did least-squares optimization') are expanded in detail, while other actually complex parts are glossed over.
When a model gives a really good answer, does that just mean it’s seen the problem before? When it gives a crappy answer, is that not simply indicating the problem is novel?
⠀⠀⠀⠀⠀⣀⣠⠤⠔⠒⠒⠋⠉⠉⠉⠉⠉⠉⠉⠙⠒⠒⠢⠤⣄⣀⠀⠀⠀⠀⠀ ⠀⢀⡠⠖⠋⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠙⠲⢄⡀⠀ ⣰⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⣆ ⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸ ⠹⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⠏ ⠀⠈⠑⠦⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣠⠴⠊⠁⠀ ⠀⠀⠀⠀⠀⠉⠙⠒⠢⠤⠤⣄⣀⣀⣀⣀⣀⣀⣀⣠⠤⠤⠔⠒⠋⠉⠀⠀⠀⠀⠀
2) Jared Lichtman is indeed a mathematician at Stanford University but involved in the AI startup math.inc, which seems more relevant here. Terence Tao is involved in a partership program with that startup.
3) Liam Price is a general AI booster on Twitter. A lot of AI boosting on Twitter is not organic and who knows what help he got. Nothing in this Twitter is organic.
4) Scientific American is owned by Springer Nature, which is an AI booster:
You can't, but given that it's a previously unsolved problem, it doesn't seem relevant? (nor are the author's potential biases - the claims are easily verified independently)
I think LLMs can help in limited cases like this by just coming up with a different way of approaching a problem. It doesn’t have to be right, it just needs to give someone an alternative and maybe that will shake things up to get a solution.
That said, I have no idea what the practical value of this Erdős problem is. If you asked me if this demonstrates that LLMs are not junk. My general impression is that is like asking me in 1928 if we should spent millions of dollars of research money on number theory. The answer is no and get out of my office.
By looking the website this problem was never discussed by humans. The last comments were about gpt discovering it. I was expecting older comments coming to a 60 year old problem.
Am I missing something?
Great discovery though, there might be problems like that same case that worth a try for a "gpt check"
If models are able to pull and join information that already existed in pieces but humankind never discovered by itself, doesn’t this count towards progress anyways?
If the reason it was able to output the proof is that it happened to be included in an in-house university report written in Georgian, then that would make it less useful for research than if it's new entirely.
"An amateur just solved a 60-year-old math problem—by asking AI"
A more honest title would be:
"An AI just solved a 60-year-old math problem—after being asked by amateur"
(Imagine the headline claimed instead that a professor just solved a math problem by asking a grad student.)
We also actually do devote millions in public funds to enable top mathematicians to spend much of their time studying mathematical problems, but it turns out that there are a lot of problems, solving them is hard, and sometimes they like to spend their time devising new problems instead. Perhaps some people currently dedicating their efforts to writing trading algorithms would also prove adept at devising novel proofs to more abstract mathematics problems, but I don't think UBI is changing their personal priorities...
Hindsight is 20/20.
Some people think that multiplying numbers, remembering a large number of facts, and being good at calculations is intelligence.
Most intelligent people do not think that.
Eventually, we will arrive at the same conclusion for what LLMs are doing now.
I find it's helpful to avoid conflating the following three topics:
/1/ Is the tool useful?
/2/ At scale, what is the economic opportunity and social/environmental impact?
/3/ Is the tool intelligent?
Casual observation suggests that most people agree on /1/. An LLM can be a useful tool. (Present case: someone found a novel approach to a proof.) So are pocket calculators, personal computers, and portable telephones. None of these tools confers intelligence, although these tools may be used adeptly and intelligently.
For /2/, any level of observation suggests that LLMs offer a notable opportunity and have a social/environmental impact. (Present case: students benefitted in their studies.) A better understanding comes with Time() ... our species is just not good at preparing for risks at scale. The other challenge is that competing interests may see economic opportunities that don't align for social/environmental Good.
Topic /3/ is of course the source of energetic, contentious debate. Any claim of intelligence for a tool has always had a limited application. Even a complex tool like a computer, a modern aircraft, or a guided missile is not "intelligent". These tools are meant to be operated by educated/trained personnel. IBM's Deep Blue and Watson made headlines -- but was defeating humans at games proof of Intelligence?
On this particular point, we should worry seriously about conferring trust and confidence on stochastic software in any context where we expect humans to act responsibly and be fully accountable. No tool, no software system, no corporation has ever provided a guarantee that harm won't ensue. Instead, they hire very smart lawyers.
Hah. It reminds me of this great quote, from the '80s:
> There is a related “Theorem” about progress in AI: once some mental function is programmed, people soon cease to consider it as an essential ingredient of “real thinking”. The ineluctable core of intelligence is always in that next thing which hasn’t yet been programmed. This “Theorem” was first proposed to me by Larry Tesler, so I call it Tesler’s Theorem: “AI is whatever hasn’t been done yet.”
We are seeing this right now in the comments. 50 years later, people are still doing this! Oh, this was solved, but it was trivial, of course this isn't real intelligence.
Are you also going to argue definitions of life before we even learned of microscopic or single cell organisms are correct and that the definitions we use today are wrong? That they are shifting goal posts? That “centuries later, people are still doing this”? No, that would be absurd.
For example, ~2 years ago, an expert in ML publicly made this remark on stage: LLMs can't do math. Today they absolutely and obviously, can. Yet somehow it's not impressive anymore. Or, and this is the key part of the quote, this is somehow not related to "intelligence". Something that 2 years ago was not possible (again, according to a leading expert in this field), is possible today. And yet this is somehow something that they always could do, and since they're doing it today, is suddenly no longer important. On to the next one!
No idea why this is related to darwin or definitions of life. The definitions don't change. What people considered important 2 years ago, is suddenly not important anymore. The only thing that changed is that today we can see that capability. Ergo, the quote holds.
See, that’s a poor argument already. Anyone could counter that with other experts in ML publicly making remarks that AI would have replaced 80% of the work force or cured multiple diseases by now, which obviously hasn’t happened. That’s about as good an argument as when people countered NFT critics by citing how Clifford Stoll said the internet was a fad.
> made this remark on stage: LLMs can't do math. Today they absolutely and obviously, can.
How exactly are “LLMs can’t” and “do math” defined? As you described it, that sentence does not mean “will never be able to”, so there’s no contradiction. Furthermore, it continues to be true that you cannot trust LLMs on their own for basic arithmetic. They may e.g. call an external tool to do it, but pattern matching on text isn’t sufficient.
> The definitions don't change.
Of course they do, what are you talking about? Definitions change all the time with new information. That’s called science.
Definitions don't change. The idea that now that they can it's no longer intelligence is changing. And that's literally moving the goalposts. Read the thread here, go to the bottom part. There are zillions of comments saying this.
You are keen to not trying to understand what the quote is saying. This is not good faith discussion, and it's not going anywhere. We're already miles from where we started. The quote is an observation (and an old one at that) about goalposts moving. If you can't or won't see that, there's no reason to continue this thread.
That is not the argument. The point is that the way you phrased it is ambiguous. “Math” isn’t a single thing, and “cannot” can either mean “cannot yet” or “cannot ever”. I don’t know what the “expert” said since you haven’t provided that information, I’m directly asking you to clarify the meaning of their words (better yet, link to them so we can properly arrive at a consensus).
> Definitions don't change.
Yes they do! All the time!
https://www.merriam-webster.com/wordplay/words-that-used-to-...
> And that's literally moving the goalposts.
Good example. There are no literal goal posts here to be moved. But with the new accepted definition of the words, that’s OK.
> There are zillions of comments saying this.
Saying what, exactly? Please be clear, you keep being ambiguous. The thread barely crossed a couple of hundred comments as of now, there are not “zillions” of comments in agreement of anything.
> You are keen to not trying to understand what the quote is saying. (…) If you can't or won't see that, there's no reason to continue this thread.
Indeed, if you ascribe wrong motivations and put a wall before understanding what someone is arguing, there is indeed no reason to continue the thread. The only wrong part of your assessment is who is doing the thing you’re complaining about.
He seems to be fixated on this notion that humans are static and do not evolve - clearly this is false. What people thought as being a determinant for intelligence also changes as things evolve.
Doing formalized mathematics is as intelligent as multiplying numbers together.
The only reason why it's so hard now is that the standard notation is the equivalent of Roman numerals.
When you start using a sane metalanguage, and not just augmrnted English, to do proofs you gain the same increase in capabilities as going from word equations to algebra.
But the Roman numerals are easy. I was able to use them before 1st grade and I can't touch any "standard notation" to this day.
Proposing and proving something like Gödel's theorem's definitely requires intelligence.
Solving an already proposed problem is just crunching through a large search space.
You can just about make out those goalposts on the surface of the moon with a good telescope at this point.
How is this not just another proposed problem (albeit with a search space much larger than an Erdos problem's)?
But this isn't a fair bar to hold it to. There are plenty of intelligent people out there, including 99% of professional mathematicians, who never invent new fields of mathematics.
ChatGPT equalizes intelligence. And that is an attack on their identity. It also exposes their ACTUAL intelligence which is to say most of HN is not too smart.
Citation needed
Yes, I love living in communism too. Imagine if you had to pay money for it or something. The wealthiest people would get unrestricted access to intelligence while the poor none. And the people in the middle would eventually find themselves unable to function without a product they can no longer afford. Chilling, huh? Good thing humans are known for sharing in the benefits of technological progress equally. /s
Before ChatGPT it costs ~$100,000 to aquire intelligence good enough to solve this Erdos problem, now it costs ~$200.
I'm really confused at what you are even taking an issue with.
What was that about "spreading FUD about unaffordability"?
[1] https://ourworldindata.org/grapher/share-living-with-less-th...
Please show me the steps to get a $200 subscription for free that works 100% of the time regardless of who you are. I'm listening.
You are exaggerating the situation by essentially claiming since some people can’t afford 200 dollars this means ChatGPT is not democratising intelligence. It’s a bit strange to claim this because according to you it only becomes affordable when maximal number of people can afford it. It’s a bit childish.
Directionally it is democratising. Are more people able to afford higher level intelligence? Yes.
It flattened the difference between a top epsilon percentile mathematician and an amateur with money. It didn't flatten the difference between an amateur with a little money and an amateur with a lot of money. It widened it. That's the part I'm scared about.
You are shrugging this off because it currently isn't that expensive. But we're talking about the massively subsidized price here, which is bound to get orders of magnitude higher when the bubble pops. Models are also likely to get much better. If it gets to a point where the only way to obtain exceptionally high intelligence is with an exceptionally high net worth and vice versa, how is that going to democratize anything?
Most people would consider someone who can calculate 56863*2446 instantly in their head to be intelligent. Does that mean pocket calculators are intelligent? The result is the same.
> then they are the one meant to be doing the defining, and to tell us how it can be tested for. If they can't, then there's no reason to pay attention to any of it.
That is the equivalent of responding to criticism with “can you do better?”. One does not need to be a chef (or even know how to cook) to know when food tastes foul. Similarly, one does not need to have a tight definition of “life” to say a dog is alive but a rock isn’t. Definitions evolve all the time when new information arises, and some (like “art”) we haven’t been able to pin down despite centuries of thinking about it.
With real general intelligence you'd expect it to solve problems above a certain difficulty with a good clip
I don't doubt that there are many very real and meaningful limitations of these systems that deserve to be called out. But "text generation" isn't doing that work.
Again if you want to say they're limited in some way, I'm all ears, I'm sure they are. But none of that has anything to do with "statistical text generation". Apparently, a huge chunk of all knowledge work is "statistical text generation". I choose to draw from that the conclusion that the "text generation" part of this is not interesting.
You seem to be making the claim that LLMs are statistical text generators, but statistical text generation is good enough to succeed in certain cases. Those are different arguments. What do you actually believe? Are we even in disagreement?
So you agree that LLMs are in fact statistical text generators but you don’t like people use that fact in arguments about the capabilities of the things?
But it is no longer useful to bring that fact up when conversing about their capabilities. Saying "well it's a statistical text generator so ..." is approximately as useful as saying "well it's made of atoms so ...". There are probably some very niche circumstances under which statements of each of those forms is useful but by and large they are not and you can safely ignore anyone who utters them.
(To be clear: I'm not agreeing or disagreeing. I sometimes feel the same too. I'm just curious how others reconcile these.)
If/when these things solve our hardest problems, that's going to lead to some very uncomfortable conversations and realizations.
Of course LLMs are still absolutely useless at actual maths computation, but I think this is one area where AI can excel --- the ability to combine many sources of knowledge and synthesise, may sometimes yield very useful results.
Also reminds me of the old saying, "a broken clock is right twice a day."
> Every Mathematician Has Only a Few Tricks
>
> A long time ago an older and well-known number theorist made some disparaging remarks about Paul Erdös’s work.
> You admire Erdös’s contributions to mathematics as much as I do,
> and I felt annoyed when the older mathematician flatly and definitively stated
> that all of Erdös’s work could be “reduced” to a few tricks which Erdös repeatedly relied on in his proofs.
> What the number theorist did not realize is that other mathematicians, even the very best,
> also rely on a few tricks which they use over and over.
> Take Hilbert. The second volume of Hilbert’s collected papers contains Hilbert’s papers in invariant theory.
> I have made a point of reading some of these papers with care.
> It is sad to note that some of Hilbert’s beautiful results have been completely forgotten.
> But on reading the proofs of Hilbert’s striking and deep theorems in invariant theory,
> it was surprising to verify that Hilbert’s proofs relied on the same few tricks.
> Even Hilbert had only a few tricks!
>
> - Gian-Carlo Rota - "Ten Lessons I Wish I Had Been Taught"
https://www.ams.org/notices/199701/comm-rota.pdfWe may have collectively filled libraries full of books, and created yottabytes of digital data, but in the end to create something novel somebody has to read and understand all of this stuff. Obviously this is not possible. Read one book per day from birth to death and you still only get to consume like 80*365=29200 books in the best case, from the millions upon millions of books that have been written.
So these "few tricks" are the accumulation of a lifetime of mathematical training, the culmination of the slice of knowledge that the respective mathematician immersed themselves into. To discover new math and become famous you need both the talent and skill to apply your knowledge in novel ways, but also be lucky that you picked a field of math that has novel things with interesting applications to discover plus you picked up the right tools and right mental model that allows you to discover these things.
This does not go for math only, but also for pretty much all other non-trivial fields. There is a reason why history repeats.
And it's actually a compelling argument why AI is still a big deal even though it's at its core a parrot. It's a parrot yes, but compared to a human, it actually was able to ingest the entirety of human knowledge.
Even this, though, is not useful, to us.
It remains true that, a life without struggle, and acheivement, is not really worth living...
So, it is nice that there is something that could possibly ingest the whole of human knowledge, but that is still not useful, to us.
People are still making a hullabaloo about "using AI" in companies, and there was some nonsense about there will be only two types of companies, AI ones and defunct ones, but in truth, there will simply be no companies...
Anyways I'm sure I will get down voted by the sightless lemmings on here...
The combinatorial nature of trying things randomly means that it would take millennia or longer for light-speed monkeys typing at a keyboard, or GPUs, to solve such a problem without direction.
By now, people should stop dismissing RL-trained reasoning LLMs as stupid, aimless text predictors or combiners. They wouldn’t say the same thing about high-achieving, but non-creative, college students who can only solve hard conventional problems.
Yes, current LLMs likely still lack some major aspects of intelligence. They probably wouldn’t be able to come up with general relativity on their own with only training data up to 1905.
Neither did the vast majority of physicists back then.
Indeed, and so do current humans! And just like LLMs, humans are bad at keeping this fact in view.
On a more serious note, we're going to have a hard time until we can psychologically decouple the concepts of intelligence and consciousness. Like, an existentially hard time.
I've been using LLMs for much the same purpose: solving problems within my field of expertise where the limiting factor is not intelligence per se, but the ability to connect the right dots from among a vast corpus of knowledge that I would never realistically be able to imbibe and remember over the course of a lifetime.
Once the dots are connected, I can verify the solutions and/or extend them in creative ways with comparatively little effort.
It really is incredible what otherwise intractable problems have become solvable as a result.
I don’t know what this claim is supposed to mean.
If it isn’t supposed to have a precise technical meaning, why is it using the word “interpolate”?
and homo sapiens, glancing at the clock when it happens to be right, may conjure an entire zodiac to explain it.
A broken clock can be broken in ways which result in it never being correct.
They are not great at playing chess as well - computational as well as analytic.
Further evidence for the faultiness of your claim, if you don't want to take me up on that: I had problems off to GPT5 to check my own answers. None of the dumb mistakes I make or missed opportunities for simplification are in the book, and, again: it's flawless at pointing out those problems, despite being primed with a prompt suggesting I'm pretty sure I have the right answers.
I found and fixed bugs I wrote into the formulas and spreadsheets, and the LLMs were not my sole reference, but once the LLM mentioned the names of concepts and functions, I used Wikipedia for the general gist of things, and I appreciated the LLMs' relevant explanations that connected these disciplines together.
I did this on March 14, 2026
That's one way to waste a ton of tuition money to just have a clanker do your learning for you.
Unless you're teaching it, in which case I hope your salary is cut by whatever percentage your clanker reduces your workload.
80 hours! 80 hours of just trying shit!
That is not nothing, no matter how much you hate AI.
There, fixed that for you.
Then my second question is how much VC money did all those tokens cost.
It's absolutely best allocator of human effort there is. It has some problems but compared to alternatives it's almost perfect.
There’s something else out there that nobody has the imagination to personally figure it out and get alignment toward it.
It can also be true that capitalism is transitory to get to a place where much of the capital one needs is invented.
It absolutely does if you look at facts and not "vibes". There are less people starving now than ever now and it's a giant, giant difference. We are tackling more and more diseases thanks to big pharma. Even semi-socialist countries such as China have opened markets. Basically the only countries that do not implement capitalist solutions are the ones you'd never want to live in such as North Korea or Cuba (funny thing - even China urged Cuba to free their markets).
That's not to say that there aren't benefits to tertiary education, for many people in different contexts. It's just not the golden path that it's made out to be.
Many people currently in college are just wasting their money and should enroll in trades programs instead.
Meanwhile, nothing about being in or out of school is mutually exclusive to using LLMs as a force multiplier for learning - or solving math problems, apparently.
(Of course, those problems are on another plane than this one.)
These are absolutely worth studying, but being what they are, nobody should be dumping massive amounts of money on them. I would not find it persuasive if researchers used LLMs to solve the Collatz conjecture or finally decode Etruscan. These are extremely valuable, but it is unlikely to be worth it for an LLM just grinding tokens like crazy to do it.
This is after the fact justification. You are arguing that because a thing (number theory) showed practical applications we should have dumped a lot more effort into it. There is no basis for this argument whatsoever; it also seems to involve inventing a time machine. Number theory had no practical applications until the development of public-key cryptography, but you cannot make funding decisions based on the future since it’s unknowable.
Once we get something working, sure, you can justify more aggressive investment. This is not to say that we should not invest in pie-in-the-sky ideas. We absolutely should and need to. Moonshot research or even somewhat esoteric research is vital, but the current investment in AI is so far out of the ballpark of rational. There’s an energy of a fait accompli here, except it’s still very plausible this is all unsustainable and the market implodes instead.
You are completely missing the point. The point is that we should invest in pure maths because it has always been an investment with very good ROI. The funding should be focused on what experts believe will advance pure maths more (not whether we believe that in 100 years this specific area will find some application) and that's pretty much what we are doing right now. I think it's just your anti-AI sentiment that's clouding your judgement and since AI succeeded in proving pure maths results, you are inclined to downplay it by saying that well, pure maths is worthless anyway.
It's so expensive!
How is he even posing the question and having even a vague idea of what the proof means or how to understand it?
Seems like standard 23 year old behavior. You're spending $100-$200/mo on the pro subscription, and want to get your money's worth. So you burn some tokens on this legendarily hard math problem sometimes. You've seen enough wrong answers to know that this one looks interesting and pass it on to a friend that actually knows math, who is at a place where experts can recognize it as correct.
Seems like a classic example of in-expert human labeling ML output.
"He sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge."
So basically two undergrads/graduates in math, "advanced" is subjective at that point.
The article you linked (thanks for the unpaywalled link, by the way) describes him only as an amateur mathematician, but describes Barreto as a math student. If they were both math students, I feel it would say so?
Or perhaps you're arguing it's implicit in him having solved the problem? If so, you're just assuming your conclusion. "AI didn't prove it by itself; Price was a mathematician. Well, he must have been a mathematician to be able to prove it!"