How scientists are using Claude to accelerate research and discovery
80 points
5 hours ago
| 5 comments
| anthropic.com
| HN
jadenpeterson
4 hours ago
[-]
Not to be a luddite, but large language models are fundamentally not meant for tasks of this nature. And listen to this:

> Most notably, it provides confidence levels in its findings, which Cheeseman emphasizes is crucial.

These 'confidence levels' are suspect. You can ask Claude today, "What is your confidence in __" and it will, unsurprisingly, give a 'confidence interval'. I'd like to better understand the system implemented by Cheeseman. Otherwise I find the whole thing, heh, cheesy!

reply
isoprophlex
1 hour ago
[-]
I've spent the last ~9 months building a system that, amongst other things, uses a vLLM to classify and describe >40 million house images of number signs in all of Italy. I wish I was joking, but that aside.

When asked about their confidence, these things are almost entirely useless. If the Magic Disruption Box is incapabele of knowing whether or not it read "42/A" correctly, I'm not convinced it's gonna revolutionize science by doing autonomous research.

reply
bob1029
1 hour ago
[-]
How exactly are we asking for the confidence level?

If you give the model the image and a prior prediction, what can it tell you? Asking for it to produce a 1-10 figure in the same token stream as the actual task seems like a flawed strategy.

reply
Yajirobe
42 minutes ago
[-]
A blind mathematician can do revolutionary work despite not being able to see
reply
red75prime
2 hours ago
[-]
> large language models are fundamentally not meant for tasks of this nature

There should be some research results showing their fundamental limitations. As opposed to empirical observations. Can you point at them?

What about VLMs, VLAs, LMMs?

reply
utopiah
2 hours ago
[-]
Old "agged Technological Frontier" but explains a bit the challenge https://www.hbs.edu/faculty/Pages/item.aspx?num=64700 namely... it's hard and the lack of reproducibility (models getting inaccessible to researcher quickly) makes this kind of studies very challenging.
reply
red75prime
2 hours ago
[-]
That is an old empirical study. jadenpeterson was talking about some fundamental limitations of LLMs.
reply
eurekin
1 hour ago
[-]
I made a toy order item cost extractor out of my pile of emails. Claude added confidence percentage tracking and it couldn't be more useless.
reply
post_below
2 hours ago
[-]
Finding patterns in large datasets is one of the things LLMs are really good at. Genetics is an area where scientists have already done impressive things with LLMs.

However you feel about LLMs, and I say this because you don't have to use them for very long before you witness how useful they can be for large datasets so I'm guessing you're not a fan, they are undeniably incredible tools in some areas of science.

https://news.stanford.edu/stories/2025/02/generative-ai-tool...

https://www.nature.com/articles/s41562-024-02046-9

reply
catlifeonmars
1 hour ago
[-]
In reference to the second article: who cares? What we care about is experimental verification. I could see maybe accurate prediction being helpful in focusing funding, but you still gotta do the experimentation.

Not disagreeing with your initial statement about LLMs being good and finding patterns in datasets btw.

reply
refurb
2 hours ago
[-]
As a scientist, the two links you provided are severely lacking in utility.

The first developed a model to calculate protein function based on DNA sequence - yet provides no results of testing of the model. Until it does, it’s no better than the hundreds of predictive models thrown on the trash heap of science.

The second tested a models “ability to predict neuroscience results” (which reads really oddly). How did they test it? Pitted humans against LLMs in determining which published abstracts were correct.

Well yeah? That’s exactly what LLMs are good at - predicting language. But science is not advanced by predicting which abstracts of known science are correct.

It reminds me of my days in working with computational chemists - we had an x-ray structure of the molecule bound to the target. You can’t get much better than that at hard, objective data.

“Oh yeah, if you just add a methyl group here you’ll improve binding by an order of magnitude”.

So we went back to the lab, spent a week synthesizing the molecule, sent it to the biologists for a binding study. And the new molecule was 50% worse at binding.

And that’s not to blame the computation chemist. Biology is really damn hard. Scientists are constantly being surprised at results that are contradictory to current knowledge.

Could LLMs be used in the future to help come up with broad hypotheses in new areas? Sure! Are the hypotheses going to prove fruitless most of the time? Yes! But that’s science.

But any claim of a massive leap in scientific productivity (whether LLMs or something else) should be taken with a grain of salt.

reply
troupo
2 hours ago
[-]
> Finding patterns in large datasets is one of the things LLMs are really good at.

Where by "good at" you mean "are totally shit at"?

They routinely hallucinate things even on tiny datasets like codebases.

reply
post_below
1 hour ago
[-]
I don't follow the logic that "it hallucinates so it's useless". In the context of codebases I know for sure that they can be useful. Large datasets too. Are they also really bad at some aspects of dealing with both? Absolutely. Dangerously, humorously bad sometimes.

But the latter doesn't invalidate the former.

reply
djtango
3 hours ago
[-]
Can't LLMs be fed the entire corpus of literature to synthesise (if not "insight") useful intersections? Not to mention much better search than what was available when I was a lowly grad...
reply
vimda
3 hours ago
[-]
This is what Yan Le Cun means when he talks about how research is at a dead end at the moment with everyone all in on LLMs to a fault
reply
agumonkey
1 hour ago
[-]
I'm just a noob but lecun seems obsessed with the idea of world models, which I assume means a more rigorous physical approach, and I don't understand (again, confused noob here) how are t would help precise abstract thinking.
reply
alsetmusic
4 hours ago
[-]
Call me when a disinterested third-party says so. PR announcements by the very people who have a large stake in our belief in their product are unreliable.
reply
joshribakoff
3 hours ago
[-]
This company predicts software development is a dead occupation yet ships a mobile chat UI that appears to be perpetually full of bugs, and has had a number of high profile incidents.
reply
simonw
3 hours ago
[-]
"This company predicts software development is a dead occupation"

Citation needed?

Closest I've seen to that was Dario saying AI would write 90% of the code, but that's very different from declaring the death of software development as an occupation.

reply
NewsaHackO
4 hours ago
[-]
Is your argument that the quotes by the researchers in the article are not real?
reply
catlifeonmars
1 hour ago
[-]
The point is to look at who is making a claim and asking what they hope to gain from it. This is orthogonal to what the thing is, really. It’s just basic skepticism.

Even if the article is accurate, it still makes sense to question the motives of the publisher. Especially if they’re selling a product.

reply
taormina
3 hours ago
[-]
What quotes? This is an AI summary that may or may not have summarized actual quotes from the researchers, but I don't see a single quote in this article, or a source.
reply
famouswaffles
3 hours ago
[-]
Why are you commenting if you can't even take a few minutes to read this ? It's quite bizarre. There's a quote and repo for Cheeseman, and a paper for Biomni.
reply
WD-42
3 hours ago
[-]
There is only one quote in the entire article, though:

> Cheeseman finds Claude consistently catches things he missed. “Every time I go through I’m like, I didn’t notice that one! And in each case, these are discoveries that we can understand and verify,” he says.

Pretty vague and not really quantifiable. You would think an article making a bold claim would contain more than a single, hand-wavy quote from an actual scientist.

reply
famouswaffles
3 hours ago
[-]
>Pretty vague and not really quantifiable. You would think an article making a bold claim would contain more than a single, hand-wavy quote from an actual scientist.

Why? What purpose would quotes serve better than a paper with numbers and code? Just seems like nitpicking here. The article could have gone without a single quote (or had several more) and it wouldn't really change anything. And that quote is not really vague in the context of the article.

reply
inferiorhuman
2 hours ago
[-]
Credibility. Why would I bother reading AI slop put out by a company who makes money off by convincing people to pay for AI slop?
reply
inferiorhuman
3 hours ago
[-]
Conflict of interest is a thing. The researchers could be AI hallucinations. The quotes could be too. Or the researchers could be real and intentionally saying things that are untrue. Who knows.

What is interesting is that HN seems to have reached a crescendo of AI fanboi posts. Yet if you step outside the bubble the Microsoft and Nvidia CEOs are begging people to actually like AI, Dell's come out and said that people don't want AI, and forums are littered with people complaining about negative consequences of AI. Go figure.

reply
simonw
3 hours ago
[-]
Most people aren't software developers. The HN audience can benefit from LLMs in ways that many people don't value.
reply
bpodgursky
3 hours ago
[-]
Are you accusing Anthropic of hallucinating an MIT lab under the MIT domain? I mean they literally link to it https://cheesemanlab.wi.mit.edu/
reply
NewsaHackO
3 hours ago
[-]
Honestly, it doesn't even seem they read the article, just came in, saw it was pro-AI, and commented.
reply
famouswaffles
3 hours ago
[-]
You know things have shifted a gear when people just start flat out denying reality.
reply
inferiorhuman
3 hours ago
[-]
Anthropic puts out plenty of AI slop. I'll wait for a human who doesn't have a financial interest in propping up Anthropic to review the slop before passing judgement.
reply
username223
4 hours ago
[-]
Pairs well with this: https://hegemon.substack.com/p/the-age-of-academic-slop-is-u...

Taking CV-filler from 80% to 95% of published academic work is yet another revolutionary breakthrough on the road to superintelligence.

reply
subdavis
3 hours ago
[-]
> scholarly dark matter that exists to pad CVs and satisfy bureaucratic metrics, but which no one actually reads or relies upon.

Is it cynical to believe this is already true and has been forever?

Is it naive to hope that when AI can do this work, we will all admit that much of the work was never worth doing in the first place, our academic institutions are broken, and new incentives are sorely needed?

I’m reminded of a chapter in Abundance where Ezra Klein notes how successful (NIH?) grant awardees are getting older over time, nobody will take risks on young scientists, and everyone is spending more of their time churning out bureaucratic compliance than doing science.

reply
LegitShady
2 hours ago
[-]
oh look another advertisement for anthropic
reply
Ronsenshi
19 minutes ago
[-]
Steady stream of these very regularly. Lately feels like place is a marketing board AI-anything companies.
reply
desireco42
3 hours ago
[-]
By paying Anthropic large sums of money ?!?

Funny you say that.

reply