Major AI conference flooded with peer reviews written by AI
65 points
1 hour ago
| 14 comments
| nature.com
| HN
jampa
1 hour ago
[-]
While I think there's significant AI "offloading" in writing, the article's methodology relies on "AI-detectors," which reads like PR for Pangram. I don't need to explain why AI detectors are mostly bullshit and harmful for people who have never used LLMs. [1]

1: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

reply
nkrisc
14 minutes ago
[-]
> Pangram’s analysis revealed that around 21% of the ICLR peer reviews were fully AI-generated, and more than half contained signs of AI use. The findings were posted online by Pangram Labs. “People were suspicious, but they didn’t have any concrete proof,” says Spero. “Over the course of 12 hours, we wrote some code to parse out all of the text content from these paper submissions,” he adds.

But what's the proof? How do you prove (with any rigor) a given text is AI-generated?

reply
getnormality
24 minutes ago
[-]
I wouldn't be surprised if the headline is accurate, but AI detectors are widely understood to be unreliable, and I see no evidence that this AI detector has overcome the well-deserved stigma.
reply
SoftTalker
17 minutes ago
[-]
In particular, conference papers are already extremely formulaic, organized in a particular way and using a lot of the same stock phrasings and terms of art. AI or not, it's hard to tell them apart.
reply
JohnCClarke
20 minutes ago
[-]
The question is not are the reviews AI generated. The question is are the reviews accurate?
reply
stanfordkid
11 minutes ago
[-]
Exactly this. Like is the research actually useful and correct is what matters. Also if it is accurate, instead of schadenfreude shouldn't that elicit extreme applause? It's feeling a bit like a click-bait rage-fantasy fueled by Pangram, capitalizing on this idea that AI promotes plagiarism / replaces jobs and now the creators of AI are oh-too human... and somehow this AI-detection product is above it all.
reply
raincole
1 hour ago
[-]
> Controversy has erupted after 21% of manuscript reviews for an international AI conference were found to be generated by artificial intelligence.

21%...? Am I reading it right? I bet no one expected it's so low when they clicked this title.

reply
conartist6
41 minutes ago
[-]
21% fully AI generated. In other words, 21% blatant fraud.

In accident investigation we often refer to "holes in the swiss cheese lining up." Dereliction of duty is commonly one of the holes that lines up with all the others, and is apparently rampant in this field.

reply
tmule
36 minutes ago
[-]
Why? I often feed an entire document I hastily wrote into an AI and prompt it to restructure and rewrite it. I think that’s a common pattern.
reply
conartist6
33 minutes ago
[-]
It might be, but I really doubt those were the documents flagged as fully AI generated. If it erased all the originality you had put into that work and made it completely bland and regressed-to-the-mean, I would hope that you would notice.
reply
exe34
18 minutes ago
[-]
> I would hope that you would notice.

he didn't say he read it carefully after running it through the slop machine.

reply
hnaccount_rng
1 hour ago
[-]
My initial reaction was: Oh no, who would have thought? But then... 21% is almost shockingly low. Especially given that there are almost certainly some false positive, given that this number originates with a company selling "detecting AI generated text"
reply
cratermoon
22 minutes ago
[-]
Headline should be "AI vendor’s AI-generated analysis claims AI generated reviews for AI-generated papers at AI conference".

h/t to Paul Cantrell https://hachyderm.io/@inthehands/115633840133507279

reply
hiddencost
1 hour ago
[-]
Automated AI detection tools do not work. This whole article is premised on an analysis by someone trying to sell their garbage product.
reply
AznHisoka
54 minutes ago
[-]
Yeah that is the premise all of these articles/tools just conveniently brush off. “We detected that x%… “ OK, and how do I know ur detectiok algorithm is right?
reply
conartist6
37 minutes ago
[-]
Usually the detectors are only called in once a basic "smell test" has failed. Those tests are imperfect, yes, but Bayesian probability tells us how to work out the rest. I have 0 trouble believing that the prior probability of an unscrupulous individual offloading an unpleasant and perceived-as-just-ceremonial duty to the "thinking machine" is around 20%. See: https://www.youtube.com/watch?v=lG4VkPoG3ko&pp=ygUZdmVyaXRhc...
reply
exe34
16 minutes ago
[-]
Could the big names make a ton of money here by selling AI detectors? they would need to store everything they generate, and then provide a % match to something they produced.
reply
NitpickLawyer
32 minutes ago
[-]
This is the kind of situation where everything sucks. You'd think that one of the biggest AI conference out there would have seen this coming.

On the one hand (and the most important thing, IMO) it's really bad to judge people on the basis of "AI detectors", especially when this can have an impact on their career. It's also used in education, and that sucks even more. AI detectors have bad rates, can't detect concentrated efforts (i.e. finetunes will trick every detector out there, I've tried) can have insane false positives (the first ones that got to "market" were rating the declaration of independence as 100% AI written), and at best they'll only catch the most vanilla outputs.

On the other hand, working with these things, and just being online is impossible to say that I don't see the signs everywhere. Vanilla LLMs fixate on some language patterns, and once you notice them, you see them everywhere. It's not just x; it was truly y. Followed by one supportive point, the second supportive point and the third supportive point. And so on. Coupled with that vague enough overview style, and not much depth, it's really easy to call blatant generations as you see them. It's like everyone writes in linkedin infused mania episodes now. It's getting old fast.

So I feel for the people who got slop reviews. I'd be furious. Especially when its faux pas to call it out.

I also feel for the reviewers that maybe got caught in this mess for merely "spell checking" their (hopefully) human written reviews.

I don't know how we'll fix it. The only reasonable thing for the moment seems to be drilling into everyone that at the end of the day they own their stuff. Be it a homework, a PR or a comment on a blog. Some are obviously more important than the others, but still. Don't submit something you can't defend, especially when your education/career/reputation depends on it.

reply
ZeroConcerns
1 hour ago
[-]
The claim "written by AI" is not really substantiated here, and as someone who's been accused of submitting AI-generated content repeatedly recently, while that was all honestly stuff I wrote myself (hey, what can I say? I just like EM-dashes...), I sort-of sympathize?

Yes, AI slop is an issue. But throwing more AI at detecting this, and most importantly, not weighing that detection properly, is an even bigger problem.

And, HN-wise, "this seems like AI" seems like a very good inclusion in the "things not to complain about" FAQ. Address the idea, not the form of the message, and if it's obviously slop (or SEO, or self-promotion), just downvote (or ignore) and move on...

reply
stevemk14ebr
54 minutes ago
[-]
Banning calling out AI slop hardly seems like an improvement
reply
ZeroConcerns
38 minutes ago
[-]
What I'm advocating is a "downvote (or ignore) and move on" attitude, as opposed to "I'm going to post about this" stance. Because, similar to "your color scheme is not a11y-friendly" or "you're posting affiliatate-links" or "this is effectively a paywall", there is zero chance of a productive conversation sprouting from that.
reply
JohnCClarke
18 minutes ago
[-]
What percentage of the papers where written by AI?

And, if your AI can't write a paper, are you even any good as an AI researcher? :^)

reply
p1esk
14 minutes ago
[-]
Did you mean: “if your AI can’t write a paper that passes an AI detector, are you any good as an AI researcher?”
reply
heresie-dabord
1 hour ago
[-]
AI research is interesting, but AI Slop is the monetising factor.

It's inevitable that faces will be devoured by AI Leopards.

reply
xhkkffbf
1 hour ago
[-]
Shouldn't AIs be able to participate in deciding their future?

If they had a conference on, say, the Americans, wouldn't it be fair for Americans to have a seat at the table?

reply
subscribed
11 minutes ago
[-]
I hope it's tongue-in-cheek.
reply