FilterHN

Major AI conference flooded with peer reviews written by AI

65 points

by _____k

1 hour ago

| past

| 14 comments

| nature.com

| HN

▲

jampa

1 hour ago

[-]

While I think there's significant AI "offloading" in writing, the article's methodology relies on "AI-detectors," which reads like PR for Pangram. I don't need to explain why AI detectors are mostly bullshit and harmful for people who have never used LLMs. [1]

1: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

▲

nkrisc

14 minutes ago

[-]

> Pangram’s analysis revealed that around 21% of the ICLR peer reviews were fully AI-generated, and more than half contained signs of AI use. The findings were posted online by Pangram Labs. “People were suspicious, but they didn’t have any concrete proof,” says Spero. “Over the course of 12 hours, we wrote some code to parse out all of the text content from these paper submissions,” he adds.

But what's the proof? How do you prove (with any rigor) a given text is AI-generated?

▲

getnormality

24 minutes ago

[-]

I wouldn't be surprised if the headline is accurate, but AI detectors are widely understood to be unreliable, and I see no evidence that this AI detector has overcome the well-deserved stigma.

▲

SoftTalker

17 minutes ago

[-]

In particular, conference papers are already extremely formulaic, organized in a particular way and using a lot of the same stock phrasings and terms of art. AI or not, it's hard to tell them apart.

▲

JohnCClarke

20 minutes ago

[-]

The question is not are the reviews AI generated. The question is are the reviews accurate?

▲

stanfordkid

11 minutes ago

[-]

Exactly this. Like is the research actually useful and correct is what matters. Also if it is accurate, instead of schadenfreude shouldn't that elicit extreme applause? It's feeling a bit like a click-bait rage-fantasy fueled by Pangram, capitalizing on this idea that AI promotes plagiarism / replaces jobs and now the creators of AI are oh-too human... and somehow this AI-detection product is above it all.

▲

raincole

1 hour ago

[-]

> Controversy has erupted after 21% of manuscript reviews for an international AI conference were found to be generated by artificial intelligence.

21%...? Am I reading it right? I bet no one expected it's so low when they clicked this title.

▲

conartist6

41 minutes ago

[-]

21% fully AI generated. In other words, 21% blatant fraud.

In accident investigation we often refer to "holes in the swiss cheese lining up." Dereliction of duty is commonly one of the holes that lines up with all the others, and is apparently rampant in this field.

▲

tmule

36 minutes ago

[-]

Why? I often feed an entire document I hastily wrote into an AI and prompt it to restructure and rewrite it. I think that’s a common pattern.

▲

conartist6

33 minutes ago

[-]

It might be, but I really doubt those were the documents flagged as fully AI generated. If it erased all the originality you had put into that work and made it completely bland and regressed-to-the-mean, I would hope that you would notice.

▲

exe34

18 minutes ago

[-]

> I would hope that you would notice.

he didn't say he read it carefully after running it through the slop machine.

▲

hnaccount_rng

1 hour ago

[-]

My initial reaction was: Oh no, who would have thought? But then... 21% is almost shockingly low. Especially given that there are almost certainly some false positive, given that this number originates with a company selling "detecting AI generated text"

▲

cratermoon

22 minutes ago

[-]

Headline should be "AI vendor’s AI-generated analysis claims AI generated reviews for AI-generated papers at AI conference".

h/t to Paul Cantrell https://hachyderm.io/@inthehands/115633840133507279

▲

hiddencost

1 hour ago

[-]

Automated AI detection tools do not work. This whole article is premised on an analysis by someone trying to sell their garbage product.

▲

AznHisoka

54 minutes ago

[-]

Yeah that is the premise all of these articles/tools just conveniently brush off. “We detected that x%… “ OK, and how do I know ur detectiok algorithm is right?

▲

conartist6

37 minutes ago

[-]

Usually the detectors are only called in once a basic "smell test" has failed. Those tests are imperfect, yes, but Bayesian probability tells us how to work out the rest. I have 0 trouble believing that the prior probability of an unscrupulous individual offloading an unpleasant and perceived-as-just-ceremonial duty to the "thinking machine" is around 20%. See: https://www.youtube.com/watch?v=lG4VkPoG3ko&pp=ygUZdmVyaXRhc...

▲

exe34

16 minutes ago

[-]

Could the big names make a ton of money here by selling AI detectors? they would need to store everything they generate, and then provide a % match to something they produced.

▲

NitpickLawyer

32 minutes ago

[-]

This is the kind of situation where everything sucks. You'd think that one of the biggest AI conference out there would have seen this coming.

On the one hand (and the most important thing, IMO) it's really bad to judge people on the basis of "AI detectors", especially when this can have an impact on their career. It's also used in education, and that sucks even more. AI detectors have bad rates, can't detect concentrated efforts (i.e. finetunes will trick every detector out there, I've tried) can have insane false positives (the first ones that got to "market" were rating the declaration of independence as 100% AI written), and at best they'll only catch the most vanilla outputs.

On the other hand, working with these things, and just being online is impossible to say that I don't see the signs everywhere. Vanilla LLMs fixate on some language patterns, and once you notice them, you see them everywhere. It's not just x; it was truly y. Followed by one supportive point, the second supportive point and the third supportive point. And so on. Coupled with that vague enough overview style, and not much depth, it's really easy to call blatant generations as you see them. It's like everyone writes in linkedin infused mania episodes now. It's getting old fast.

So I feel for the people who got slop reviews. I'd be furious. Especially when its faux pas to call it out.

I also feel for the reviewers that maybe got caught in this mess for merely "spell checking" their (hopefully) human written reviews.

I don't know how we'll fix it. The only reasonable thing for the moment seems to be drilling into everyone that at the end of the day they own their stuff. Be it a homework, a PR or a comment on a blog. Some are obviously more important than the others, but still. Don't submit something you can't defend, especially when your education/career/reputation depends on it.

▲

ZeroConcerns

1 hour ago

[-]

The claim "written by AI" is not really substantiated here, and as someone who's been accused of submitting AI-generated content repeatedly recently, while that was all honestly stuff I wrote myself (hey, what can I say? I just like EM-dashes...), I sort-of sympathize?

Yes, AI slop is an issue. But throwing more AI at detecting this, and most importantly, not weighing that detection properly, is an even bigger problem.

And, HN-wise, "this seems like AI" seems like a very good inclusion in the "things not to complain about" FAQ. Address the idea, not the form of the message, and if it's obviously slop (or SEO, or self-promotion), just downvote (or ignore) and move on...

▲

stevemk14ebr

54 minutes ago

[-]

Banning calling out AI slop hardly seems like an improvement

▲

ZeroConcerns

38 minutes ago

[-]

What I'm advocating is a "downvote (or ignore) and move on" attitude, as opposed to "I'm going to post about this" stance. Because, similar to "your color scheme is not a11y-friendly" or "you're posting affiliatate-links" or "this is effectively a paywall", there is zero chance of a productive conversation sprouting from that.

▲

JohnCClarke

18 minutes ago

[-]

What percentage of the papers where written by AI?

And, if your AI can't write a paper, are you even any good as an AI researcher? :^)

▲

p1esk

14 minutes ago

[-]

Did you mean: “if your AI can’t write a paper that passes an AI detector, are you any good as an AI researcher?”

▲