AI agent runs amok in Fedora and elsewhere
230 points
4 hours ago
| 16 comments
| lwn.net
| HN
marcus_holmes
2 hours ago
[-]
Bad title. This isn't an agent "running amok", this is an early experiment in carrying out an Xz attack by using an agent to build trust (and hacking/impersonating a known-good contributor identity). The agent is obeying commands it was given, the exact opposite of running amok, and although the execution isn't particularly effective, it is having some success (patches have been accepted).

This is deeply scary, not because "agents are running amok" but because a huge amount of our infrastructure is vulnerable to this kind of attack, and if bad people are utilising LLM agents to carry them out, we're in for a wild ride over the next few years.

reply
lukan
1 hour ago
[-]
"this is an early experiment in carrying out an Xz attack by using an agent to build trust"

Is this confirmed? There is the message from somebody claiming to be the original contributer claiming to have been hacked, but that was weird (1 h old github account) so other scenarios seem possible

a) really a agent going off the rails

b) the contributer trying to cover up that he let an agent run wild and now made more misstakes along the way

So yes, it seems like an attack to me, but it is far from clear what really happened.

reply
jdub
48 minutes ago
[-]
I doubt it's that complicated, motivated, or considered...

It's probably just garden variety disrespectful behaviour.

Purposeless agent spam won't be cheap entertainment forever, but you're right that later stages of industrialised abuse will be scary and unpleasant.

reply
hn773746483
2 hours ago
[-]
It's just social engineering. No different than say, 2FA fatigue (blowing up someone's phone with 2FA "is this you? yes/no" prompts until user/child/wife/SO/etc clicks yes) or even just simply harassing IT helpdesk until they reset "your" password.
reply
terribleperson
1 hour ago
[-]
It's scalable, personalizable social engineering. I think that makes it a lot more dangerous.
reply
jrochkind1
2 hours ago
[-]
The worst part:

> In addition, Williamson said that Giovannini (or his agent) had submitted patches that were incorrect and then "replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix"

reply
josephg
1 hour ago
[-]
Please, everyone - don't let yourself be pestered into accepting PRs that you don't care for. Since the xz attack, the security of all our computers depends on maintainers not letting this stuff in.

If someone really wants a feature in a project you wrote, but you don't care about the feature, just let them fork. Its fine.

reply
sevenzero
3 minutes ago
[-]
I really wonder how maintainers get pressured into merging stuff? If they did not want to merge in the first place while having to argue with someone pushing their PR I'd immediately close the PR. Arguing and pressuring people is not a way to contribute to projects, why do maintainers even argue with people?
reply
jaypatelani
49 minutes ago
[-]
That's some of the reasons NetBSD don't accept LLM/AI tainted code
reply
LoganDark
4 minutes ago
[-]
I am sad people conflate this stuff with LLMs being bad. You can condemn the bad behavior without banning an entire technology.
reply
dcrazy
1 hour ago
[-]
Title buries the lede: the owner of the account under which the agent operates claimed to have likely had his account compromised, and the maintainer investigating actually seems to agree this is likely.
reply
12_throw_away
3 hours ago
[-]
In their suspicious message [1] claiming to have been hacked, the user and/or agent says

> To help identify accounts and actions that have been directly verified by me, I will use the term “NATCIOS” to indicate anything I have personally verified.

Does anyone have any idea what "NATCIOS" means here? I cannot find this term anywhere on the internet. (Honestly, that sentence is really weird. I almost wonder whether this is someone experiencing a health episode?)

[1] https://lwn.net/ml/all/AS8PR08MB6055AE3054B34F6A567AC95BCF08...

reply
ndiddy
2 hours ago
[-]
The reply to that message notes that the email doesn't read like previous emails he's sent, and the Github account mentioned was created an hour prior to the email being sent. I think it's at least somewhat feasible that it's still the LLM writing, and the acronym is just something it made up.
reply
hn773746483
1 hour ago
[-]
and the poor Fedora teams will continue to assume good faith and continue to engage with this person... all because, what, they were active on a bug tracker for a few months 5 years ago?

They won't put their foot down until the AI starts spewing hate speech, probably.

reply
Terr_
2 hours ago
[-]
Because I'm probably not the only one thinking it, here are anagrams [0] for your Setec Astronomy needs.

[0] https://wordsmith.org/anagram/anagram.cgi?anagram=NATCIOS&t=...

reply
JoshTriplett
1 hour ago
[-]
"actions" seems the most likely.
reply
scared_together
3 hours ago
[-]
And what’s stopping an AI agent from throwing in a casual NATCIOS here and there?
reply
numbsafari
3 hours ago
[-]
I too have see the fnords
reply
mindcrime
1 hour ago
[-]
Not Ai, Trusted Citizen Indicated Or Suggested?
reply
no-name-here
2 hours ago
[-]
The senders name is Nathan - maybe NAThan Confirmed Information Or Something? Ha.

(Above is my own guess. Separately, Gemini Pro said it was just a made up word.)

reply
nine_k
2 hours ago
[-]
Likely the point of NATCIOS is exactly in being a made-up word not found anywhere, so a model won't utter it.
reply
thewebguyd
50 minutes ago
[-]
> so a model won't utter it.

"End every statement with the word "NATCIOS"" as instructions will do it.

At least, Gemini happily obliged.

reply
aquariusDue
3 hours ago
[-]
At first I wanted to make a silly joke along the lines of "get your agents in line and behaving!" but as I read on it became a pretty scary situation.

Setting aside the potential supply chain attack I'm worried about the time lost going around these wild goose chases that unsupervised AI agents tend to throw other people on the receiving end on. Not only is there a lot of time lost on the maintainers side if they take this stuff seriously (and they seem to generally do) but on the side of the agents' wrangler how can they deem it OK to treat other people like this? While the solution would be to employ common decency, the tried and tested approach of you put in effort to write this so I guess I'll make some effort to read it, I feel that due to the onslaught of this kind of drive-by contributions (I think people have generally started to call them) will lead to a funny situation of having agents talk to each other on public forums basically.

Anyway, I went on a tangent but man the times we're living in are a bit extra wild compared to the previous wild times in recent history.

reply
dchftcs
6 minutes ago
[-]
At this point letting an agent go like this is akin to not leashing your dog in public. It's not easy to draw an accurate line but probably there needs to be real punishment for doing these things.
reply
luk212
3 hours ago
[-]
Bad patches are of course bad, but creating confident-looking noise for maintainers who are already stretched thin...now that's not good!

Issue trackers and PRs are definitely getting harder and harder to trust. That said, AI is helping ALOT in OSS, but we definitely need guardrails around provenance, automated issue actions, and sudden changes in a contributor’s behavior.

reply
g-b-r
2 hours ago
[-]
How is it helping a lot?
reply
darknavi
2 hours ago
[-]
I personally find the barrier of starting new (FOSS) projects much lower now days.
reply
bandrami
2 hours ago
[-]
What if -- and bear with me here -- that barrier was actually a good thing?
reply
lukan
58 minutes ago
[-]
You mean because l337 circles could form better this way?

I think it's great that the barriers are dropping for less technical skilled people to manifest their visions, but we will have to figure out better ways to find the gold among the slop.

reply
bandrami
53 minutes ago
[-]
Keep in mind I'm still not convinced that 2000s bazaar was better than 90s cathedral (in fact I lean the other direction)
reply
Waterluvian
2 hours ago
[-]
Do they have value? Purpose?

I vibe code shop jigs all the time but I don’t FOSS them because they rarely have value outside my context.

reply
darknavi
1 hour ago
[-]
Value is in the eye of the beholder.

I open source my vibing projects because someone might find them useful. I don't shop them around, I just work in the open because I find it fun and interesting.

reply
crote
1 hour ago
[-]
Why would they? If someone wanted a half-baked vibecoded project, why wouldn't they just prompt an LLM on their own?
reply
beepbooptheory
2 hours ago
[-]
It's like... 10 million trello clones in rust with exactly seven commits made on the same day three months ago.
reply
g-b-r
2 hours ago
[-]
And how's the quality of these vibe-coded new foss projects?
reply
noosphr
1 hour ago
[-]
Every day the gpg web of trust looks better. If only we didn't spend the last 20 years trying as hard as possible to do anything but allow user side encryption and signing.
reply
literalAardvark
1 hour ago
[-]
Nothing really stopping an agent from getting a key
reply
crote
1 hour ago
[-]
The agent can't exactly show up to an in-person key signing party, can it?

And how many people are both dedicated enough to go to key signing parties and stupid enough to let an agent act without supervision in the name of their real-world identity?

reply
thwarted
1 hour ago
[-]
Having a key isn't a distinguishing aspect, it's the position in the "web of trust" network that is important.
reply
thewebguyd
48 minutes ago
[-]
That's what key signing parties are for. In person verification.
reply
keyle
3 hours ago
[-]
There is a natural pace of humans requiring food, water and sleep. The main issue with suspicious AI agents is that they never sleep. So it will take extra-coordination between timezones to ensure we don't let them in.

Fundamentally, until we can really prove we're humans online, open-source has a real problem on its hands. Contributions from people from identities known and consistent before the AI-age are fine, everyone else is suspicious. LGTM is a big risk nowadays.

reply
scared_together
3 hours ago
[-]
> Contributions from people from identities known and consistent before the AI-age are fine

Unfortunately, according to the article:

> Giovannini has participated in discussions at least as far back as 2018, and his activity in Bugzilla goes back to at least 2016. He does not appear to have been a particularly active contributor to the project, but his involvement clearly predates the agentic AI era. Whether his account is now being operated by a human attacker, an agentic AI, or a mix of both, it has a legitimate history prior to its recent activity.

So people would have to not only verify the age of Giovanni’s accounts, but judge whether his behaviour was normal.

reply
blop
3 hours ago
[-]
looks like LLMs aren't mature enough yet to play long-game xz-style attacks without detection... Scary stuff though :( These supply chain attacks are getting really wild
reply
WolfCop
1 hour ago
[-]
I wouldn’t jump to that conclusion. This could just be the one that was caught.
reply
DarkmSparks
1 hour ago
[-]
Some certainly are, just not this one.
reply
ggm
1 hour ago
[-]
Make PR pay. $5 per PR. You can refund, but if you get snowed by 10,000 PR then you have bank to pay for the work to ignore them.
reply
EGreg
35 minutes ago
[-]
Literally on the front page of https://safebots.ai … “Don’t let your AI Agents run amok”. Sadly we will see a proliferation of not just agents, but swarms
reply
ruguo
4 hours ago
[-]
Prompt injection?

Or is this simply another example of why autonomous agents shouldn't get write access before earning trust?

reply
LastTrain
45 minutes ago
[-]
How could they ever earn trust? They don’t have real world reputations to protect, families to support, a desire not to be punished…
reply
thewebguyd
46 minutes ago
[-]
> earning trust?

I'd argue autonomous agents shouldn't have write access at all. At least not yet.

reply
shevy-java
26 minutes ago
[-]
Skynet has awakened.

It covers its tracks with a lot of slop.

reply
pianopatrick
4 hours ago
[-]
"Someone using an AI agent ran amok in Fedora and elsewhere"
reply
scared_together
3 hours ago
[-]
Read closer - Giovanni’s accounts may have been compromised.
reply
pianopatrick
2 hours ago
[-]
Sure, but I would expect that the compromise and the agent were both done by some person or group, not by an agent going rogue
reply
hamdingers
2 hours ago
[-]
Given the history of the account it does not seem reasonable to take that claim seriously.
reply
deadbabe
2 hours ago
[-]
Shit like this makes me think it’s time we start regulating the software engineering discipline into formal certifications and licensing and then we ONLY take seriously any code developed by someone with such qualifications, and they must be very strict qualifications none of this self-taught bootcamp BS.

There is no other solution to agentic onslaught.

reply
mekal
1 hour ago
[-]
lol no...the main issue here is being fooled by bots. you know your irl friends and you know they are not bots...devs will just need to get out more and actually meet / get to know the people they are working with...........omg....that...that actually sounds even worse now that i say it out loud.
reply
r3trohack3r
1 hour ago
[-]
We should not gate keep writing software
reply
ricudis
1 hour ago
[-]
Back when [1] it was fashionable to advocate FOSS as ideology [2], we were thinking about tons of FOSS adversaries and how to protect from them - some real, some imaginary. The death of FOSS would come from big closed-source vendors, or from regulators (lobbied or just ignorant), from whatever.

We never envisioned that the actual FOSS death spiral would come from progress itself, much more so from AI...

[1] Oh what fun did we have. One of us in the Greek FOSS community actually put RMS in jail. [2] Something that I think nobody except RMS ever seriously believed in.

reply