Welcome to the Strip Mining Era of OSS Security
58 points
2 hours ago
| 14 comments
| metabase.com
| HN
ahlCVA
45 minutes ago
[-]
Whenever one of these vulnerability apocalypse posts comes along I cannot help but think of the Litany of Gendlin:

  What is true is already so.
  Owning up to it doesn't make it worse.
  Not being open about it doesn't make it go away.
  And because it's true, it is what is there to be interacted with.
  Anything untrue isn't there to be lived.
  People can stand what is true,
  for they are already enduring it.
I cannot wrap my mind around why people think finding vulnerabilities is bad. The code already was broken before somebody published the vulnerability. The difference now only is that you know about this.

Imagine somebody finding a flaw in a mathematical proof and everybody being sad because a beautiful proof got invalidated rather than being glad future work won't build on flawed assumptions.

I get that the rate of vulnerability discovery can be a burden, especially for people doing FOSS in their spare time, but the sustainability problem with that has always existed and only gets exacerbated by the vulnerability stuff, but the latter isn't the cause you need to make go away.

reply
esseph
7 minutes ago
[-]
[delayed]
reply
bell-cot
28 minutes ago
[-]
> I cannot wrap my mind around why people think finding vulnerabilities is bad. The code already was broken before somebody published the vulnerability. The difference now only is that you know about this.

Try binge-watching old Star Trek episodes, to see how Spock deals with the illogical 99.9% of humanity?

reply
_alternator_
1 hour ago
[-]
The article focuses on OSS, but closed-source software is at major risk too. Perhaps more.

It's gotten much easier to reverse engineer binaries in general, and security patches in particular. Basically, an LLM can turn binaries into 'readable' code, and then reason about said code.

reply
salsakran
1 hour ago
[-]
Perhaps -- but I think for most people, the vast majority of proprietary software they consume is over the network.

But yeah, if you're distributing binaries publicly, then you're going to have very similar problems.

reply
redanddead
1 hour ago
[-]
That happens a lot though, even OpenAI is attempting to lock functionality (like computer-use, 2 weeks ago) behind a binary -- Mac only they said, no EU. I saw a guy crack it the same day, ported to Windows. There are many many things like Rive that use binaries, obfuscation and uglification has been the name of the proprietary game for a long ass time, with the only protection being an assumption that "nobody would go through that trouble", yeah an LLM would ralph loop through it all day long, and make what you paid good money for pretty much free for anyone to use whenever they feel like it, we're back to the the "you wouldn't download a car would you?" argument
reply
twism
1 hour ago
[-]
Does it even need to turn it into readable code?
reply
_alternator_
58 seconds ago
[-]
My understanding is that decompilation into more readable code is an important step in building the path to an exploit.

This understanding may be incomplete or outdated (things moving very fast right now). I'd love to hear from a someone with more experience using LLMs to do binary analysis about the level of 'binary annotation' needed for LLMs relative to humans.

reply
edrobap
1 hour ago
[-]
I had done a fair bit of reverse-engineering-jar-files in the pre-LLM era for various reasons. The biggest problem with decompiled java files was naming. The original variable names, class names etc were not retained and the decompiler would use some alphanumeric series. That'd make reading code very hard. Curious how the current LLMs are able to address this. Maybe it's able to figure out how the class, variable etc is used and name it accordingly. (All this is assuming the original code itself was readable because there are enough bad programmers)
reply
roenxi
56 minutes ago
[-]
I expect Java would be easy-mode for the AI, they already do quite well reconstructing C++ from ghidra output in my experience from when I wanted to know what damage formula some game was using.

As a reminder; your account has been shadow-banned, it looks like you got a little unlucky in 2016.

reply
mtlynch
23 minutes ago
[-]
> Most are not serious, and we’ve quietly fixed them, thanked the researcher, and went our merry way... These come from a wide variety of locations and people, and sometimes, but not always, are looking for bug bounties.

I take it that Metabase is both not paying bug bounties and not using these tools internally?

If that's the case, Metabase is not going to get meaningful investment from researchers who want to fix issues, but they'll get increased attention from malicious attackers who have no qualms exploiting the vulnerabilities for profit.

LLMs have made it a lot easier for people to find vulnerabilities in software. Open-source makes it easier, but we already have non-AI tooling (IDA Pro, Ghidra) that's good at binary reverse engineering, and LLMs can use that output to find vulnerabilities as well.

This year, as I select products to use for sensitive data, I've been paying a lot more attention to whether they offer bug bounties and for how much. For example, I like Kagi for search and thought about trying Orion, their web browser. Then, I saw that Kagi's been paying $100 for UXSS vulnerabilities.[0] For comparison, Firefox pays $8-10k,[1] and Chrome pays up to $10k for the same class of bug.[2]

[0] https://help.kagi.com/kagi/privacy/bug-bounty-program.html

[1] https://www.mozilla.org/en-US/security/client-bug-bounty/

[2] https://bughunters.google.com/about/rules/chrome-friends/chr...

reply
devinabox
31 minutes ago
[-]
This is something I struggle with as someone building a tool for debugging and security.

I have dog-fooded it heavily on my own projects, client projects and friends projects. It finds things that are really quite clever and not obvious. It really helps me.

But when I try to do the obvious thing for sales of using an OSS project to get hype, show off etc. I find that it becomes really hard to really know that I am helping and not just spamming.

To be clear - I think for an AI tool like mine to actually give you clever results that finds not obvious issues and security flaws - it needs to have some level of false positives.

I find myself struggling to justify the approach of firing off defects to an OSS maintainer without verifying them - which takes considerable time if I am going to do a good job. Even with tools to help pull apart the code, the core problem is always you don't know what you don't know.

The same process working on my own projects I can eat through a ton of defects and find some really great stuff. But that's only possible because I can tell at a glance what is real, what is fake, and also what is an oh ** issue.

So I think this is true, but the risk is that people who don't understand the projects just point scanners at OSS blindly and ruin the good work maintainers are doing.

This stuff is more complicated than people give credit - and it's so easy to kid yourself into thinking any bug report is helpful.

reply
parliament32
7 minutes ago
[-]
So you slop-coded a tool, you're slop-generating reports, you know it has hallucinations ("false positives").. and you're complaining it's too much work to even verify the output?

And you're surprised OSS projects are pivoting towards "open source does not mean open contributions"?

reply
marginalx
2 hours ago
[-]
Clearly for commercial oriented opensource software, security through obscurity is one way to keep the pace in the short term. Not an option for proper open source software. Will this be the case that people who use open source software that is easily detectable will also start to shy away from using them for the fear of zero-days?

One of the benefits of Open source has been that there are more eye balls on the source, leading to more secure code/better quality. I think given enough time the bug reports will plateau and we will be back to a normal cadence - once the tsunami is over, hopefully things will settle at a more manageable cadence .

reply
salsakran
1 hour ago
[-]
I'm not sure that the benefit of many eyes helps here. So much of this bulk scanning is low-effort, and if you're a smart person developing closed source software you get the benefits of bulk scanning, but _at the time of your choosing_ .

OSS has always had tradeoffs and I sadly think this one is going straight to the "Cons" column. We still think the Pros outweigh the Cons, but this is NotGreat.

reply
dynawicki
1 hour ago
[-]
This benefit you speak of is actually just a meme.

Source that is unmaintained is dead. Nobody is looking at it, even the maintainer has something better to do.

Do you know whats even more powerful than "eyeballs"? Money.

reply
Joel_Mckay
1 hour ago
[-]
Lets be honest, LLM with fuzzers are going to pound any llvm generated binary right in the hubris.

Won't matter if is closed source, signed, and or obfuscated. =3

reply
aetherspawn
1 hour ago
[-]
Say I had $1000, how do I get the best value for money to discover vulnerabilities? Are there any worthwhile LLM powered services that are turnkey and ready to go?
reply
ben_w
1 hour ago
[-]
From what I've heard, every LLM before Mythos (which you can't get, they'll call you if you're big enough) will have far too many false positives to be helpful, so I guess the best option would be to use an agent to help you (not lights-off vibe coding!*) take advantage of all the older tools like valgrind and closing all the compiler warnings?

* I presume I'm not the only one to find the agents tasked with adding unit tests will sometimes try to sneak through "open source code and apply regex to confirm presence or absence of specific string literal".

They can speed you up significantly, but you absolutely do need to pay attention to what they produce.

reply
salsakran
1 hour ago
[-]
With all respect to the Anthropic folks, that's just marketing. (If they're reading this: let us into the program so I can be proven wrong here.)

I'm sure what they have is awesome, but it's clear that there are people out there with some decent prompts that are getting results out of widely available models as well.

The big thing we're sharing is: bulk scanning by random people in random geographies got a _lot_ better around January, it's widely distributed, and it's going to get a lot better regardless of whether that specific version of Mythos becomes widely available or not.

reply
embedding-shape
1 hour ago
[-]
> prompts that are getting results out of widely available models as well.

Absolutely, and the "false-positive" issue people keep citing as why Mythos is so good is easily solved in the harness, simplest solution is starting fresh context with another prompt to evaluate if it's a false-positive or not, just adding that drastically cuts down the rate.

reply
bluGill
1 hour ago
[-]
That is false. A year ago every LLM generated report was slop - more likely a false positive than correct. However in the past few months nearly every LLM generated report is real.
reply
embedding-shape
1 hour ago
[-]
Not sure about turnkey solutions for finding vulnerabilities that doesn't involve having to hand over a bunch of identity proofs for them to store on their insecure infra and also enrolling in programs.

Besides that, hiring a beefy GPU instance at Vast.ai or similar places then running your own uncensored models on it, I've had great success with AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-NVFP4, smart + uncensored, but there are lots of options, probably some are already tailored for security research.

reply
hrjriritifif
1 hour ago
[-]
I do not think author understands how opensource works. You have a problem on your computer, in __your__ software, and somehow some random dude is responsible for fixing it? Sure if you gimme a few kilo USDs I will drop everything and come to rescue you. But for free it is a volunteer gig I do once a month....
reply
salsakran
57 minutes ago
[-]
I dunno man ... I produced a few things that got a few github stars over the years.

At the risk of repeating myself -- this is targeted at other OSS maintainers, not random people who might have done a git pull of some random project a couple years ago.

reply
Macha
42 minutes ago
[-]
> Did you have other plans for the weekend? Or a long term project you’re prioritizing? That’s nice, you have a new plan — fix every vulnerability that comes in NOW.

Or you know, provide the security companies and businesses using your software for free with all the fix timelines and out of hours support they’ve paid for (none).

reply
adamtaylor_13
1 hour ago
[-]
> Did you have other plans for the weekend? Or a long term project you’re prioritizing? That’s nice, you have a new plan — fix every vulnerability that comes in NOW.

Umm... no? It's called OPEN source. Expecting people to cancel their plans to make your free software more secure is pretty audacious. Luckily, many WILL, but the expectation is just foolish.

reply
salsakran
1 hour ago
[-]
That line was aimed at other OSS maintainers.

These alerts are absolutely not being shared publicly before we have a fix for them.

reply
le-mark
1 hour ago
[-]
So what does this mean for the open source ecosystem? Unmaintained or “finished” projects will be labeled as to unsafe to use?
reply
salsakran
59 minutes ago
[-]
If you're using unmaintained OSS projects in this day and age, I'm sorry to say you might deserve what happens next.
reply
gmuslera
1 hour ago
[-]
The problem on the side of closed source software is that if there had been leaks of source code, the vulnerabilities and exploits may remain unknown for long time.
reply
pixl97
1 hour ago
[-]
I would go to say that most closed source software code gets leaked. Most companies hold that info close and don't disclose it, even if legally required unless it's made public.
reply
salsakran
1 hour ago
[-]
Side conversation -- This is all stuff we're seeing in white/grey hat land. What's going on in blackhat land?
reply
bluGill
1 hour ago
[-]
Nobody really knows of course. However it is safe to assume they are not so stupid as to ignore what is happening in the other areas (at least some of them), and so they are running their own targeted scans and then trying to figure out how to make money (or whatever their goal is) by exploiting them. They are also using LLMs to try things on closed source that are more than a brute force attack, though I have no idea what those would be.
reply
dynawicki
1 hour ago
[-]
Good luck getting anyone who values their time to even triage the results. I would rather lick the bottom of a NYC dumpster that a rat had just died in.
reply
salsakran
1 hour ago
[-]
That was true last year -- things changes.

Ignore (admittedly low-effort LLM generated) reports at your own peril.

reply
dynawicki
1 hour ago
[-]
Software will eventually become "unmaintainable due to lack of interest", because of this very thing. People not invested in this are not "in peril" in any way.
reply
bluGill
1 hour ago
[-]
A lot of people are invested without realizing it. I'm typing this on a computer running linux, with all the standard services/software. I maintain one OSS project (icecc - we have always said only run on trusted networks. I'm sure there are a lot of issues in our code but nobody has bothered run a scan yet to my knowledge), but I don't pay attention to everything. I'm sure there are known easy to exploit (with a LLM) issues on this computer just because my distro hasn't updated yet. (I need a better distro, but even the most up to date will constantly have these issues)
reply
dynawicki
1 hour ago
[-]
What you just described may be accurate. But it also is the essence of a "trap". My comment about investment was more to that point.

If software "is a trap", even my ever-computing loving wrote first programs on an Apple II in the 80s will only be as you sort of describe invested in by reference (minimal usage).

But no-one will sign up for a "trap" as a career, and only those who do will deal with its problems. The first thing that comes to mind is "Johns", "Hotels", and the trappings of the sex trade.

reply
as3qkaH
1 hour ago
[-]
Apparently the AI company Metabase has a very poor code base. Like so many others, instead of questioning their own (or AI) output, they help their AI overlords by promoting security scans.

Fact is that Mythos found only one issue in curl and nothing at all in most code bases. It is getting quiet around Mythos, and the AI companies will move on to the next scam.

reply
bluGill
1 hour ago
[-]
Mythos found only one issue in curl - but it didn't start until many other LLMs had been run and found a lot of issues that were fixed. If Mythos was run a year ago it would have found over 100 issues (of course it didn't exist a year ago, nor did the other tools).
reply
4ladf1
1 hour ago
[-]
Curl had many old protocols and code from the 1990s that no one used. Besides, Mythos was claimed to be better than existing tools.

In most open source projects, Mythos or similar tools have found nothing. The AI people only contact the projects where they find something, because it would be bad for marketing otherwise.

reply
dynawicki
58 minutes ago
[-]
This is now the open source problem. And why my personal opus of work has been removed from online repositories.

Who gave them "the right to scan"? You did by hosting your open source in public. But scanning a public service prior to AI was still covered by "Unauthorized System Access".

But what if they are wrong, and given the self-serving nature of these scans, now your repo is just OJ Simpson? And your software is banned due to an external scan you did not ask for?

Is there no one in this world who will be accountable for any thing at all? Can we sue the scanners if they are wrong and publish their results for defamation even in a public PR?

These things will happen. IF I had source in the open and a scan result was incorrect that nobody asked for and the results had false positives, I would sue Anthropic for defamation and I would win.

reply