FilterHN

Google API keys weren't secrets, but then Gemini changed the rules

213 points

by hiisthisthingon

9 hours ago

| past

| 14 comments

| trufflesecurity.com

| HN

▲

klooney

25 minutes ago

[-]

> Retroactive Privilege Expansion. You created a Maps key three years ago and embedded it in your website's source code, exactly as Google instructed. Last month, a developer on your team enabled the Gemini API for an internal prototype. Your public Maps key is now a Gemini credential. Anyone who scrapes it can access your uploaded files, cached content, and rack up your AI bill. Nobody told you.

Malpractice/I can't believe they're just rolling forward

▲

devsda

1 hour ago

[-]

> Leaked key blocking. They are defaulting to blocking API keys that are discovered as leaked and used with the Gemini API.

There are no "leaked" keys if google hasn't been calling them a secret.

They should ideally prevent all keys created before Gemini from accessing Gemini. It would be funny(though not surprising) if their leaked key "discovery" has false positives and starts blocking keys from Gemini.

▲

827a

1 hour ago

[-]

Yeah its tremendously unclear how they can even recover from this. I think the most selective would be: they have to at minimum remove the Generative Language API grant from every API key that was created before it was released. But even that isn't a full fix, because there's definitely keys that were created after that API was released which accidentally got it. They might have to just blanket remove the Generative Language API grant from every API key ever issued.

This is going to break so many applications. No wonder they don't want to admit this is a problem. This is, like, whole-number percentage of Gemini traffic, level of fuck-up.

Jesus, and the keys leak cached context and Gemini uploads. This might be the worst security vulnerability Google has ever pushed to prod.

▲

decimalenough

11 minutes ago

[-]

The Gemini API is not enabled by default, it has to be explicitly enabled for each project.

The problem here is that people create an API key for use X, then enable Gemini on the same project to do something else, not realizing that the old key now allows access to Gemini as well.

Takeaway: GCP projects are free and provide strong security boundaries, so use them liberally and never reuse them for anything public-facing.

▲

oompty

37 minutes ago

[-]

Ohh so that's how that happened. I had noticed (purely for research purposes of course) that some of Google's own keys hardcoded into older Android images were useable for Gemini (some instantly ratelimited so presumably used by many other people already but some still usable) until they all got disabled as leaked like two months ago. They also had over time disabled Gemini API access on some of them over them beforehand.

▲

warmedcookie

1 hour ago

[-]

What's frustrating is that a lot of these keys were generated a long time ago with a small amount of GCP services that they could connect to. (Ex. Firebase remote config, firestore, etc.)

When Gemini came around, rather than that service being disabled by default for those keys, Gemini was enabled, allowing exploiters to easily utilize these keys (Ex. a "public" key stored in an APK file)

▲

decimalenough

6 minutes ago

[-]

Gemini API is not enabled by default, a project owner has to go explicitly enable it.

The problem described here is that developer X creates an API key intended for Maps or something, developer Y turns on Gemini, and now X's key can access Gemini without either X or Y realizing that this is the case.

The solution is to not reuse GCP projects for multiple purposes, especially in prod.

▲

louison11

1 hour ago

[-]

This seems so… obvious? How can a company of this size, with its talent and expertise, not have standardized tests or specs preventing such a blatant flaw?

▲

SlightlyLeftPad

47 minutes ago

[-]

First of all, Google is a shell of the company it used to be.

That said, I’d actually argue there’s an evolutionary explanation behind this where at a certain size, and more importantly complexity, an oversight like this becomes even more likely, not less.

▲

ryanjshaw

1 minute ago

[-]

Seems like they ought to be dedicated security teams monitoring for exactly this: does a key to X give users access to not-X. Even more bizarre is their VDP team not immediately understanding the severity of the issue.

▲

adenta

53 minutes ago

[-]

Stuff like this was proposed to be added to standard interviews, but they were too busy reversing binary trees

▲

j16sdiz

14 minutes ago

[-]

in a company of this size ... left hand don't know what right hand is doing

▲

gamblor956

1 hour ago

[-]

They probably used the in house AI tools to build this.

▲

leptons

3 minutes ago

[-]

"This seems fine"

▲

827a

1 hour ago

[-]

Is the implication at the end that Google has not actually fixed this issue yet? This is really bad; a massive oversight, very clearly caused by a rush to get Gemini in customers' hands, and the remediation is in all likelihood going to nuke customer workflows by forcing them to disable keys. Extremely bad look for Google.

▲

vessenes

23 minutes ago

[-]

Woof. Impedance mismatch outcome from moving fast - the GCP auth model was never designed to work like oAI's API key model; this isn't the only pain point this year, but it's a nasty one. I'm sympathetic, except that dealing with GCP has always been a huge pain in the ass. So I'm a little less sympathetic.

▲

evo

1 hour ago

[-]

Can’t wait til someone makes a Gemini prompt to find these public keys and launch a copy of itself using them.

▲

selridge

3 hours ago

[-]

Great write-up. Hilarious situation where no one (except unwieldiness) is the villain.

▲

phantomathkg

59 minutes ago

[-]

> 2,863 Live Keys on the Public Internet

It will be more interesting if they scan GitHub code instead. The number terrified me. Though I am not sure how many of that are live.

▲

sheept

3 minutes ago

[-]

2k feels very small considering the number of business sites that embed Google Maps. I guess a lot of those sites use other website building services that handle the Google API keys for them, and/or they're old and untouched enough that no one enabled Gemini on them.

▲

dakolli

3 minutes ago

[-]

Dang, another obvious reason (among many others) you shouldn't be uploading documents to any LLM client (or use them on anything important).

▲

bpodgursky

1 hour ago

[-]

ChatGPT writing a blog post attacking Gemini security flaws. It's their world now, we're just watching how it plays out.

▲

bryanrasmussen

1 hour ago

[-]

How do you know that this blog post was written by ChatGPT?

▲

solid_fuel

1 hour ago

[-]

It feels generated to me too. It’s this:

    When you enable the Gemini API (Generative Language API) on a Google Cloud project, existing API keys in that project (including the ones sitting in public JavaScript on your website) can silently gain access to sensitive Gemini endpoints. No warning. No confirmation dialog. No email notification.

Specifically, the last bit - “No warning. No confirmation dialog. No email notification.” Immediately smells like LLM generated text to me. Punchy repetition in a set of 3.

If you scroll through tiktok or instagram you can see the same exact pattern in a lot of LLM generated descriptions.

▲

tyre

41 minutes ago

[-]

Using threes is common in English writing and speaking. It has an optimal balance of expressiveness (three marking a pattern or breadth; creating momentum) without being overwhelming.

It’s not uncommon, as basic writing advice, to use sets of three for emphasis. That isn’t a signifier of LLM generation, in my opinion.

▲

Gigachad

18 minutes ago

[-]

It's also seemingly the only way ChatGPT knows how to write, while being very uncommon for blogposts beforehand. Of course it's not 100% proof, but it's the most likely explanation.

▲

WalterGR

9 minutes ago

[-]

It has a name. The Rule of Threes. https://en.wikipedia.org/wiki/Rule_of_three_(writing)

“The rule of three is a writing principle which suggests that a trio of entities such as events or characters is more satisfying, effective, or humorous than other numbers, hence also more memorable, because it combines both brevity and rhythm with the smallest amount of information needed to create a pattern.”

It’s how I was taught to write, but I understand that my personal experience can’t be generalized to make sweeping statements.

Do you have data that suggests it’s uncommon in human-authored blog posts and more common in LLM-generated text?

▲

coliveira

28 minutes ago

[-]

This excerpt is demonstrating the use of a literary technique to write non-literary prose. It's an almost sure sign that an LLM is generating the text.

▲

masklinn

2 minutes ago

[-]

Of course, how could a writer writing have writing chops and use writing techniques? It boggles the mind that anyone thinks that would ever happens. Must have been aliens.

▲

larusso

53 minutes ago

[-]

I’m not a native speaker so my level of AI recognition is already low. I find it very interesting what patters people bring up to declare it’s AI. The 3 punchline one for instance is a pattern I use while speaking. Can’t say I would write like this though.

▲

solid_fuel

44 minutes ago

[-]

It's not so much the grouping of 3 or way it's supposed to be punchy specifically that's the problem, that is just one example of what gives the article the "LLM Generated" feeling since whatever cheap model people are using for this kind of spam has some common ticks.

I use groupings of 3 and try to make things punchy myself sometimes, especially when I'm writing something intended to sway others. I think the problem with this article is the way it feels like the perfect average of corporate writing. It's sort of like the "written by committee" feel that incredibly generic pop music often has.

When I write things, I often go back and edit and reword parts. Like the brushstrokes in an oil painting, the flow of thought varies between paragraphs and even sentences. LLMs only generate things from left to right (or vice versa in RTL languages, I presume). I think that gives LLM generated text a "smooth" texture that really stands out to anyone who reads a lot.

▲

nimonian

16 minutes ago

[-]

I completely agree with you. There's something conspicuous about this particular use of the "group of three" device. It's trying but it's goofy and conspicuous. I think it's not human, it's 52 trillion parameters in a trenchcoat.

▲

Gigachad

16 minutes ago

[-]

Aside from particulars like the set of 3, LLMs add a lot of emotive language which doesn't mean anything or is a repetition of already established points. Since they can't add any actual substance beyond what was in the prompt, the only thing they do is pad the prompt with filler language.

▲

bryanrasmussen

30 minutes ago

[-]

OK I've seen many people make this point on this site over just the last few months, but where do you think LLMs pick up these patterns? How did this rule of threes https://en.wikipedia.org/wiki/Rule_of_three_(writing) get into the LLM so they are so damn recognizable as LLMs and not as humans?

HN Note: Yes the rule of threes is broader than just this particular pattern here, but in my opinion this common writing and communication pattern is a specific example of the rule of threes.

Punchy repetition in a set of 3. Yes. LLMs are able to capably mimic the common patterns that how to write books have suggested for the last 100 years as ways to make your writing more "impactful" and attention-grabbing. So are humans. They learned it from watching us.

I am a little bit worked up on this as I have felt insulted a couple times at having something I've written been accused of being by an LLM, in that case it was because I had written something from the viewpoint of a depressed and tired character and someone thought it had to be an LLM because they seemed detached from humanity! Success!

I too would like to be able to reliably detect when something has been written by an LLM so I can discount it out of hand, but frankly many of the attempts I see people make to detect these things seem poorly reasoned and actively detrimental.

People have learned in classes and from reading how to improve their writing. LLMs have learned from ingesting our output. If something matches a common writing 101 tip it is just as likely to be reasonably competent as it is to be non-human. The solution to escape being labelled an LLM is not to become less competent as a writer.

I have been overly verbose here, as I am somewhat worked up and angry and it is too late in the morning to go back to sleep but really too early to be awake. I know verbosity is also a symptom of being an LLM, but not giving a damn is a symptom of humanity.

▲

kgeist

18 minutes ago

[-]

>but where do you think LLMs pick up these patterns?

>LLMs are able to capably mimic the common patterns that how to write books have suggested for the last 100 years as ways to make your writing more "impactful" and attention-grabbing. So are humans. They learned it from watching us.

Don't forget that LLMs (at least the "instruct" versions) undergo substantial post-training to align them with the authors' objectives, so they are not a 100% pure reflection of the distribution seen on the internet. For example, it's common for LLMs to respond with "You're absolutely right!" to every second message, which isn't what humans usually do. It's a result of some kind of RLHF: human labelers liked to hear that they're right, so they preferred answers containing such phrases, and those responses became amplified. People recognize LLM-generated writing because LLMs' pattern distribution is different from the actual pattern distribution found in articles written by humans.

▲

raincole

9 minutes ago

[-]

It's too well structured and the message is too clean. HN (and the whole internet) is allergic to proper writing. We praise human sloppiness now.

No, I'm not being sarcastic. People have given up em-dash, which is an official punctuation you use in proper writing. And it's all a downhill from there.

▲

jibal

33 minutes ago

[-]

They don't. Many of these claims are due to illiteracy.

Someone is complaining that

> it's all just crisp and clean structured and actionable in a way that a meandering human would not distill it down to.

but this is a security report ... people intentionally write such things carefully and crisply with multiple edits and reviews.

▲

bpodgursky

1 hour ago

[-]

> The Core Problem

> What You Should Do Right Now

> Bonus: Scan with TruffleHog.

> TruffleHog will verify whether discovered keys are live and have Gemini access, so you'll know exactly which keys are exposed and active, not just which ones match a regular expression.

I don't know exactly, but I'm sure. The cadence, the clarity, the bolding, the italics, it's all just crisp and clean structured and actionable in a way that a meandering human would not distill it down to.

▲

cyral

1 hour ago

[-]

Yup, it was actually an interesting article but there are a few telltale parts that sound like every AI spam post on /r/webdev and similar. "No warning. No confirmation dialog. No email notification." is another. The three negatives repeated is present in so many AI generated promotional posts.

▲

bpodgursky

52 minutes ago

[-]

I don't even have a problem with the content itself, I think frankly the smell is that it's too good. It's just fascinating in the sense that it's one LLM attacking another LLM.

▲

SecretDreams

1 hour ago

[-]

It's too structured and consistent. Imo. Has that AI smell to it, but I guess humans will eventually also start writing more like the AIs they learn from.

▲

Hnrobert42

1 hour ago

[-]

AI was trained on human writing.

▲

SecretDreams

1 hour ago

[-]

And now humans are trained on AI writing.

Like what happens to YouTube videos that go through the compression algorithm 20 times.

▲

devsda

1 hour ago

[-]

> guess humans will eventually also start writing more like the AIs they learn from.

With the AI feedback loop being so fast and tight for some tasks, the focus moves on to delivery than learning. There is no incentive, space or time for learning.

▲

bpodgursky

1 hour ago

[-]

Won't be well received here, but this is the truth.

▲

the_arun

1 hour ago

[-]

Private data should not be allowed to be accessed using public keys. That is the core problem. It is not about Google API keys are secret or not.

▲

bandrami

45 minutes ago

[-]

It was intended for situations where the keyholder is a middleman between Google's API and the end user.

▲

habosa

1 hour ago

[-]

This is true but also not as new as the author claims. There have been various ways to abuse Google API keys in the past (at least to abuse them financially) and it’s always been very confusing for developers.