Bucketsquatting is (finally) dead
111 points
3 hours ago
| 11 comments
| onecloudplease.com
| HN
josephg
1 hour ago
[-]
Sometimes I wonder if package names, bucket names, github account names and so on should use a naming scheme like discord. Eg, @sometag-xxxx where xxxx is a random 4 digit code. Its sort of a middleground between UUID account names and completely human generated names.

This approach goes a long way toward democratizing the name space, since nobody can "own" the tag prefix. (10000 people can all share it). This can also be used to prevent squatting and reuse attacks - just burn the full account name if the corresponding user account is ever shut down. And it prevents early users from being able to snap up all the good names.

reply
jorams
1 hour ago
[-]
Notably Discord stopped using that format two years ago, moving to globally unique usernames.

Their stated reason[1] for doing so being:

> This lets you have the same username as someone else as long as you have different discriminators or different case letters. However, this also means you have to remember a set of 4-digit numbers and account for case sensitivity to connect with your friends.

[1]: https://support.discord.com/hc/en-us/articles/12620128861463...

reply
thaumasiotes
17 minutes ago
[-]
The stated reason is obviously not able to justify the change; either they have an internal reason they're not willing to admit to, or someone at Discord just went crazy.

Imagine trying to connect with your friends... by telephone.

reply
juliangmp
31 minutes ago
[-]
It was honestly a downgrade i ended up just putting the 4 digits I had before at the end of my username cause surprise the name was taken immediately
reply
ffsm8
13 minutes ago
[-]
I haven't logged in since . I wonder if they'll delete my account eventually - as I essentially don't have a username because of that
reply
rithdmc
1 hour ago
[-]
I like it for buckets, but adding a four digit code won't help with the package hijacking side of things - in fact might just introduce more typo/hijack potential. It'll just be four more characters for people to typo.
reply
fc417fc802
30 minutes ago
[-]
IMO a better general solution is UUIDs and a petname system, at least as far as chat apps are concerned.

For buckets I thought easy to use names was a key feature in most cases. Otherwise why not assign randomly generated single use names? But now that they're adding a namespace that incorporates the account name - an unwieldy numeric ID - I don't understand.

In the case of buckets isn't it better to use your own domain anyway?

reply
donmcronald
1 hour ago
[-]
I just want to be able to use a verified domain; @example.com everywhere.
reply
Cthulhu_
1 hour ago
[-]
That still has "squatting" risks as described in the original article though, domains expire and / or can be taken over.
reply
fc417fc802
22 minutes ago
[-]
But you already have a domain for whatever you're doing so presumably that's going to be a threat either way.

For particularly high risk activities if circumstances permit you can sidestep the entire issue by adding a layer of verification using a preshared public key. As an arbitrary example, on android installing an app with the same name but different signing key won't work. It essentially implements a TOFU model to verify the developer.

reply
vhab
2 hours ago
[-]
> For Azure Blob Storage, storage accounts are scoped with an account name and container name, so this is far less of a concern.

The author probably misunderstood what "account name" is in Azure Storage's context, as it's pretty much the equivalent of S3's bucket name, and is definitely still a large concern.

A single pool of unique names for storage accounts across all customers has been a very large source of frustration, especially with the really short name limit of only 24 characters.

I hope Microsoft follows suit and introduces a unique namespace per customer as well.

reply
iann0036
1 hour ago
[-]
Author here. Thanks for the call out! I've updated the article with attribution.
reply
ryanjshaw
2 hours ago
[-]
I recall being shocked the first time I used Azure and realizing so many resources aren’t namespaced to account level. Bizarre to me this wasn’t a v1 concern.
reply
mwalser
5 minutes ago
[-]
And the naming restrictions and maximum name lengths are all over the place: https://learn.microsoft.com/en-us/azure/azure-resource-manag...

Storage accounts are one of the worst offenders here. I would really like to know what kind of internal shenanigans are going on there that prevent dashes to be used within storage account names.

reply
iknownothow
1 hour ago
[-]
Thank you author Ian Mckay! This is one of those good hygiene conventions that save time by not having to think/worry each time buckets are named. As pointed out in the article, AWS seems to have made this part of their official naming conventions [1].

I'm excited for IaC code libraries like Terraform to incorporate this as their default behavior soon! The default behavior of Terraform and co is already to add a random hash suffix to the end of the bucket name to prevent such errors. This becoming standard practice in itself has saved me days in not having to convince others to use such strategies prior to automation.

[1] https://aws.amazon.com/blogs/aws/introducing-account-regiona...

reply
bulbar
12 minutes ago
[-]
A name shouldn't be the same as the thing it names.

When a name becomes free and somebody else uses it, it points to another thing. What that means for consumers of the name depends on the context, most likely it means not to use it. If you yourself reassign the name you can decide that the new thing will be considered to be identical to the old thing.

reply
ClaudeFixer
13 minutes ago
[-]
Good riddance. The number of production deploys I've seen pointing at bucket names that could've been claimed by anyone was wild. Glad this is finally getting closed off at the platform level instead of relying on everyone to not make the mistake.
reply
calmworm
2 hours ago
[-]
That took a decade to resolve? Surprising, but hindsight is 20/20 I guess.
reply
INTPenis
2 hours ago
[-]
I started treating long random bucketnames as secrets years ago. Ever since I noticed hackers were discovering buckets online with secrets and healthcare info.

This is where IaC shines.

reply
XorNot
2 hours ago
[-]
I just started using hashes for names. The deployment tooling knows the "real" name. The actual deployment hash registers a salt+hash of that name to produce a pseudo-random string name.
reply
Galanwe
1 hour ago
[-]
This is all good and we'll on the IaC side,yes. But at the end of the day, buckets are also user facing resources, and nobody likes random directory / bucket names.
reply
INTPenis
7 minutes ago
[-]
That's a contradiction, a bucket name being treated as a secret in IaC, while being a user facing resource. So no, they're not user facing resources.

If anyone wants them to be user facing resources, then treat them as such, and ensure they're secure, and don't store sensitive info on them. Otherwise, put a service infront of them, and have the user go through it.

The S3 protocol was meant to make the lives of programmers easier, not end users.

reply
amluto
1 hour ago
[-]
It would be nice if the other end of this could be addressed: a configurable policy to limit resolution of bucket names within an account namespace. Ideally, if someone doesn’t have permission to resolve a bucket name, they shouldn’t even be able to detect whether it exists.
reply
alemwjsl
1 hour ago
[-]
I take it advertising your account id isn't a security risk?
reply
Cthulhu_
1 hour ago
[-]
Armchair opinion, but shouldn't be too bad - it's identification, not authentication, just like your e-mail address is.

But probably best to not advertise it too much.

reply
aduwah
1 hour ago
[-]
It is not hygienic, but with only the account-id you are fine. In the IAM rules the attacker can always just use a * on their end, so it does not make a difference. You have to be conscious to set proper rules for your (owner) end tho.
reply
Aardwolf
2 hours ago
[-]
Why all that stuff with namespaces when they could just not allow name reuse?
reply
hrmtst93837
1 minute ago
[-]
If you block name reuse globally, you introduce a new attack surface: permanent denial by squatting on retired names. Companies mess up names all the time from typos, failed rollouts, or legal issues. A one-shot policy locks everyone into their worst error or creates a regulatory mess over who can undo registrations.

Namespaces are annoying but at least let you reorganize or fix mistakes. If you want to prevent squatting, rate limiting creation and deletion or using a quarantine window is more practical. No recovery path just rewards trolls and messes with anyone whose processes aren't perfect.

reply
JoBrad
1 minute ago
[-]
I think a better policy would be to disallow bucket names that follow the account regional namespace convention, but don’t match the account id indicated in the name.
reply
iknownothow
1 hour ago
[-]
Potential reasons I can think of for why they don't disallow name reuse:

a) AWS will need to maintain a database of all historical bucket names to know what to disallow. This is hard per region and even harder globally. Its easier to know what is currently in use rather know what has been used historically.

b) Even if they maintained a database of all historically used bucket names, then the latency to query if something exists in it may be large enough to be annoying during bucket creation process. Knowing AWS, they'll charge you for every 1000 requests for "checking if bucket name exists" :p

c) AWS builds many of its own services on S3 (as indicated in the article) and I can imagine there may be many of their internal services that just rely on existing behaviour i.e. allowing for re-creating the same bucket name.

reply
dwedge
1 hour ago
[-]
I can't accept a) or b). They already need to keep a database of all existing bucket names globally, and they already need to check this on bucket creation. Adding a flag on deleted doesn't seem like a big loss.

As for c), I assume it's not just AWS relying on this behaviour. https://xkcd.com/1172/

reply
orf
1 hour ago
[-]
That would be a huge breaking change. Any workload that relies on re-using a bucket name would be broken, and at the scale of S3 that would have a non-trivial customer impact.

Not to mention the ergonomics would suck - suddenly your terraform destroy/apply loop breaks if there’s a bucket involved

reply
afandian
1 hour ago
[-]
Any workload that relies on re-using a bucket name is broken by design. If someone else can get it, then it's Undefined Behaviour. So it's in keeping with the contract for AWS to prevent re-use. Surely?
reply
orf
1 hour ago
[-]
Think terraform tests, temporary environments, etc. Or anything else: it’s Hyrum's Law.
reply
CodesInChaos
2 hours ago
[-]
I'd allow re-use, but only by the original account. Not being able to re-create a bucket after deleting it would be annoying.

I think that's an important defense that AWS should implement for existing buckets, to complement account scoped bucket.

reply
wiether
49 minutes ago
[-]
Then they should allow bucket ownership transfer...
reply
thih9
2 hours ago
[-]
> If you wish to protect your existing buckets, you’ll need to create new buckets with the namespace pattern and migrate your data to those buckets.

My pet conspiracy theory: this article was written by bucket squatters who want to claim old bucket names after AI agents read this and blindly follow.

reply
lijok
2 hours ago
[-]
Huh? Hash your bucket names
reply
why_only_15
2 hours ago
[-]
if your bucket name is ever exposed and you later delete it, then this doesn't help you.
reply
lijok
1 hour ago
[-]
The entire article talks about “guessing” the bucket name as being the attack enabler, not the leaking of it. What does the landscape look like once you start doing the basics like hashing your bucket names? Is this still a problem worth engineering for?
reply
Maxion
2 hours ago
[-]
I don't think that'd prevent this attack vector.
reply
alemwjsl
1 hour ago
[-]
Ok; salt, and then hash your bucket names
reply
xxs
25 minutes ago
[-]
that doesn't help either. 'Salt' is public and usually different/unique per entry/name.

If you mean to use a "secret" prefix (i.e. pepper) then, that would generate effectively globally unique names each time (and unpredictable too) but you can't change the pepper and it's only a matter of time it'd leak.

reply