>My data is back. Not because of viral pressure. Not because of bad PR. [...]
>“I am devastated to read on your blog about the deletion of your AWS data. I did want to reach out to let you know that people in positions of leadership, such as my boss, are aware of your blog post and I’ve been tasked with finding out what I can, and to at least prevent this from happening in the future.”
So, yes, because of bad PR. Or, at least the possibility of the blog blowing up into a bad PR storm. I'm guessing that if there was no blog, the outcome would be different.
But here’s what I learned from this experience: If you are stuck in a room full of deaf people, stop screaming, just open the door and go find someone who can hear you.
The 20 days of pain I went through, it wasn’t because AWS couldnt fix it.
It’s because I believed that one of the 9 support agents would eventually break script and act like a human. Or that they get monitored by another team.
Turns out, that never happened.
It took someone from outside the ticketing system to actually listen and say: Wait. This makes no sense.
At my small business, we proactively monitor blogs and forums for mentions of our company name so that we can head off problems before they become big. I'm extremely confident that is what happened here.
It was PR-driven in the proactive sense. Which is still PR-driven. (which, by the way, I have no problem with! the problem is the shitty support when it isn't PR-driven)
Regardless, I 100% feel your pain with dealing with support agents that won't break script, and I am legitimately happy that you both got to reach someone that was high enough up the ladder to act human and that they were able to restore your data.
Yes, it is totally possible that AWS monitors blogs and forums for early damage control, like your company does.
But we shouldn’t paint it like I was bailed out by some algorithmic PR radar and nothing else.
Let’s not fall into the “Fuk the police” style of thinking where every action is assumed to be manipulation. Tarus didn’t reach out like a Scientology agent demanding I take the post down or warning me of consequences.
He came with empathy, internal leverage, and actually made things move.
When before i read Tarus email, i wrote in Slack to Nate Berkopec (puma maintainer): `Hi. AWS destroyed me, i'm going to take a big break .`
Then his email reset my cortisol levels to acceptable level.
Most importantly, this incident triggered a CoE (Correction of Error) process inside AWS.
That means internal systems and defaults are being reviewed, and that’s more than I expected. We’re getting a real update, that will affect cases like mine in the future.
So yeah, it may have started in the visibility layer, but what matters is that someone human got involved, and actual change is now happening.
>[...] assumed to be manipulation
I think you're reading way more negativity into "PR" than I'm intending (which is no negativity).
It's very clear Tarus is a caring person who really did empathize with your situation and did their best to rectify the situation. It's not a bad thing that your issue may (most likely) have been brought to his attention because of "PR radar" or whatever.
The bad part, on Amazon and other similar companies, is how they typically respond when a potential PR hit isn't on the line. Which, as I'm sure you know because you experienced it prior to posting your blog, is often a brick wall.
The overwhelming issue is that you often require some sort of threat of damage to their PR to be assisted. That doesn't make the PR itself a bad thing. And that fact implies nothing about the individuals like Tarus who care. Often the lowly tier 1 support empathizes, they just aren't allowed to do anything or say anything.
“SEC01-BP01 Separate workloads using accounts.” - https://docs.aws.amazon.com/wellarchitected/latest/security-...
Keep resources out of the payer/management accounts. Consolidated billing is fine, but the management account should stay empty.
"Best practices for the management account" - https://docs.aws.amazon.com/organizations/latest/userguide/o...
Enable cross-account backups. Copy snapshots or AWS Backup vaults to a second account so Support lockouts don’t equal data loss.
"Creating backup copies across AWS accounts" - https://docs.aws.amazon.com/aws-backup/latest/devguide/creat...
Populate Billing, Security, and Ops alternate contacts. AWS Support escalates to those addresses when the primary inbox is dead. "Update the alternate contacts for your AWS account" - https://docs.aws.amazon.com/accounts/latest/reference/manage...
Follow the multi-account white-paper for long-term org design. It is not optional reading. "Organizing Your AWS Environment Using Multiple Accounts" - https://docs.aws.amazon.com/whitepapers/latest/organizing-yo...
Maybe get some training?
Which only happened because of your blog post. In other words, the effort to prevent bad PR led to them fixing your problem immediately, while 20 days of doing things the "right" way yielded absolutely no results.
This actually makes the problem you've described even worse: it indicates that AWS has absolutely no qualms about failing to properly support the majority of its customers.
The proper thing for them to do was not to have a human "outside the system" fix your problem. It was for them to fix the system so that the system could have fixed your problem.
That being said: Azure is so much worse than AWS. Even bad PR won't push them to fix things.
customer service was great and refunded my money without me blogging about it. we messaged back and forth about what i was trying to do and what i thought i was signing up for. i think it helped to have a long history of tiny aws instances because they mentioned reviewing my customer history
i want to hate amazon but they provided surprisingly pleasant and personable service to a small fry like me. that exchange alone probably cost amazon more money than ive spent in aws. won my probably misguided customer loyalty
Being a PR move isn't inherently a bad thing.
The bad thing is the lack of support when PR isn't at risk.
>It's fair to say that without the blog post this issue wouldn't have been noticed or fixed, but anything past that is really just speculating about people's motives.
My only (minor) issue with the blog post is starting by saying "Not because of PR" when the opening email from the human at amazon was "saw your blog". I think it is evident that Tarus Balog did indeed actually care!
"If you want your paperwork processed in Morocco, make sure you know someone at the commune, and ideally have tea with their cousin."
Yes, it works, but it shouldn’t be the system.
What happened with AWS isn’t a clever survival tip, it’s proof that without an account manager, you are just noise in a ticket queue, unless you bring social proof or online visibility.
This should have never come down to 'who you know' or 'how loud you can go online'.
It a big luck that i'm speaking in english and have online presence, what if i was ranting in French, Arabic, or even Darija in Facebook. Tarus will have never noticed.
I recently opened a DigitalOcean account and it was locked for a few days after I had moved workloads in. They took four days to unlock the account, and for my trouble they continued to charge me for my resources during the time the account was locked when I couldn't log in to delete them. I didn't have any recourse at all. They did issue a credit because I asked nicely, but if they said no, that would have been it.
https://developer.visa.com/capabilities/vau
https://developer.mastercard.com/product/automatic-billing-u...
https://stripe.com/resources/more/what-is-a-card-account-upd...
I'm super annoyed by any hosting or telco provider that has failed to implement it, I think it amounts to negligence, especially if they stop and completely erase your entire portfolio only a couple of days after payment failure, which some officially do.
I lost my Online.net Scaleway test servers this way, because they have failed to implement the updater. Of course, their support will always blame the customer.
BTW, these Account Updaters are also the reason why changing a card number will not prevent any recurring charges, either; it's by design.
I can confirm that Amazon Prime, Visible and 123-Reg have no issues with VAU and equivalents. Your monthly Amazon Prime in the US will continue getting billed to an expired credit card without any issues, for example. Same with Visible and even the annual renewals on 123-Reg. They all don't show the new card in the interface, and still DO issue warnings that the card is expired, yet, on the billing date, it all just bills as if nothing has happened.
T-Mobile US may lack support, and Callcentric, Hetzner and Online.net, don't seem to implement an Account Updater, either, not sure about OVH.
Translation: "someone noticed it trending on HN, decided it was bad publicity, and that they should do something about it"
Implication: what mattered was the bad publicity, not the poor support infrastructure. The latter won't change, and the next person with similar problems will get the same runaround, and probably lose their data.
/c (cynic, but I suspect realist)
The various teams (anti-fraud and support) are investigating how we failed this customer so we can improve and hopefully keep this from happening again. (This is the ‘Correction of Error’ process that’s being worked on. And CoE’s aren’t a punitive ‘blame session’ - it’s figuring out how a problem happened and how we can fix or avoid it systemically going forward).
To be fair, the publicity did mean that multiple people were flagging this and driving escalations around it.
I'm concerned that you're being very unspecific talking about "our services" and "it" going wrong.
What went wrong here is AWS not spending enough money on humans in the support teams. And of course this is a neverending balancing act between profitability and usability. Like any other profit vs. usability consideration, the curve probably has a knee somewhere when the service becomes too unusable and too many people flee to the competition.
And it seems current economic wisdom is that that knee in the curve is pretty far on the "bad support" side of the scale.
Which is to say, the cynic in me doesn't believe you'll be making any changes, mostly because that knee in the curve is in fact pretty far on the "bad support" side, and economics compels you to exploit that.
Disabling a legitimate in-use account is one of our absolute nightmares, and I don't care if it was an account paying $3/month we would be having a review of that with our top level management (including our CEO - Matt Garman) no matter how we found out about it. For us, there is not some acceptable rate of this as a cost of doing business.
Might be your nightmare but at the same time there is no way for your customers to report it or your own support agents to escalate that something wrong might have happened and someone should look again ...
At least one layer of human support needs to have the ability -- not just the ability, but the obligation! -- to escalate to your team when a customer service problem occurs that doesn't fit a pattern of known/active scams and they are unable to help the customer themselves. Sounds like that's not currently the case.
In these cases, it's also really important that customer support stick to a script and can't be abused as part of social engineering, hijacking, or fraud check bypass. "No we can't reset your account" is a very important protection too. I agree that there is an obligation to escalation, but I suspect the focus of the COE will be on how we could have detected this without human judgement. There's got to be a way.
One obvious approach would be to charge for access to human support. I'll bet the OP would happily have paid $50 to talk to someone with both the ability and inclination to escalate the issue. In rare instances such as this one where the problem really is on your end, the $50 would be refunded.
And disabling an in use account was not the issue here. There not being a way to get the account re enabled is the issue.
"The post-closure period is 90 days—during this time, an account can be reopened and data is retained."
If they don’t “know you” financially you’ll need to jump through compliance questions before your new payment method is accepted. If you don’t verify the email verification mail within the 5 days I imagine you have to jump through more hoops, which this person didn’t do in time.
I am inspired now to dump my databases and rsync the content on a schedule.
1. A hard drive in a fire safe.
2. An S3 bucket, mediated by Wasabi.
3. My friend's server that lives at his house half a continent away.
It would be nice to have a fourth location that's a physical hard-drive that lives outside of my house, but close enough to drive to for pick-up, but it would mean either paying for a safety deposit box as you mentioned, or hassling a friend once a week as I come to pick it up and deposit it.
I figure that if a disaster that takes out my house and someone across town at the same time, I probably won't be worrying about restoring data. Across continent would only be viable with a server like you mentioned essentially buddyCloud.
I urge you to reconsider this belief if you have the financial means. Life continues after natural disasters. I lived in a place that was devastated by a hurricane and many people's homes were destroyed. Those people still live and work today, and they still needed their files. They especially needed their insurance documents after the hurricane.
Re: "The Support Agents Who Became LLMs"; yes, institutionalized support is terrible almost everywhere. Partly because it costs real money to pay real humans to do it properly, so it ends up as a squeezed cost centre.
I have read both his article in full. As I see it the issue is that he has given his account out of his own hands. But I guess he did not understand the consequences of this.
He put his own AWS account as a member into another companies "AWS Organizations" (https://docs.aws.amazon.com/organizations/), so they could pay his bills. As I understand if the so called Management Account gets deleted (reason may not matter), this will also delete all the member accounts, probably as they don't have any payment setup active any more. A good overview gives "Terminology and concepts for AWS Organizations"(https://docs.aws.amazon.com/organizations/latest/userguide/o...).
I had done this setup in the company I worked for. We "merged" two independent accounts (one from the company we acquired) as members into an AWS Organization with consolidated billing below a newly created management account. We let our AWS account manager take care of transferring the payment settings (we already payed with invoice and bank transfer) from our own (now only a member) account into the (newly) management account. With this we had only one invoice per month and everything is legally owned by the owner company. With this also some discounts or savings plans can be used from all accounts. Over the years we also created several new member accounts out of the management account, which is easy and does not need the setup of any payment information at all.
I was getting constant fake SMS messages about a failed login attempt, turned on 2FA thinking it was a good idea, 2 minutes later banned completely for 'abuse'. My case will not be heard or reviewed.
Well, not normally, no. But it does happen. Not often enough to be a meaningful statistical issue, but if it were to happen to you then a little forethought can turn a complete disaster into a survivable event. If you store all your data 'in the cloud' realize that your account could be compromised, used to store illegal data, be subject to social engineering and lots of other ways that could result in a cloud services provider to protect their brand rather than your data. If - like the author - you are lucky you'll only be down for a couple of days. But for most businesses that's the end of the line, especially if you run a multi-tenant SaaS or something like that. So plan for the worst and hope for the best.
Doesn't everyone has this?
Surprising. In my time, things always got pretty serious if your service could not recover from loss due to regretable events.
TFA alluded to a possible but "undocumented" way to restore terminated infrastructure. I don't think all AWS services nuke everything on deletion, but if it is not in writing ...
https://www.tomshardware.com/software/cloud-storage/aws-accu...
> Update: August 5 7:30am (ET): In a statement, an AWS spokesperson told Tom's Hardware "We always strive to work with customers to resolve account issues and provided an advance warning of the potential account suspension. The account was suspended as part of AWS’s standard security protocols for accounts that fail the required verification, and it is incorrect to claim this was because of a system error or accident."
This shows a bigger part of this problem.
When these mistakes do happen, they're invariably treated as standard operating procedures.
They're NEVER treated as errors.
It would appear that the entire support personnel chain and PR literally have no escalation path to treat any of these things as errors.
Instead, they simply double-down that it's NOT an error that the accounts was terminated on an insufficient notice over bogus claims and broken policies.
Yes, I had backups everywhere. Across providers, in different countries. But I built a system tied to my AWS account number, my instances, my IDs, my workflows.
When that account went down, all those “other” backups were just dead noise encrypted forever. Bringing them up to the story only invites the 'just use your other backups' fallback, and ignores the real fragility of centralized dependencies.
It is like this: the UK still maintains BBC Radio 4’s Analogue Emergency Broadcast—a signal so vital that if it’s cut, UK nuclear submarines and missile silos automatically trigger retaliation. No questions asked. That's how much stakes they place on a reliable signal.
If your primary analogue link fails, the world ends. That's precisely how I felt when AWS pulled my account, because I’d tied my critical system to a single point of failure. If the account was just Read only, i will waited because i could have access to my data and rotated keys.
AWS is the apex cloud provider on the planet. This isnt about redundancy or best practices.
it's about how much trust and infrastructure we willingly lend to one system.
Remember that if BBC Radio 4 signal get to fail for some reasons, the world will get nuked, only cockroaches will survive… and your RDS and EC2 billing fees.
Sorry, but it is absolutely and undeniably not!
I don't think it's accurate to portray AWS as the underlying infrastructure of the internet on the planet at all.
Speaking of the "planet", here's a sample statistics for .ru ccTLD on web hosting usage across the 1 million domain names in the zone — one of the top-10 ccTLD zones on the planet:
https://statonline.ru/metrics/hosting_shared?month=2021-12&t...
Amazon is dead last #30 with 0.46% of the market; the data is for December 2021, so, it's before any payment issues that would only originate in March 2022, and which would take weeks/months/years to propagate.
Hetzner is far more popular, at 4.11%, also, OVH is bigger than Amazon, too, at 1.38%, and even Wix had 2.16%. Even Google managed to get 1.15%, probably because of their Google Sites service. Most of the rest of the providers are local — that's how it should be if digital sovereignty is of the essence. The real reason for the local providers, however, is most likely affordability, reliability, price, better service and far better support, and not the digital sovereignty considerations at all. This is exactly why Hetzner is at the top, because the market is obviously price-sensitive if the unlimited venture capital was never available for the local startups, and Hetzner and OVH provide the best value for money, whereas AWS, does not.
The earliest data this service currently has, is for March 2020, and there, Amazon didn't even make it in the top-30 at all!
https://statonline.ru/metrics/hosting_shared?month=2020-03&t...
They have a separate tab for VPS and Dedicated, which covers about 0.1 million domain names compared to 1 million in the prior view, and, there, Hetzner, Digital Ocean and OVH, are, likewise, ahead of Amazon, too, with Hetzner being #2, having over 10%, compared to Amazon's 2%, with Amazon thus far behind:
https://statonline.ru/metrics/hosting_vps_dedic?tld=ru&month...
Numbers for 2025 don't look as good for any foreign provider, likely due to a combination of factors, including much stronger data sovereignty laws that may preclude most startups from being able to use foreign services like Hetzner, but Hetzner is still the most popular foreign option in both categories, still having more market share in 2025 than AWS did in 2021, and even DigitalOcean is still more popular than AWS, too.
BTW, I've tried looking up the BBC Radio 4 apocalypse claims, and I'm glad to find out that this information is false.
I'm glad you got your account restored, and I thank you for bringing the much needed attention to these verification issues, but I don't think you're making the correct conclusions here. Not your keys, not your coins. Evidently, this applies to backups, too.
Many of the providers on those lists are older than AWS, and/or have been in business for many years since before 2010, many for 20 years or more, long before the data sovereignty concerns went mainstream all around the world in the last 10 years.
> An AWS spokesperson told The Register: "We always strive to work with customers to resolve account issues and provided advance warning of the potential account suspension. The account was suspended as part of AWS's standard security protocols for accounts that fail the required verification, and it is incorrect to claim this was because of a system error or accident."
Note that this deletion — without the supposedly standard 90d grace period — was NOT an error (up until the time it's been reinstated, apparently).
As if all their code was geared to AWS services and they were banned, they couldn't easily deploy elsewhere without a re-write
I stick to to VMs/kubernetes and containers and VMs. No own brand services.
If wrong data gets deleted, and that gets replicated, now you simply have two copies of bad data.
Is this a great situation? No. It's also not "I did everything right and boo hoo AWS did a boo boo". AWS is not your friend, but you also weren't the customer, that was the middleman you gave ownership to.
[1] https://www.seuros.com/blog/aws-deleted-my-10-year-account-w...
> AWS blamed the termination on a “third-party payer” issue. An AWS consultant who’d been covering my bills disappeared, citing losses from the FTX collapse. The arrangement had worked fine for almost a year—about $200/month for my testing infrastructure.
He essentially got screwed over by the consultant. Everything else is a side-effect but not material to the cause.
Get some training. It's in the first few slides... Read the docs. Look at the Well-Architected Framework...
See this comment for what he should have done: https://news.ycombinator.com/item?id=44828910
> July 14: Form expired. I contact support. Simple question: “What do you need from me?”
> July 16-20: Four days of silence. Then: “We’re escalating to the appropriate > team.”
> July 20: New form finally arrives.
> July 21: I submit ID and utility bill (clear PDF). Response time: 10 hours.
> July 22: AWS: “Document unreadable.” The same PDF my bank accepts without question. > July 23: Account terminated. My birthday gift from AWS.
If I understand the timeline, how were they going to verify themselves properly?
"The post-closure period is 90 days—during this time, an account can be reopened and data is retained."
https://www.seuros.com/blog/aws-deleted-my-10-year-account-w...
Root cause: the author had some wacky service-end setup with their AWS account (presumably to save a few cents per month) and the third party screwed them.