On the one hand, this is obviously the right decision. The number of giant data breeches caused by incorrectly configured S3 buckets is enormous.
But... every year or so I find myself wanting to create an S3 bucket with public read access to I can serve files out of it. And every time I need to do that I find something has changed and my old recipe doesn't work any more and I have to figure it out again from scratch!
Even if you have a terrible and permissive bucket policy or ACLs (legacy but still around) configured for the S3 bucket, if you have Block Public Access turned on - it won't matter. It still won't allow public access to the objects within.
If you turn it off but you have a well scoped and ironclad bucket policy - you're still good! The bucket policy will dictate who, if anyone, has access. Of course, you have to make sure nobody inadvertantly modifies that bucket policy over time, or adds an IAM role with access, or modifies the trust policy for an existing IAM role that has access, and so on.
That being said, there's certain a lot more that could into making a system like that easier for developers. One thing that springs to mind is tooling that can describe what rules are currently in effect that limit (or grant, depending on the model) permissions for something. That would make it more clear when there are overlapping rules that affect the permissions of something, which in turn would make it much more clear why something is still not accessible from a given context despite one of the rules being removed.
After thinking about this sort of thing a lot when designing a system for something sort of similar to this (at a much smaller scale, but with the intent to define it in a way that could be extended to define new types of rules for a given set of resources), I feel pretty strongly that the best way for a system like this to work from the protectives of security, ease of implementation, and intuitiveness for users are all aligned in requiring every rule to explicitly be defined as a permission rather than representing any of them as restrictions (both in how they're presented to the user and how they're modeled under the hood). With this model, veryifing whether an action is allowed can be implemented by mapping an action to the set of accesses (or mutations, as the case may be) it would perform, and then checking that each of them has a rule present that allows it. This makes it much easier to figure out whether something is allowed or not, and there's plenty of room for quality of life things to help users understand the system (e.g. being able to easily show a user what rules pertain to a given resource with essentially the same lookup that you'd need to do when verifying an action in it). My sense is that this is actually not far from how AWS permissions are implemented under the hood, but they completely fail at the user-facing side of this by making it much harder than it needs to be to discover where to define the rules for something (and by extension, where to find the rules currently in effect for it).
Yeah, what month?
It does compile down to Azure Resource Manager's json DSL, so in that way close to Troposphere I guess, only both sides are official and not just some rando project that happens to emit yaml/json
The implementation, of course, is ... very Azure, so I don't mean to praise using it, merely that it's a better idea than rawdogging json
The syntax does look nicer but sadly that’s just a superficial improvement.
Also that have CDK which is a framework for writing IaC in Java/TypeScript, Go, Python, etc.
https://dev.to/kelvinskell/getting-started-with-aws-cdk-in-p...
1. I need a goddamn CLI to run it (versus giving someone a URL they can load in their tenant and have running resources afterward)
1. the goddamn CLI mandates live cloud credentials, but then stright-up never uses them to check a goddamn thing it intends to do to my cloud control plane
You may say "running 'plan' does" and I can offer 50+ examples clearly demonstrating that it does not catch the most facepalm of bugs
1. related to that, having a state file that believes it knows what exists in the world is just ludicrous and pain made manifest
1. a tool that thinks nuking things is an appropriate fix ... whew. Although I guess in our new LLM world, saying such things makes me the old person who should get onboard the "nothing matters" train
and the language is a dumpster, imho
> 1. I need a goddamn CLI to run it (versus giving someone a URL they can load in their tenant and have running resources afterward)
CloudFormation is the only IaC that supports "running as a URL" and that's only because it's an AWS native solution. And CloudFormation is a hell of a lot more painful to write and slower to iterate on. So you're not any better off for using CF.
What usually happens with TF is you'd build a deploy pipeline. Thus you can test via the CLI then deploy via CI/CD. So you're not limited to just the CLI. But personally, I don't see the CLI as a limitation.
> the goddamn CLI mandates live cloud credentials, but then stright-up never uses them to check a goddamn thing it intends to do to my cloud control plane
All IaC requires live cloud credentials. It would be impossible for them to work without live credentials ;)
Terraform does do a lot of checking. I do agree there is a lot that the plan misses though. That's definitely frustrating. But it's a side effect of cloud vendors having arbitrary conditions that are hard to define and forever changing. You run into the same problem with any tool you'd use to provision. Heck, even manually deploying stuff from the web console sometimes takes a couple of tweaks to get right.
> 1. related to that, having a state file that believes it knows what exists in the world is just ludicrous and pain made manifest
This is a very strange complaint. Having a state file is the bare minimum any IaC NEEDS for it to be considered a viable option. If you don't like IaC tracking state then you're really little better off than managing resources manually.
> a tool that thinks nuking things is an appropriate fix ... whew.
This is grossly unfair. Terraform only destroys resources when:
1. you remove those resources from the source. Which is sensible because you're telling Terraform you no longer want those resources
2. when you make a change that AWS doesn't support doing on live resources. Thus the limitation isn't Terraform, it is AWS
In either scenario, the destroy is explicit in the plan and expected behaviour.
Incorrect, ARM does too, they even have a much nicer icon for one click "Deploy to Azure" <https://learn.microsoft.com/en-us/azure/azure-resource-manag...> and as a concrete example (or whole repo of them): <https://github.com/Azure/azure-quickstart-templates/tree/2db...>
> All IaC requires live cloud credentials. It would be impossible for them to work without live credentials ;)
Did you read the rest of the sentence? I said it's the worst of both worlds: I can't run "plan" without live creds, but then it doesn't use them to check jack shit. Also, to circle back to our CF and Bicep discussion, no, I don't need cloud creds to write code for those stacks - I need only creds to apply them
I don't need a state file for CF nor Bicep. Mysterious about that, huh?
That’s Azure, not AWS. My point was to have “one click” HTTP installs you need native integration with the cloud vendor. For Azure it’s the clusterfuck that is Bicep. For AWS it’s the clusterfuck that is CF
> I don't need a state file for CF nor Bicep.
CF does have a state file, it’s just hidden from view.
And bicep is shit precisely because it doesn’t track state. In fact the lack of a state file is the main complain against bicep and thus the biggest thing holding it back from wider adoption — despite being endorsed by Microsoft Azure.
Regarding HCL, I respect their decision to keep the language minimal, and for all it's worth you can go very, very far with the language expressions and using modules to abstract some logic, but I think it's a fair criticism for the language not to support custom functions and higher level abstractions.
This isn’t a limitation of TF, it’s an intended consequence of cloud vendor lock in
Which is to say both of you are correct, but OP was highlighting the improper expectations of "if we write in TF, sure it sucks balls but we can then just pivot to $other_cloud" not realizing it's untrue and now you've used a rusty paintbrush as a screwdriver
But maybe I’ve just been blessed to work with people who aren’t complete idiots?
I'm not sure if that's changed recently, I've stopped using it.
eksctl just really impressed me with its eks management, specifically managed node groups & cluster add-ons, over terraform.
that uses cloudformation under the hood. so i gave it a try, and it’s awesome. combine with github actions and you have your IAC automation.
nice web interface for others to check stacks status, events for debugging and associated resources that were created.
oh, ever destroy some legacy complex (or not that complex) aws shit in terraform? it’s not going to be smooth. site to site connections, network interfaces, subnets, peering connections, associated resources… oh, my.
so far cloudformation has been good at destroying, but i haven’t tested that with massive legacy infra yet.
but i am happily converted tf>cf.
and will happily use both alongside each other as needed.
I can't confirm it, but I suspect that it was always meant to be a sales tool.
Every AWS announcement blog has a "just copy this JSON blob, and paste it $here to get your own copy of the toy demo we used to demonstrate in this announcement blog" vibe to it.
If you know what you're doing, as it sounds like you and I do, then all of this is very easy to get set up (but then aren't most things easy when you already know how? hehe). However we are talking about people who aren't comfortable with vanilla S3, so throwing another service into the mix isn't going to make things easier for them.
For small scale stuff, S3s storage and egress charges are unlikely to be impactful. But it doesn’t mean they’re cheap relative to the competition.
There are also ways you can reduce S3 costs, but then you're trading the costs received from AWS with the costs of hiring competent DevOps. Either way, you pay.
It's so simple for storing and serving a static website.
Are there good and cheap alternatives?
BTW: Is GitHub Page still free for custom domains? (I don't know the EULA)
- signed url's in case you want a session base files download
- default public files, for e.g. a static site.
You can also map a domain (sub-domain) to Cloudfront with a CNAME record and serve the files via your own domain.
Cloudfront distributions are also CDN based. This way you serve files local to the users location, thus increasing the speed of your site.
For lower to mid range traffic, cloudfront with s3 is cheaper as the network cost of cloudfront is cheaper. But for large network traffic, cloudfront cost can balloon very fast. But in those scenarios S3 costs are prohibitive too!
Once I have that I can also ask it for the custom tweaks I need.
You're braver than me if you're willing to trust the LLM here - fine if you're ready to properly review all the relevant docs once you have code in hand, but there are some very expensive risks otherwise.
I take code from stack overflow all the time and there’s like a 90% chance it can work. What’s the difference here?
But for low stakes LLM works just fine - not everything is going to blow up to a 30,000 bill.
In fact I'll take the complete opposite stance - verifying your design with an LLM will help you _save_ money more often than not. It knows things you don't and has awareness of concepts that you might have not even read about.
If you don't do that will you necessarily notice that you accidentally leaked customer data to the world?
The problem isn't the LLM it's assuming its output is correct just the same as assuming Stack Overflow answers are correct without verifying/understanding them.
If you are comparing with stackoverflow then I guess we are on the same page - most people are fine with taking stuff from stackoverflow and it doesn't count as "brave".
> I'm willing to accept the risk of ocassionally making S3 public
This is definitely where we diverge. I'm generally working with stuff that legally cannot be exposed - with hefty compliance fines on the horizon if we fuck up.
I feel like this workflow is still less time, easier and less error prone than digging out the exact right syntax from the AWS docs.
I now have the daunting challenge of deploying an Azure Kubernetes cluster with... shudder... Windows Server containers on top. There's a mile-long list of deprecations and missing features that were fixed just "last week" (or whatever). That is just too much work to keep up with for mere humans.
I'm thinking of doing the same kind of customised chatbot but with a scheduled daily script that pulls the latest doco commits, and the Azure blogs, and the open GitHub issue tickets in the relevant projects and dumps all of that directly into the chat context.
I'm going to roll up my sleeves next week and actually do that.
Then, then, I'm going to ask the wizard in the machine how to make this madness work.
Pray for me.
I'm still not sure I know how to do it if I need to again.
Hasn't it been this way for many years?
>Spot instances used to be much more of a bidding war / marketplace.
Yeah because there's no bidding any more at all, which is great because you don't get those super high spikes as availability drops and only the ones who bid super high to ensure they wouldn't be priced out are able to get them.
>You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots.
This one was a nightmare and it took ages to convince some of my more pig headed coworkers in the past that they didn't need to do it any more. The funniest part is that they were storing their data as millions and millions of 10-100kb files, so the S3 backend scaling wasn't the thing bottlenecking performance anyway!
>Originally Lambda had a 5 minute timeout and didn’t support container images. Now you can run them for up to 15 minutes, use Docker images, use shared storage with EFS, give them up to 10GB of RAM (for which CPU scales accordingly and invisibly), and give /tmp up to 10GB of storage instead of just half a gig.
This was/is killer. It used to be such a pain to have to manage pyarrow's package size if I wanted a Python Lambda function that used it. One thing I'll add that took me an embarrassingly long time to realize is that your Python global scope is actually persisted, not just the /tmp directory.
I had a theory (based on no evidence I'm aware of except knowing how Amazon operates) that the original Glacier service operated out of an Amazon fulfillment center somewhere. When you put it a request for your data, a picker would go to a shelf, pick up some removable media, take it back, and slot it into a drive in a rack.
This, BTW, is how tape backups on timesharing machines used to work once upon a time. You'd put in a request for a tape and the operator in the machine room would have to go get it from a shelf and mount it on the tape drive.
https://www.reddit.com/r/DataHoarder/comments/12um0ga/the_ro...
Which is basically exactly what you described but the picker is a robot.
Data requests go into a queue; when your request comes up, the robot looks up the data you requested, finds the tape and the offset, fetches the tape and inserts it into the drive, fast-forwards it to the offset, reads the file to temporary storage, rewinds the tape, ejects it, and puts it back. The latency of offline storage is in fetching/replacing the casette and in forwarding/rewinding the tape, plus waiting for an available drive.
Realistically, the systems probably fetch the next request from the queue, look up the tape it's on, and then process every request from that tape so they're not swapping the same tape in and out twenty times for twenty requests.
1. Do not encrypt your tapes if you want the data back in 30/50 years. We have had so many companies lose encryption keys and turn their tapes into paperweights because the company they bought out 17 years ago had poor key management.
2. The typical failure case on tape is physical damage not bit errors. This can be via blunt force trauma (i.e. dropping, or sometimes crushing) or via poor storage (i.e. mould/mildew).
3. Not all tape formats are created equal. I have seen far higher failure rates on tape formats that are repeatedly accessed, updated, ejected, than your old style write once, read none pattern.
That said, I don’t miss any of that stuff, gimme S3 any day :)
It's been a long time, and features launched since I left make clear some changes have happened, but I'll still tread a little carefully (though no one probably cares there anymore):
One of the most crucial things to do in all walks of engineering and product management is to learn how to manage the customer expectations. If you say customers can only upload 10 images, and then allow them to upload 12, they will come to expect that you will always let them upload 12. Sometimes it's really valuable to manage expectations so that you give yourself space for future changes that you may want to make. It's a lot easier to go from supporting 10 images to 20, than the reverse.
It feels odd that this is some sort of secret. Why can't you talk about it?
(And if Amazon is making unreasonable length NDAs I hope they lose a lot of money over it.)
Then the existing pickers would get special instructions on their handhelds: Go get item number NNNN from Row/shelf/bin X/Y/Z and take it to [machine-M] and slot it in, etc.
Even worse, if you run self hosted NAT instance(s) don't use a EIP attached to them. Just use a auto-assigned public IP (no EIP).
NAT instance with EIP
- AWS routes it through the public AWS network infrastructure (hairpinning).
- You get charged $0.01/GB regional data transfer, even if in the same AZ.
NAT instance with auto-assigned public IP (no EIP)
- Traffic routes through the NAT instance’s private IP, not its public IP.
- No regional data transfer fee — because all traffic stays within the private VPC network.
- auto-assigned public IP may change if the instance is shutdown or re-created so have automations to handle that. Though you should be using the network interface ID reference in your VPC routing tables.
My understanding is that transfer gets charged on both sides as well. So if you own both sides you'll pay $0.02/GB.
I have the opposite philosophy for what it’s worth: if we are going to pay for AWS I want to use it correctly, but maximally. So for instance if I can offload N thing to Amazon and it’s appropriate to do so, it’s preferable. Step Functions, lambda, DynamoDB etc, over time, have come to supplant their alternatives and its overall more efficient and cost effective.
That said, I strongly believe developers don’t do enough consideration as to how to maximize vendor usage in an optimal way
Because it's not straightforward. 1) You need to have general knowledge of AWS services and their strong and weak points to be able to choose the optimal one for the task, 2) you need to have good knowledge of the chosen service (like DynamoDB or Step Functions) to be able to use it optimally; being mediocre at it is often not enough, 3) local testing is often a challenge or plain impossible, you often have to do all testing on a dev account on AWS infra.
Truly a marketing success.
AWS can be used in a different, cost effective, way.
It can be used as a middle-ground capable of serving the existing business, while building towards a cloud agnostic future.
The good AWS services (s3, ec2, acm, ssm, r53, RDS, metadata, IAM, and E/A/NLBs) are actually good, even if they are a concern in terms of tracking their billing changes.
If you architect with these primitives, you are not beholden to any cloud provider, and can cut over traffic to a non AWS provider as soon as you’re done with your work.
Let me explain why we’re not talking about an 80/20 split.
There’s no reason to treat something like a route53 record, or security group rule, in the same way that you treat the creation of IAM Policies/Roles and their associated attachments.
If you create a common interface for your engineers/auditors, using real primitives like the idea of a firewall rule, you’ve made it easy for everyone to avoid learning the idiosyncrasies of each deployment target, and feel empowered to write their own merge requests, or review the intended state of a given deployment target.
If you need to do something provider-specific, make a provider-specific module.
If you don't use the E(lasticity) of EC2, you're burning cash.
For prod workloads, if you can go from 1 to 10 instances during an average day, that's interesting. If you have 3 instances running 24/7/365, go somewhere else.
For dev workloads, being able to spin instances in a matter of seconds is a bliss. I installed the wrong version of a package on my instance? I just terminate it, wait for the auto-scaling group to pop a fresh new one a start again. No need to waste my time trying to clean my mess on the previous instance.
You speak about Step Functions as an efficient and cost effective service from AWS, and I must admit that it's one that I avoid as much as I can... Given the absolute mess that it is to setup/maintain, and that you completely lock yourself in AWS with this, I never pick it to do anything. I'd rather have a containerized workflow engine running on ECS, even though I miss on the few nice features that SF offers within AWS.
The approach I try to have is:
- business logic should be cloud agnostic
- infra should swallow all the provider's pills it needs to be as efficient as possible
In practice I found this to be more burden than it’s worth. I have yet to work somewhere that is on Azure, GCP or AWS and actually switch between clouds. I am sure it happens, but is it really that common?
I instead think of these platforms as a marriage, you’re going to settle in one and do your best to never divorce
Using all the bells and whistles of a provider and being locked-in is one thing. But the other big issue is that, as service providers, they can (and some of them did more often than not) stop providing some services or changing them in a way that forces you to make big changes in your app to keep it running on this service.
Whereas, if you build your app in a agnostic way, they can stop or change what they want, you either don't rely on those services heavily enough for the changes required to be huge, or you can just deploy elsewhere, with another flavor of the same service.
Let's say you have a legacy Java app that works only with a library that is not maintained. If you don't want to bear the cost of rewriting with a new and maintained library, you can keep the app running, knowing the risks and taking the necessary steps to protect you against it.
Whereas if your app relies heavily on DynamoDB's API and they decide to drop the service completely, the only way to keep the app running is to rewrite everything for a similar service, or to find a service relying on the same API elsewhere.
There's a sweet spot somewhere in between raw VPSes and insanely detailed least-privilege serverless setups that I'm trying to revert to. Fargate isn't unmanageable as a candidate, not sure it's The One yet but I'm going to try moving more workloads to it to find out.
Want to set up MFA ... login required to request device.
Yes, I know, they warned us far ahead of time. But not being able to request one of their MFA devices without a login is ... sucky.
> I understand your situation is a bit unique, where you are unable to log in to your AWS account without an MFA device, but you also can't order an MFA device without being able to log in. This is a scenario that is not directly covered in our standard operating procedures.
The best course of action would be for you to contact AWS Support directly. They will be able to review your specific case and provide guidance on how to obtain an MFA device to regain access to your account. The support team may have alternative options or processes they can walk you through to resolve this issue.
Please submit a support request, and one of our agents will be happy to assist you further. You can access the support request form here: https://console.aws.amazon.com/support/home
That last URL? You need to login to use it ...
NAT gateways are not purely hands-off, you can attach additional IP addresses to NAT gateways to help them scale to supporting more instances behind the NAT gateway, which is a fundamental part of how NAT gateways work in network architectures, because of the limit on the number of ports that can be opened through a single IP address. When you use a VPC Gateway Endpoint then it doesn't use up ports or IP addresses attached to a NAT gateway at all. And what about metering? If you pay per GB for traffic passing through the NAT gateway, but I guess not for traffic to an implicit built-in S3 gateway, so do you expect AWS to show you different meters for billed and not-billed traffic, but performance still depends on the sum total of the traffic (S3 and Internet egress) passing through it? How is that not confusing?
It's also besides the point that not all NAT gateways are used for Internet egress, indeed there are many enterprise networks where there are nested layers of private networks where NAT gateways help deal with overlapping private IP CIDR ranges. In such cases, having some kind of implicit built-in S3 gateway violates assumptions about how network traffic is controlled and routed, since the assumption is for the traffic to be completely private. So even if it was supported, it would need to be disabled by default (for secure defaults), and you're right back at the equivalent situation you have today, where the VPC Gateway Endpoint is a separate resource to be configured.
Not to mention that VPC Gateway Endpoints allow you to define policy on the gateway describing what may pass through, e.g. permitting read-only traffic through the endpoint but not writes. Not sure how you expect that to work with NAT gateways. This is something that AWS and Azure have very similar implementatoons for that work really well, whereas GCP only permits configuring such controls at the Organization level (!)
They are just completely different networking tools for completely different purposes. I expect closed-by-default secure defaults. I expect AWS to expose the power of different networking implements to me because these are low-level building blocks. Because they are low-level building blocks, I expect for there to be footguns and for the user to be held responsible for correct configuration.
And the traffic never even reaches the public internet. There's a mismatch between what the billing is supposedly for and what it's actually applied to.
> do you expect AWS to show you different meters for billed and not-billed traffic, but performance still depends on the sum total of the traffic (S3 and Internet egress) passing through it?
Yes.
> How is that not confusing?
That's how network ports work. They only go so fast, and you can be charged based on destination. I don't see the issue.
> It's also besides the point that not all NAT gateways are used for Internet egress
Okay, if two NAT gateways talk to each other it also should not have egress fees.
> some kind of implicit built-in S3 gateway violates assumptions
So don't do that. Checking if the traffic will leave the datacenter doesn't need such a thing.
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpo...
(Disclaimer: I work for AWS, opinions are my own.)
There are many IaC libraries, including the standard CloudFormation VPC template and CDK VPC class, that can create them automatically if you so choose. I suspect the same is also true of commonly-used Terraform templates.
It's a convenience VS security argument, though the documentation could be better (including via AWS recommended settings if it sees you using S3).
This has been a common gotcha for over a decade now: https://www.lastweekinaws.com/blog/the-aws-managed-nat-gatew...
They spent the effort of branding private VPC endpoints "PrivateLink". Maybe it took some engineering effort on their part, but it should be the default out of the box, and an entirely unremarkable feature.
In fact, I think if you have private subnets, the only way to use S3 etc is Private Link (correct me if I'm wrong).
It's just baffling.
And, as as added benefit, they distinguish between "just pull" and "pull and push" which is nice
S3 can use either, and we recommend establishing VPC Gateway endpoints by default whenever you need S3 access.
(Disclaimer: I work for AWS, opinions are my own.)
People who are probably shouldn't be on aws - but they usually have to for unrelated reasons, and they will work to reduce their bill.
This just sounds like a polite way of saying "we're taking peoples' money in exchange for nothing of value, and we can get away with it because they don't know any better".
Hideous.
They should be, of course, at least when the destination is an AWS service in the same region.
[edit: I'm speaking about interface endpoints, but S3 and DynamoDB can use gateway endpoints, which are free to the same region]
S3 can use either, and we recommend establishing VPC Gateway endpoints by default whenever you need S3 access.
(Disclaimer: I work for AWS, opinions are my own.)
Would you recommend using VPC Gateway even on a public VPC that has an Internet gateway (note: not a NAT gateway)? Or only on a private VPC or one with a NAT gateway?
Fascinating. What's the advantage of doing that?
Gateway endpoints only work for some things.
Other AWS services, though, don't support gateway endpoints.
~~I get the impression there are several others, too, but that one is of especial interest to me~~ Wowzers, they really are much better now:
aws --region us-east-1 ec2 describe-vpc-endpoint-services | jq '.ServiceNames|length'
459
If you're saying "other services should offer VPC Endpoints," I am 100% on-board. One should never have to traverse the Internet to contact any AWS control planeThe other problem with (interface) VPC endpoints is that they eat up IP addresses. Every service/region permutation needs a separate IP address drawn from your subnets. Immaterial if you're using IPv6, but can be quite limiting if you're using IPv4.
Just some internet services that haven’t upgraded. (But fixed by NAT.)
Ultimately AWS doesn’t have the right leadership or talent to be good at GenAI, but they do (or at least used to) have decent core engineers. I’d like to see them get back to basics and focus there. Right now leadership seems panicked about GenAI and is just throwing random stuff at the wall desperately trying to get something to stick. Thats really annoying to customers.
WTH?
The canonical AZ naming was provided because, I bet, they realized that the users who needed canonical AZ identifiers were rarely the same users that were causing hot spots via always picking the same AZ.
It was fine, until there started to be ways of wiring up networks between accounts (eg PrivateLink endpoint services) and you had to figure out which AZ was which so you could be sure you were mapping to the the same AZs in each account.
I built a whole methodology for mapping this out across dozens of AWS accounts, and built lookup tables for our internal infrastructure… and then AWS added the zone ID to AZ metadata so that we could just look it up directly instead.
TGW is... twice as expensive as vpc peering?
But unlike peering TGW traffic flows through an additional compute layer so it has additional cost.
They actually used to have the upstream docs in GitHub, and that was super nice for giving permalinks but also building the docs locally in a non-pdf-single-file setup. Pour one out, I guess
As of when? According to internal support, this is still required as of 1.5 years ago.
The auto partitioning is different. It can isolate hot prefixes on its own and can intelligently pick the partition points. Problem is the process is slow and you can be throttled for more than a day before it kicks in.
Everything you know is wrong.
Weird Al. https://www.youtube.com/watch?v=W8tRDv9fZ_c
Firesign Theatre. https://www.youtube.com/watch?v=dAcHfymgh4Y
Ideally it should be a stream of important updates that can be interactively filtered by time-range. For example, if I have not been actively consuming AWS updates firehose for last 18 months, I should be able to "summarize" that length of updates.
Why this is not already a feature of "What's New" section of AWS and other platforms -- I dont know. Waiting to be built -- either by OEM or by the Community.
From my understanding, I don't think this is completely accurate. But, to be fair, AWS doesn't really document this very well.
From my (informal) conversations with AWS engineers a few months ago, it works approximately like this (modulo some details I'm sure the engineers didn't really want to share):
S3 requests scale based on something called a 'partition'. Partitions form automatically based on the smallest common prefixes among objects in your bucket, and how many requests objects with that prefix receive. And the bucket starts out with a single partition.
So as an example, if you have a bucket with objects "2025-08-20/foo.txt" and "2025-08-19/foo.txt", the smallest common prefix is "2" (or maybe it considers the root as the generator partition, I don't actually know). (As a reminder, a / in an object key has no special significance in S3 -- it's just another character. There are no "sub-directories"). Therefore a partition forms based on that prefix. You start with a single partition.
Now if the object "2025-08-20/foo.txt" suddenly receives a ton of requests, what you'll see happen is S3 throttle those requests for approximately 30-60 minutes. That's the amount of time it takes for a new partition to form. In this case, the smallest common prefix for "2025-08-20/foo.txt" is "2025-08-2". So a 2nd partition forms for that prefix. (Again, the details here may not be fully accurate, but this is the example conveyed to me). Once the partition forms, you're good to go.
But the key issue here with the above situation is you have to wait for that warm up time. So if you have some workload generating or reading a ton of small objects, that workload may get throttled for a non-trivial amount of time until partitions can form. If the workload is sensitive to multi-minute latency, then that's basically an outage condition.
The way around this is that you can submit an AWS support ticket and have them pre-generate partitions for you before your workload actually goes live. Or you could simulate load to generate the partitions. But obviously, neither of these is ideal. Ideally, you should just really not try and store billions of tiny objects and expect unlimited scalability and no latency. For example, you could use some kind of caching layer in front of S3.
Hit it when building an iceberg Lakehouse using pre existing data. Using object prefixes fixed the issue.
Ex: your prefix is /id=12345. S3, under the hood, generates partitions named `/id=` and `/id=1`. Now, your id rolls over to `/id=20000`. All read/write activity on `/id=2xxxx` falls back to the original partition. Now, on rollover, you end up with read contention.
For any high-throughput workloads with unevenly distributed reads, you are best off using some element of randomness, or some evenly distributed partition key, at the root of your path.
Turns out there're many incorrect implementations of Happy Eyeballs that cancel the ipv4 connection attempts after the timeout, and then switch to trying the AAAA records and subsequently throwing a "Cannot reach host" error. For context, in Happy Eyeballs you're supposed to continue trying both network families in parallel.
This only impacts our customers who live far away from the region they're accessing, however, and there's usually a workaround - in Node you can force the network family to be v4 for instance
No. They break existing customer expectations.
There are heaps of dualstack API endpoints https://docs.aws.amazon.com/general/latest/gr/rande.html#dua... if that's what the client wants.
The amazonaws.com domain endpoints did not introduce ipv6/AAAA directly is (mostly) due to access control. For better or worse there are a lot of "v4 centric" IAM statements, like aws:SourceIp, in identity/resource/bucket policies. Introducing a new v6 value is going to break all of those existing policies with either unexpected DENYs or, worse, ALLOWs. Thats a pretty poor customer experience to unexpectedly break your existing infrastructure or compromise your access control intentions.
AWS _could_ have audited every potential IAM policy and run a MASSIVE outreach campaign, but something as simple as increasing (opaque!) instance ID length was a multi year effort. And introducing backwards compatibility on a _per policy_ basis is its own infinite security & UX yak shaving exercise as well.
So thats why you have opt-in usage of v6/dualstack in the client/SDK/endpoint name.
Does anyone have experience running Spot in 2025? If you were to start over, would you keep using Spot?
- I observe with pricing that Spot is cheaper
- I am running on three different architectures, which should limit Spot unavailability
- I've been running about 50 Spot EC2 instances for a month without issue. I'm debating turning it on for many more instances
1. Spot with autoscaling to adjust to demand and a savings plan that covers the ~75th percentile scale
2. On-demand with RIs (RIs will definitely die some day)
3. On-demand with savings-plans (More flexible but more expensive than RIs)
3. Spot
4. On-demand
I definitely recommend spot instances. If you're greenfielding a new service and you're not tied to AWS, some other providers have hilariously cheap spot markets - see http://spot.rackspace.com/. If you're using AWS, definitely auto-scaling spot with savings plans are the way to go. If you're using Kubernetes, the AWS Karpenter project (https://karpenter.sh/) has mechanisms for determining the cheapest spot price among a set of requirements.
Overall tho, in my experience, ec2 is always pretty far down the list of AWS costs. S3, RDS, Redshift, etc wind up being a bigger bill in almost all past-early-stage startups.
Oh yeah, we were in the same row!
Wouldn't this always depend on the length of the queue to access the robotic tape library? Once your tape is loaded it should move really quickly:
https://www.ibm.com/docs/en/ts4500-tape-library?topic=perfor...
Your assumption holds if they still use tape. But this paragraph hints at it not being tape anymore. The eternal battle between tape versus drive backup takes another turn.
The disks for Glacier cost $0 because you already have them.
Last I was aware flash/nvme storage didnt have quite the same problem, due to orers of magnitude improved access times and parallelism. But you can combine the two in a kind of distributed reimplementation of access tiering (behind a single consistent API or block interface).
But then what do you do with the other half of the disk? If you access it when the machine isn’t dormant you lose most of these benefits.
For deep storage you have two problems. Time to access the files, and resources to locate the files. In a distributed file store there’s the potential for chatty access or large memory footprints for directory structures. You might need an elaborate system to locate file 54325 if you’re doing some consistent hashing thing, but the customer has no clue what 54325 is. They want the birthday party video. So they still need a directory structure even if you can avoid it.
[1] https://www.aboutamazon.com/news/aws/aws-data-center-inside
For storage especially we now build enough redundancy into systems that we don't have to jump on every fault. That reduces the chance of human error when trying to address it, and pushing the hardware harder during recovery (resilvering, catching up in a distributed concensus system, etc).
When the entire box gets taken out of the rack due to hitting max faults, then you can piece out the machine and recycle parts that are still good.
You could in theory ship them all off to the backend of nowhere, but it seems that Glacier is all the places where AWS data centers are, so it's not that. But Glacier being durable storage, with a low expectation of data out versus data in, they could and probably are cutting the aggregate bandwidth to the bone.
How good do your power backups have to be to power a pure Glacier server room? Can you use much cheaper in-rack switches? Can you use old in-rack switches from the m5i era?
Also most of the use cases they mention involve linear reads, which has its own recipe book for optimization. Including caching just enough of each file on fast media to hide the slow lookup time for the rest of the stream.
Little's Law would absolutely kill you in any other context but we are linear write, orders of magnitude fewer reads here. You have hardware sitting around waiting for a request. "Orders of magnitude" is the space where interesting solutions can live.
Is tape even cost competitive anymore? The market would be tiny.
I just started working with a vendor who has a service behind API Gateway. It is a bit slow(!) and times out at 30 seconds. I've since modified my requests to chunk subsets of the whole dataset, to keep things under the timeout.
Has this changed? Is 30 secs the new or the old timeout?
Not strictly true.
Please see the documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi...
This 2024 re:Invent session "Optimizing storage performance with Amazon S3 (STG328)" which goes very deep on the subject: https://www.youtube.com/watch?v=2DSVjJTRsz8
And this blog that discusses Iceberg's new base-2 hash file layout which helps optimize request scaling performance of large-scale Iceberg workloads running on S3: https://aws.amazon.com/blogs/storage/how-amazon-ads-uses-ice...
"If you want to partition your data even better, you can introduce some randomness in your key names": https://youtu.be/2DSVjJTRsz8?t=2206
FWIW The optimal way we were told was to partition our data was to do this: 010111/some/file.jpg.
Where `010111/` is a random binary string which will please both the automatic partitioning (503s => partition) and manual partitioning you could ask AWS. Please as in the cardinality of partitions grows slower at each characters vs prefixes like `az9trm/`.
We were told that the later version makes manual partitioning a challenge because as soon as you reach two characters you've already created 36x36 partitions (1,296).
The issue with that: your keys are no more meaningful if you're relying on S3 to have "folders" by tenants for example (customer1/..).
If key prefixes don’t matter much any more, then it’s a very recent change that I’ve missed.
S3 will automatically do this over time now, but I think there are/were edge cases still. I definitely hit one and experienced throttling at peak load until we made the change.
But I don’t know what conversations did or did not happen behind the scenes.
Not true for GPU instances, they're stuck 5 minutes in a stopping state because they run some GPU health checks.
When this was changed? I think this is still an issue, I've had some such errors quite recently.
> DynamoDB now supports empty values for non-key String and Binary attributes in DynamoDB tables. Empty value support gives you greater flexibility to use attributes for a broader set of use cases without having to transform such attributes before sending them to DynamoDB. List, Map, and Set data types also support empty String and Binary values.
https://docs.aws.amazon.com/amazondynamodb/latest/developerg...
My recent interactions with them would probably have been better if they were an LLM.
Perhaps they trained the LLM using that data though.
(Small customer though: yearly AWS spend around 80k. Support is 10% of that.)
Then you can self-cloud. Several startips are in this space. It gets you the best of both worlds: scaling, freedom, cost-control.
And no marketing jargon that you need to learn, and then unlearn!
That's a reasonable approach, but the fact this post exists shows that this practice is a reputational risk. By all means do this if you think it's the right thing to do, but be aware that first impressions matter and will stick for a long time.