On the client side, you just have to pin to an older version of the AWS SDK till whatever compatible service you're using updates, right?
Also, this is the first I've heard of OpenDAL. Looks interesting:
It's had barely any discussion on HN:
Back when Microsoft started offering CosmosDB, I was working at MongoDB on the drivers team (which develops a slew of first-party client libraries for using MongoDB), and several of the more popular drivers got a huge influx of "bug" reports of users having issues with connecting to CosmosDB. Our official policy was that if a user reported a bug with a third-party database, we'd do a basic attempt to reproduce it with MongoDB, and if it actually turned out to be a bug that was with our code, it would show up there, and we'd fix it. Otherwise, we didn't spend any time trying to figure out what the issue with CosmosDB was. In terms of backwards compatibility, we spent enough time worrying about compatibility for arbitrary versions of our own client and server software for it to be worth spending any time thinking about how changes might impact third-party databases.
In the immediate week or two after CosmosDB came out, a few people tried out the drivers they worked on to see if we could spot any differences, and although at least for basic stuff it seemed to work fine, there were a couple small oddities with specific fields in the responses during the connection handshake and stuff like that, and I think as a joke someone made a silly patch to their driver that checked those fields and logged something cheeky, but management was pretty clear that they had zero interest in any sort of proactive approach like that; the stance was basically that drivers were intentionally licensed permissively and users were free to do anything they wanted with them, and it only became our business if they actually reached out to us in some way.
Just because you can get something to work doesn’t mean it’s supported. Using an Amazon S3 library on a non-Amazon service is “works but isn’t supported.”
Stick to only supported things if you want reliability.
I stopped giving amazon the benefit of the doubt about any aspect of their operations about 8 years ago.
Do these 3rd parties get veto power over a feature they can't support?
Can they delay a launch if they need more time to make their reverse-engineered effort compatible again?
It seems a hard to defend position that this is at all Amazon's problem. The OP even links to the blog post announcing this change months ago. If users pay you for your service to remain S3-compatible that seems like its on you to make sure you live up to that promise, not Amazon.
Clicking through to the actual git issues, it definitely seems like the maintainers of Iceberg have the right mental model here too. This is their problem to fix. After re-reading this post this mostly feels like a click-baity way to advertise OpenDAL, which the author appears to be heavily involved in.
If your service no longer works with the AWS SDK because you crash at `headers["content-md5"]` just because "it seemed a good way to make things more correct" - it is on you to fix it, IMO.
Like, this changeset https://github.com/minio/minio/pull/20855/files#diff-be83836...
Why does Minio mandate the presence of Content-MD5? Is it in the docs somewhere for the S3 "protocol"? No, it's not. It's someone wanting to "be extra correct with validating user input" and thus creating a subtle extra restriction on the interface they do not control.
https://aws.amazon.com/blogs/aws/introducing-default-data-in...
Amazon may not be actively hostile to using their SDK with third party services, but they never promised to support that use case.
(disclaimer: I work for AWS but not on the S3 team, I have no non-public knowledge of this and am speaking personally)
I do think some of the vendors did themselves an active disservice by encouraging use of the aws sdk in their documentation/examples, but again that's on the vendor, not on Amazon who is an unrelated third party in that arrangement.
I would guess that Amazon didn't have hostile intentions here, but truthfully their intentions are irrelevant as Amazon shouldn't be part of the equation. For example, if I use Backblaze, the business relationship here is between me and Backblaze. My choice to use the AWS SDK for that doesn't make Amazon part of it anymore than it would if I found some random chunk of code on github and used that instead.
Many customers don't like to upgrade unless they need to. It can be significant toil for them. So, you do see some tail traffic in the wild that comes from SDKs released years ago. For a service as big as S3, I bet they get traffic from SDKs ever longer than that.
export AWS_REQUEST_CHECKSUM_CALCULATION=when_required
export AWS_RESPONSE_CHECKSUM_CALCULATION=when_required
or adding the following 2 lines to a profile in ~/.aws/config request_checksum_calculation=when_required
response_checksum_validation=when_required
Or just pin your AWS SDK to version before the following.<https://github.com/aws/aws-sdk-go-v2/blob/release-2025-01-15...>
<https://github.com/boto/boto3/issues/4392>
<https://github.com/aws/aws-cli/blob/1.37.0/CHANGELOG.rst#L19>
<https://github.com/aws/aws-cli/blob/2.23.0/CHANGELOG.rst#223...>
<https://github.com/aws/aws-sdk-java-v2/releases/tag/2.30.0>
<https://github.com/aws/aws-sdk-net/releases/tag/3.7.963.0>
<https://github.com/aws/aws-sdk-php/releases/tag/3.337.0>
and wait for your S3 Compatible Object store to add a fix to support this.
This is wholly predictable; AWS isn't in the business of letting other companies OpenSearch them.
AWS is beholden first and foremost to their paying customers, and this is the best option for most S3 customers.
I wouldn't be happy to find out they did it /just/ to break third-party S3 providers but it seems like it's an easy enough thing to turn it off right?
I'm just not sure how comfortable I am with the phrasing here (or maybe I'm reading too much into it).
My guess is the client has options you pass in, they added a new default (or changed one, I’m not clear on that), and the new default sends something up to the server (header/param/etc) asking for the server to send back the new checksum header, the server doesn’t respond with the header, the client errors out.
In reality, AWS are the reference S3 implementation. Every other implementation I've seen has a compatibility page somewhere stating which features they don't support. This is just another to add to the list.
This is whiny and just wrong. Best behavior by default is always the right choice for an SDK. Libraries/tools/clients/SDKs break backwards compatibility all the time. That's exactly what semver version pinning is for, and that's a fundamental feature of every dependency management system.
AWS handled this exactly right IMO. Change was introduced in Python SDK version 1.36.0 which clearly indicatesbreaking API changes, and their changelog also explicitly mentions this new default
api-change:``s3``: [``botocore``] This change enhances integrity protections for new SDK requests to S3. S3 SDKs now support the CRC64NVME checksum algorithm, full object checksums for multipart S3 objects, and new default integrity protections for S3 requests.
https://github.com/boto/boto3/blob/2e2eac05ba9c67f0ab285efe5...Not entirely sure that's how things work?
Any consumer of this software using it for its intended purpose (S3) didn't need to make any changes to their code when upgrading to this version. As an AWS customer, knowing that when I I upgrade to this version my app will continue working without any changes is exactly what this semver bump communicates to me.
I believe calling this a feature release is correct.
It is also on the implementors of the "compatible" services to, for example, not require a header that can be assumed optional. If it is not "basic HTTP" (things like those checksums) - don't crash the PUT if you don't get the header unless you absolutely require that header. Postel's law and all.
The mention in the Tigris article is strange: is boto now doing a PUT without request content-length? That's not even valid HTTP 1.1
Setting up a business so that all your customers fail at the same moment is a poor business practice: nobody can support all their customers breaking at once. I'm guessing competitors compete on price, not reliability.
Amazon has the incentive to break third parties, since their customers are likely to switch to Amazon. Why else use the Amazon code unless you're ready to migrate or the service is low importance?
But if your customer remains on the S3 SDK, the same reduced switching cost you enjoyed is now enjoyed by your competitors - and you have to eat the support cost when you stop being compatible with the S3 SDK (regardless of why you are no longer compatible).
Edit: I forgot, since full object checksums are now the default, aws can now upload multiple parts in parallel, which was not possible before. (For upload multipart)
Having a way for vendors to publish some Level of compatibility would be a great help. Eg Tier 1 might mean basic upload/download, Tier 2 might support storage classes and lifecycle policies etc. Right now is just a confusing mess of self-attestation.
I might be wrong, but I'm betting all these 3rd party clients (including open source projects) choose to be S3 compatible because a majority of their addressable market is currently using the S3 API. "Switching over to our thing doesn't require any code refactoring, just update the url in your existing project and you're good to go."
Any standard that isn't the S3 compatible API would require adopters migrate their code off the S3 API.
I get what everyone (all three sides) is saying, i got no love for amazon, but this does not affect me in any way - i don't use AWS APIs for anything except the aws webui to bounce something or edit route53. We mostly self-host everything[1]. mastodon, matrix, nextcloud, subsonic, librephoto, bot services, PBX, VPN.
I'm t1.micro guy and i can't stand managed services.
[1] I have some $5 vps that is a canary and i use amazon lightsail for 1 public website (512mb ram, 0.5vcpu or whatever), glacier, and route53. My goal for 2025 is to become proficient enough with bind or whatever to stop paying that $5 a month to AWS for route53 request handling. A website is one thing, but services tend to chew money on route53 with constant requests. I don't see a need to drop glacier(it's static, ~100GB of family photo backups for my aunt and whatnot) or lightsail just yet.
Been really happy with aws4fetch in TypeScript (for Cloudflare R2, generating presigned URLs & sending mails via SES) after getting much frustration out of the official JS SDK.
Implying that the SDKs don't communicate directly with the APIs? This "problem" could have happened in OpenDAL just as it did in AWS SDKs.
It’s certainly not for all use cases, though.
The nuance is that Amazon updated their sdk so that the default is that the sdk no longer works for third party services that do not use the new checksum. There is a new configuration option in the sdk that either: (1) requires the checksum (and breaks s3 usage for unsupported services); or (2) only uses the checksum if the s3 service supports it.
The sdk default is (1) when this issue could have been avoided if the sdk default was (2).
Agree with all the comments that Amazon has never said or even implied that updates to their sdk would work on other s3 compatible services. However, the reality is that s3 has become a defacto standard and (unless this is a deliberate Amazon strategy) it would be relatively easy for Amazon to set this default that allows for but does not require changes to s3 compatible services or, if possible, to loudly flag these types of changes well in advance so that they don’t surprise users.
Python and cloudflare generally don't see each other much
minio at least was updated to always emit the header, so it's simply an upgrade away.
It's not on Amazon to support every edge case for every customer.
I think more the issue is that people started thinking of the AWS SDK as a generic open source library rather than what it should be thought of: an open source project run by a particular vendor who not only doesn't care about helping you use competitors, but actively wants to make that difficult. I would guess the truth is somewhere in the middle, but I think the healthy thing to do is treat it like the extreme end.