FilterHN

Comparing AWS Lambda ARM64 vs. x86_64 Performance Across Runtimes in Late 2025

99 points

by hasanhaja

8 hours ago

| past

| 15 comments

| chrisebert.net

| HN

▲

perpil

11 minutes ago

[-]

This is very thorough, but I have a few things to add if you are using Node 22. I benchmarked Node 22 earlier this year using something similar to the light benchmark. https://speedrun.nobackspacecrew.com/blog/2025/07/21/the-fas...

I found that Node 22 had ~50ms slower coldstarts than Node 20. This is because the AWS Javascript V3 SDK loads the http request library which became much heavier in Node 22. This happens on the newly released Node 24 as well.

I recommend that if you are trying to benchmark coldstarts on Lambda, you measure latency from a client as well. The Init Duration in the logs doesn't include things like decrypting environment variables which adds ~20ms and other overhead like pulling the function code from S3. The impact of this is manifests when comparing runtimes like llrt to Node, the Init Duration is faster than Node, but the E2E time from the client is actually closer because the llrt bundle size is 4-5MB larger than Node.

▲

greatgib

4 hours ago

[-]

I would not be surprised that Rust be faster than Python, but looking at the code of his Benchmarks, I'm not sure that it really means anything.

For example, the "light" test will do calls to "dynamodb". Maybe you benchmark python, or you benchmark the aws sdk implementation with different versions of python, or just you benchmark average dynamodb latencies at different times of the day, ...

And, at least for Python, his cpu and memory test code looks like to be especially specific. For example, if you use "sha256" or any other hashlib, you will not really benchmark python but the C implementation of the hashlib, that might depend on the crypto or gcc libraries or optimization that was used by AWS to build the python binaries.

▲

torginus

2 hours ago

[-]

I think the 'light' workload is the most realistic - most people use lambda as a stateless CRUD backend/REST endpoint, I don't think doing heavy number crunching is that idiomatic inside a lambda.

And for that use case, Python seems almost as good as Rust, which is surprising to me, as is the fact that Node runs so slow - I have a ton of Node-based lambdas in prod, and while they're no speed demons, I'm quite surprised how bad it is compared to even interpreted things like Python.

AWS should really look into this and fix it, or at least tell us why it's so slow.

▲

drob518

2 hours ago

[-]

Sure. Who is to say where the bottleneck is, but if an application is going to use all those same libraries and runtimes, it’s not an unrealistic test. Obviously, with all benchmarking, the most accurate benchmark is your own application, but this seems pretty reasonable as a generic first cut.

▲

ajross

3 hours ago

[-]

Yeah, having a native binary be only 2x faster than CPython, or CPython showing 4x (!!) advantage vs. V8 seems really suspicious. This benchmark suite is probably good for looking at CPU architecture deltas on identical source code (which seems to be what it was good for), but trying to intuit language/runtime behavior from it seems very dangerous.

▲

pizlonator

16 minutes ago

[-]

I saw so many red flags:

- How is Rust only one order of magnitude faster than Python?

- How is Python that much faster than Node.js?

So I looked at the benchmark repo.

These benchmarks mean nothing folks!

Each of these benchmarks is just a SHA256 hash. This is NOT a valid way to compare CPUs, except if the only thing you will ever do with the CPU is to execute SHA256 hashes.

Hash functions are not representative of the performance of:

- Compression or decompression (of text, video, or anything really)

- Parsing

- Business logic (which is often dominated by pointer chasing)

So, you can safely ignore the claims of this post. They mean nothing.

▲

torginus

4 hours ago

[-]

What I don't get is why is it that Node is dog slow. Seriously it seems borderline unusable.

In terms of perf, Node has a pretty snappy JIT, and seems to perform OK in the browser, but this seems like something's not right here.

~200ms requests even in the light case are on the very brink of barely acceptable.

On the other hand, Python performs very well, but it's alarming to see it gets slower in every release by a significant margin.

▲

koakuma-chan

1 hour ago

[-]

> What I don't get is why is it that Node is dog slow. Seriously it seems borderline unusable.

This is in line with my experience using anything written in Node.js

▲

torginus

1 hour ago

[-]

Not on a dedicated server - if serving a db query in a rest endpoint took 100ms in Node, it wouldn't have gotten popular,

In my experience, Node perf is 'okay' - not stellar but a simple express/js handler certainly doesn't take 100ms. This sounds 10x-100x slower than running something similar on a dedicated instance.

▲

ju-st

6 hours ago

[-]

Would be interesting to add a cold start + "import boto3" benchmark for Python as importing boto3 takes forever on lambdas with little memory. For this scenario I only know this benchmark but it is from 2021 https://github.com/MauriceBrg/aws-blog.de-projects/tree/mast...

▲

torginus

2 hours ago

[-]

I don't really use Python, but most AWS SDKs seem to be autogenerated for each language, and they're pretty much just thin wrappers over REST calls to interal AWS endpoints.

I dunno why a Python impl would be particularly heavy.

▲

anentropic

2 hours ago

[-]

if imports are slow one should probably look into pre-compiling .pyc files into the Lambda bundle

▲

coredog64

2 hours ago

[-]

This is a well known issue, and the fix is not to create any boto3 clients at runtime. Instead, ensure they're created globally (even if you throw them away) as the work then gets done once during the init period. The init period gets additional CPU allocation, so this is essentially "free" CPU.

Source: I'm a former AWS employee.

▲

ComputerGuru

42 minutes ago

[-]

Thanks for citing your sources, I think your source may be out if date, though! The “free init time hack” was killed in August (unless I’m missing something - never used it myself).

https://aws.amazon.com/blogs/compute/aws-lambda-standardizes...

▲

tybit

6 hours ago

[-]

It’s interesting that the author chose to use SHA256 hashing for the CPU intensive workload. Given they run on hardware acceleration using AES NI, I wonder how generally applicable it is. Still interesting either way though, especially since there were reports of earlier Graviton (pre v3) instances having mediocre AES NI performance.

▲

ComputerGuru

34 minutes ago

[-]

Hardware-accelerated SHA support has a patchy history. I wrote an article some years ago about the prevalence of SHA instructions in x86 in x86_64 CPUs [0], like the current mess we see now with AVX-512, Intel invented something useful then declined to continue supporting it, while competitors that were late to the party became the real champions.

[0]: https://neosmart.net/blog/will-amds-ryzen-finally-bring-sha-...

▲

bobmcnamara

2 hours ago

[-]

Does AES NI imply SHA256 acceleration support?

▲

kbolino

2 hours ago

[-]

There are some crossed wires here.

AES-NI is x86-specific terminology. It was proposed in 2008. SHA acceleration came later, announced in 2013. The original version covers only SHA-1 and SHA-256 acceleration, but a later extension adds SHA-512 acceleration. At least for x86, AES-NI does not imply SHA support. For example, Westmere, Sandy Bridge, and Ivy Bridge chips from Intel have AES-NI but not SHA.

The equivalent in Arm land is called "Cryptographic Extensions" and was a non-mandatory part of ARMv8 announced in 2011. Both AES and SHA acceleration were announced at the same time. While part of the same extensions, there are separate feature flags for each of AES, SHA-1, and SHA-256.

▲

mrgaro

2 hours ago

[-]

I tried to do a very low latency https endpoint with Lambda and Rust and wasn't able to get less than 30ms, no matter what I tried.

Then I deoloyed an ECS task with ALB and got something like <5ms.

Has anybody gotten sub-10ms latencies with Lambda Https functions?

▲

mnutt

4 hours ago

[-]

Would be interesting to see a benchmark with the rust binary with successively more “bloat” in the binary to separate out how much of the cold start is app start time vs app transfer time. It would also be useful to have the c++ lambda runtime there too; I expect it probably performs similarly to rust.

Tangent: when you have a lambda returning binary data, it’s pretty painful to have to encode it as base64 just so it can be serialized to json for the runtime. To add insult to injury, the base64 encoding is much more likely put you over the response size limits (6MB normally, 1MB via ALB). The c++ lambda runtime (and maybe rust?) lets you return non-JSON and do whatever you want, as it’s just POSTing to an endpoint within the lambda. So you can return a binary payload and just have your client handle the blob.

▲

tomComb

3 hours ago

[-]

Yikes, Node.js did really badly. If this holds up, my take-away would be ...

If I want to use TypeScript for Functions, I should write to the v8 runtimes (Deno, Cloudflare, Supabase, etc) which are much faster due to being much more lightweight.

▲

coredog64

2 hours ago

[-]

Something not mentioned in this article is that which x86_64 and arm64 implementation you get are both relatively frozen in time. I haven't checked recently, but the last time I did, the arm64 implementation was stuck at something like Graviton2.

▲

abhashanand1501

4 hours ago

[-]

One of the easiest hack to reduce your AWS bills is to migrate from x86 to arm64 CPU. Performance difference is negligible, and cost can be upto 50% lower for arm machines. This is for both RDS and general compute (EC2, ECS). Would recommend to all.

▲

torginus

2 hours ago

[-]

I'd say the best price/performance hack on AWS if you don't need web scale is just put your stuff on a tiny EC2 instance, like a t3.micro - it'll be likely faster and more flexible than lambda with much more predictable performance.

You can scale up by changing out to a bigger instance - it's surprising how far you can get with this strategy.

▲

watermelon0

3 hours ago

[-]

How is the performance difference negligible? In my experience, for the same generation of hardware, ARM64 performance is better than the AMD64 one.

AFAIK ARM64 is around 20% cheaper, not sure where you got the 50%.

▲

abhashanand1501

1 hour ago

[-]

In different regions the price difference is different. In us-east there is a 20% difference, in ap-south it is 50%. You can check for fargate ecs pricing for example.

▲

ZiiS

2 hours ago

[-]

This is benchmarking `hashlib.sha256` isn't that normally OpenSSL's heavily hand optimized assembly implementation, certainly isn't something written in Python?

▲

schmidtleonard

2 hours ago

[-]

I think that's the CPU benchmark rather than the python benchmark -- and comparing CPU ARM64 vs x86_64 seems worthwhile.

▲

ZiiS

1 hour ago

[-]

It is the CPU-Intensive Workload Results; which compares Python versions and notes "Python 3.11 consistently outperformed newer versions across all memory configurations. It was 9-15% faster than Python 3.12, 3.13, and 3.14. This surprised me" The most obvious conclusion is the benchmark is simply flawed in some way; if this result is real, then it says something about how AWS compiled OpenSSL it says absolutely nothing about the speed of Python versions.

▲

evilmonkey19

4 hours ago

[-]

Can someone tell me why there isn't almost any laptop with Linux and ARM? Is it more efficient than x86 though

▲

CCs

36 minutes ago

[-]

It is a pain to make any new platform useful enough for large adoption. Apple made a lot of effort to get MacBook M1 useable, same for AWS with Graviton. Eventually it will be adopted for Linux laptops too, even without a specific vendor focusing on it, but it will take time.

▲

PhilipRoman

4 hours ago

[-]

How is the bootloader/peripheral compatibility on the non-SBC ARM systems these days? Can you plug in a boot disk on different machine and expect it to just work? My main problem with ARM is that many manufacturers act as if they're special little snowflakes and deserve to have their custom patched kernel/bootloader/whatever.

▲

AlotOfReading

2 hours ago

[-]

This is the goal of the Arm SystemReady compliance label. The selection is still pretty limited and what's out there is generally buggy, but there's a few boards out there you can buy like the Orion O6 [0]. If you just want a stable system with predictable performance, you're probably better off with a more traditional system though.

[0] https://radxa.com/products/orion/o6/

▲

torginus

2 hours ago

[-]

Afaik a lot of bootloaders are proprietary/wonky, a lot of SOCs run custom bootloaders.

However if you do manage to boot things up, hardware with open-source drivers should just work, for example Jeff Geerling has couple of videos on youtube about running his RPi with external AMD graphics cards connected via PCIe, and it works.

▲

dijit

4 hours ago

[-]

Software/driver compatibility and rational fear of change from users.

(My work laptop is one of the few ARM laptops: Thinkpad T14s with Quallcomm Snapdragon Elite)

▲

raddan

4 hours ago

[-]

If you don’t mind me asking, what do you think of that laptop? What kind of workloads do you run and how is battery life? What OS? Would you choose it again?

▲

dijit

4 hours ago

[-]

Was trying to install Linux on it, though its not working like a standard x86 laptop (for the installer on debian for example).

Battery is good, hardware is really rock solid (though I dislike the new plastic for the keyboard).

Really can’t complain, it’s nearly as good as my Macbook.

It runs Windows 11 today, and everything I need runs fine (jetbrains, rustc, clang, msvc, terraform and of course python).

I’m a technical CTO with infrastructure background, most of my time is spent in spreadsheets these days unfortunately.

▲

jeremyjh

4 hours ago

[-]

Chromebooks are essentially this, but not that great for local development.

▲

tomComb

1 hour ago

[-]

So then one solution might be to buy a Chromebook, and put regular Linux on it? I don’t think the Chromebook are locked down.

▲

fragmede

4 hours ago

[-]

Depends on which one, and what you want to locally develop.

▲

jeremyjh

4 hours ago

[-]

Is there one that even has a full keyboard?

▲

fragmede

4 hours ago

[-]

HP makes a 17" Chromebook if that's what you're after.

▲

KeplerBoy

5 hours ago

[-]

> Node.js: Node.js 22 on arm64 was consistently faster than Node.js 20 on x86_64. There's essentially a “free” ~15-20% speedup just by switching architectures!

Not sure why this is phrased like that in the TL;DR, when ARM64 is just strictly faster when running the same nodejs workload and version.

▲

artemonster

3 hours ago

[-]

Intel execs after reading this: FAST, more stock buybacks and executive bonuses to mitigate this!!!