Every Byte Matters
61 points
1 hour ago
| 8 comments
| fzakaria.com
| HN
noelwelsh
56 minutes ago
[-]
The JVM is currently pretty bad for memory allocation. Every object (i.e. not a primitive) has a header that IIRC is 12 bytes. But there is good news in JVM land: this will be reduced to 8 bytes in the next JVM release, and Project Valhalla will give the tools to do away with headers entirely in some cases. Project Valhalla also has tools to manage off-heap memory, which is important in many cases.

The JVM is an odd place where it requires too much heap to compete with the AOT compiled languages, but its startup time is too slow compared to interpreted languages. I think these enhancements are essential to keep the platform relevant.

reply
pron
28 minutes ago
[-]
> Every object (i.e. not a primitive) has a header that IIRC is 12 bytes. But there is good news in JVM land: this will be reduced to 8 bytes in the next JVM release

Since JDK 25 it's already 64 bits with the `-XX:+UseCompactObjectHeaders` flag [1], but in JDK 27 it will be the default [2].

> where it requires too much heap to compete with the AOT compiled languages

Not to compete but to beat, and not too much, but the right amount. Low level languages are optimised for control, not performance (that control translates to better performance in smaller programs, and to worse performance in larger programs), and their particular constraints prevent them from enjoying certain important optimisations, especially those offered by JIT compilation and moving collectors, which remove some overheads that AOT compilers and free-list allocators incur. Their memory management is forced (by their constraints) to optimise for footprint rather than speed.

There are common misunderstandings about memory management and why moving collectors were created to reduce the CPU overheads of malloc/free, especially in large programs, in exchange for what is effectively free RAM. This is why moving collectors are chosen by the languages that are unconstrained enough to use them and have the resources to implement them (Java, .NET, V8). With the exception of Zig (and even there it requires some effort), it's hard for low level languages to use the basic optimisation that's behind moving collectors. I gave a talk about how moving collectors optimise memory management at the last Java One, and it should be available on YouTube soonish [3].

> but its startup time is too slow compared to interpreted languages

That hasn't been the case for some time. You are right, though, that startup/warmup time is worse than in AOT compiled languages, and that is the tradeoff of optimising JITs: reduce the overheads associated with AOT compilation in large program in exchange for warmup.

Both startup and warmup have already been improved thanks to Project Leyden's "AOT cache" [4], but it will never be as low as C.

In general, the tradeoff is between optimisations that help large programs vs optimisations that help small programs.

[1]: https://openjdk.org/jeps/519

[2]: https://openjdk.org/jeps/534

[3]: I can't reproduce the full talk (which goes into the maths of memory management) here but what happened with moving collectors was that until very recently (open source low-latency moving collectors are newer than ChatGPT), they required pauses and so weren't suitable for programs requiring low latencies. As a result, many developers either forgot or never learnt just how incredibly efficient moving collectors are. But the key is that because accessing RAM by necessity requires CPU, using CPU effectively captures RAM even it's not used by the program. Bringing the CPU and RAM usage into a good balance is more efficient than trying to minimise one or the other. This is also the reason why hardware (physical or virtual) is packaged within a very narrow band of RAM/core ratio.

[4]: https://www.youtube.com/watch

reply
kakacik
39 minutes ago
[-]
Most of real world use of Java platform has next to 0 concerns like those. Some more niche use case may benefit, good, but overall success map isn't changing anytime soon. Reasons for its long term success lie elsewhere.
reply
FartyMcFarter
10 minutes ago
[-]
Android Java apps' memory consumption is definitely a relevant concern.
reply
forinti
1 hour ago
[-]
So if you need speed, you just have to swallow your OO programmer's pride and put your data in arrays.
reply
theandrewbailey
1 hour ago
[-]
Maybe someone can write an OO language where arrays of structs are automatically stored as structs of arrays.

mild /s

reply
tlb
6 minutes ago
[-]
There's a package to do this in Julia: https://juliaarrays.github.io/StructArrays.jl/stable/
reply
fp64
15 minutes ago
[-]
Odin has some helpers, was one of the more interesting features I found, but never tried. Not sure if you want to consider Odin OO, but well https://odin-lang.org/docs/overview/#soa-struct-arrays
reply
Mizza
52 minutes ago
[-]
Are you talking about Zig's MultiArrayList?
reply
alex7o
16 minutes ago
[-]
He is talking about jai the programing language from Jonathan Blow, which is quite cool but there is no way to access it.
reply
pron
46 minutes ago
[-]
> The cost of each new field is rarely considered

Most developers, in Java and in most other languages, do not consider the cost of every field, but I can tell you that people who need micro-optimisations certainly do care, and in Java's standard library, a layout is very much a concern (except, as always, you want to optimise what really matters; there's no point in optimising something that is unlikely to be a hot spot in a real program). Sometimes, though, you want to intentionally spread out the layout to avoid cache line sharing when concurrency is involved. You will find such examples in the standard library, too.

reply
ssiddharth
54 minutes ago
[-]
Slight tangent, but every ms, μs, and ns counts too. We've gotten awfully carefree with response times and wasted compute cycles.
reply
coldcity_again
1 hour ago
[-]
I love to see stuff like this. And an active Vectrex gamedev and PC/Amiga sizecoder I strongly agree with the sentiment!
reply
RickJWagner
22 minutes ago
[-]
That’s a great read. I wish more people wrote like that.
reply
yas_hmaheshwari
47 minutes ago
[-]
Out of course: I had thought about reading an article about Iran war or some geo political news when I read fzakaria :-)
reply
coolThingsFirst
23 minutes ago
[-]
Why doesn’t the machine fill up the other cache lines as well why is 64 bytes only and then a miss?
reply