FilterHN

Docker 29 has changed its default image store for new installs

82 points

by neitsab

3 days ago

| past

| 7 comments

| docs.docker.com

| HN

▲

fabian2k

3 hours ago

[-]

> This difference is particularly noticeable with multiple images sharing the same base layers. With legacy storage drivers, shared base layers were stored once locally, and reused images that depended on them. With containerd, each image stores its own compressed version of shared layers, even though the uncompressed layers are still de-duplicated through snapshotters.

This seems like a really weird decision. If base images are duplicated for every image you have, that will add up quickly.

▲

epistasis

2 hours ago

[-]

This is hell for a lot of ML containers, that have gigabytes of CUDA and PyTorch. Before at least you could keep your code contained to a layer. But if I understand this correctly every code revision duplicates gigabytes of the same damn bloated crap.

▲

spwa4

2 hours ago

[-]

If you have problems with 13 (I believe) GB of docker layers ... how do you deal with terabytes or petabytes of AI training data?

▲

epistasis

2 hours ago

[-]

Petabytes of training data is only one application of PyTorch, which is going to use tens of thousands of containers, but...

Inference, development cycles, any of the application domains of PyTorch that don't involve training frontier models... all of those are complicated by excessive container layers.

But mostly dev really sucks with writing out an extra 10GB for a small code change.

▲

StableAlkyne

2 hours ago

[-]

You don't even need MB of training data for some ML applications. AI is the sexy thing nowadays, but neural networks (Torch is a NN library) are generally useful for even small regression and clarification problems.

For some problems you might even be able to get away with single digit numbers of training points (classic example of this regime being Physics-Informed Neural Networks)

▲

0cf8612b2e1e

44 minutes ago

[-]

You don’t train petabytes on your laptop.

▲

Normal_gaussian

2 hours ago

[-]

the training data is on a separate drive; or the training data isn't that large for this use case; or they aren't training.

▲

kodama-lens

2 hours ago

[-]

I think there is an Issue/PR right now to change this. See: https://github.com/containerd/containerd/issues/13307

▲

epistasis

1 hour ago

[-]

Oh, very glad to see this, ML applications that were mentioned in it are exactly why I was thinking this was such a disastrous change.

However, the tedium of the reply chain reminds me why I tend to focus most energy on internal projects rather than external open source...

Docker may have been built for a specific type of use case that most developers are familiar with (e.g. web apps backed by a DB container) but containerization is useful across so much of computing that are very different. Something that seems trivial in the python/DB space, having one or two different small duplicates of OS layers, is very different once you have 30 containers for different models+code, and then ~100 more dev containers lying around as build artifacts from building and pushing, and pulling, each at ~10GB, that the inefficient new system is just painful.

The smallest PyTorch container I ever built was 1.8GB, and that was just for some CPU-only inference endpoints, and that took several hours of yak shaving to achieve, and after a month or two of development it had ballooned back to 8GB. Containers with CUDA, or using significant other AI/ML libraries, get really big. YAGNI is a great principle for your own code when writing from scratch, but YAGNI is a bit dangerous when there's been an entire ecosystem built on your product and things are getting rewritten from scratch, because the "you" is far larger than the developer making the change. Docker's core feature has always been reusable and composable layers, so seeing it abandoned seems that somebody took YAGNI far too extreme on their own corner of the computing world.

▲

IsTom

3 hours ago

[-]

Docker is already hogging a lot of disk space and needs to be pruned regularly. I can't imagine what's it's going to be like now.

▲

Oxodao

4 hours ago

[-]

Docker already fills up my dev machines yet they decided for this insane solution:

> The containerd image store uses more disk space than the legacy storage drivers for the same images. This is because containerd stores images in both compressed and uncompressed formats, while the legacy drivers stored only the uncompressed layers.

Why ?

▲

ElevenLathe

4 hours ago

[-]

Sounds like a straightforward time-space tradeoff: if you have the compressed layers sitting around when you need them, you can avoid the expense and time of compressing them.

▲

Filligree

3 hours ago

[-]

Why would I need the compressed layers?

▲

NewJazz

2 hours ago

[-]

Pushing

▲

cryptonym

2 hours ago

[-]

To save disk space /s

▲

colechristensen

3 hours ago

[-]

I'm not sure about the fastest macbook disk access, but even with NVMe storage I've found lz4 to be faster than the disk. That is (it's hard to say this exactly correct) compressed content gets read/written FASTER than uncompressed content because fewer bytes need to transit the disk interface and the CPU is able to compress/decompress significantly faster than data is able to go through whatever disk bus you've got.

▲

fpoling

2 hours ago

[-]

On my 2 years old ThinkPad laptop SSD is faster than lz4. On a fat EC2 server lz4 is faster. So one really has to test a particular config.

▲

colechristensen

1 hour ago

[-]

Yeah, I'm not surprised the PCIe 5.0 transfer speeds matched with top tier SSD chips win that race.

It still bothers me that the fastest most performant computer I have access to is almost always my laptop, and that by a considerable margin.

Someone should do some lz4 vs. ssd benchmarks across hardware to make my argument more solid and the boundaries clear.

▲

awestroke

55 minutes ago

[-]

But if it stores the uncompressed layers, why store the compressed ones too? Why both at the same time?

▲

freedomben

3 hours ago

[-]

did you mean the first "compressed" to be "uncompressed" ?

▲

giobox

2 hours ago

[-]

> https://docs.docker.com/reference/cli/docker/system/prune/

Just in case - I'm always amazed how many Docker users don't know about the prune command for cleaning up the caches and deleting unused container images and just slowly let their docker image cache eat their disk.

▲

johannes1234321

1 hour ago

[-]

Prune is nice, but if you have a bunch of containers which run shirt time for a build step or similar prune would collect those, too. A filter "last used a few months ago" would be useful.

▲

giobox

1 hour ago

[-]

I think you can filter on last created, but agree last used would be helpful:

  docker image prune -a --filter "until=24h"

> https://docs.docker.com/reference/cli/docker/image/prune/#fi...

▲

neitsab

3 days ago

[-]

Docker v29 (released 2025-11) switched to using containerd for its image store for new installs.

This means `/var/lib/docker` is no longer "hermetic": images and container snapshots are located in `/var/lib/containerd` now.

More info about the switch: https://www.docker.com/blog/docker-engine-version-29/

To configure this directory, see https://docs.docker.com/engine/storage/containerd/.

▲

neitsab

3 days ago

[-]

I noticed the change because I wanted to persist Docker-related data between container instantiations on IncusOS. I couldn't understand why the custom volume I had mounted on /var/lib/docker didn't contain the downloaded images.

To keep both /var/lib/{containerd,docker} in sync, I use a single ZFS dataset ("custom filesystem volume" in Incus parlance) and mount subpaths inside the container:

  incus storage volume create local docker-data
  incus config device add docker docker disk pool=local source=docker-data/docker path=/var/lib/docker
  incus config device add docker containerd disk pool=local source=docker-data/containerd path=/var/lib/containerd

There are other ways to achieve the same of course.

▲

newsoftheday

2 hours ago

[-]

The article says to regularly run prune, how regularly? Currently I run the following once per day from cron:

    docker system prune -a -f
    docker volume prune -a -f

▲

wolttam

49 minutes ago

[-]

This would depend entirely on how much churn your system is doing on containers/volumes/images. Once a day sounds really often for most situations.

"Regularly" = when you're running out of space because of a bunch of built up old stuff.

▲

arnitdo

2 hours ago

[-]

From the docs, you can just run `docker system prune -a --volumes`

Ref: https://docs.docker.com/reference/cli/docker/system/prune/

▲

0xbadcafebee

1 hour ago

[-]

It sounds like this breaks all Docker installs that use userns-remap? Are they really shipping a breaking change with no fix? In addition to bloating the disk? In addition to breaking all old systems that relied on mapping /var/lib/docker?

I can't believe Docker finally shit the bed. Time to replace Docker with Podman.... sigh

▲

DeathArrow

2 hours ago

[-]

I should start looking into Podman.

▲

mrichman

3 hours ago

[-]

Why not just use podman at this point?

▲

nitinreddy88

3 hours ago

[-]

They are adopting to containerd standard, not sure why negative sentiment

▲

mgrandl

1 hour ago

[-]

Where did you see that? I just did a deep dive into podman/quadlets/bootc/composefs and never once seen a mention of that. A google search also didn’t bring anything like that up.