Sample what you can't compress; image auto-encoders wihtout GANs
17 points
1 day ago
| 1 comment
| arxiv.org
| HN
vighneshb
1 day ago
[-]
Hi

In our latest paper we shoa that a GAN loss (used by almost all latent diffusion models) to train their autoencoders is not required and instead can be replaced with a diffusion loss. Our auto-encoder is trained end-to-end and achieves higher compression and better generation quality.

I am excited to share it with you. Let me know what you think.

Cheers

reply
billconan
1 day ago
[-]
I just saw https://hanlab.mit.edu/projects/hart

it seems to be another autoencoder(autoregressive) + diffusion.

reply
vighneshb
1 day ago
[-]
This is very interesting. Unlike us (who focus on the decoder) they focus on changing the representation itself so that they can achieve better generation. Thanks for the link.
reply
billconan
1 day ago
[-]
they use autoencoder/autoregressive model to predict the big picture, and diffusion for the details, similar to yours.

The difference is they use discrete tokens.

reply