DjVu and its connection to Deep Learning (2023)
43 points
by tosh
8 hours ago
| 5 comments
| scottlocklin.wordpress.com
| HN
qingcharles
21 minutes ago
[-]
Ironically, because of poor software support and lack of knowledge about the format, most DjVus are slowly being converted to PDFs.
reply
joecool1029
5 minutes ago
[-]
Really hate that archive abandoned it. djvu files are much smaller, faster, and high quality than pdf. Real reason for abandoning it was probably to allow for the DRM needed for controlled access lending, because it’s a garbage choice otherwise.
reply
stared
1 hour ago
[-]
Oh, my favourite format during my undergraduate time! Most books in mathematics and physics (some old and niche) were available in the "Russian library".

At the same time, I haven't yet seen DjVu used in a legit way.

reply
cxr
1 hour ago
[-]
Licensing concerns resulted in DjVu being originally preferred over PDF by archive.org and WMF projects like Wikipedia. With baseline PDF now being unencumbered and the widespread existence of FOSS readers, PDF is both the de jure and de facto standard across even those sites.
reply
qdotme
3 hours ago
[-]
Another reason why I think it failed (TIL Yann LeCun was the coauthor) is the connotation with the pirate books/articles community.

When I came across this format in college days, when handling lots of scanned material, it always triggered the mental “don’t install suspicious software” block. Which is a shame as the article points out it was the superior format.

reply
nico_h
3 hours ago
[-]
I don’t know how relevant the samples are, but while the details are lost, the essence seems well preserved. It seems it would be really useful for performing OCR on.
reply