FilterHN

Show HN: Veil – Dark mode PDFs without destroying images, runs in the browser

35 points

13 hours ago

| 2 comments

Hi HN! here's a tool I just deployed that renders PDFs in dark mode without destroying the images. Internal and external links stay intact, and I decided to implement export since I'm not a fan of platform lock-in: you can view your dark PDF in your preferred reader, on any device. It's a side project born from a personal need first and foremost. When I was reading in the factory the books that eventually helped me get out of it, I had the problem that many study materials and books contained images and charts that forced me, with the dark readers available at the time, to always keep the original file in multitasking since the images became, to put it mildly, strange. I hope it can help some of you who have this same need. I think it could be very useful for researchers, but only future adoption will tell.

With that premise, I'd like to share the choices that made all of this possible. To do so, I'll walk through the three layers that veil creates from the original PDF:

- Layer 1: CSS filter. I use invert(0.86) hue rotate(180deg) on the main canvas. I use 0.86 instead of 1.0 because I found that full inversion produces a pure black and pure white that are too aggressive for prolonged reading. 0.86 yields a soft dark grey (around #242424, though it depends on the document's white) and a muted white (around #DBDBDB) for the text, which I found to be the most comfortable value for hours of reading.

- Layer 2: image protection. A second canvas is positioned on top of the first, this time with no filters. Through PDF.js's public API getOperatorList(), I walk the PDF's operator list and reconstruct the CTM stack, that is the save, restore and transform operations the PDF uses to position every object on the page. When I encounter a paintImageXObject (opcode 85 in PDF.js v5), the current transformation matrix gives me the exact bounds of the image. At that point I copy those pixels from a clean render onto the overlay. I didn't fork PDF.js because It would have become a maintenance nightmare given the length of the codebase and the frequent updates. Images also receive OCR treatment: text contained in charts and images becomes selectable, just like any other text on the page. At this point we have the text inverted and the images intact. But what if the page is already dark? Maybe the chapter title pages are black with white text? The next layer takes care of that.

- Layer 3: already-dark page detection. After rendering, the background brightness is measured by sampling the edges and corners of the page (where you're most likely to find pure background, without text or images in the way). The BT.601 formula is used to calculate perceived brightness by weighting the three color channels as the human eye sees them: green at 58.7%, red at 29.9%, blue at 11.4%. These weights reflect biology: the eye evolved in natural environments where distinguishing shades of green (vegetation, predators in the grass) was a matter of survival, while blue (sky, water) was less critical. If the average luminance falls below 40%, the page is flagged as already dark and the inversion is skipped, returning the original page. Presentation slides with dark backgrounds stay exactly as they are, instead of being inverted into something blinding.

Scanned documents are detected automatically and receive OCR via Tesseract.js, making text selectable and copyable even on PDFs that are essentially images. Everything runs locally, no framework was used, just vanilla JS, which is why it's an installable PWA that works offline too.

Here's the link to the app along with the repository: https://veil.simoneamico.com | https://github.com/simoneamico-ux-dev/veil

I hope veil can make your reading more pleasant. I'm open to any feedback. Thanks everyone

▲

gwern

34 minutes ago

[-]

Have you considered, since you can extract the images via the mask, selectively inverting them?

One can fairly reliably use a small NN to classify images by whether they should be inverted or just dimmed, and I've used it with great success for years now on my site: https://invertornot.com/ https://gwern.net/invertornot

---

On a side note, it'd be nice to have an API or something to let one 'compile' a PDF to dark-mode version PDF. Ephemeral browser-based is a drawback as often as a benefit.

▲

simoneamico

16 minutes ago

[-]

That's actually exactly where I started. The initial idea involved a YOLO nano model to classify images, deciding what to invert and what not to. It worked as a concept, but during the feasibility analysis I realized that for native PDFs it wasn't necessary: the format already tells you where the images are. I walk the page's operator list via getOperatorList() (PDF.js public API, no fork) and reconstruct the CTM stack, that is the save, restore and transform operations, until I hit a paintImageXObject. The current transformation matrix gives me the exact bounds. I copy those pixels from a clean render onto an overlay canvas with no filters, and the images stay intact. It's just arithmetic on transformation matrices, on a typical page it takes a few milliseconds.

Your approach with a classifier makes a lot more sense for the generic web, where you're dealing with arbitrary <img> tags with no structural metadata, and there you have no choice but to look at what's inside. PDFs are a more favorable problem. A case where a classifier like yours would be an interesting complement is purely vector diagrams, drawn with PDF path operators, not raster images. Veil inverts those along with the text because from the format's perspective they're indistinguishable. In practice they're rare enough that the per-page toggle handles them, but it's the honest limitation of the approach.

▲

ainch

32 minutes ago

[-]

As a PhD student doing my fair share of midnight paper-reading I think I'm the exact target market - thank you for sharing!

▲

simoneamico

8 minutes ago

[-]

That really means a lot, ainch. I hope it makes your late-night sessions a little more bearable. If you find anything that doesn't work well with the papers you read, keep me posted