Sure, if you ignore latency. In reality it's an unnecessary 0.001% increase in load time because that size increase isn't enough to matter vs the round trip time. And the time you save transmitting 55 fewer KiB is probably less than the time lost to decompression. :p
While fun, I would expect this specific scenario to actually be worse for the user experience not better. Speed will be a complete wash and compatibility will be worse.
The tragedy here is that while some people, such as the author of TFA, go to great lengths to get from about 100 to 50 kB, others don't think twice to send me literally tens of megabytes of images, when I just want to know when a restaurant is open – on roaming data.
Resource awareness exists, but it's unfortunately very unevenly distributed.
I wish there was a bit of an opposite option - a "don't lazy/partially load anything" for those of us on fiber watching images pop up as we scroll past them in the page that's been open for a minute.
A little OT, and I'm not sure if iOS has this ability, but I found that while I'm traveling, if I enable Data Saver on my (Android) phone, I can easily go a couple weeks using under 500MB of cellular data. (I also hop onto public wifi whenever it's available, so being in a place with lots of that is helpful.)
My partner, who has an iPhone, and couldn't find an option like that (maybe it exists; I don't think she tried very hard to find it), blew through her 5GB of free high-speed roaming data (T-Mobile; after that you get 256kbps, essentially unusable) in 5 or 6 days on that same trip.
It turns out there's so much crap going on the background, and it's all so unnecessary for the general user experience. And I bet it saves battery too. Anything that uses Google's push notifications system still works fine and gets timely notifications, as IIRC that connection is exempt from the data-saving feature.
I've thought about leaving Data Saver on all the time, even when on my home cellular network. Should probably try it and see how it goes.
But overall, yes, it would be great if website designers didn't design as if everyone is on an unmetered gigabit link with 5ms latency...
She might have kept tapping Sync at the bottom of Photos even though iOS itself pauses it when in Low Data mode. iCloud Photos and video syncing is a data killer if you're on a holiday abroad, my wife takes hundreds of photos and videos a day, imagine what it does to data.
So, for any reasonable connection the difference doesn’t matter; for actually gruesomely slow/unreliable connections where 50KB matters this is markedly worse. While a fun experiment, please don’t do it on your site.
1: browsers choose when to download files and run JavaScript. It is not as easy as one might think to force JavaScript to run immediately as high priority (which it needs to be when it is on critical path to painting).
2: you lose certain browser optimisations where normally many things are done in parallel. Instead you are introducing delays into critical path and those delays might not be worth the "gain".
3: Browsers do great things to start requesting files in parallel as files are detected with HTML/CSS. Removing that feature can be a poor tradeoff.
There are a few other unobvious downsides. I would never deploy anything like that to a production site without serious engineering effort to measure the costs and benefits.
I'm not up to date on web/font development – does anybody know what that does?
So the purpose is effectively to have human-readable CSS class names to refer to given glyphs in the font, rather than having stray private use Unicode characters in the HTML?
This is a reasonable approach if you have a large number of icons across large parts of the site, but you should always compile the CSS/icon set down to only those used.
If only a few icons, and the icons are small, then inlining the SVG is a better option. But if you have too many SVGs directly embedded on the site, the page size itself will suffer.
As always with website optimization, whether something is a good option always “depends”.
Pictures are pictures, text is text. <img> tag exists for a reason.
Icon fonts are used all over the place - look at the terminal nowadays. Most TUI's require an icon font to be installed.
So it's a skill issue.
Just removing the inline duplicated SVGs used for light/dark mode and using styles instead would bring it down 30%. Replacing the paths used to write text (!) with <text> and making most of the overly complicated paths into simple rect patterns would take care of the rest.
For who? Not for 99.9% of the people clicking the link here on HN. Even regular visitors of a blog will likely no longer have things in cache for their next visit.
Definitely a fascinating post though, there were things I’ve not encountered before.
Won't be a 2.5x difference, but also not 0.001%.
Also when you get to the end, you then see
> The actual savings here are moderate: the original is 88 KiB with gzip, and the WebP one is 83 KiB with gzip. In contrast, Brotli would provide 69 KiB.
At 69 KiB you're still over the default TCP packet max, which means both cases transmit the same number of packets, one just has a bunch of extra overhead added for the extra JavaScript fetch, load, and execute.
The time saved here is going to be negligible at best anyway, but there looks to be actually negative because we're burning time without reducing the number of needed packets at all.
> At 69 KiB you're still over the default TCP packet max, which means both cases transmit the same number of packets,
What? No, they absolutely don't transmit the same number of packets. Did you mean some other word?
However, Ethernet has a MTU (Maximum Transmission Unit) of 1500 bytes. Unless jumbo frames are used.
And so I agree with you, the number of packets that will be sent for 69 KiB vs 92 KiB will likely be different.
- client requests X
- client gets X, which contains a reference to Y
- therefore client requests Y
So you're starting a new request that depends on the client having received the first one. (although upon closer inspection I think the technique described in the blog post manages to fit everything into the first response, so I'm not sure how relevant this is)If you want to learn more, pretty much any resource on TCP should explain this stuff. Here's something I wrote years ago, the background section should be pretty applicable: https://www.snellman.net/blog/archive/2017-08-19-slow-ps4-do...
- client requests X
- server sends bytes 0-2k of X
- client acknowledges bytes 0-2k of X
- server sends bytes 2k-6k of X
- client acknowledges bytes 2k-6k of X
- server sends bytes 6k-14k of X
- client acknowledges bytes 6k-14k of X
- server sends bytes 14k-30k of X
- client acknowledges bytes 14k-30k of X
- server sends bytes 30k-62k of X
- client acknowledges bytes 30k-62k of X
- server sends bytes 62k-83k of X
- client acknowledges bytes 62k-83k of X
- client has received X, which contains a reference to Y
- therefore client requests Y
It's all about TCP congestion control here. There are dozens of algorithms used to handle it, but in pretty much all cases you want to have some kind of slow buildup in order to avoid completely swamping a slower connection and having all but the first few of your packets getting dropped.Doesn’t client see reference to Y at this point? Modern browsers start parsing HTML even before they receive the whole document.
I suppose it could make a difference on lossy networks, but I'm not sure.
These are similar conversations people have around hydration, by the by.
For the uninitiated: https://en.m.wikipedia.org/wiki/Hydration_(web_development)
Edit: Ah, I see OP's code requests the webp separately. You can avoid the extra request if you write a self-extracting html/webp polyglot file, as is typically done in the demoscene.
Even if you transmit the js stuff inline, the op's notion of time still just ignores the fact that it takes the caller time to even ask the server for the data in the first place, and at such small sizes that time swallows the time to transmit from the user's perspective.
It is technically 2 requests, but the second one is a cache hit, in my testing.
OP is only looking at transmit size differences, which is both not the same as transmit time differences and also not what the user actually experiences when requesting the page.
> This code minifies to about 550 bytes. Together with the WebP itself, this amounts to 44 KiB. In comparison, gzip was 92 KiB, and Brotli would be 37 KiB.
But regarding the current one:
> The actual savings here are moderate: the original is 88 KiB with gzip, and the WebP one is 83 KiB with gzip. In contrast, Brotli would provide 69 KiB. Better than nothing, though.
Most of the other examples don't show dramatic (like more than factor-of-2) differences between the compression methods either. In my own local testing (on Python wheel data, which should be mostly Python source code, thus text that's full of common identifiers and keywords) I find that XZ typically outperforms gzip by about 25%, while Brotli doesn't do any better than XZ.
Also, XZ (or LZMA/LZMA2 in general) produces a smaller compressed data than Brotli with lots of free time, but is much slower than Brotli when targetting the same compression ratio. This is because LZMA/LZMA2 uses an adaptive range coder and multiple code distribution contexts, both highly contribute to the slowness when higher compression ratios are requested. Brotli only has the latter and its coding is just a bitwise Huffman coder.
> keep the styling and the top of the page (about 8 KiB uncompressed) in the gzipped HTML and only compress the content below the viewport with WebP
Ah, that explains why the article suddenly cut off after a random sentence, with an empty page that follows. I'm using LibreWolf which disables WebGL, and I use Chromium for random web games that need WebGL. The article worked just fine with WebGL enabled, neat technique to be honest.
An author might reasonably prefer 90% of people visit his site to 100% of people consuming the content indirectly.
[1] https://js1024.fun/demos/2022/18/readme
[2] https://gist.github.com/lifthrasiir/1c7f9c5a421ad39c1af19a9c...
> The only possibility is to use the WOFF2 font file format which Brotli was originally designed for, but you need to make a whole font file to leverage this. This got more complicated recently by the fact that modern browsers sanitize font files, typically by the OpenType Sanitizer (OTS), as it is very insecure to put untrusted font files directly to the system. Therefore we need to make an WOFF2 file that is sane enough to be accepted by OTS _and_ has a desired byte sequence inside which can be somehow extracted. After lots of failed experiments, I settled on the glyph widths ("advance") which get encoded in a sequence of two-byte signed integers with almost no other restrictions.
Fantastic idea!
Unfortunately we live in a world where Google decides to rip JPEG-XL support out of Chrome for seemingly no reason other than spite. If the reason was a lack of maturity in the underlying library, fine, but that wasn’t the reason they offered.
Of course, there is - and it's really boring. Prioritisation, and maintenance.
It's a big pain to add, say, 100 compressions formats and support them indefinitely, especially with little differentiation between them. Once we agree on what the upper bound of useless formats is, we can start to negotiate what the lower limit is.
And I qualified it with mature implementation because I agree that if there is no implementation which has a clear specification, is well written, actively maintained, and free of jank, then it ought not qualify.
Relative to the current status quo, I would only imagine the number of data compression, image compression, and media compression options to increase by a handful. Single digits. But the sooner we add them, the sooner they can become sufficiently widely deployed as to be useful.
As far as I know, it was already making the smallest JPEGs out of any of the web compression tools, but WebP was coming out only ~50% of the size of the JPEGs. It was an easy decision to make WebP the default not too long after adding support for it.
Quite a lot of people use the site, so I was anticipating some complaints after making WebP the default, but it's been about a month and so far there has been only one complaint/enquiry about WebP. It seems that almost all tools & browsers now support WebP. I've only encountered one website recently where uploading a WebP image wasn't handled correctly and blocked the next step. Almost everything supports it well these days.
You can always reduce file size of a JPEG by making a WebP that looks almost the same, but you can also do that by recompressing a JPEG to a JPEG that looks almost the same. That's just a property of all lossy codecs, and the fact that file size grows exponentially with quality, so people are always surprised how even tiny almost invisible quality degradation can change the file sizes substantially.
Based on which quality comparison metric? WebP has a history of atrocious defaults that murder detail in dark areas.
It really depends on what you're after, right? If preserving every detail matters to you, lossless is what you want. That's not going to create a good web experience for most users, though.
It’s, strictly speaking, invalid HTML, but it still successfully triggers standards mode.
See https://GitHub.com/kangax/html-minifier/pull/970 / https://HTML.spec.WHATWG.org/multipage/parsing.html#parse-er...
(I too use that trick on https://FreeSolitaire.win)
That’s for the whole game: graphics are inline SVGs, JS & CSS are embedded in <script> and <style> elements.
0: https://github.com/KTibow/KTibow/issues/3#issuecomment-23367...
Edit: I found my prototype from way back, I guess I was just testing heh: https://retr0.id/stuff/bee_movie.webp.html
Something I wanted to do but clearly never got around to, was figuring out how to put an open-comment sequence (<!--) in a header somewhere, so that most of the garbage gets commented out
You likely have dozens of copies of Google Fonts, each in a separate silo, with absolutely zero reuse between websites.
This is because a global cache use to work like a cookie, and has been used for tracking.
well at least you don't have to download it more than once for the site, but first impressions matter yeah
> Alright, so we’re dealing with 92 KiB for gzip vs 37 + 71 KiB for Brotli. Umm…
That said, the overhead of gzip vs brotli HTML compression is nothing compared with amount of JS/images/video current websites use.
.webm can go away, though.
WebM on the other hand still has a reason to exist unfortunately: patents on H.264.
So there is a small chance Google will reverse the removal from Chrome.
[1] https://github.com/gildas-lormeau/SingleFile?tab=readme-ov-f...
[2] https://github.com/gildas-lormeau/Polyglot-HTML-ZIP-PNG
[3] https://github.com/gildas-lormeau/Polyglot-HTML-ZIP-PNG/raw/...
Note that way slower applies to speed of compression, not decompression. So Brotli is a good bet if you can precompress.
> Annoyingly, I host my blog on GitHub pages, which doesn’t support Brotli.
If your users all use modern browsers and you host static pages through a service like Cloudflare or CloudFront that supports custom HTTP headers, you can implement your own Brotli support by precompressing the static files with Brotli and adding a Content-Encoding: br HTTP header. This is kind of cheating because you are ignoring proper content negotiation with Accept-Encoding, but I’ve done it successfully for sites with targeted user bases.
Well it didn't work in Materialistic (I guess their webview disable js), and the failure mode is really not comfortable.
[1] https://en.wikipedia.org/wiki/Sloot_Digital_Coding_System
8 kilobytes? Rookie numbers. I'll do it in 256 bytes, as long as you're fine with a somewhat limited selection of available digital movie files ;)
I call it the High Amplitude Shrinkage Heuristic, or H.A.S.H.
It is also reversible, but only safely to the last encoded file due to quantum hyperspace entanglement of ionic bonds. H.A.S.H.ing a different file will disrupt them preventing recovery of the original data.
You'd also want "seed" and "engine" attributes to ensure all visitors see the same result.
One of the best uses of responsive design I've ever seen was a site that looked completely different at different breakpoints - different theme, font, images, and content. It's was beautiful, and creative, and fun. Lots of users saw different things and had no idea other versions were there.
Not so much for us on earth however.
That idea is something that is only cool in theory.
Currently, we're definitely not there in terms of space/time tradeoffs for images, but I could imagine at least parameterized ML-based upscaling (i.e. ship a low-resolution image and possibly a textual description, have a local model upscale it to display resolution) at some point.
It’s utterly impractical, but fun to muse about how neat it would be if it weren’t.
By comparison, you could easily define a number that goes 0,123456789101112131415… and use indexes to that number. However the index would probably be larger than what you're trying to encode.
I am curious what the compression ratios would be. I suspect the opposite, but the numbers are at a scale where my mind falters so I wouldn’t say that with any confidence. Just 64 bits can get you roughly 10^20 digits into the number, and the “reach” grows exponentially with bits. I would expect that the smaller the file, the more common its sequence is.
Let's do it for a similar number but in binary format… a 1mb file has 2²⁰ digits. If we optimize the indexing to point to the "number" instead of the "digit"… so that the index is smaller. Magically the index is as long as the file!
https://m.youtube.com/watch?v=TQy3EU8BCmo
You can really feel the "compute has massively outpaced networking speed" where this kind of thing is actually practical. Maybe I'll see 10G residential in my lifetime.
The future is already here – it's just not very evenly distributed.
Of course, the downsides became apparent once the euphoria had faded.
Arguably the cruelest implication of the pigeonhole principle.
But...
> Annoyingly, I host my blog on GitHub pages, which doesn’t support Brotli.
Is the glaringly obvious solution to this not as obvious as I think it is?
TFA went through a lot of round-about work to get (some) Brotli compression. Very impressive Yak Shave!
If you're married to the idea of a Git-based automatically published web site, you could at least replicate your code and site to Gitlab Pages, which has supported precompressed Brotli since 2019. Or use one of Cloudflare's free tier services. There's a variety of ways to solve this problem before the first byte is sent to the client.
Far too much of the world's source code already depends exclusively on Github. I find it distasteful to also have the small web do the same while blindly accepting an inferior experience and worse technology.
I'll probably switch to Cloudflare Pages someday when I have time to do that.
> As far as I know, browsers are only shipping the decompression dictionary. Brotli has a separate dictionary needed for compression, which would significantly increase the size of the browser.
How can the decompression dictionary be smaller than the compression one? Does the latter contain something like a space-time tradeoff in the form of precalculated most efficient representations of given input substrings or something similar?
[1] https://github.com/google/brotli/blob/master/c/enc/dictionar...
It's a bit disappointing you can't use Brotli in the DecompressionStream() interface just because it may or may not be available in the CompressionStream() interface though.
My suspicion is that this is a confusion of the (runtime) sliding window, which limits maximum required memory on the decoder's side to 16 MB, with the actual shared static dictionary (which needs to be present in the decoder only, as far as I can tell; the encoder can use it, and if it does, it would be the same one the decoder has as well).
On one hand it seems a bit silly to worry about ~100 KB in browser for what will probably, on average, save more than that in upload/download the first time it is used. On the other hand "it's just a few hundred KB" each release for a few hundred releases ends up being a lot of cruft you can't remove without breaking old stuff. On the third hand coming out of our head... it's not like Chrome has been against shipping more for functionality for features they'd like to impose on users even if users don't actually want them anyways so what are small ones users can actually benefit from against that.
So, just use the anti-fingerprint noise as a cookie, I guess?
I opened the page in Firefox like the article suggests and I get a different pattern per site and session. That prevents using the noise as a supercookie, I think, if its pattern changes every time cookies are deleted.
Also, it's a long shot, but could the combo of FEC (+size) and lossy compression (-size) be a net win?
(1) compatibility
(2) features
WebP still seems far behind on (1) to me so I don't care about the rest. I hope it gets there, though, because folks like this seem pretty enthusiastic about (2).
Lossy WebP comes out a lot smaller than JPEG. It's definitely worth taking the saving.
That may apply to old "LTS" Linuxes, but not any relatively recent one. Xviewer and gimp immediately come to mind as supporting it and i haven't had a graphics viewer on Linux _not_ be able to view webp in at least 3 or 4 years.
Under normal circumstances you're probably very right
¹ Now I wonder if the makers foresaw how their protocol name might sound to us now
>manually decompress it in JavaScript
>Brotli decompressor in WASM
the irony seems lost
zstd is a general-purpose compressor. By and large (and i'm unaware of any exceptions), specialized/format-specific compression (like png, wepb, etc.) will compress better than a general-purpose compressor because format-specific compressors can take advantage of quirks of the format which a general-purpose solution cannot. Also, format-specific ones are often lossy (or conditionally so), enabling them to trade lower fidelity for better compression, something a general-purpose compressor cannot do.
<img src="data:image/jpeg;base64,abc123..." />
(Double-check the exact syntax and the MIME type before you use it; it's been a few years since I have, and this example is from perhaps imperfect memory.)I love reading blogpost like these.
What is the point of doing this sort of thing if you dont even test how much faster or slower it made the page to load?