I knew the details behind this because Windows 10 didn't include font with the trans flag by default, and so it always rendered as flag + trans symbol. I eventually installed the emoji font from the Windows 11 betas and found much of what I read to suddenly be a lot nicer looking.
P.S. I love the effects on this website :3
> Flags for countries with Unicode region codes [ie. recognized by ISO] are automatically recommended, with no proposals necessary! [...] the Emoji Subcommittee is no longer taking in any proposals for flags of any kind.
They have a section addressing new pride flags specifically near the end of the FAQ.
And it isn't flamebait to point out the f*cked up power dynamics in highly-government-influenced standards orgs. Especially the Unicode Consortium, since you can fit the alphabets of the official language of every country but one into a 16-bit space (no I'm not advocating Han Unification -- in fact precisely the opposite). The whole rest of the world has to deal with variable-length encodings and "grapheme cluster" nonsense just to keep one country happy.
dang, you have impugned my honor. I demand satisfaction in the form of a duel! Nerf guns at twenty paces.
These tangents always go in the same direction, too: they bump the thread off the back road (obscure Unicode details!) and onto the well-paved multi-line highway (to hell!).
None of that matters though, if you're going to be as good-humored as this!
> Although they can be displayed as Roman letters, it is intended that implementations may choose to display them in other ways, such as by using national flags. The Unicode FAQ indicates that this mechanism should be used and that symbols for national flags will not be directly encoded. This allows the Unicode consortium to avoid any issues surrounding which countries to include (and, de facto, recognize), instead leaving it entirely to the system implementation as to which flags to include (see: partially recognized state).
Both the People's Republic of China and the Republic of China agree that there exists an entity called "Taiwan, Province of China" (TW). They have different views about what that entity's flag is (and many other things about that entity), but Unicode doesn't offer any opinions on that.
In the meantime this arrangement works out since ROC constitution still legally asserts it's but part of One China polity, i.e. it doesn't matter what TWers think or DPP claims, or tries to legally engineer (additional articles /legal fiction limiting ROC political jurisdiction to "free area" of tw + islands). Until TW voters&politicians actually formally separates / declares independence, as in change ROC constitution by renounce claims on mainland, they'll lose 3166-1 designation because PRC gets to remove them, and won't get a new one because PRC veto. They'll lose their emojis (maybe iso codes, maybe domain depending on US/ICANN drama)... which TBH will be least of their worries.
It’s kind of like how New Zealand is included as a province of Australia, technically, in their constitution.
Having earned thousands of dollars fixing old systems to deal with new character sets, I can’t really complain.
Also, 8 bit codepages, for all their problems (a different kind of hell), didn't break the assumption that each character is encoded as one byte. In that way, they didn't break software in interesting ways like UTF-encoded and possibly decomposed Unicode is able to do. Back then, it was something of a blessing at surface level, but the proliferation of string handling code and concepts that assume this one to one mappping just don't fit well with Unicode. And UTF-8 specifically gives the illusion to English speakers that using naive 8 bit string handling works.
Color modifiers are just ZWJ sequences. Those existed before. The color modifiers themselves are not the most complicated things that get attached to ZWJ sequences among languages that Unicode supports.
OpenType today supports color tables that mean most emoji modified by colors aren't "handcrafted" but algorithmically constructed. (As many ligatures and other ZWJ sequences often are.)
> Also, 8 bit codepages, for all their problems (a different kind of hell), didn't break the assumption that each character is encoded as one byte.
That is broken in other 8-bit codepages as well, it was just seen as an exception/edge case rather than the rule. The big obvious exception has always been \r\n (carriage return then newline), but there's also ^H (control-H) and ^W (control-W) sequences (effectively backspace and delete word), and the entire gamut of things done with ANSI and/or VT100 escape seqences starting with Escape often stylized as ^[.
> And UTF-8 specifically gives the illusion to English speakers that using naive 8 bit string handling works.
Unless emoji are present, which is one of the great things about emoji and emoji becoming a very common form of punctuation in English text. Naive 8-bit string handling was always wrong. Emoji help make it visible how wrong it was. (In part by doing things other languages do such as ZWJ sequences and having code points out in the Astral Plane and other such features.)
A bunch of control codes are historically part of character encodings, and their encoding is very consistent within codepages of the same family (ASCII/ANSI and EBCDIC). You don't have to have any awareness about the active codepage/language to handle them correctly.
Terminal escape sequences are a poor form of in-band signaling between devices (now virtualized), not text. I comsider that out of scope.
Anyway, as we get into the weeds here, I do not want to dispute the enormous practical utility of Unicode and I am glad that it exists and covers so many of the world's writing systems and alphabets. It is one of the central standards that connects people today. But from the purely technical perspective, the steady complexity creep is undeniable and brings somewhat hidden costs to software systems.
Written like this, emojis look a bit like stargate addresses [1F3F3 FE0F 200D 26A7 FE0].
I knew that emoji symbols have a way of using modifiers but this is the first time I’m getting a glimpse into the process of iterating on a proposal. Thanks for sharing!
Blimey! After I engaged reader mode, all was revealed 8)
The "spider" and the wandering particles are funky but everything else in gthe presentation conspires to exclude granddad (who has rather shite eyesight these days). On the bright side, you didn't go for a dark theme. I'm happy to sort out my very minor accessibility "problems" but it might be nice to cater for all, as much as you can.I love how you have considered so many ways to ensure that it will degrade gracefully, as far as is possible for certain glyph handling capabilities.
Good skills ... how on earth does this work? I pasted your glyph quite a lot and found that backspace changes it into the other flag:
EDIT: Oh dear, HN strips out funky glyphs so this post looks a bit odd.
I hope they add the bi flag one day.
Also, this website is delightfully hilarious. It's got music. I haven't seen that since the old days. Very nostalgic. I read in Reader Mode, but enjoyed the expression of self.
Really, it should have a full implementation of heraldry.
Unicode can only be complete when it is a full renderer. If I can't render my video game using one complete "glyph", what are we even doing in this world?
Usually it sounds like a weird advice and we'd want site owners to aim for readability...but TBH, blaming a site on a .pink domain on a page explaining the codepoints of an emoji flag to not be universally accessible seems beside the point.
document.querySelector('#particles').remove()I realise it's a stylistic choice but there's been a few posts where I felt tired after reading their articles. And it also feels like one of those YouTube shorts sketches where one person pretends to be multiple people and it starts feeling a bit cheap/meh.
I think it would be fine if it they toned down on the interjections/interruptions
(didn't read the article because the website is deliberately unreadable. zero guilt)
Emojis then blew up with the rest of the world once people worked out how to enable them on the iphone. And since unicode has unlimited space for new emoji, there is little reason to deny any widely used symbol an emoji representation.
This could have very different meaning. E.g. flag of Afghanistan before vs after the Taliban took over.
The classic IBM PC text encoding ("codepage 437") already contains the card suits, gender symbols, and box drawing characters which are not text glyphs, so any "non-text symbols" battle was lost before it even started.
With emoji specifically, they were popular in Japan dating all the way back to the 90s, via carrier-specific encoding standards. The lack of emoji support in messaging was a reason that the iPhone and Android were slower than expected to take off in Japan, and so Apple and Google asked the Unicode Consortium to add emoji support, so they could have this feature on their phones while sticking to a universal encoding standard. IIRC, the Unicode Consortium was actually hesitant to do this and didn't want to be involved with standardizing pictograms into Unicode, but eventually relented.
Neither is poop, and yet someone decided that was important enough to include.