I have to say I am surprised about that. Does anyone have any context or guesses as to why this is the case?
EDIT: Go's unicode was actually updated to v17 yesterday:
https://github.com/golang/go/commit/dd39dfb534d2badf1bb2d72d...
Go is pretty much entirely developed in public; there are some Google-internal customizations but none of them are particularly exciting and almost all changes start in the open source repo and are imported from there.
I would consider splitting this task into two:
- extracting the next Unicode code unit
- determining whether it’s in the code class
For the second, instead of using an automaton, one could use a perfect hash (https://en.wikipedia.org/wiki/Perfect_hash_function). That could make that part branch-free.
Is that a good idea?