I once got drunk with my elderly unix supernerd friend and he was talking about TTYs and how his passwords contained embedded ^S and ^Q characters and he traced the login process to learn they were just stalling the tty not actually used to construct the hash. No one else at the bar got the drift. He patched his system to put do 'raw' instead of 'cooked' mode for login passwords. He also used backspaces ^? ^H as part of his passwords. He was a real security tiger. I miss him.
Look at the Teletype ASR-33, introduced in 1963.
The idea that SOH/1 is "Ctrl-A" or ESC/27 is "Ctrl-[" is not part of ASCII; that idea comes from they way terminals provided access to the control characters, by a Ctrl key that just masked out a few bits.
(I assume everybody knows that on mechanical typewriters and teletypes the "shift" key physically shifted the caret position upwards, so that a different glyph would be printed when hit by a typebar.)
https://blog.glyphdrawing.club/the-origins-of-del-0x7f-and-i...
It really helps understand the logic of ASCII.
https://github.com/jez/bin/blob/master/ascii-4col.txt
It's neat because it's the only command I have that uses `tail` for the shebang line.
It makes sense, but it didn’t really hit me until recently. Now, I’m wondering what other hidden cleverness is there that used to be common knowledge, but is now lost in the abstractions.
I believe the layout of the shifted symbols on the numeric row were based on an early IBM Selectric typewriter for the US market. Then IBM went and changed it, and the latter is the origin of the ANSI keyboard layout we have now.
I honestly wouldn’t have thought anything of it if I hadn’t seen it written as `b ^ 0x20`.
https://www.farah.cl/Keyboardery/A-Visual-Comparison-of-Diff...
ESC [ { 11011
FS \ | 11100
GS ] } 11101
Also curious why the keys open and close braces, but ... the single and double curly quotes don't open and close, but are stacked. Seems nuts every time I type Option-{ and Option-Shift-{ …Note on your Mac that the Option-{ and Option-}, with and without Shift, produce quotes which are all distinct from the characters produced by your '/" key! They are Unicode characters not in ASCII.
In the ASCII standard (1977 version here: https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub1-2-197...) the example table shows a glyph for the double quote which is vertical: it is neither an opening nor closing quote.
The apostrophe is shown as a closing quote, by slanting to the right; approximately a mirror image of the backtick. So it looks as though those two are intended to form an opening and closing pair. Except, in many terminal fonts, the apostrophe is a just vertical tick, like half of a double quote.
The ' being veritcal helps programming language '...' literals not look weird.
There's also these:
| ASCII | US keyboard |
|------------+-------------|
| 041/0x21 ! | 1 ! |
| 042/0x22 " | 2 @ |
| 043/0x23 # | 3 # |
| 044/0x24 $ | 4 $ |
| 045/0x25 % | 5 % |
| | 6 ^ |
| 046/0x26 & | 7 & |Four Column ASCII (2017) - https://news.ycombinator.com/item?id=21073463 - Sept 2019 (40 comments)
Four Column ASCII - https://news.ycombinator.com/item?id=13539552 - Feb 2017 (68 comments)
https://dl.acm.org/doi/epdf/10.1145/365628.365652
also defined 6-bit ASCII subset
- https://en.wikipedia.org/wiki/ASCII#History
- https://en.wikipedia.org/wiki/Hexadecimal#Cultural_history
If you have to prominently represent 10 things in binary, then it's neat to allocate slot of size 16 and pad the remaining 6 items. Which is to say it's neat to proceed from all zeroes:
x x x x 0 0 0 0
x x x x 0 0 0 1
x x x x 0 0 1 0
....
x x x x 1 1 1 1
It's more of a cause for hexadecimal notation than an effect of it.EDIT: it would need to predate the 6-bit teletype codes that preceded ASCII.
(I'm almost reluctant to to spoil the fun for the kids these days, but https://en.wikipedia.org/wiki/%C2%A3sd )
Though the 01 column is a bit unsatisfying because it doesn’t seem to have any connection to its siblings.
Also explains why there is no difference between Ctrl-x and Ctrl-Shift-x.
for x in range(0x0,0x20): print(chr(x),end=" ")
for x in range(0x0,0x20): print(f'({chr(x)})', end =' ')
(0|) (1|) (2|) (3|) (4|) (5|) (6|) (7|) (8) (9| ) (10|
) (11|
) (12|
) (14|) (15|) (16|) (17|) (18|) (19|) (20|) (21|) (22|) (23|) (24|) (25|) (26|␦) (27|8|) (29|) (30|) (31|)If you want to use symbols for Mars and Venus for example,they are not in range(0,0x20). They are in Miscellanous Symbols block.
Even smaller 5-bit Baudot code had already had special characters to shift between two sets and discard the previous character. Murray code, used for typewriter-based devices, introduced CR and LF, so they were quite frequently needed in way more than few years.
ASCII did us all the favor of hitting a good stopping point and leaving the “infinity” solution to the future.
You probably mean 28-31 (∟↔▲▼, or ␜␝␞␟)
Unless this is octal notation? But 0o60-0o63 in octal is 0123
I don't fault the creators of ASCII - those control characters were probably needed at the time. The fault is ours for not moving on from the legacy technology. I think some non-ASCII/Unicode encodings did reuse the control character bytes. Why didn't Unicode implement that? I assume they were trying to be be compatible with some existing encodings, but couldn't they have chosen the encodings that made use of the control character code points?
If Unicode were to change it now (probably not happening, but imagine ...), what would they do with those 32 code points? We couldn't move other common characters over to them - those already have well-known, heavily used code points in Unicode and also iirc Unicode promises backward compability with prior versions.
There still are scripts and glyphs not in Unicode, but those are mostly quite rare and effectively would continue to waste the space. Is there some set of characters that would be used and be a good fit? Duplicate the most commonly used codepoints above 8 bits, as a form of compression? Duplicate combining characters? Have a contest? Make it a private area - I imagine we could do that anyway, because I doubt most systems interpret those bytes now.
Also, how much old data, which legitimately uses the ASCII control characters, would become unreadable?