I'm sure you already know this one, but for anyone else reading this I can share my favourite StackOverflow answer of all time: https://stackoverflow.com/a/1732454
> For some reason or other, people have been posting a lot of excerpts from old emails on Twitter over the last few days.
On the risk of having missed the latest meme or social media drama, but does anyone know what this "some reason or other" is?
Edit: Question answered.
Why do mail server care about how long a line is? Why don't they just let the client reading the mail worry about wrapping the lines?
The server needs to parse the message headers, so it can't be an opaque blob. If the client uses IMAP, the server needs to fully parse the message. The only alternative is POP3, where the client downloads all messages as blobs and you can only read your email from one location, which made sense in the year 2000 but not now when everyone has several devices.
Given a mechanism for soft line breaks, breaking already at below 80 characters would increase compatibility with older mail software and be more convenient when listing the raw email in a terminal.
This is also why MIME Base64 typically inserts line breaks after 76 characters.
I suspect this is relevant because Quoted Printable was only a useful encoding for MIME types like text and HTML (the human readable email body), not binary (eg. Attachments, images, videos). Mail servers (if they want) can effectively treat the binary types as an opaque blob, while the text types can be read for more efficient transfer of message listings to the client.
telnet smtp.mailserver.com 25
HELO
MAIL FROM: me@foo.com
RCPT TO: you@bar.com
DATA
blah blah blah
how's it going?
talk to you later!
.
QUIT
I think there is a second possible conclusion, which is that the transformation happened historically. Everyone assumes these emails are an exact dump from Gmail, but isn't it possible that Epstein was syncing emails from Gmail to a third party mail server?
Since the Stackoverflow post details the exact situation in 2011, I think we should be open to the idea that we're seeing data collected from a secondary mail server, not Gmail directly.
Do we have anything to discount this?
(If I'm not mistaken, I think you can also see the "=" issue simply by applying the Quoted-Printable encoding twice, not just by mishandling the line-endings, which also makes me think two mail servers. It also explains why the "=" symbol is retained.)
I wonder why even have a max line length limit in the first place? I.e. is this for a technical reason or just display related?
I wonder if the person who had the idea of virtualizing the typewriter carriage knew how much trouble they would cause over time.
Edit: yes I think that's most likely what it is (and it's SHOULD 78ch; MUST 998ch) - I was forgetting that it also specifies the CRLF usage, it's not (necessarily) related to Windows at all here as described in TFA.
Here it is in my 'notmuch-more' email lib: https://github.com/OJFord/amail/blob/8904c91de6dfb5cba2b279f...
The article doesn't claim that it's Windows related. The article is very clear in explaining that the spec requires =CRLF (3 characters), then mentions (in passing) that CRLF is the typical line ending on Windows, then speculates that someone replaced the two characters CRLF with a one character new line, as on Unix or other OSs.
It's just sp hacky i cant belive it's a real life's solution
Consider converting the original text (maintaining the author’s original line wrapping and indentation) to base64. Has anything been “inserted” into the text? I would suggest not. It has been encoded.
Now consider an encoding that leaves most of the text readable, translates some things based on a line length limit, and some other things based on transport limitations (e.g. passing through 7-bit systems.) As long as one follows the correct decoding rules, the original will remain intact - nothing “inserted.” The problem is someone just knowledgeable enough to be aware that email is human readable but not aware of the proper decoding has attempted to “clean up” the email for sharing.
Infinite line length = infinite buffer. Even worse, QP is 7-bit (because SMTP started out ASCII only), so characters >127 get encoded as three bytes (equal, then two hex digits), so a 500-character non-ASCII UTF8 line is 1500 bytes.
It all made sense at the time. Not so much these days when 7-bit pipes only exist because they always have.
But I agree with sibling comment: it makes more sense when its called "encoding" instead of "inserting chars into original stream"
Did the site get the HN kiss of death?
I, too, was reading about the new Epstein files, wondering what text artifact was causing things to look like that.
https://nitter.net/AFpost/status/2017415163763429779?s=201
Something clearly went wrong in the process.
I'm glad to know the real reason!
cat title | sed 's/anyway/in email/'
would save a click for those already familiar with =20 etc.On a side note: There are actually products marketed as kosher bacon (it's usually beef or turkey). And secular Jews frequently make jokes like this about our kosher bros who aren't allowed to eat the real stuff for some dumb reason like it has too many toes.
The writer presumably knows that umlauts and other non-ascii characters are functional in many languages. "rock döts" is poking fun at the trend in a certain tranche of anglophone rock/metal to use them in a purely aesthetic way in band names etc.
Back in those days optical scanners were still used.