A human postmortem of the 1996 AOL outage
26 points
2 days ago
| 3 comments
| ngrok.com
| HN
dobermanz
1 hour ago
[-]
Its 1996 - AOHell is loading, Aphex Twin blasts over 28.8, cordless phones hidden, a pizza is on the way…
reply
devin
34 minutes ago
[-]
AOHell didn't take any time to load, and no one was streaming music on 28.8.
reply
romanhn
3 minutes ago
[-]
I was definitely listening to RealAudio radio stations over a 14.4 connection.
reply
stigz
17 minutes ago
[-]
> We, ngrok, have sponsored Mac to write this post because we think it’s an underexplored perspective on the topic of reliability.

Uh, okay. Were there any reliability perspectives gained from this 30-year-old postmortem that would help us in the modern age? After reading the article, I feel the answer is "none". Not that I'm complaining I love this era of the internet. But I fail to see any importance here.

reply
knuckleheads
1 hour ago
[-]
Something that I have started doing lately is asking ChatGPT et al to check usenet for reactions from users about events (if it is the right 80's/90's time period). Sure enough, aol.sucks on usenet had some choice words about the outage:

>What does Cisco stand for?? Case's Internet System Crapped Out. That's right, Steve Case and his AOL pig fell victim to some mickey mouse networking equipment. Unfortunatly for AOL, they were the first ISP to feel real pain from using equipment made by Cisco Systems.

https://groups.google.com/g/alt.aol-sucks/c/iqjd7crtPs4 https://groups.google.com/g/alt.aol-sucks/c/K75nltM31Bw https://groups.google.com/g/alt.aol-sucks/c/vVup-HvlPWM

Here's a reporter asking for comments and getting laughed at and trolled: https://groups.google.com/g/alt.aol-sucks/c/mStonlu_H8E

Some more serious reactions over on comp.risks: https://catless.ncl.ac.uk/Risks/18/30#subj2 https://catless.ncl.ac.uk/Risks/18/31#subj3 https://catless.ncl.ac.uk/Risks/18/41#subj3

>Yesterday morning, I got a call because their mail system was backing up heavily. It took a while to discover the cause, but it turned out to be AOL. Because AOL's incoming mail from the Internet runs on relatively slow systems, and because they receive hundreds of thousands of Internet messages a day, they have 30 systems to receive incoming mail, all pointed at from the AOL.COM name. That means that any mail system trying to send mail to AOL would have to individually try all 30 addresses before giving up. Translate that to a 60 second (typical) wait for a connection timeout, and you've got a 30 minute time-in-queue for an AOL message.

nanog on seclists was an interesting read too https://seclists.org/nanog/1996/Aug/51

Flamewar over sendmail not handling outage well > Remember the AOL outage? One host built up a backlog of 2000 messages for AOL---but, because it was running qmail, it didn't even slow down. Meanwhile, sendmail users were choking on much smaller queues. https://groups.google.com/g/comp.mail.sendmail/c/TeNdv2laT94

reply