The entire New Yorker archive is now digitized
449 points
by thm
6 days ago
| 16 comments
| newyorker.com
| HN
habosa
17 hours ago
[-]
With every passing year the New Yorker stands out even more. High quality long-form journalism and short fiction with minimal advertising (in the print issue it’s just a few at the front and one at the back) is very hard to find. I love getting my issue in the mail every week and I’ve never once thought that reading it was a waste of my time.

I’d highly encourage anyone who loves great writing to subscribe.

reply
whistle650
15 hours ago
[-]
I’m a longtime New Yorker lover myself. I think there is some truth to this though: https://open.substack.com/pub/persuasion1/p/how-the-new-york...
reply
jbaber
10 hours ago
[-]
I subscribe, but stare right through ads, unnoticing. Do they really not have that margin ad for berets anymore?
reply
yujzgzc
16 hours ago
[-]
Did this change? I stopped reading the print version for lack of time a few years back, and there was definitely some full-page and margin advertising throughout the paper. I recall some of it being clearly directed at much wealthier customers than I was.
reply
waldothedog
16 hours ago
[-]
The placements and counts tends to vary issue to issue, but in general is much lower volume than many publications. But agreed, the ads do tend to be almost comically high end (for me)
reply
avipars
29 minutes ago
[-]
reply
smelendez
23 hours ago
[-]
I’ve long thought about trying to map of how the locations of music and maybe theater events listed in the magazine have changed over time.

There are performances of some kind in pretty much every corner of NYC but it’s interesting to see which neighborhoods have had events deemed relevant to The New Yorker readership in different eras.

reply
bufordsharkley
19 hours ago
[-]
It also speaks to what we lose when we lose magazine listings of events (New Yorker effectively gutted this section within the past decade), movie showtime listings via newspaper, etc

We have a very strong archive going back a century until about 2015, but now wading through linkrot circa 2017 is miserable

reply
smelendez
18 hours ago
[-]
And the current era of less-than-major-venue music listings in many places is exclusively on Instagram and Facebook pages of venues and bands.
reply
gregsadetsky
17 hours ago
[-]
in addition to making a map, it would also be a fascinating timeline: you could show venues (as they appear/disappear through time) and artists, and filter/search those

imagine seeing listings for John Coltrane or Miles Davis or Benny Goodman...

let me know if I can help - it's a beautiful & great project idea!

reply
Q6T46nT668w6i3m
14 hours ago
[-]
That’s an incredible idea and I hope you do this! If you do, you should consider adding restaurants too.
reply
paganel
21 hours ago
[-]
That's a very neat idea! If you ever have the time to do it you should try it out, in fact you've gave me an idea of trying to do the same for my city, Bucharest, just need to find some relevant data-sources.
reply
smelendez
18 hours ago
[-]
Travel guides are interesting too although obviously not quite the same.
reply
krelian
22 hours ago
[-]
I hope this gets incorporated into the existing website. I'm not an active subscriber but I used to be and I always thought there was a very fertile "other articles you might like" grounf that the New Yorker never took advantage of, given it's reputation and legacy.
reply
tclancy
20 hours ago
[-]
I’ve happily lost hours to following links at the bottom of one story to the next. The new archive still feels a little clunky (search needs a fair bit of work and the OCR clearly struggled in places), but it’s fun to chase down old classics and they’ve done a great job of highlighting greatest hits from the past 100 years.

Plus the (really high-quality) crossword puzzles often have an Easter egg where the big revealer is linked to an essay from the past.

reply
gregsadetsky
20 hours ago
[-]
I think that a better link (even though it lacks the context) is this new archive (which is mostly good as it lets you quickly see all cover pages) - https://www.newyorker.com/archive

But yeah, without a subscription, this still mostly just leads to walled off pages.

Accessing the actual archived version of every issue at https://archives.newyorker.com/ is truly wonderful as they are fully digitized back to back.

reply
toofy
18 hours ago
[-]
hopefully a lot of local libraries will have access. i could spend hours sifting through this.
reply
jjaaammmmy
17 hours ago
[-]
Unfortunately, it's not likely. The full text back to 1925 (of articles, with no images) has been available on ProQuest for a while, and many libraries subscribe to that which is ok, but lacking all the great photos, cartoons, ephemera etc.

Many libraries also subscribe to Libby/Overdrive which does include the full images of all the pages, but Libby only provides coverage for the past year. Unfortunately publishers of newspapers and magazines often offer great archival content of this sort on their websites, but don't allow libraries to license it for their patrons.

reply
qingcharles
11 hours ago
[-]
I saw them all on the High Seas recently, but each year is ~20GB of PDFs.
reply
robin_reala
22 hours ago
[-]
Slightly different question, but does anyone have any info about Google’s digitisation of Mainichi Shimbun’s pre-war articles? The work was announced 3 years ago, but it’s been radio silence since: https://mainichi.jp/english/articles/20221110/p2a/00m/0bu/00...
reply
donohoe
18 hours ago
[-]
About 10 years ago, when I was at The New Yorker, I worked on launching the redesign, paywall, and the move to WordPress. We actually had most of the archive technically ready to go. The data wasn’t the hard part.

The real blocker was permissions and rights. Contracts going back a century obviously never contemplated digital publication, domains, or the internet at all. Untangling who owned what, and securing the right to republish everything online, was a massive legal and logistical undertaking.

That’s what held us back then, not so much the technology. Really glad to see that chapter finally closed.

reply
donohoe
17 hours ago
[-]
Fun (unrelated) fact:

My favorite product that I got to build there was “Cartoons at Random”. You’ll never guess what it did/was!

I miss it terribly, just swiping images off a stack to reveal a new random cartoon underneath.

The developer (Justin?) did an amazing interaction on iOS app (seamless, no jank) and web version was decent too.

They broke it when they migrated from Wordpress to their own Condé Nast CMS

https://www.newyorker.com/cartoons/random/share/1544311

Such delight. Sigh.

reply
taveras
9 hours ago
[-]
I'm bummed that we never made that link keep working - it was a fun start page.
reply
donohoe
3 hours ago
[-]
It happens. I felt like the Copilot/Autopilot CMS team had a lot going on so I understood. But it was a good play for a decent native ad experience (example: we ran a decently funny set of Bill Murray cartoons - and that was good) oddly enough and assumed that would ensure its survival.
reply
rconti
16 hours ago
[-]
Any idea what changed, if anything? Court decisions made in the meantime simplifying things?

Hopefully the content fits in a few buckets (cartoons, fiction, non-fiction) as far as different terms for rights might go. And then from there, you can lop off anything that's past its copyright term (?). Then maybe the next step is grouping works by the agent/publisher, if any? Or maybe all the contracts with the New Yorker are signed by individuals, with the New Yorker as a publisher. I don't know.

reply
donohoe
14 hours ago
[-]
I assume it was a matter of time - ten years of digging into contracts or chasing people/agencies down (speculative on my part)? Bear in mind, if you are unsure if you have rights to a piece then you cannot use it until you know for sure - I am sure that was part of it too.
reply
subpixel
23 hours ago
[-]
Here’s a place to start, a list of 250 “best” articles from the New Yorker. I guess this is from previously available articles.

https://www.reddit.com/r/longform/s/zRJgAEdagi

reply
detourdog
17 hours ago
[-]
My personal favorite is Louis Menard’s piece on how bad Microsoft Word is.

https://www.newyorker.com/magazine/2003/10/06/the-end-matter

reply
tclancy
20 hours ago
[-]
reply
msla
23 hours ago
[-]
Possibly friendlier link:

https://old.reddit.com/r/longform/comments/1e8m5s1/the_250_b...

(old.reddit.com takes you to the old UI)

reply
JKCalhoun
22 hours ago
[-]
I saw no way to pull down a PDF. That's unfortunate as I prefer to browse offline.
reply
ez_mmk
22 hours ago
[-]
I think you can download the entire issue from the archive
reply
boh
21 hours ago
[-]
Honestly this got me to subscribe. The back catalog is pretty stellar with pretty much every major writer of the twentieth century making a contribution. Zooming in on PDFs just wasn't how you wanted to read them.
reply
TrevorFSmith
20 hours ago
[-]
I am a subscriber but still would love a tarball of PDFs of each issue.
reply
bookofjoe
1 day ago
[-]
reply
gavmor
22 hours ago
[-]
How soon can we chat with it via RAG?
reply
visarga
21 hours ago
[-]
Haha, I can't read long articles anymore because I want to reply, a habit I picked chatting LLMs.
reply
xnx
1 day ago
[-]
Nice! 100 years worth.
reply
fnord77
14 hours ago
[-]
cynical me thinks they did this to sell to AI companies
reply
NoMoreNicksLeft
1 day ago
[-]
Could have sworn they did this years ago. I even have the first 80 years or whatever on DVD in the closet.
reply
throwup238
19 hours ago
[-]
Normally when laymen say "digitized" they mean one of two things: scanned images in a PDF or fully transcribed (and possible formatted) text extracted from the scan. The Complete New Yorker you're thinking of was mostly the former, with a bit of indexing (table of contents pointing to the PDFs if I remember correctly).

This latest digitization project does the latter, transcribing the text into their existing content management system and as far as I can tell, preserving much of the formatting. This comes with full text search, allows cross linking between articles, and all that good stuff.

I suspect that since they include an LLM summary and started this digitization project in early 2024, this was enabled by LLMs.

reply
smelendez
23 hours ago
[-]
If I’m reading this correctly, they now have all their historic articles loaded into their CMS. I think they previously just had a system where you could page (and maybe search?) through scans of old issues, which is also cool but not as versatile.
reply
ghaff
1 day ago
[-]
When a lot of content was being put out on CD/DVD, a number of publications did but they are not straightforwardly accessible these days because they're usually on an old version of Windows. (Yes, if you want to make a project of it, you can probably get into them but has never been worth it for me.)
reply
haunter
23 hours ago
[-]
Usually Windows/Wine is the much better case than the old Mac apps (32bit, PPC etc) in the age of Apple Silcon

https://old.reddit.com/r/thenewyorker/comments/1jlhrve/instr...

Breaking the DJVU DRM would be the perfect solution though

reply
qingcharles
21 hours ago
[-]
It has been broken. I actually have the set on my desk ready to rip, I just couldn't find my USB DVD drive.

Here's a link to the guy that broke it:

https://github.com/reconSuave/PlayboyPDF/

reply
mekael
21 hours ago
[-]
Surprisingly, this has been a project I’ve been tinkering with for years. There is an easy way to get the raw png/jpeg files out, but it does require a windows box. Im planning on working on it more over the long holiday.
reply
zorked
23 hours ago
[-]
I think the disc release GP is talking about had files in DjVu format.
reply
Tomte
21 hours ago
[-]
Encrypted DjVu, and the viewer doesn‘t run on modern Windows.
reply
medler
15 hours ago
[-]
It runs great on windows 11. The install took a long time but I didn’t have to do anything special to make it work
reply
Tomte
10 hours ago
[-]
Maybe we have different editions? I never got mine to work.
reply
fsckboy
23 hours ago
[-]
doesn't wine have old versions of mswindows pretty much nailed?
reply
kopirgan
23 hours ago
[-]
I have the MAD archives bought in 90s on CDs but can't use..
reply
haunter
23 hours ago
[-]
The issues on the Absolutely MAD DVD (1952-2005) are just plain PDF files, no DRM, they work perfectly

https://files.catbox.moe/x4np6u.png

reply
kopirgan
5 hours ago
[-]
No mine were pre dvd era. In CD. Older. They had a surprisingly good UI with its own funny stuff. Your install that and insert the disk 1-7 based on which issue you select. Even scold you for installing wrong disk & comments about 'you can insert a CD of Yanni if you prefer screeching' or something like that. Lol don't know what mad has against him their comments are always funny.
reply
ghaff
22 hours ago
[-]
The CDs I have seem to be proprietary for Windows from the late 90s. But I also have PDFs through 2005 on my computer which I must have "acquired" at some point.
reply
kopirgan
5 hours ago
[-]
Yes the file names are something unknown. It has a software to access. They did a damn good job.

For instance, in Disk 1, there is a big binary file mad.m1 492MB. That seems to hold content, but not sure what file type or which program can open it. Rest of the files are very small.

reply
haunter
21 hours ago
[-]
The browser app might be some outdated Windows application, that's the case with the MAD DVD too, but you can find the actual issue files in some folders
reply
ghaff
23 hours ago
[-]
I have MAD archives somewhere. I thought they were in some standard format but maybe not.

A lot of the gen 1 or so CD content isn't easily accessible although a more industrious person could probably get to it in some manner.

reply
kopirgan
5 hours ago
[-]
I have the CD backed up as ISO files which I can mount. Since these days laptops don't have CD players.

Need to try on latest windows 11 I gave up earlier. For a while had a windows 2000 virtual machine that worked.

reply