Archive.today is directing a DDoS attack against my blog - https://news.ycombinator.com/item?id=46843805 - Feb 2026 (168 comments)
Ask HN: Weird archive.today behavior? - https://news.ycombinator.com/item?id=46624740 - Jan 2026 (69 comments)
Every Reddit archived page used to have a Reddit username in the top right, but then it disappeared. "Fair enough," I thought. "They want to hide their Reddit username now."
The problem is, they did it retroactively too, removing the username from past captures.
You can see on old Reddit captures where the normal archived page has no username, but when you switch the tab to the Screenshot of the archive it is still there. The screenshot is the original capture and the username has now been removed for the normal webpage version.
When I noticed it, it seemed like such a minor change, but with these latest revelations, it doesn't seem so minor anymore.
With this said, I also disagree with turning everyone that uses archive[.]today into a botnet that DDoS sites. Changing the content of archived pages also raises questions about the authenticity of what we're reading.
The site behaves as if it was infected by some malware and the archived pages can't be trusted. I can see why Wikipedia made this decision.
It still is, uBlocks default lists are killing the script now but if it's allowed to load then it still tries to hammer the other blog.
"You found the smoking gun!"
This is absolutely the buried lede of this whole saga, and needs to be the focus of conversation in the coming age.
With all of this context shared, the Internet Archive is likely meeting this need without issue, to the best of my knowledge.
[1] https://meta.wikimedia.org/wiki/Wikimedia_Endowment
[2] https://perma.cc/about ("Perma.cc was built by Harvard’s Library Innovation Lab and is backed by the power of libraries. We’re both in the forever business: libraries already look after physical and digital materials — now we can do the same for links.")
[3] https://community.crossref.org/t/how-to-get-doi-for-our-jour...
[4] https://www.crossref.org/fees/#annual-membership-fees
[5] https://www.crossref.org/fees/#content-registration-fees
(no affiliation with any entity in scope for this thread)
also the oldest of that kind and rarely mention free https://www.freezepage.com
The URLs proved to be less permanent than expected, and so the issue of "linkrot" was addressed, mostly at the Internet Archive, and then through wherever else could bypass paywalls and stash the content.
All content hosted by the WMF project wikis is licensed Creative Commons or compatible licenses, with narrow exceptions for limited, well-documented Fair Use content.
Shortcut is to consume the Wikimedia changelog firehose and make these http requests yourself, performing a CDX lookup request to see if a recent snapshot was already taken before issuing a capture request (to be polite to the capture worker queue).
It's not possible to imitate Googlebot well enough to fool a site (or WAF) which knows what it's doing, because the canonical way to verify Googlebot is a DNS lookup dance which will only ever succeed if the request comes from one of Googlebots dedicated IP addresses. Same with Bingbot and all the others.
The curious part is that they allow web scraping arbitrary pages on demand. So if a publisher could put in a lot of arbitrary requests to archive their own pages and see them all coming from a single account or small subset of accounts.
I hope they haven't been stealing cookies from actual users through a botnet or something.
Why? in the world of web scrapping this is pretty common.
Maybe they use accounts for some special sites. But there is definetly some automated generic magic happening that manages to bypass paywalls of news outlets. Probably something Googlebot related, because those websites usually give Google their news pages without a paywall, probably for SEO reasons.
That effort appears to have gone nowhere, so now suddenly archive.today commits reputational suicide? I don't suppose someone could look deeper into this please?
Archive.today is directing a DDoS attack against my blog?
Oh? Do tell!
I personally just don't use websites that paywall important information.
>Oh? Do tell!
They do. In the very next paragraph in fact:
The guidance says editors can remove Archive.today links when the original
source is still online and has identical content; replace the archive link so
it points to a different archive site, like the Internet Archive,
Ghostarchive, or Megalodon; or “change the original source to something that
doesn’t need an archive (e.g., a source that was printed on paper)> editors can remove Archive.today links when the original source is still online and has identical content
Hopeless. Just begs for alteration.
> a different archive site, like the Internet Archive,
Hopeless. It allows archive tampering by the page's own JS and archive deletion by the domain owner.
> Ghostarchive, or Megalodon
Hopeless. Coverage is insignificant.
Hopeless. Caught tampering the archive.
The whole situation is not great.
I did so. You're welcome.
As for the rest, take it up with Jimmy Wiles, not me.
Oh good. That's definitely a reasonable thing to do or think.
The raw sociopathy of some people. Getting doxxed isn't good, but this response is unhinged.
We live at a moment where it's trivially easy to frame possession of an unsavory (or even illegal) number on another person's storage media, without that person even realizing (and possibly, with some WebRTC craftiness and social engineering, even get them to pass on the taboo payload to others).
In response to J.P's blog already framed AT as project grown from a carding forum + pushed his speculations onto ArsTechnica, whose parent company just destroyed 12ft and is on to a new victim. The story is full of untold conflicts of interests covered with soap opera around DDoS.
It’s still a threat isn’t it?
The article about FBI subpoena that pulled J.P's speculations out of the closet was also in ArsTechnica and by the same author, and that same article explicitly mentioned how they are happy with 12ft down
From hero to a Kremlin troll in five seconds.
I see WP is not proposing to run its own.
Like Wikipedia?
> Internet archives wayback machine works as alternative to it.
It is appalling insecure. It lets archives be altered by page JS and deleted by the page domain owner.
Yes, they are essentional, and that was the main reason for not blacklisting Archive.today. But Archive.today has shown they do not actually provide such a service:
> “If this is true it essentially forces our hand, archive.today would have to go,” another editor replied. “The argument for allowing it has been verifiability, but that of course rests upon the fact the archives are accurate, and the counter to people saying the website cannot be trusted for that has been that there is no record of archived websites themselves being tampered with. If that is no longer the case then the stated reason for the website being reliable for accurate snapshots of sources would no longer be valid.”
How can you trust that the page that Archive.today serves you is an actual archive at this point?
Oh dear.
> How can you trust that the page that Archive.today serves you is an actual archive at this point?
Because no-one shown evidence that it isn't.