Show HN: A search engine for deleted YouTube videos (1.5B+ indexed since 2005)
14 points
1 day ago
| 2 comments
| tube.archivarix.net
| HN
archivarix
1 day ago
[-]
Search engine for YouTube content that's no longer on YouTube: deleted, removed, region-blocked, DMCA'd. ~1.5B videos indexed from 2005 onwards by aggregating archive sources Internet Archive Wayback Machine (CDX + HEAD-spread discovery), Common Crawl. What you get for any video ID: metadata (title, description, channel, upload date, duration, view counts, tags), thumbnails, original captions when the archive captured them, and reconstructed URLs to play the archived video file when available. Channel discovery reconciles legacy username/handle eras to a single canonical identity (lots of channels renamed themselves a dozen times — that part was painful).
reply
n1xis10t
1 day ago
[-]
Seems pretty cool. So this is a recent project, and you haven’t been working on this since 2005 right?

Have you considered also indexing videos that haven’t been deleted?

reply
n1xis10t
21 hours ago
[-]
Update: So I mustered the courage to try the search engine, because it was looking not very much like a scam, and it becomes very apparent as soon as you use it that non-deleted videos are also indexed.
reply
archivarix
17 hours ago
[-]
Yes, the database contains all the videos, both deleted and active ones. Or rather, not the videos themselves, but the metadata and links to the video files in the web archive. I don't have servers large enough to store the videos themselves.
reply