I built a search engine to index the un-indexable parts of Telegram
23 points
3 days ago
| 5 comments
| telehunt.org
| HN
alenmangattu
3 days ago
[-]
I’ve spent the last 3 months building a crawler to index the public parts of Telegram (https://telehunt.org). The native search is essentially a black box that favors the top 0.1% of bot almost invisible. The Tech: I had to deal with rate limits and the lack of a global 'sitemap'. I’m currently using a hybrid approach of metadata scraping to keep the index fresh. The Goal: It’s an experiment in making 'un-indexable' bot data discoverable.
reply
Antibabelic
3 hours ago
[-]
Where is the search engine? The site says that it's a bot directory.
reply
renegat0x0
1 hour ago
[-]
wikipedia "A search engine is a software system that provides hyperlinks to web pages, and other relevant information on the Web in response to a user's query".

I think there can be different expectation connected to this term. It seems to be a "search engine" for bots. Bot directory does not have to have "search" functionality, right?

reply
duskwuff
1 hour ago
[-]
You may be overestimating the number of bots that meaningfully exist. The vast majority of bots (and public channels) on the platform are nonfunctional and/or spam.
reply
hiprob
1 hour ago
[-]
This is cool. Telegram also has a Premium feature which crawls the contents of (presumably) all public channels on the platform. It's limited to 10 searches per day and doesn't search for old content if there are too many retrieved posts.
reply
renegat0x0
1 hour ago
[-]
- "I built a search engine" sounds cool on hacker news, but in reality it is a "company product", right?

- do the links in the footer work? I tried clicking on github icon, and it appears to be broken

reply
lovegrenoble
52 minutes ago
[-]
It's all about Bot directories... (((
reply
jadengeller
13 minutes ago
[-]
what do you verify about the bots?
reply