Copyright-ignoring AI scraper bots laugh at robots.txt
6 points
7 days ago
| 2 comments
| theregister.com
| HN
maniacwhat
7 days ago
[-]
The ai companies have shown they don't care at all about the preferences of site owners by ignoring them.

I don't see why a new language to express preferences would make any difference here.

reply
PeterStuer
7 days ago
[-]
Honestly, some sites are so ridiculously malconfigured in their anti-bot zeal that it becomes a Heisenberg like dilemma.

E.g. I want to pull in the rss. It is there specifically for m2m. If I dare get the robots.txt, i'm flagged as a bot, and denied the whole site. including not just the rss but even the parts that are not denied per the robots.txt

reply