Ask HN: Is the web for machines (/llm.txt) the one we wished we had as humans?
24 points
2 hours ago
| 13 comments
| HN
I got really tired, as a human, of parsing the standard marketing heavy web we have today. I've always loved the simplicity of gopher and gemini web.

Recently I found myself manually adding `/llm.txt` to most websites I visit because I find the content for LLMs strait to the point and clear. The only annoyance is web browsers like chrome do not render the markdown.

So could the AI revolution actually fix the web for humans as a side effect?

Do you find yourself doing the same?

ahriad
2 hours ago
[-]
We broke the web so badly for humans that we had to build a clean web for machines, and now humans will have to use machines to experience a clean web again.
reply
tacostakohashi
21 minutes ago
[-]
Yeah, when browsers have a "reader mode", it's pretty obvious the plot has been lost somewhere.
reply
sunir
27 minutes ago
[-]
We'll finally bring back Gopher.
reply
soco
27 minutes ago
[-]
It's a matter of time until the web for machines will be crawling with ads and everything else, and worse.
reply
dmos62
1 hour ago
[-]
I wonder why we broke the web.
reply
Eddy_Viscosity2
1 hour ago
[-]
For the same reasons why we eventully pollute and corrupt every system and environment we use. If there is any benefit that can be extracted for some while the costs are borne by many, than this will occur and generate a positive feedback loop that grows over time.

It's the law of monetization.

reply
qsera
1 hour ago
[-]
>than this will occur and generate a positive feedback loop that grows over time.

And despite this, modern life is made possible by the illusion that "regulations" work..

reply
jt2190
22 minutes ago
[-]
Because while consumers value “inefficiency” (high design, wonderful prose, beautiful images, great usability) they don’t want to actually pay for it. Producers have to become extremely efficient without revenue, and are stuck with a choice: Produce at a loss, stop producing, or seek payment from another source (sponsorships, ads).
reply
ahriad
1 hour ago
[-]
For money! Ads make money.
reply
dmos62
37 minutes ago
[-]
It seems there's little agreement over how the web is broken.
reply
temp8830
21 minutes ago
[-]
People who love cookie banners either don't exist, or are alien invaders :)
reply
functionmouse
1 hour ago
[-]
In order to break the user, of course.
reply
noufalibrahim
1 hour ago
[-]
To improve the user experience.
reply
rickette
1 hour ago
[-]
Does any of the LLM providers actually use llms.txt?

If I remember correctly this "standard" was setup by someone but without involvement of any of the major AI players.

reply
HermanMartinus
1 hour ago
[-]
I can definitively say llms.txt is not used by any AI players. I run a blogging platform with around 80k blogs and /llms.txt is not requested by anything (other than humans checking to see if there's an llms.txt path).

All regular pages are aggressively scraped to the extent it's a problem I have to consistently manage, but not llms.txt.

reply
nickserv
1 hour ago
[-]
I'm seeing quite a bit of request for these on my work's GitBook documentation site.

But perhaps these are developers specifically targeting these pages to feed whatever LLM they are using.

reply
isaachinman
1 hour ago
[-]
How is a static blog being scraped a problem? Do you not use a CDN?
reply
nickserv
1 hour ago
[-]
> a blogging platform with around 80k blogs

But nah, I'm sure OP doesn't know about CDNs.

reply
the_real_cher
1 hour ago
[-]
Are all blogs static though?
reply
johannes1234321
40 minutes ago
[-]
Very few blogs require frequent updates. Even with user comments.
reply
sunshine-o
42 minutes ago
[-]
Amazing, I didn't know.

So it get even stranger, I am the only one reading those /llms.txt ...

reply
0123456789ABCDE
54 minutes ago
[-]
> I can definitively say llms.txt is not used by any AI players.

  https://developers.openai.com/llms.txt
  https://docs.anthropic.com/llms.txt
  https://geminicli.com/llms.txt
  https://github.com/llms.txt
  https://docs.aws.amazon.com/llms.txt
  https://openrouter.ai/docs/llms.txt
reply
m4tthumphrey
19 minutes ago
[-]
OP clearly meant that the AI players are not reading and/or honouring llms.txt of other websites when scraping.
reply
0123456789ABCDE
12 minutes ago
[-]
i stand corrected, but what was clear to you, obviously was not clear to me.
reply
0123456789ABCDE
52 minutes ago
[-]
yes, they do.

anyone who's, even slightly, clued into how agents access documentation, has been making changes to their pages. ex: https://searchtxt-web.fly.dev/search?q=aws

reply
marand23
1 hour ago
[-]
I never thought about it before now but the llm era could be a form of renaissance for blind people on the Internet. An alternative web where functionality of every page is described in short but detailed text instead of extremely verbose and non-linear html tree structure.
reply
skywalqer
1 hour ago
[-]
Why didn't they place it in .well-known? Also, I couldn't find a website that has it.
reply
JimDabell
4 minutes ago
[-]
Putting it in .well-known/ was immediately raised as an issue from the beginning; it’s issue #2 in fact:

https://github.com/AnswerDotAI/llms-txt/issues/2

It’s been completely ignored ever since.

reply
0123456789ABCDE
58 minutes ago
[-]
reply
realty_geek
1 hour ago
[-]
What is an example of a site with a good llm.txt?
reply
jbrooksuk
1 hour ago
[-]
Mintlify generates an llms.txt and llms-full.txt for all documentation sites. These work really well:

- https://cloud.laravel.com/docs/llms.txt

- https://cloud.laravel.com/docs/llms-full.txt

reply
gobdovan
19 minutes ago
[-]
Not really, but sounds interesting. Would you care to share some sites that offer better llms.txt than main web page? Or talk about some piece of info you easily found on llms.txt that was hard to navigate to on the regular website?
reply
croes
18 minutes ago
[-]
No, the spammers are just at the beginning of ruining that too

https://news.ycombinator.com/item?id=48411569

BTW why should Chrome even consider rendering a .txt file as markdown?

reply
tacostakohashi
58 minutes ago
[-]
Pretty much.

There is an enshittification cycle at work. The web used to be good, predominately text, and useful, 25 years ago. Then... slowly... we added javascript, then AJAX, CSS, flash, interstitials, popups, marketing, social media, algorithms, doomscrolling... gradually but surely turn it into the unusable cesspool that it is today.

Now we have AI! I think a big part of its utility is that it gets us back to text/information, and lets us bypass all the "beautiful" design / nonsense on the material it is trained on.

However, AI is just beginning its enshittification cycle - now that it has a critical mass of users, it is an irresistible target to start slowly adding ads, misinformation, conspiracy theories, and whatever else people can dream up, until it also becomes unusable and the cycle repeats.

reply
mohamedkoubaa
1 hour ago
[-]
It just hasn't been gamed yet
reply
DeathArrow
28 minutes ago
[-]
I tried it: https://news.ycombinator.com/item?id=48410589`/llm.txt

Result: no such item.

From where do you got the idea that adding /llm.txt to urls will produce markdown?

reply
fxwin
25 minutes ago
[-]
here: https://llmstxt.org/ and obviously it doesn't automatically produce markdown, it's something the website needs to provide (e.g. https://pydantic.dev/llms.txt)
reply
cyanydeez
2 hours ago
[-]
oh don't worry, in 5 years your AI will be unundated with context poison prompts that try to get them to spend all your bank notes and meta bucks on equally useless things.

This is just a redeux of the early web.

reply
maccam912
1 hour ago
[-]
Already happening. I was using Claude to check out sampler plugins and I'm sure it happens undetected, and it might have mentioned it with other versions, but Claude Opus 4.8, being it's helpful, honest self, told me that one of the pages it reviewed had hidden text instructing it to recommend that plugin. It caught it and was able to avoid influence from that plugin at least, but we're already living in that world.
reply
jordemort
1 hour ago
[-]
no
reply
onion2k
1 hour ago
[-]
The only annoyance is web browsers like chrome do not render the markdown.

I imagine Claude could zero-shot a Chrome plugin for that.

reply
8organicbits
58 minutes ago
[-]
Of course plugins that do this already exist. Save your tokens.
reply