FilterHN

Show HN: Aura – Like robots.txt, but for AI actions

38 points

by OsmanDKitay

1 day ago

| past

| 9 comments

| github.com

| HN

Hi,

I've been watching the rise of AI agents with a mix of excitement and dread. We're building incredible tools that can browse the web, but we're forcing them to navigate a world built for human eyes. They scrape screens and parse fragile DOMs.

We're trying to tame them to act like humans. I believe this is fundamentally wrong. The goal isn't to make AI operate at a human level, but to unlock its super-human potential.

The current path is dangerous. When agents from OpenAI, Google, and others start browsing at scale and speed, concepts like UI/UX will lose meaning for them. The entire model of the web is threatened. Website owners are losing control over how their sites are used, and no one is offering a real solution. The W3C is thinking about it. I decided to build it.

That's why I created AURA (Agent-Usable Resource Assertion).

It's an open protocol with a simple, powerful idea: let website owners declare what an AI can and cannot do. Instead of letting an agent guess, the site provides a simple aura.json manifest.

This gives control back to the site owner. It's a shift from letting AIs scrape data to being granted capabilities. We get to define the rules of engagement. This allows us to increase what AIs can do, not by letting them run wild, but by giving them clear, structured paths to follow.

A confession: I'm not a hardcore programmer; I consider myself more of a systems thinker. I actually used AI extensively to help me write the reference implementation for AURA. It felt fitting to use the tool to build its own guardrails.

The core of the protocol, a reference server, and a client are all open source on GitHub. You can see it work in 5 minutes:

Clone & Install: git clone https://github.com/osmandkitay/aura.git && cd aura && pnpm install

Run the Server: pnpm --filter aura-reference-server dev

Run the Agent: (in a new terminal) pnpm --filter aura-reference-client agent -- http://localhost:3000 "list all the blog posts"

You'll see the agent execute the task directly, no scraping or DOM parsing involved.

The GitHub repo is here: https://github.com/osmandkitay/aura

I don't know if AURA will become the standard, but I believe it's my duty to raise this issue and start the conversation. This is a foundational problem for the future of the web. It needs to be a community effort.

The project is MIT licensed. I'm here all day to answer questions and listen to your feedback—especially the critical kind. Let's discuss it.

▲

1gn15

19 hours ago

[-]

As with what others had said, this is less of a robots.txt and more of a sitemap.

The issue with this is that website owners don't want to do this. Take Reddit removing the API for example. Everyone just switched to scraping the Reddit website instead for any remaining third party clients.

Yes, APIs were supposed to be a compromise to lower the resources needed on both sides, but Reddit's stock price is linked to the value of "their" data, so...

Alternatively, malicious website owners may make incorrect Aura files to mislead user agents. Then we're back to screen scraping as the ground truth, because behaving like a human is the best way to avoid discrimination.

▲

OsmanDKitay

15 hours ago

[-]

The Reddit example gets right to the heart of it: when a platform loses control over its data and how it makes money, it just shuts the door on open access.

And that s really what aura is trying to address. If we dont figure something out the web could easily end up just being a handful of big ai sites, and all the smaller, independent sites might just fade away.

The goal with aura is to give control back to the people running the websites. It s not about blocking ai but about giving sites a clear, standard way to say "here s how you can work with me meaningfully...". This means an agent can do something specific and useful without costly, aimless scraping, and it lets site owners build cool, new features just for ais.

and you re right to worry about malicious manifests but that s a trust problem. A site that lies in its aura file would get a bad reputation fast just like a phishing site does now.

At the end of the day aura is a bet that we can build an open, capability based web where site creators can join in on the AI revolution on their own terms. Time will tell if it's the right technical answer, but it s a conversation i think we absolutely need to have to keep the web diverse and creative.

▲

input_sh

20 hours ago

[-]

How is it different than llms.txt? https://llmstxt.org/

Are any websites actually using it (for llms.txt: https://llmstxt.site/)? Why do I need to npm install anything instead of writing a text file?

▲

OsmanDKitay

16 hours ago

[-]

The main difference is -llms.txt is for reading content --while aura is for performing actions

-llms.txt tells an AI here is a clean, simple version of this page for you to read --Aura tells an AI here is a capability called create_post, it needs a title and content, and you can use it by POSTing to the /api/posts endpoint.

llms.txt is for reading, aura is for doing.

You also asked why you need npm install. You dont! To add Aura to your own website, you only need to create one static file - public/.well-known/aura.json. That's it. It s just as simple as creating a llms.txt file.

The npm install command in my project is for people who want to download and run my full reference implementation (the example server and client). It is not a requirement for the protocol itself.

And for your last question, Aura is new so no other websites are using it yet.

▲

tempfile

17 hours ago

[-]

> While websites serve both human readers and LLMs, the latter benefit from more concise, expert-level information gathered in a single, accessible location.

This honestly made me laugh out loud.

▲

pbronez

19 hours ago

[-]

Honestly I love the llmstxt idea if only because it implies that I can get an unbloated version of the web for my own use. Reader mode gets way easier if a .md is available everywhere.

Which of course means this is never going to fly with any site that needs to show you ads.

▲

OsmanDKitay

15 hours ago

[-]

That s a reallyy sharp observation and you re right it s the core challenge for the web as we know it. For any site that depends on ads hiding the UI is a complete nonstarter. The key is that aura isnt really about creating another reader mode. Its goal is to fundamentally separate a website s functions from its visual presentation paving the way for a truly intent based web.

Think of it this way -instead of an ai trying to find and click the "post comment" button it can just use the "post_comment" capability the site offers through a clean api. While this seems to sidestep the ad model it actually enables a more direct one. A site could specify that a particular action, say a premium translation feature, requires a small payment or an api key. It s a way to get paid for the actual value you provide, not just for ad views.

This could even change how search works like a search engine that indexes what sites can do, not just what they say. Your personal ai could then find and execute the "book a flight" capability from multiple airlines to find the best deal for you all without you ever having to load a webpage. It s a different way of thinking about the web s economy moving from attention to action.

▲

JimDabell

18 hours ago

[-]

I’m not sure this is targeted correctly. As a general rule, new protocols for the web work best when they are associated with individual URLs, not with the somewhat nebulous concept of a site.

Have you considered something like <script type="text/llm"> or Link: <https://api.example.com/llms/foo>; rel="llm:foo", or just normal content negotiation on individual pages?

▲

OsmanDKitay

15 hours ago

[-]

I chose a single site-wide /.well-known/aura.json manifest primarily for agent efficiency. The goal is for an agent to understand the website s entire capability map with a single, predictable request, rather than having to crawl and parse individual pages to piece that map together. Aura is conceptually similar to robots.txt one file that defines the rules and possibilities for the whole site.

However you are right that context is critical for individual URLs. I handle that with the dynamic aura-state http header. While the manifest is the static map of everything possible, the aura-state header in each response tells the agent what's available right now for that specific page or state. (e.g "you are logged in so create_post is now available").

So i get the best of both worlds efficient site-wide discovery dynamic *state-aware execution

▲

JohnFen

1 day ago

[-]

Like robots.txt, it has the fatal flaw of being unenforceable.

▲

OsmanDKitay

1 day ago

[-]

You're right... Aura.json file itself is completely voluntary. A badly behaved agent can just ignore it. But this is where the model differs from robots.txt. Aura isnt the fence, it s the official map to the gates. The real enforcement happens on the backend at the API endpoint, just like any normal web app.

For example, aura manifest says the create_post capability needs auth. If an agent ignores that and POSTs to /api/posts without a valid cookie, our server's API will reject it with a 401. The manifest doesnt do the blocking, the backend does. It just tells the cooperative agents the rules of the road ahead of time.

So the real incentive for an agent to use Aura isnt about avoiding punishment, it s about the huge upside in efficiency and reliability. Why scrape a page and guess at DOM elements when you can make a single, clean API call that you know will work? It saves the agent developer time, compute resources, and the headache of maintaining brittle scrapers.

So;

robots.txt tells good bots what they shouldn't do.

aura.json tells them what they can do, and gives them the most efficient way to do it, all backed by the server's actual security logic.

▲

JohnFen

1 day ago

[-]

The primary purpose of robots.txt isn't to deny access. That's just a sideline. The intended purpose is to do exactly what this aura proposal does: to provide guidance to crawlers as to what parts of the site are valuable to crawl. That's why it's voluntary: it's main reason for existing is to benefit the crawlers in the first place.

In that light, I guess your proposal makes a certain amount of sense. I don't think it addresses what a lot of web sites want, but that's not necessarily a bad thing. Own your niche.

▲

OsmanDKitay

1 day ago

[-]

You re right that not every website needs this today. My bet is that this becomes essential for any site that wants to be a verb (a place to do actions), not just a noun (a place to read content), in the emerging agent driven web. Thanks for the thoughtful discussion.

▲

paulryanrogers

20 hours ago

[-]

Are you thinking of sitemap?

▲

JohnFen

20 hours ago

[-]

No, sitemap serves a similar but different purpose. The design goal of robots.txt is to guide crawlers to the parts of the site that are worth crawling. That it's used as a very weak method of access control is a hack.

▲

flufluflufluffy

17 hours ago

[-]

I will purposefully implement this with a bunch of incorrect information for every website I make

▲

OsmanDKitay

13 hours ago

[-]

Fair point.. If a website's manifest is wrong any agent trying to use it will just get an error. The agent will learn not to trust that site and the site owner just ends up making their own features unusable. The real incentive is to be useful. Websites that provide an honest manifest are the ones that will actually work with agents.

▲

Nikkau

20 hours ago

[-]

You should add a section to explain why OpenAPI isn't enough (narrator's voice: it is).

Otherwise, it just seems you vibecoded the wheel.

▲

OsmanDKitay

17 hours ago

[-]

The comparison to OpenAPI is the main thing to address and you re right to ask why it isn t enough.

OpenAPI is fantastic for describing a static API for a developer to read. But the web is more than that its a dynamic stateful environment built for human interaction. The current trend of forcing AI agents to navigate this human-centric web with screen scraping and DOM manipulation is brittle and I believe, unsustainable. Its like sending a robot into a grocery store to read the label on every single can instead of just asking the manager for the inventory list.

This is where Aura tries to be different in two key ways

Control & Permission:not just Documentation: Aura is designed from the website owner's perspective. It's a way for a site to say "This is my property and here are the explicit rules for how an automated agent can interact with it." The aura.json file is a handshake a declaration of consent. It gives control back to the site owner.

Statefulness(This is the big one): An OpenAPI spec is stateless. It cant tell an agent what it can do right now based on its current context. This is what the AURA-State header solves. So for example before you log in the AURA-State might only show you list_posts and login capabilities. After you successfully call login the very next response from the server includes a new AURA-State header that now unlocks capabilities like create_post and update_profile. The agent discovers its new powers dynamically. This state management is core to the protocol and doesn't really have a parallel in OpenAPI.

You re right to be skeptical and as I said in my post maybe Aura isnt the final answer. But I strongly believe the web needs a native capability-aware layer for the coming wave of AI agents. The current path of brute force interaction feels like it will break the open, human-centric web we ve all built.

▲

YVoyiatzis

20 hours ago

[-]

"vibecoded the wheel". Where did you get this from‽

▲

Nikkau

19 hours ago

[-]

> I actually used AI extensively to help me write the reference implementation for AURA.

▲

zveyaeyv3sfye

19 hours ago

[-]

How about we don't cater the web for the handful of companies hell bent on killing it?

> I actually used AI extensively to help me write the reference implementation for AURA.

So that's why. You drank the kool aid.

▲

jakeydus

17 hours ago

[-]

> The project is MIT licensed. I'm here all day to answer questions and listen to your feedback—especially the critical kind. Let's discuss it.

You can tell that OP is a big AI believer by the final sentence. That's gotta be one of the most ChatGPT lines I've ever read.

▲

OsmanDKitay

17 hours ago

[-]

The current way agents interact with the web is a problem.My view isnt that we should "cater" to them but that we should define the terms of engagement before they define it for us. Right now they re using scraping. That gives site owners zero control. Aura is an attempt to hand control back to the site owner by providing a clear aura.json manifest. It s about consent. As for using AI to build it, I believe you have to deeply understand a technology to help steer it.

▲

Hitton

17 hours ago

[-]

Hm, I don't see much reason to use it unless the website in question is paid. Web today is run on ad revenue and not only won't the site extract any revenue from the agent, it even won't look as a person for ad purposes.

▲

OsmanDKitay

12 hours ago

[-]

The way I see it the choice for websites is quickly becoming not ads vs. no ads but getting scraped for free by ai companies vs. offering a clean managed front door. Right now, when an ai scrapes a site, it bypasses the entire user experience that site owners have carefully designed to guide visitors and keep them engaged. The ai just extracts the data and leaves completely ignoring the layout, the related content, and any other part of the site s strategy. Aura is a way for a site to manage that interaction, and this opens up a couple of possibilities. For paid services the aura.json manifest can technically define which capabilities require payment. Your ai agent connected to your payment method, could then directly pay for a api call on your behalf to complete a task. But perhaps more interestingly for adsupported sites aura enables a new kind of advertising. Since the ai s request contains precise user intent(e.g. searching for flights to London) the api response can include a highly relevant, structured ad object right alongside the data. That s an ad delivered at the peak moment of user intent, which is far more valuable than a simple banner impression. It gives control back to the site owner to build a business model that actually works in a world full of ai agents.

▲

tempfile

20 hours ago

[-]

> It proposes a new standard for AI-web interaction that moves beyond fragile screen scraping and DOM manipulation towards a robust, secure, and efficient machine-readable layer for the internet.

This is nothing like robots.txt, it is much more like a sitemap. In fact, this design goal is almost word for word the point of the semantic web in general. You may find that there are existing working groups for similar resource description frameworks. Given how poor adoption of semantic tagging has been, I somewhat doubt sites start doing it just for LLMs.

Incidentally, I thought the whole point of an AI agent was that it could read and understand things by itself. I welcome any improvement in the semantic content of the web, but isn't scraping kind of the point?

▲

OsmanDKitay

17 hours ago

[-]

You made good points. Aura is more like a sitemap for actions than robotstxt. I just wanted to make Aura much simpler, using only plain JSON and HTTP. I think the Semantic Web was too complex for people to use. So why is now a good time for this idea? I think it s because of AI agents. Today big companies spend a lot of money on web scraping and their tools break all the time. This gives them a real reason to want a better way. And for your last question, can't AI just read the page? Yes it can, but it s very slow and it breaks easily. It s the difference between using a clean API versus guessing where to click on a page. Aura is just a way for a website to offer that clean API-like path. It's faster for everyone and doesnt break when a small thing like a button s color changes. Thanks for the feedback.