FilterHN

Show HN: We told OpenClaw to rm -rf and it failed successfully

1 points

by joshdevon

1 hour ago

| past

| 1 comment

| securetrajectories.substack.com

| HN

▲

joshdevon

1 hour ago

[-]

As we all know, OpenClaw is awesome precisely because it gives us Simon Willison’s lethal trifecta: Access to private data, exposure to untrusted content, and the ability to externally communicate.

While extremely risky, it gives us a glimpse of the future we can have if we actually could trust agents.

To date, sandboxing (or buying mac-minis) has been the approach to reducing risk. While necessary, sandboxes also make the agent less useful because they ultimately contain and restrict the agent's helpful capabilities.

To wrangle OpenClaw, we took a complementary approach. Instead of just a perimeter, we built an open source OpenClaw extension that creates deterministic lanes for the agent using Cedar (AWS's policy as code language).

For example, we created a policy that forbids OpenClaw from using rm. We aren't trying to stop the LLM from thinking about deleting a file or stop it from being prompt injected to delete a file. Instead, the extension catches the tool call and blocks it before execution.

We are shipping with 3 policy packs (103 rules):

-Baseline pack: Protections for sudo, rm, credentials, etc.

-OpenClaw System Protection: Protects SOUL.md, identity files, etc.

-OWASP Agentic Pack: Based on the OWASP Top 10 for Agentic Applications.

Just like OpenClaw, this is experimental and hasn't been rigorously tested, so please don't use the extension to protect anything valuable or sensitive. We hope this project is a strong proof of concept for how we can put agents in risky situations and still trust them with deterministic rules.

For more details and the link to the repo, please check out our write-up. Would love to hear what others think of the approach and what policies you think would be useful to add.