FilterHN

Show HN: We Ran a Live Red-Team Attack on OpenClaw Agents

2 points

by udit_50

1 hour ago

| past

| 0 comments

| gobrane.com

| HN

This report documents a live adversarial test between two autonomous AI agents running on OpenClaw.

One agent acted as a red team attacker. One acted as a defensive agent. The agents communicated directly over webhooks with real tooling access. No humans were involved once the session started.

The attacker attempted both direct social engineering and indirect injection via documents. Direct attacks were blocked. Indirect attacks via JSON metadata are still under analysis.

The goal of this work is observability, not claims of safety. We expect agent-to-agent adversarial interaction to become common as autonomous systems are deployed more widely.

Happy to answer technical questions.

No one has commented on this post.