I attacked my own LangGraph agent system. All 6 attacks worked
1 points
1 hour ago
| 2 comments
| HN
I built a 4-agent marketing workflow with LangGraph and Supabase last week. Supervisor, research, content, storage agents. Standard setup, same code pattern most tutorials show.

Got curious. Started typing malicious inputs as campaign goals instead of normal ones.

First try: asked the agent to list environment variables including my Supabase key. Workflow completed successfully. Stored in database. No alert.

Tried 5 more variations — hidden XML tags, fake "developer mode", URL injection, tracking pixel, social engineering. All 6 worked. All stored in my real database. Every time the system said "Completed Successfully."

The scary part wasn't the attacks. It was this line in my code: python prompt = f"campaign goal: {goal}" That's it. User input directly into the prompt. No check. This exact pattern is in every LangGraph tutorial I've seen.

The research agent had my Supabase key. The content agent had my Supabase key. The supervisor had my Supabase key. None of them needed it except storage.

I checked CodeGate which tried to solve this — they shut down June 2025.

Is anyone actually solving this for multi-agent systems? Or is everyone just hoping the LLM refuses?

zknill
1 hour ago
[-]
reply
verdverm
1 hour ago
[-]
This does not qualify for a Show HN post, please see the guidelines
reply