Hidden HTML elements, zero-width characters, base64 payloads, fake LLM delimiters (<|im_start|>, [INST], <<SYS>>) — WebFetch passes all of it straight through. mcp-safe-fetch strips it in 8 stages on raw HTML and the resulting markdown.
Tested against PayloadsAllTheThings: caught 3 hidden elements and 4 LLM delimiter patterns WebFetch missed. Side effect I didn't expect — ~90% average token reduction across 4 test sites. Live test: same article, same task, 24,700 tokens vs 575.
Doesn't catch semantic injection (malicious instructions in visible text). That requires model judgment.
npx -y mcp-safe-fetch init — sets up Claude Code in one command. Works with any MCP client.
I'm sure you are already thinking about other attack vectors, web fetch is one way injection gets in but agents have a lot more surfaces. User input, tool responses, memory, other agents in a chain.
I've been poking at handling this sanitization at the api call level and filtering everything. Definitely more latency w this approach, but essentially denying all.