2025 GitHub Copilot vulnerabilities – technical overview
2 points
1 year ago
| 1 comment
| apexhq.ai
| HN
Terr_
1 year ago
[-]
IMO the critical concept to explain LLM prompt injection and manipulation like this is that almost all these "assistants" are fictional characters in a document that looks like a theater-play script, along with some tricks that "speaks out" the character so that we humans believe it's a real entity. (Meanwhile, our own inputs invisibly become words "spoken" by another "The User" character.)

So the true LLM is a nameless lump tasked with Make Any Document Longer. If for any reason the prior state is "Copilot Says: Sure, " then the LLM is probably going to try to make something that "fits" with that kind of intro.

This becomes extra-dangerous when when the generated play-script has stuff like "Copilot opens a terminal and runs the command X", and some human programmers decided to put in special code to recognize and "act out" that stage-direction.

> AI assistants like Copilot need strong context-awareness

That'll be hard. The LLM is just Making Document Longer, and the document is one undifferentiated string with no ownership. Without core algorithm changes, you're stuck trying to put in flimsy literary guardrails.

Really hardening it means getting closer to the "real" AI of sci-fi stories, where the machine (not just an assembled character named The Machine) recognizes multiple entities as existing, recognizes logical propositions, trakcs which entities are asserting those proposition (and not just referencing them), and assigning different trust-levels or authority.

reply