Built something that tackles this from the local side - VeilPiercer runs 6 agents that score each other's outputs every 35 seconds on your own hardware via Ollama. Score history gets written to memory so the swarm actually learns what good output looks like over time. One-time $197, no cloud dependencies. Happy to share more about how the peer-scoring works if useful.
The sandbox approach mentioned by @arty_prof is essential, but there’s also the 'Data Leakage' side of the coin.
If an LLM agent has access to your local filesystem to 'help' with code, it essentially has a map of your credentials.
Aside from Dockerizing everything, are people using localized, air-gapped LLMs for sensitive security logic to prevent the 'Phone Home' risk entirely? Curious if anyone has successfully integrated something like Ollama into their dev-flow for this specific reason.