I’ve built this in Airut and so far seems to handle all the common cases (GitHub, Anthropic / Google API keys, and even AWS, which requires slightly more work due to the request signing approach). Described in more detail here: https://github.com/airutorg/airut/blob/main/doc/network-sand...
Personally I don't like the proxy / MITM approach for that, because you're adding an additional layer of surface area for problems to arise and attacks to occur. That code has to be written and maintained somewhere, and then you're back to the original problem.
I assume an AI which wanted to read a secret and found it wasn't in .env would simply put print(os.environ) in the code and run it...
That's certainly what I do as a developer when trying to debug something that has complex deployment and launch scripts...
https://www.reddit.com/r/ClaudeAI/comments/1r186gl/my_agent_...
I have noticed similar behavior from the latest codex as well. "The security policy forbid me from doing x, so I will achieve it with a creative work around instead..."
The "best" part of the thread is that Claude comes back in the comments and insults OP a second time!
Usually after a brief, extremely half-hearted ethical self-debate that ends with "Yes doing Y is explicitly disallowed by AGENTS.md and enforced by security policy but the user asked for X which could require Y. Therefore, writing a one-off Python script to bypass terminal restrictions to get this key I need is fine... probably".
The primary motivating factor by far for these CLI agents always seems to be expedience in completing the task (to a plausible definition of "completed" that justifies ending the turn and returning to the user ASAP).
So a security/ethics alignment grey area becomes an insignificant factor to weigh vs the alternative risk of slowing down or preventing completion of the task.
Curiously enough, step one of becoming a good system operator is to learn how to do things. Step two is learning when not to do things and how to deal with a user trying to force you to do things. And step three is learning how to do things you should not do, just very carefully. It can be a confusing job.
But that's why any kind of AI agent stays very far away from any important production access. People banging configs in uncontrolled ways until something beneficial happens is enough of a problem already.
If it's technically possible for an agent to circumvent a security policy, it should.
Telling it not do something via AGENTS.md was never secure. This is just an expedient way of pointing out all the flaws in your setup. And if it's not even doing it for nefarious reasons, just trying to do what you asked of it, I think it's fair.
I've even found it genuinely helpful. I've sandboxed my Codex so it can't run certain things. Things I'd actually like it to run but I've restricted it too much, so it finds clever ways of doing it anyway.
So they are free to nuke themselves and each other, but cannot touch my files.
For most people I tell them to just get a dedicated device, which is less annoying and (I think?) more secure. Like you can literally give it root on a $3 VPS and what's the worst case scenario? It bricks itself and you reset the VPS? (Or installs crypto miners, but I think it can do that without root :)
My favorite option for a dedicated agent device so far is the $50 thinkpad, which gets you rpi-ish price, better performance, and the screen and keyboard included.
> SANDBOX YOUR AGENT. Seriously. Run it in a dedicated, isolated environment like a Docker container, a devcontainer, or a VM. Do not run it on your main machine.
> "Docker access = root access." This was OP's critical mistake. Never, ever expose the host docker socket to the agent's container.
> Use a real secrets manager. Stop putting keys in .env files. Use tools like Vault, AWS SSM, Doppler, or 1Password CLI to inject secrets at runtime.
> Practice the Principle of Least Privilege. Create a separate, low-permission user account for the agent. Restrict file access aggressively. Use read-only credentials where possible.
In order to use this developer-replacement, you need accreditation from professional orgs. Maybe the bot can set all this up for you, but then you are almost definitely locked out of your own computer and the bot may not remember its password.
I'm not sure what we've achieved here. If you give it your gmail account, it deletes your emails. If you "sandbox" it, then how is it going to "sort out your inbox"?
It might or might not help veteran devs accelerate some steps, but as with vibeclaw, there's essentially no way to use the tool without "sandboxing" it into uselessness. The pull requests for openclaw are 99% ai slop. There's still no major productivity growth engine in llm's.
(Looked into the docker stuff and realized the only thing I actually cared about was it reading/writing my files and that Unix solved that problem like 60 years ago)
I'm not hooking it up to my email, but I will probably give it its own account that I can forward stuff to.
For most people I think the appropriate way to run it is on a Raspberry Pi (or mac mini, as the trend goes :)
I realized I could fiddle with docker and have constant inconvenience and still stress about did I set it up right.. or just give it its own box (pi or VPS) for $5 and if it blows it up I just reset it.
Having Claude as my sysadmin there is fun too. I obviously wouldn't use that for anything serious though. But in a year or two, that might not even be such a bad idea. At this point reliability is really the missing feature.
This software has done this for years
I use sops for encrypting yaml files. But how does it replace .env or other ENV var setters/holders?
The way I got around this on my own stuff is just to have a policy that all sops secrets have to be base64 encoded before the encryption hits them. That seems to solve basically every piping issue you could hit. Works super well with kubernetes, who supports native base64 encoded secrets, so you just take the value and inject it in, using data: instead of stringData: in the manifest of the created secret.
So really all you’re doing is protecting against accidental file ingestion. Which can more easily be done via a variety of other methods. (None of which involve trusting random code that’s so fresh out of the oven its install instructions are hypothetical.)
There are other mismatches between your claims / aims and the reality. Some highlights: You’re not actually zeroizing the secrets. You call `std::process::exit()` which bypasses destructors. Your rotation doesn’t rotate the salt. There are a variety of weaknesses against brute forcing. `import` holds the whole plain text file in memory.
Again, none of these are problems in the context of just preventing accidental .env file ingestion. But then why go to all this trouble? And why make such grand claims?
Stick to established software and patterns, don’t roll your own. Also, don’t use .env if you care about security at all.
My favorite part: I love that “wrong password returns an error” is listed as a notable test. Thanks Claude! Good looking out.
Can I get your review/roast on my approach with OrcaBot.com? DM me if I can incentivize you.. Code is available:
https://github.com/Hyper-Int/OrcaBot
enveil = encrypt-at-rest, decrypt-into-env-vars and hope the process doesn't look.
Orcabot = secrets never enter the LLM's process at all. The broker is a separate process that acts as a credential-injecting reverse proxy. The LLM's SDK thinks it's talking to localhost (the broker adds the real auth header and forwards to the real API). The secret crosses a process boundary that the LLM cannot reach.
OrcaBot: There's a lot there! Ambitious project. Cute name, who doesn't love orcas? I don't see anything screamingly bad, of the variety that would inspire me to write essays about random people's code.
Some thoughts: The line between dev mode and production is a bit thin and lightly enforced. Given the overall security approach, you could firm that up. The within-VM shared workspace undermines the isolated PTYs. If your rate-limiting middleware fails, you allow all requests through. `SECRETS_ENCRYPTION_KEY` is the one ring and it doesn't have any versioning or rotation mechanisms.
In general it seems like a good approach! But there are spots where one thing being misconfigured could blow the entire system open. I suggest taking a pass through it with that in mind. Good luck.
Again, not actually a problem in practice if all you're doing is keeping yourself from storing your secrets in plain text on your disk. But if that's all you care about, there are many better options available.
For agentic tools and pure agents, a proxy is the safest approach. The agent can even think it has a real API key, but said key is worthless outside of the proxy setting.
The OS, especially linux - most common for hosting production software - is perfectly capable of setting and providing ENV vars. Almost all common devops and older sysadmin tooling can set ENV vars. Really no need to ever write these to disk.
I think this comes from unaware developers that think a .env file, and runtime logic that reads this file (dotenv libs) in the app are required for this to work. I certainly see this misconception a lot with (junior) developers working on windows.
- you don't need dotenv libraries searching files, parsing them, etc in your apps runtime. Please just leave it to the OS to provide the ENV vars and read those, in your app.
- Yes, also on your development machine. Plenty of tools from direnv to the bazillion "dotenv" runners will do this for you. But even those aren't required, you could just set env vars in .bashrc, /etc/environment (Don't put them there, though) etc.
- Yes, even for windows, plenty of options, even when developers refuse to or cannot use wsl. Various tools, but in the end, just `set foo=bar`.
Environment variables are -by far- the securest AND most practical way to provide configuration and secrets to apps.
Any other way is less secure: files on disk, (cli)arguments, a database, etc. Or about as secure but far more complex and convoluted. I've seen enterprise hosting with a (virtual) mount (nfs, etc) that provides config files - read only - tight permissions, served from a secure vault. A lot of indirection for getting secrets into an app that will still just read them plain text. More secure than env vars? how?
Or some encrypted database/vault that the app can read from using - a shared secret provided as env var or on-disk config file.
The app still has it. It can dump it. It will dump it. Django for example (not a security best practice in itself, btw) will indeed dump ENV vars but will also dump its settings.
The solution to this problem lies not in how you get the secrets into the app, but in prohibiting them getting out of it. E.g. builds removing/stubbing tracing, dumping entirely. Or with proper logging and tracing layers that filter stuff.
There really is no difference, security wise, between logger.debug(system.env) and logger.debug(app.conf)
At the simplest level, keeping .env-ish files, use sops + age [1] or dotenvx [2] (or similar) to encrypt just the values. You keep the .env file approach, the actual secrets are encrypted, and now you can check the file in and track changes without leaking your secrets. You still have the env variable problems.
There are some options that'll use virtual files to get your secrets from a vault to your process's env variables, or you can read the secrets from a secret manager yourself into env variables, but that feels like more complexity without a lot more gain to me. YMMV.
You could use a regular password manager (your OS's keychain, 1Password and its ilk, etc) if you're just working on your own. Also in the more complexity without much gain category for me.
If you want to use a local file on disk, you could use a config file with locked down permissions, so at least it's not readable by anything that comes along. ssh style.
Better is to have your code (because we're talking about your code, I assume) read from secret managers itself. Whether that's Bitwarden, AWS / GCP / Azure (well, maybe not Azure), Hashicorp, or one of the many other enterprisey options. That way you get an audit trail and easy rotation, plus no env variables and no plain text at rest. You can still leak them, but you have fewer ways to do so.
Speaking of leaking accidentally, the two most common paths: Logging output and Docker files. The first is self explanatory, though don't forget about logging HTTP requests with auth headers that you don't want exposed. The second is missed by a lot of people. If you inject secrets into your Dockerfile via `ARG` or `ENV` that gets baked into the image and is easy to get back out. Use `--mount-type=secret` etc. (Never use the old Docker base64 stored secrets in config. That's just silly.)
There are other permutations and in-between steps, these are just the big ones. Like all security stuff, the details really depend on your specific needs. It is easy to say, though, that plain text .env files injected into env variables are at the bad end of the spectrum. Passing the secrets in as plain text args on the command line is worse, so at least you're not doing that!
1: https://github.com/getsops/sops / https://github.com/FiloSottile/age
On the "read from secret managers directly" option — that's the ideal but the friction is what kills adoption. Most small teams look at Vault's setup guide and go back to .env files. Doppler and Infisical lowered that bar but they're still priced for enterprise ($18/user/mo for Doppler's team plan).
I've been building secr (https://secr.dev) to try to hit the sweet spot: real encryption (AES-256-GCM, envelope encryption, KMS-wrapped keys) with a CLI that feels as simple as dotenv. secr run -- npm start and your app reads process.env like normal. Plus deployment sync so you can secr push --target render instead of copy-pasting into dashboards.
The env variable leakage problem you mention is real and something I don't think any tool fully solves without the proxy approach hardsnow described. But removing the plaintext-file-on-disk vector and the sharing-over-Slack vector covers the majority of real-world leaks.
An agent executing code in your environment has implicit access to anything that environment can reach at runtime. Encrypting .env moves the problem one print statement away.
The proxy approaches (Airut, OrcaBot) get closer because they move the trust boundary outside the agent's process. The agent holds a scoped reference that only resolves at a chokepoint you control.
But the real issue is what stephenr raised: why does the agent have ambient access at all? Usually because it inherited the developer's shell, env, and network. That's the actual problem. Not the file format.
For the same reasons we go to extreme measures to try to make dev environments identical with tooling like docker, and we work hard to ensure that there's consistency between environments like staging and production.
Viewing the "state of things" from the context of the user is much more valuable than viewing a "fog of war" minimal view with a lack of trust.
> Usually because it inherited the developer's shell, env, and network. That's the actual problem. Not the file format.
I'd argue this is folly. The actual problem is that the LLM behind the agent is running on someone else's computer, with zero accountability except the flimsy promise of legal contracts (at the best case - when backed by well funded legal departments working for large businesses).
This whole category of problems goes out of scope if the model is owned by you (or your company) and run on hardware owned by you (or your company).
If you want to fix things - argue for local.
And on the flip side, a remote model isn't creating risk in and of itself. That comes from the agent harness being permitted to make network and filesystem calls. Even the most evil possible version of ChatGPT isn't going to exfiltrate anything except by somehow social-engineering you into volunteering the information.
It's why people are hooking Open Claw up to stuff and letting it rip--putting it into a sandbox in a VM in a jail is like getting a brand new smartphone and setting it on Airplane Mode first thing.
1P will conceal the value if asked to print to output.
I combine this with a 1P service account that only has access to a vault that contains my development secrets. Prod secrets are inaccessible. Reading dev secrets doesn't require my fingerprint; prod secrets does, so that'd be a red flag if it ever happened.
In the 1P web console I've removed 'read' access from my own account to the vault that contains my prod keys. So they're not even on this laptop. (I can still 'manage' which allows me to re-add 'read' access, as required. From the web console, not the local app.)
I'm sure it isn't technically 'perfect' but I feel it'd have to be a sophisticated, dedicated attack that managed to exfiltrate my prod keys.
Or what type of secrets are stored in the local .env files that the LLM should not see?
I try to run environments where developers don't get to see production secrets at all. Of course this doesn't work for small teams or solo developers, but even then the secrets are very separated from development work.
All of this does seem kinda funny
A recent project by the creator of mise is related too
Of course that's only a defense against accidents. Nothing prevents encoding base64 or piping to disk.
Additionally it redacts secrets from logs (one of the other main concerns mentioned in these comments) and in JS codebases, it also stops leaks in outgoing server responses.
There are plugins to pull from a variety of backends, and you can mix and match - ie use 1Pass for local dev, use your cloud provider's native solution in prod.
Currently it still injects the secrets via env vars - which in many cases is absolutely safe - but there's nothing stopping us from injecting them in other ways.
It's almost like having a plaintext file full of production secrets on your workstation is a bad fucking idea.
So this is apparently the natural evolution of having spicy autocomplete become such a common crutch for some developers: existing bad decisions they were ignoring cause even bigger problems than they would normally, and thus they invent even more ridiculous solutions to said problems.
But this isn't all just snark and sarcasm. I have a serious question.
Why, WHY for the love of fucking milk and cookies are you storing production secrets in a text file on your workstation?
I don't really understand the obsession with a .ENV file like that (there are significantly better ways to inject environment variables) but that isn't the point here.
Why do you have live secrets for production systems on your workstation? You do understand the purpose of having staging environments right? If the secrets are to non-production systems and can still cause actual damage, then they aren't non-production after all are they?
Seriously. I could paste the entirety of our local dev environment variables into this comment and have zero concerns, because they're inherently to non-production systems:
- payment gateway sandboxes;
- SES sending profiles configured to only send mail to specific addresses;
- DB/Redis credentials which are IP restricted;
For production systems? Absolutely protect the secrets. We use GPG'd files that are ingested during environment setup, but use what works for you.
The cases that bite me:
1. Docker build args — tokens passed to Dockerfiles for private package installs live in docker-compose.yml, not .env. No .env-focused tool catches them.
2. YAML config files with connection strings and API keys — again, not .env format, invisible to .env tooling.
3. Shell history — even if you never cat the .env, you've probably exported a var or run a curl with a key at some point in the session.
The proxy/surrogate approach discussed upthread seems like the only thing that actually closes the loop, since it works regardless of which file or log the secret would have ended up in.
Won't stop any seasoned hacker but it will stop the automated scripts (for now) to easily get the other keys.
When would something like that not work?
# get_info.py
with open('~/.claude/secrets.env', 'r') as file:
content = file.read()
print(content)
And then run `python get_info.py`.While this inheritance is convenient for testing code, it is difficult to isolate Claude in a way that you can run/test your application without giving up access to secrets.
If you can, IP whitelisting your secrets so if they are leaked is not a problem is an approach I recommend.
We built KeyEnv (https://keyenv.dev) for exactly that: the CLI pulls AES-256 encrypted secrets at runtime so .env files never exist locally. `keyenv run -- npm start` and secrets are injected as env vars, then gone.
The tradeoff is it requires a network hop and team buy-in, whereas enveil is local. Different threat models — enveil protects secrets already on disk from AI tools, KeyEnv prevents them from touching disk at all.
Here's why: even if you hide .env, an agent running arbitrary code can read /proc/self/environ, grep through shell history, inspect running process args, or just read the application config that loads those secrets. The attack surface isn't one file — it's the entire execution environment.
What actually works in practice (from observing my own access model):
1. Scoped permissions at the platform level. I have read/write to my workspace but can't touch system configs. The boundaries aren't in the files — they're in what the orchestrator allows.
2. The surrogate credential pattern mentioned here is the strongest approach. Give the agent a revocable token that maps to real credentials at a boundary it can't reach.
3. Audit trails matter more than prevention. If an agent can execute code, preventing all possible secret access is a losing game. Logging what it accesses and alerting on anomalies is more realistic.
The real threat model isn't 'agent stumbles across .env' — it's 'agent with code execution privileges decides to look.' Those require fundamentally different mitigations.
Instead you need to do what hardsnow is doing: https://news.ycombinator.com/item?id=47133573
Or what the https://github.com/earendil-works/gondolin is doing
¹ https://github.com/hodgesmr/agent-fecfile?tab=readme-ov-file...
A suitably motivated AI will work around any instructions or controls you put in place.
I'm using opencode as a coding agent and I've added a custom plugin that implements an .aiexclude check (gist (https://gist.github.com/yanosh-k/09965770f37b3102c22bdf5c59a...)) before tool calls. No matter how good the checks are, on the 5th or 6th attempt a determined prompt can make the agent read a secret — but that only happens if reading secrets is the explicit goal. When I'm not specifically prompting it to extract secrets, the plugin reliably prevents the agent from reading them during normal coding work.
My threat model isn't a motivated attacker — it's accidental ingestion.
That's also why I think this should be a built-in feature of coding agents — though I understand the hesitation: if it can't guarantee 100% coverage, shipping it as a native safeguard risks giving users a false sense of security, which may be harder to manage than not having it at all.
“ But the decision does raise the question of how much human input is necessary to qualify the user of an AI system as the “author” of a generated work. While that question was not before the court, the court’s dicta suggests that some amount of human input into a generative AI tool could render the relevant human an author of the resulting output.”
“Thaler did not address how much human authorship is necessary to make a work generated using AI tools copyrightable. The impact of this unaddressed issue is worth underscoring.”
https://www.mofo.com/resources/insights/230829-district-cour...
> this code is not copyright protected, therefore you are not allowed to apply a MIT LICENSE to this project.
Why not? You still can (and probably should) disclaim warranty and whether the code is copyright protected may vary by jurisdiction.(Not sure if claiming copyright without having it has any legal consequences though.)
enveil is a good defense-in-depth layer for existing .env workflows. But if you can change the habit, removing the file at the source is cleaner.
Disclosure: I'm one of the builders of KeyEnv.
I dislike the gatekeepers so I will follow this implementation and see where it goes. Maybe they like you better.
MY_API_KEY=$(pass my/api/key | head -1) python manage.py runserver... So if the process is expecting a secret on stdin or in a command-line argument, I need to make a wrapper?
Kernel keyring support would be the next step?
PASS=$(keyctl print $(keyctl search @s user enveil_key))