Running OpenClaw requires setting up a cloud VM or local container (a pain) or giving OpenClaw root access to your machine (insecure). Many basic integrations (eg Slack, Google Workspace) require you to create your own OAuth app.
We make running OpenClaw simple by giving each user their own EC2 instance, preconfigured with keys for OpenRouter, AgentMail, and Orthogonal. And we have OAuth apps to make it easy to integrate with Slack and Google Workspace.
We are both HN readers (Bailey has been on here for ~10 years) and we know OpenClaw has serious security concerns. We do a lot to make our users’ instances more secure: we run on a private subnet, automatically update the OpenClaw version our users run, and because you’re on our VM by default the only keys you leak if you get hacked belong to us. Connecting your email is still a risk. The best defense I know of is Opus 4.6 for resilience to prompt injection. If you have a better solution, we’d love to hear it!
We learned a lot about infrastructure management in the past month. Kimi K2.5 and Mimimax M2.5 are extremely good at hallucinating new ways to break openclaw.json and otherwise wreaking havoc on an EC2 instance. The week after our launch we spent 20+ hours fixing broken machines by hand.
We wrote a ton of best practices on using OpenClaw on AWS Linux into our users’ AGENTS.md, got really good at un-bricking EC2 machines over SSM, added a command-and-control server to every instance to facilitate hotfixes and migrations, and set up a Klaus instance to answer FAQs on discord.
In addition to all of this, we built ClawBert, our AI SRE for hotfixing OpenClaw instances automatically: https://www.youtube.com/watch?v=v65F6VBXqKY. Clawbert is a Claude Code instance that runs whenever a health check fails or the user triggers it in the UI. It can read that user’s entries in our database and execute commands on the user’s instance. We expose a log of Clawbert’s runs to the user.
We know that setting up OpenClaw is easy for most HN readers, but I promise it is not for most people. Klaus has a long way to go, but it’s still very rewarding to see people who’ve never used Claude Code get their first taste of AI agents.
We charge $19/m for a t4g.small, $49/m for a t4g.medium, and $200/m for a t4g.xlarge and priority support. You get $15 in tokens and $20 in Orthogonal credits one-time.
We want to know what you are building on OpenClaw so we can make sure we support it. We are already working with companies like Orthogonal and Openrouter that are building things to make agents more useful, and we’re sure there are more tools out there we don’t know about. If you’ve built something agents want, please let us know. Comments welcome!
Two questions as a potential user who knows the gist of OpenClaw but has been afraid to try it: 1. I don't understand how the two consumption credits play into the total cost of ownership. E.g. how long will $20 of Orthogonal credits last me? I have no idea what it will actually cost to use Klaus/OpenClaw for a month. 2. Batteries included sounds great, but what are those batteries? I've never heard of Apollo or Hunter.io so I don't know the value of them being included.
In general, a lot of your copy sounds like it's written for people already deep into OpenClaw. Since you're not targeting those folks, I would steer more towards e.g. articulating use cases that work ootb and a TCO estimate for less technical folks. Good luck, and I'm eager to try it!
I can give you an openclaw instruction that will burn over $20k worth of credits in a matter of hours.
You could also not talk to your claw at all for the entire month, setup no crons / reoccurring activities / webhooks / etc, and get a bill of under $1 for token usage.
My usage of OpenClaw ends up costing on the order of $200/mo in tokens with the claude code max plan (which you're technically not allowed to use with OpenClaw anymore), or over $2000 if I were using API credits I think (which Klause is I believe, based on their FAQ mentioning OpenRouter).
So yeah, what I consider fairly light and normal usage of OpenClaw can quite easily hit $2000/mo, but it's also very possible to hit only $5/mo.
Most of my tokens are eaten up by having it write small pieces of code, and doing a good amount of web browser orchestration. I've had 2 sentence prompts that result in it spinning up subagents to browse and summarize thousands of webpages, which really eats a lot of tokens.
I've also given my OpenClaw access to its own AWS account, and it's capable of spinning up lambdas, ec2 instances, writing to s3, etc, and so it also right now has an AWS bill of around $100/mo (which I only expect to go up).
I haven't given it access to my credit card directly yet, so it hasn't managed to buy gift cards for any of the friendly nigerian princes that email it to chat, but I assume that's only a matter of time.
Giving an agent access to AWS is effectively giving it your credit card.
At the max, I would give it ssh access to a Hetzner VM with its own user, capable of running rootles podman containers.
There's infamously no way to set a max bill amount for an account in AWS, so it indeed has unlimited spending, but I'm okay with a couple hundred bucks a month.
> Hetzner VM with its own user, capable of running rootles podman containers
Why not give it root on the full VM, and not use the VM for anything else? Giving it a user, and presumably also running your own stuff as a different user, sounds like a very weak security boundary to me compared to giving it a dedicated machine.
If you're not doing multi-tenancy, there's no reason to not give it root, and if you are doing multi-tenancy, then your security boundary is worse than mine is, so you can't call me a madman for it.
This is a problem for coding as smarter really has an impact there, but there are so so so many tasks that an 8b model that runs on a $200 gpu can handle nicely. Scrape this page and dump json? Yeah that’s gonna be fine.
This is my conclusion based on a week or so of using ollama + qwen3.5:3b self hosted on a ~10 year old dell optiplex with only the built-in gpu. You don’t need state of the art to do simple tasks.
Only gonna be fine on a trusted page, an 8b model can be prompt injected incredibly trivially compared to larger ones.
Obviously you should also take precautions, like never instructing it to invoke the browser tool on untrusted sites, avoiding feeding it untrusted inputs where possible in other places, giving it dedicated and locked-down credentials where possible....
But yeah, at this point it's inherent to LLMs that we cannot do something like SQL prepared statements where "tainted" strings are isolated. There is no perfect solution, but using the best model we can is at least a good precaution to stack on top of all our other half-measures.
Claude 4.6 is at least a bit resilient to prompt injection, but local models are much worse at that, so using a local model massively increases your chance of getting pwned via a prompt injection, in my estimation.
You're kinda forced to use one of the better proprietary models imo, unless you've constrained your claw usage down to a small trusted subset of inputs.
Orthogonal credits are used more frequently by power users. For everyday tasks they'll last a very long time, I don't think any of our users have run out.
Some example Orthogonal user cases:
* customers in sales uses Apollo to get contact info for leads
* I use Exa search to help me prepare for calls by getting background info on customers and businesses
* I used SearchAPI to help find AirBnbs.
Point taken on the copy! We made this writing more technical for the HackerNews audience and try to use less jargon on other platforms.
Maybe $50 a month is an underestimate because our average user has been live for less than a month.
Do you think it’s worth $500 a month? Also, maybe tough to answer, does it seem like the token usage ($500 a month) would be equivalent if you did the same things using Claude or GPT directly?
My reason for asking is because I tried OpenClaw and a quick one-line test question used 10,000 tokens. I immediately deleted the whole thing.
IMO I don't think the "OpenClaw has root access to your machine" angle is the thing you should worry that much about. You can put your OpenClaw on a VM, behind a firewall and three VPNs but if it's got your Google, AWS, GitHub, etc. credentials you've still got a lot to worry about. And honestly, I think malicious actors are much more interested in those credentials than wiping out your machine.
I'm honestly kind of surprised everyone neglects to think about that aspect and is instead more concerned with "what if it can delete my files."
Are there other tasks that people commonly want to run, that don't require this, that I'm not aware of? If so I'd love to hear about them.
The ClawBert thing makes a lot more sense to me, but implementing this with just a Claude Code instance again seems like a really easy way to get pwned. Without a human in the loop and heavy sandboxing, a agent can just get prompt injected by some user-controlled log or database entry and leak your entire database and whatever else it has access to.
So there isn't really a way to avoid this trade-off you can either have a useless agent with no info and no access. Or a useful agent that then is incredibly risky to use as it might go rogue any moment.
Sure you can slightly choose where on the scale you want to be but any usefulness inherently means it's also risky if you run LLMs async without supervision.
The only absolutely safe way to give access and info to an agent is with manual approvals for anything it does. Which gives you review fatigue in minutes.
A user could leave malicious instructions in their instance, but Clawbert only has access to that user's info in the database, so you only pwned yourself.
A user could leave malicious instructions in someone else's instance and then rely on Clawbert to execute them. But Clawbert seems like a worse attack vector than just getting OpenClaw itself to execute the malicious instructions. OpenClaw already has root access.
Re other use cases that don't rely on personal data: we have users doing research and sending reports from an AgentMail account to the personal account, maintaining sandboxing. Another user set up this diving conditions website, which requires no personal data: https://www.diveprosd.com/
Well the assumption was that you could secure OpenClaw or at least limit the damage it can do. I was also thinking more about the general usecase of a AI SRE, so not necessarily tied to OpenClaw, but for general self hosting. But yeah probably doesn't make much of a different in your case then.
> If you’ve built something agents want, please let us know. Comments welcome!
I'll bite! I've built a self-hosted open source tool that's intended to solve this problem specifically. It allows you to approve an agent purpose rather than specific scopes. An LLM then makes sure that all requests fit that purpose, and only inject the credentials if they're in line with the approved purpose. I (and my early users) have found substantially reduces the likelihood of agent drift or injection attacks.
Basically how do you make sure your "AI SRE" does not deviate from it's task and cause mayhem in the VM, or worse. Exfiltrates secrets, or other nasty things? :)
Complex abilities unlocked calling a FastAPI server with one skill for each endpoint
OpenClaw is interesting because it does a lot of things ok, but it was the first to do so. It will chat with you in Telegram/messages which is small but surprisingly interesting. It handles scheduled tasks. The open source community is huge, clawhub is very useful for out of the box skills. It's self building and self modifying.
There seem to be about 20 options, and new ones every day. Any consensus on the best few are, and their tradeoffs?
https://github.com/skorokithakis/stavrobot
It does indeed only need compose up -d.
I spent the past month hacking on openclaw to play nice in a docker container for my own VPS use.
This project has a lot of useful debugging tools for running multiple claws on a single VPS:
https://github.com/simple10/openclaw-stack
For average users, Klaus is a much better fit.
OpenClaw is capable of using ElevenLabs or other providers to make phone calls, but I personally haven't done this and as far as I know none of our customers have either. Is AI good enough at cold calling yet for this to work? I personally would never entertain such a call.
What is more important is making them do actual useful things that are net positive and right now the use-case are pretty limited.
1. There are many interactions I just could not get to work. I may have done something wrong, but in general, I have the perspective that most products should "just work" if it's as simple as clicking a button or directing something. In this case, I'm tangibly talking about the Browser feature, and the Canvas feature. In my account, I tried many times to have OpenClaw use the Browser to access a website and send me a screenshot, and it regularly reported the Browser was inaccessible, even though I had enabled it via Klaus UI. Secondly, I asked it to write certain reports to the Canvas as HTML pages that I could review - the entries would show up as files I could click on, but the files themselves were always empty. 2. OpenClaw with tokens is insanely expensive - I blew through the $15 tokens in a matter of a day.
For the first, my guess is I misconfigured something, but it's really difficult to identify what is wrong. My expectation was that I could prompt via Telegram to configure anything and everything, but some link was missing. Although I am a technical person, my expectation was that I would not need to muck around via `ssh` to figure out where my files ended up.
For the latter - and more broadly - OpenClaw is not well understood for most, and I think they will be caught off guard just how expensive it is. $15 in tokens is not a lot with how inefficient OpenClaw can be. My suggestion would be:
1. Pre-configure OpenClaw with already extremely memory-efficient rules and skills. 2. Provide clear guidance/documentation on ideal agent setup with different models as necessary. I think OpenRouter attempts to achieve this pretty well, but you are providing a layer on top of OpenRouter that may not be obvious to less-well-versed people. 3. Batteries-included options should "just work" - I felt I wasted a lot of tokens just figuring out how to get the thing to do simple tasks for me.
---
A lot of the notes I made are less about your product and what you've achieved, and more to do with OpenClaw. However, you've achieved one major milestone - which is the one-click setup of OpenClaw. But if your target demographic is the less technically inclined folks that want to be able to play with the bleeding edge of AI practices, I think your platform needs to guide users to how to actually use this thing, and become useful right away.
It may even be beneficial to showcase extremely clear workflows for users to get started and sell why they even want OpenClaw.
---
Anyway, kudos on the release! It is not easy to ship and you've done that hard bit! I bid you good luck on the next phase!
One of the fundamental problems is OpenClaw is tech for nerds. It's hard to use, it breaks all the time, it's built on LLMs, etc. We'd like to be the one to bridge the gap but that will take a ton of work. It's something we spend all day thinking about. Some issues like the one you hit with canvas are likely some mix of our problem and the model doing something unexpected like putting the file in the wrong directory which is constantly a problem.
Also agree on the cost being a huge issue. We give $15 up front and it just disappears so quickly for many users. Some users switch to smaller models but often this just ends up with people being more unhappy because the performance is bad. Opus is the least likely to make mistakes but also the most expensive.
Thanks for the advice, it's great to hear you believe in it too! At a personal level, it means a ton to me. Just got to keep writing code.
oh fuck yea, sounds great.
Hard pass on this (and OpenClaw) thanks.
Even in a locked-down VM the agent can still send emails, spin up infra, hit APIs, burn tokens.
A pattern we've been experimenting with is putting an authorization boundary between the runtime and the tools it calls. The runtime proposes an action, a policy evaluates it, and the action only runs if authorization verifies.
Curious if others building agent runtimes are exploring similar patterns.
mind if I write an article about this on ijustvibecodedthis.com ?