I’m interested in checking out Vibium - I’ve been a reluctant adopter of Playwright and hopeful for a new approach.
not yet. definitely on the roadmap, though. goal is to embrace what playwright has done well, then extend what's possible...
I’m a big fan of testing exactly like a user. Users don’t use network intercepts, timeouts, etc. All of my most reliable tests assert on DOM state. If the user doesn’t see it, don’t assert on it.
It's been an interesting journey.I do think Playwright is the defacto standard now, but Selenium was the original browser driver.
Anyway, how does Vibium compare to Playwright ? Playwright's main advantage is it has official support for multiple languages.
i'll politely pushback a little. i think it's safe (at this moment in time) to say: playwright wins the first derivative, but selenium wins the "area under the curve". selenium is very entrenched in many parts of the world, especially outside of SF/USA. part of the inbound interest i've been getting for vibium is from those selenium users who want some kind of bridge to the future, but didn't have an obvious path forward beyond "dump selenium, adopt playwright"...
part of my plan with vibium post-v1 is to give that massive (and it truly is massive, i'm not bragging) installed base of selenium users an upgrade path to more agentic coding options.
Playwright really simplifies getting setup. It won't work for everyone, but within 30 seconds Playwright will download it's needed browsers along with a test runner.
I also find the documentation is much better/consolidated.
Definitely open to helping you out if I can be of assistance.
right now, code-wise -- for the code you see in github at the moment -- it's just me and my ai pal, claude. but there's a growing cast of (human!) characters also helping with all the other things we need to do to run a successful project. patches and tokens welcome!
Generally if you have a lot of legacy selenium scripts it's probably not worth it to switch everything over, but if you're creating a new UI automation framework I've just never seen selenium as a first choice for that.
Don't get me wrong it's still solid technology though.
legacy selenium suites are a strong contender for vibium adoption. i think hugs has been surveying a ton of folks, he may have a better bird's eye view of the potential user base.
as for academic use of selenium, we have boni garcia - maker/popularizer or selenium webdriver manager teaching at a uni in spain. (maybe an isolated example, but he's rather known in the community)
A custom sh script or something for whitelists would take ~5min to setup.
For more robust governance (many policies), you can write Rego using https://github.com/eqtylab/cupcake
i did post a v2 roadmap on the github repo. might be time to start the draft for v3!
The solution I landed on recently was to locally modify the Chrome devtools MCP to launch the browser instance with strict network restrictions. I believe the implementation used `--host-resolver-rules`, blocking all URLs by default with an environment variable to control the allowlist (which, in hindsight, Claude can easily work around if it needs to -- I should probably just hard-code the allowlist).
This is Anthropic's recommended setup for devcontainers:
https://github.com/anthropics/claude-code/blob/main/.devcont...
You may want to adapt it and particularly to remove the GitHub and VS Code stuff.
There's a browser_find method, but that assumes you already know what type of element it is. But I can't always tell what type of element something is just by looking at a screenshot.
What have I missed or misunderstood?
I’ve added a browser_evaluate tool in my fork—though I haven’t committed or pushed a PR yet. With that, the agent can call JavaScript to get the accessibility tree and then use that to navigate via browser_find.
This and much more will be coming soon. See the V2 roadmap for more insight: https://github.com/VibiumDev/vibium/blob/main/V2-ROADMAP.md
What was the reason you went down this path instead of extending selenium with AI features?
but why a new thing vs extending selenium? it's a little complicated, but neither selenium nor playwright were designed with ai in mind from day 1. with vibium, i'm optimizing for "vibe coding" and ai-driven workflows first.
of course, i have a new host of problems by going all in with "vibium"... i'm making a huge bet that "vibe coding" is a trend, not a fad. (it could still be a fad! we'll see if this post ages well soon enough!)
We were looking at seeing if a model could look at the screenshot of the failure, some of the original website source code, and try to fix the failing test.
My question is with vibium, would it make more sense to port the legacy tests over to vibium, and if a test fail, use its capabilities to try to self-heal?
i want to build an island resort and a bridge from the mainland to get there. do i build the island resort first or the bridge first?
here's my thinking: if the resort is popular and a fun place to be, there will be a huge incentive to build the bridge next. but we might also find out that building the bridge will ultimately be economically impractical and we should just stick to using ferry boats. at least we'll have a cool island resort to go to, though!
so for now, i'm just focusing on building the island resort at the moment. but i really, really want to build that bridge, too, asap.
Any plans of exposing more of the browser? For instance playwright is able to store tracing files the agent may decide to read to understand some requests / payloads…
Any plans on allowing the agent to run an arbitrary js script?
also need to clarify: there are two apis exposed right now: the mcp server and a "plain old" js/ts api. the js/api does have the ability to run arbitrary js. theoretically, you could ask an agent to write a vibium script with the js/ts library, and have the ai run that... (which ironically? is also a way to deal with the issue of context bloat)
So glad to see you are still in this space!
to save a click, i'll post it here, too:
-----------
why vibium?
there are dozens of "ai-powered browser" tools now. so why this one?
the selenium ecosystem is massive: millions of tests, thousands of companies, decades of investment. but there's no obvious bridge to the ai future. many have moved to playwright — and for good reason: it's fast, easy to use, has popular features like auto-waiting, integrated video recording, and a ton of other batteries included.
vibium takes the same approach. batteries included. great dx. but built for where the industry is going: ai agents that need to drive browsers.
when i did those interviews in september, the response wasn't just "cool idea." it was relief. the community trusts us to build this bridge because we built the last two: selenium in 2004, appium in 2012.
community and ecosystem are the moat.
AFAIK Playwright also takes the approach of batteries included, great dx, and has a lot of good integration with AI agents.
Basically, what sets Vibium apart?
i was being a little too cute using "embrace" and "extend" in a previous comment (look up "embrace, extend, extinguish"). sorry about that.
the big idea with vibium in v2 and beyond is to bring to test automation something old and boring in robotics: the "sense - think - act" loop. sensors observe the world, a brain makes decisions, and actuators carry them out.
right now most browser tools extend what's possible at the "act" layer. they make it easier for an llm to click, type, and observe the browser.
that's useful, but it mostly enables one-off demos. every run starts from scratch. there's no accumulated understanding of the app, and long workflows are navigated by guessing and retries.
what vibium is trying to extend is not just action, but the loop.
vibium v1 is just the "act" part, which i'm calling clicker. it clicks buttons, types, and navigates the browser.
retina and cortex are coming in v2. retina turns real interaction into durable signal (manual exploration, existing tests, production usage). cortex builds on that signal to create a navigable model of workflows that an llm can plan through, instead of reasoning from raw html each time.
clicker is the execution layer. playwright mcp largely lives here. vibium clicker overlaps in scope, but is designed from the start to feed sensing and planning rather than being the whole system.
so yes, playwright mcp covers part of this. what's missing today is first-class sense and think. that's the gap vibium is exploring, even if v1 only ships the act layer.
tl;dr:
sense -> retina (v2)
think -> cortex (v2)
act -> clicker (v1)
i've spent the past few months talking about applying the "sense - think - act" loop to browser automation, but at some point i realized i needed to "talk less, ship more". :-) i'm looking forward to shipping retina and cortex so we can see whether the full loop is actually a step change beyond what playwright or playwright+mcp can do.
happy to dig deeper if helpful.
https://github.com/VibiumDev/vibium/blob/main/V2-ROADMAP.md
What’s next 5 years look like given that you are very good at building long-term projects that last and evolve through time? And for a very specific example, what’s the plan for incorporating new standards like Agent Skills as they quickly evolve and launch?
as far as long term plans go, i like the tim o'reilly quote: "create more value than you capture".
with selenium, we created an entire ecosystem of tools, users, companies, and economic activity. (literally billions of usd -- it's a story frequently ignored by the tech press when looking for "open source success stories".) but i hope to do the same with vibium. there will likely be a hosted "vibium.cloud" hosted service. i also hope there will be lots of them. in a similar way, there weren't many "hosted selenium" services when i started sauce labs. now there's a bunch. browserstack, lambdatest, etc.
it was also not really an accident we did that with selenium. there is a lot of behind-the-scenes consensus building that happens to make things like a w3c webdriver standard happen. (funfact: vibium relies on the new! w3c standard "webdriver bidi" protocol heavily inspired by the chrome devtools protocol used by playwright. (tl;dr: it's just json over websockets.)
i'm betting on industry cooperation, standards, and shared prosperity. that's my 5 year plan!
https://github.com/VibiumDev/vibium/commits/main/?after=ffc3...
1) test automation (my specialty)
2) data scraping / crawling
3) business/robotic process automation (e.g. back-office data entry, processing invoices, etc.)
when it comes to handling login sessions, cookies, etc. test automation is the easiest. (you create disposable test logins and use them in each test. it's mostly a solved problem.)
handling logins is a way gnarlier problem in data scraping and business process automation. i'm focused on test automation in v1. (i'm hoping experts in data scraping and process automation can help me improve vibium in this regard.)
no reason other than my number #1 goal was "ship something". i only started the actual coding on dec 11. it's been a bit of a sprint the last two weeks!
though "image-based" vs "dom-based" testing approaches is a very big topic! (look forward to researching that more in the future.)
v1 announcement: https://github.com/VibiumDev/vibium/blob/main/docs/updates/2...
Thanks, from a very tiny human.
i try to say this often, but it never feels like enough: yes, i started the project, but it's a relay race. i ran the first few laps, but the project has been going for 21 years now. there's dozens (hundreds?) of people to thank at this point for the success and impact that the selenium project has achieved.
- MCP option (where tokens will eventually get burned) Getting Started with Vibium MCP: https://github.com/VibiumDev/vibium/blob/main/docs/tutorials...
v1 is about getting to a base-line of functionality.
things get interesting in v2: https://github.com/VibiumDev/vibium/blob/main/V2-ROADMAP.md
we also have a new discord server for the project that we just spun up and will be opening up more widely soon. discord could be a good place to share uses cases and experiments until we set up a more formal website structure).
"vibium": {
"command": "npx",
"args": [
"-y",
"@vibium/mcp@latest"
]
}"command": "npx",
"args": [
"-y",
"vibium"
]
}
source: https://www.linkedin.com/posts/apzal-bahin_ai-mcp-browseraut...