TREX: An AI code reviewer that runs your code
32 points
4 hours ago
| 3 comments
| greptile.com
| HN
ygouzerh
47 minutes ago
[-]
How does this work when a projet have many external dependencies, like an S3 bucket, a secret manager, a third party API, etc?
reply
hartator
28 minutes ago
[-]
In my experience, you still are left with these annoying parts. (Ie, figuring out how to give appropriate access to your agents)
reply
muzzkhan
20 minutes ago
[-]
we're working on a way for you to expose creds safely into our sandbox. But for now, it's limited to mocks API calls, clicks around the UI, and unit tests.
reply
deet
1 hour ago
[-]
We've found that methods like this substantially increase the quality and reliability of coding agent output. The ability to run code in a sandbox, drive an interactive session using a browser or API calls or other apps, and visually confirm output via vision models all adds up to plugging a big hole in the feedback loop for agent modifying a complex codebase.

We've had agents go as far as interactively testing how our product responds in video calls by launching our full stack in a set of docker containers (app, api, db, queues, etc.), all inside a larger sandbox, populating test data, connecting the mock system to a real video call solution like Google meet, and injecting audio and video to test the response. End-to-end, like a real user flow.

It's not perfect yet, but if you are a skeptic on the ability for AI agents to productively modify a complex product, I'd highly encourage you to play with a setup like this before ossifying your conclusions.

reply
Elzair
2 hours ago
[-]
I wonder how long it will take for someone to pwn this?
reply