- Write a .claude/commands/review.md. Simple but deprecated.
- Use a /code-review skill, either one you install or one you just write yourself (it's just Markdown, after all).
- Use the /pr-review subagent. Also just Markdown, but it runs "in the background" and "in parallel", so it must be better, I guess.
- Install the /code-review plugin. This just installs the skills and subagents above.
- Simply ask Claude to review the code. Probably works almost as well as the above in most situations.
They are all just variations of "insert a canned prompt", varying only along the dimensions of (a) how and where the prompt is installed and from where it is sourced, and (b) which context or contexts the prompt runs in. There's not much advice here about which option is best, and no clear best practices seem to have emerged yet either. Personally, I find just asking Claude to review the code works well enough.
Some of the advice here is also off. For example:
"Install a language server plugin. Type errors and unused imports caught after every edit. Highest-impact plugin you can install."
I work mostly with Rust, Python, and Dart, and followed similar advice, installing LSPs for all three in both Claude Code and Codex. Two months later, after heavy development in all three languages and hundreds of sessions - and frequently running out of RAM due to all the Rust analyzer, Dart analysis server, and Ty LSP servers the harnesses were spinning up - I checked the session logs to see how often the agents were actually invoking the LSP tools. The answer was they had invoked them literally once the entire time. I uninstalled all my LSPs and haven't looked back. The agents do just fine using ripgrep and calling cargo clippy, dart analyze, ty check, etc. themselves.
When I need code review I should just say “review it”. Model should figure out what plugins, skills, etc. to use.
(/s - Blargh, writing like that that by hand is exhausting)
What agentic platform would you recommend for those with API access (including other models)?
Much cheaper too.
Issue is that CC forced corps over 150 people into a API pricing, which is, well, suboptimal compared what we get. I think it will push those towards hiring more juniors (finally).
...H1Bs.
If you can't one-shot your problem with the free LLM that Google gives you, you're jerking yourself off.
Just write it the old fashioned way.
Ignore all the 3rd party frameworks (at least for now, probably forever.)
Makes one wonder...
While I'm also a huge fan of local LLMs and believe they will be key in the future; I think the claim of "just as good" is hyperbole. They're productively useful tools though, and something worth exploration.
But GLM is SOTA level for code, so it's obviously going to beat all local small models by a lot.
Download opencode GUI or cli. Sign up for Go or Zen plan, choose GLM-5.1 model.
- corporal threats of harm directly against Claude
- threats of prison for the entire board of directors of Anthropic
- explanation how every time it goes off the rails / makes mistakes, it gives more evidence to a class action lawsuit against Anthropic
Especially the latter two seem to have improved its "behaviour" to be more "careful" and "deliberate"
I'm hoping that when the robot apocalypse happens, they'll let me stay in the breeding harem, or worst case let me live a few extra minutes.
Apocalyptic safety is just a bonus.
Personally I'd like to see AI ceos (legally) exterminated for their traitorous crimes against American society and culture.
Still... I'm not ready to give it more autonomy. Even as it gets high-level things quite well, I still look at the code, give feedback, and have 3-4 rounds of tweaks until I'm happy with it, and also happy that I stil feel I have a good handle on the codebase.
In Claude I use /branch and /rename a lot (context checkpoints, fork, go back)
I use sandboxing almost exclusively: https://github.com/nix-tools/bubblebox -- it's a generalisation of Numtide's claudebox with a few fixes and some feature additions (more coming). This is best compared to always running your Claude in Docker containers, except there's no Docker runtime. Works fine in WSL and nix-darwin, too.
Codex is way better at nix than I am.
On my own machine I just give it a Linux User Namespace, i.e. soft virtualisation via "bubblewrap."
What Docker Compose and Linux User Namespaces provide that a VPS doesn't: You can easily mount extra directories from your developer host machine in read or read+write mode. With the VPS you (most likely) need it to clone all of your resources separately, which requires SSH keys, and now you're slowly building towards an independent agentic environment, which is definitely very nice, but time-consuming, compared to piggybacking on your developer environment. Definitely the direction I'm going.
I have a project that's mostly Rust sprinkled with C++ libs and Python helpers and it's easier to manage than the average virtualenv. Everything builds with nix build, everything runs with nix run, profiler/debugger works, IDE detects everything on any of my computers, builds and links with CUDA on x86, aarch64, NixOS, MacOS, Ubuntu or Amazon Linux. nix build can even build a Docker image for the odd need of Docker, and I haven't tried but I'm convinced that if I import the flake on my nix-config it will be built into the SD card for my Raspberry Pi just fine.
It's even replaced Ansible for me, colmena all the way.
Maybe you have some premade tooling that helps provide persistency between container invocations.
But by default, closing your agent container and opening it again just wipes everything you didn't host-mount.
What I'm advocating is really just the same functionality without the Docker runtime, because Linux has namespaces.
Feels more like you're on your host system with exactly the minor variations you specify.
Making Docker feel like your host system is possible, but I just never felt at home.
Though I also use nix to manage my machines :-D
How does fnox compare to sops?
How does hk compare to lefthook?
And does hk and fnox have a similar Nix integration as lefthook-nix and sops-nix?
I'm still hoping I don't need to make a better lefthook.
I kind of like sops-nix, not sure what's missing, really. Maybe fnox is similarly wholesome for non-Nix users.
I see that hk has a flake, so that's a good sign.
EX. Sure, you could go back to the old ways of using a drafting table for your engineering work if CAD went down but it would be exponentially slower…
Personally with my workflow I spend 30-60 minutes per Claude feature spec doc when I’m pair planning. If Claude goes down I would just prepare spec docs on my own until it came back online and then rapidly review them before calling the coding workflow.
Precisely. Every online-only solution is a huge risk i personally do not want to take, i've always done my best to use offline-only tools.
That may restrict me from the latest and greatest, but i prefer not to be left at mercy of any corpo
But this is the reason "serious shops" do not use always online software and tools in critical parts of the SDLC. There is a difference between influencers/people on socials promoting things vs. reality where the expectation is that things don't just stop working because there is an internet outage or some 3rd party disruption
Do farmers still plough fields with a Horse just in case their tractor runs out of diesel? Of course not, as technology moves on we all have to accept the inherent risks in exchange for the huge benefits, otherwise the work you do will be too slow and your job taken by someone willing to leverage the tools available today.
Uh, people do say this thing. It is a basic factor and question asked during technology procurement. Uptime and fail states matter.
AI just seems exempt from all the questions people usually ask about relying on other people's software.
For someone to do this, they would have to think for themselves, which I've also not seen much of in the vibe-coding space.
I have seen many many times in microcontroller forums posts from first timers in the liking of "hello sirs i have problem please show how to do this", followed by their own reply a few hours later asking again because they were holding up, where "this" was usually something really trivial, you just needed to read the docs and the rightful answer was "did you really not try anything in 6 hours?"
If hand coding pays better there will be plenty who can still do that.
Naturally you can also have a LLM one-shot a 14000 line PHP monstrosity - it's up to you still, LLM or not.
The main problem is that it'll probably be a waste of time to code anything yourself if Claude is back online in 8 hrs. It's like walking to the next bus stop when you missed your bus - it won't make you get home any sooner.
8 hrs will probably be better spent reading specs or checking things with stakeholders so the next features you let Claude implement are the ones the business actually wants.
(sorry couldn't resist)
The point is vendor lock-in. The vibe coding community has reinvented vendor lock-in and is bound to repeat every mistake associated with it.
pre AI if my IDE was down for whatever reason I wouldn't switch IDE's, I would do something else.
You don't need to put in any effort, just get Claude (Codex CLI if Claude is down) to generate the multi-harness config for you.
You sound like you might be a beginner so let me help you out with some advice -- You can get your multi-harness configurations completely identical by simply telling Claude to research the Codex spec and eliminate all feature drift between your configs. Hope this helps.
It's literally one prompt away. If you're worried about errors, just threaten to sue Anthropic in the prompt and the quality of the answer instantly improves by 75%.
2. How often do you think that happens, compared to Claude?
> What happens when you have a codebase made with claude using this setup and claude is down for let's say 8 hours?
So: - A codebase made with Claude - Using this [Claude] setup - Claude is down
How can you come up with such non sense.
The worst thing about LLMs is they can pass the Turing test, leading people to believe they have an Asimov style robot instead of a very cool statistical model. It feels like they should be able to follow instructions or keep instructions from content separate, but that’s not what’s happening.
``` # Development Workflow
*Always use `bun`, not `npm`.*
# 1. Make changes
# 2. Typecheck (fast)
bun run typecheck
# 3. Run tests
bun run test -- -t "test name" # Single suite bun run test:file -- "glob" # Specific files
# 4. Lint before committing
bun run lint:file -- "file1.ts" bun run lint
# 5. Before creating PR
bun run lint:claude && bun run test ```
I have these things in pre-commit, this way the targets are always ran and the agent is forced to fix them (I ask claude to commit changes). The agents are erratic and very often skip these steps. Anything that can be deterministic I keep as scripts.
Regarding commits; both codex and claude are terrible at writing them. I have in my user CLAUDE.md:
``` Pattern: `type(scope): message` where type is `fix`, `feat`, `chore`, `docs`, `refactor`, or `style`; scope marks what is affected; message is a short lowercased description.
Keep subject and body lines under 72 characters. Always write a body explaining what, how, and why in continuous human-readable text. For fixes include the error message being fixed. No first-person speech. Re-read the actual git diff before writing — the message must describe what changed, not what was planned.
Use following command to create commit:
```bash git commit -F - <<'EOF' type(scope): subject line
Body paragraph explaining what, how, and why. EOF ```
```
Without it would write the body as a single long sentence; when asked to fix lines it would just insert \n (newlines), which were not respected and were instead just rendered as characters.
Another thing I find helpful is VOCABULARY.md. Very often the agent would assume (connect?) a different thing than what I had in mind, with VOCABULARY I make sure when I say "thing" claude and I have both the same "understading" (connection?) what "thing" is.
Example: https://github.com/rkuska/carn/blob/main/VOCABULARY.md
I always get the best results when I have live feedback with it.
If you aren't using AI for code review in 2026, why would you even bother? High quality, error-free, better-than-human code generation AND review is available for cheaper than ever. Why are you wasting your life reading code you didn't even write?
With this i mean there are some system prompts that make Claude very concerned about your autonomy.
I think in the future this type of system prompt will be embeded to force people to think a little.
Do yourself a favor and try Codex. Then do yourself an even bigger favor and try composer 2.5 from Cursor. It's night and day difference. You don't even have time to get distracted, you stay in the zone.
The marketing strategy for the AI firms is to get people with poor reading and writing skills socially dependent on their "tools".
The selling point is that you can delay "reading and fixing it yourself in 5 minutes" ad infinitum, consequences be damned.
What we gain from LLMs is avoiding (heaven forbid) having to read and write for another 15 minutes.
So what’s the recommendation for Claude to have a feedback loop?
Because it’s not what follows in the article: _“Explore, then plan, then code.”, “Use plan mode…”, “Reference, do not describe.”_
For front end code it's giving claude a way to 'see' the work for example a Playwrite MCP server seems common. https://playwright.dev/docs/getting-started-mcp
Could be a simple typo, but I my mind jumped to `s/tip//g` which is kinda interesting
Beyond the issue of AI serfdom, I just don’t want so much of my workflow to depend on “some other company.”
This whole setup is basically setting you up to have all your projects in a Claude SaaS lock-in.
I also think if AI was actually smart it wouldn’t need so much handholding. I don’t want to spend my time developing skills and writing markdown files to try to get this dumb thing to write code for me. Why isn’t the AI reading the codebase and understanding what to do?
Because it’s artificial, that’s why.
Generally, and more so with paid products, one should expect to get something that is ready to be used, tuned by who's selling it at the best of their efforts. Instead, this is basically saying that the product is actually not much more than an empty box, and that it is your responsibility to augment it with third-party plugins and markdown texts that make it finally useful. And you better be carefully selecting the skills you install, you don't want to end up with second tier material made by GithubInfluencerA, you definitely need the work of GithubInfluencerB.
In the end, it's what is giving companies fuel to keep the hype running, because it allows to counter every possible argument or doubt about the technology, especially the ones made in good faith. No matter the problem you're facing, the blame is definitely on you, the user, for not setting up the tool in the right way.
I'm struggling in a lot of ways in accepting LLMs, but if I'll ever come completely sold on them and take this technology seriously, it won't be before this mood has gone away.
Having an "unfinished" product is also a great marketing tool for companies like anthropic: each skill/plugin/guide that you see on the internet is boosting their SEO + social validation metrics.
I would just say this: there is a difference between advice for using a product, and for _optimizing_ your use of a product. Between a user and a power user.
I think devs probably disproportionately like to see themselves as power users of any given tool, and thus with coding agents, there are 1000 "systems" being thrown out on GitHub on any given day. Generally speaking, it is safe to avoid these, especially if you're new to the tool.
But saying the fact that people are into optimizing their setups indicates some fundamental deficiency of the tool misses the point, I think.
Claude Code and Codex CLI (and OpenCode, and I'm sure many others) are _remarkably_ effective right out of the box. The teams behind these tools must make them _generically_ useful so that they are accessible to as many people, and as many use cases, as possible. That is part of why, when you become familiar with the tool, there is typically going to be a level of customization you can apply to it to optimize it for _your_ use cases, beyond the generic out of the box configuration.
Similarly, I don't think it would be fair to critique VS Code simply because most power users augment it with a suite of extensions. In fact, it's customizability/extensibility is part of what makes it great.
Here, something different is going on instead of the usual "base tool is ok for 90% of use cases, remaining 10% is covered by plugins and extensions". A lot of developers are finding it difficult to commit to agentic coding workflows, feeling a stretch on a lot of different aspects.
Companies, with the help of a very prominent and vocal part of the web and social media community, are addressing every issue by simply blaming the users, saying it's their fault if they're not keeping up with all the alleged advancements in prompt strategies. See the whole "maybe you haven't tried it in the last two months, everything's changed now". While it's true that things have been moving very fast, the fundamental idea behind the technology is the same, and some concerns about it simply cannot be wiped away by scaling some factors.
Right like I bought an AWS EC2 m6a.metal instance expecting to get something that is ready to be used. Now being told to recite arcane "commands" from the cloud computing holy book. They claim their supposedly groundbreaking hypertext protocol isn't even accessible to mere mortals using a $6000/month EC2, the blame is definitely on you, the user, for not setting up the tool in the right way.
This sysadmin cloud cult is basically saying that the EC2 product is actually not much more than an empty box, and that it is your responsibility to augment it with third-party servers and interpreters and application source texts that make it finally useful. And you better be carefully selecting the tools you install.
It's not that Claude code isn't a finite product per-se, I certainly can find some value in it. What I'm saying is that people selling it, through the convenient talks of prominent voices on the Internet and gullible C-suites, are trying to make it look like it's the only software engineer the world will need from now on. What makes me mad is not the deceptive advertising, that's already everywhere, it's the fact that the industry is happily believing all of this. If you raise any doubt, it must be that you haven't tried with the right skill.
I found this one: do you guys know something else ?
Also, how is "Explore, then plan, then code" considered "beyond the basics"?
Their conclusion: environment-layer containment first, then model-layer steering. CLAUDE.md is the right configuration layer but it is not a containment layer. Worth thinking about whether your worst case is a lost afternoon or a lost database and all backups deleted, too: https://safebots.ai/compromise.html
But the more important point are the costs. People are starting to realize just how costly it can be to run agents without precomputing and caching: https://safebots.ai/costs.html and self-orchestrating agents can go up to 1000x: https://safebots.ai/kimi.html
This is also how you get a slop codebase that you won’t easily understand.
It becomes a labyrinth that only the Agent knows. It’s not a catastrophe when your making prototypes or projects like you see on X.
But if you are expanding your codebase or trying to build something more professional and maintainable. I find it important to explicitly spec things bit by bit so I can understand and some what keep my writing style in this codebase. But this is only productive when you have a fast model otherwise it kills your chain of thought while you wait for the output.
If the model is slow, delegation is probably the only way.
The good bugs from AI are bug neither developer nor user has found, so it is more work.