I don't disagree, but I've been thinking about this a bit: a lot of _human_ written code was/is less-than-fine. And a lot of human devs didn't understand the context when they wrote it.
I'm not advocating that we fire devs, or evangelizing that LLms are awesome. But I do wish there was a slightly more honest take on the pre-LLM world: it's not just about cost reduction, it's about solving some long-term structural deficiencies of industry.
I don't think anyone truly disputes this, but I don't think it's a good argument.
If we already had a sort of poor quality output when humans were at the helm, then what can we expect from AI steered by the sort of humans that produced poor quality code?
My guess would be even more poor quality code, at much higher volumes than before. It is impossible for me to see this as an improvement over "some people always made poor quality code"
> When code production gets cheap, the cost doesn't disappear. It migrates.
> It was true then. It is unavoidably true now.
I find the constant critique of punchy style a bit tiring. It would be more productive for the grandparent to think about the content and state an opinion.
That's kind of similar to written content being posted and linked. There's an expectation that you are asking someone to take time to read it, and with LLMs now the cost to generate things to be read is a lot lower but our attention and capacity to read them remains the same.
there are many apps with ai-generated ideas, specs, and functionality. nobody uses them because of the contempt.
in either case, the part that's user-facing is ai-resistant
The right place for this is usually in the design doc or commit message, and robust engineering organizations will ensure that commits are cross-referenced back to design and requirements docs so you can trace decisions from git blame back to the actual rationale.
The same process also works pretty well with LLMs. Google, for example, is internally championing a process where the engineer has a dialog with the LLM to generate a design doc, oftentimes with an adversarial LLM to poke holes in the design. Once the design is fully specified, the last step is to ask the LLM to turn the design doc into code. This creates a human-readable artifact that traces the decisions that the human and AI collaboratively made, which then can be traced back from the code.
It must be in order to have an effective AI-assisted software engineering workflow, but it's not something that's magically done for you (usually). Knowledge management of project context is everything. Even the little music app I'm building has >100 ADRs (https://en.wikipedia.org/wiki/Architectural_decision), and that's just one form of project memory.
Where are you keeping your design? If it's a WYSIWYG doc like word, what would it take for you to switch to a more text driven world?
Intent is something that is already sorely missing in most software projects.
At best you might get good commit messages, or messages that link back to good issue/PR discussion.
But something I realize is that /implementation/ intent is also important, not just higher level "why are we here in the first place" intent.
We tend to cargo cult the idea that you shouldn't have to explain what you're changing because it can be derived from the change. e.g. Commit messages should only explain the why since the what is redundant.
But the what is not redundant. Most bugs are a desync from the what, since you obviously didn't mean to implement the bug. And you didn't mean to use UUID v1 instead of UUID v4 for that session token.
Now that my workflow is centered around generating plan files instead of doing the implementation myself, the goal is specifying the plan with enough granularity that anyone could implement the intended change when given the plan.
It's data I wish we always had in a software project, but it would involve way too much work and energy to do this level of planning before every change until the LLM age.
Now that implementation is automated, we live at this specification step. And it's really the specification/plan that I want to see first rather than the code.
I think a great way to approach this to change the PR / code review step to center around revising a markdown plan file until you can get the intent and specification right. That makes a lot more sense than only reviewing the code and the deriving the why and the intended what.
e.g. For my own projects, I'd rather someone submit a plan file as a PR that we can revise together, not code. The code is a derivation of the plan. And when code is derived from a plan, then the plan must contain all of the knowledge that is applied to (or missing from) the code since it's the source of truth.
Compare that to the status quo where you start with a PR code change and then kinda work backwards to synthesize the why and intended what, then see if the real what synchronizes with intention.
And then I make a decision based on that.
I guess I'm wondering if the article is missing have the picture. Yes - AI is wrong some of the time (and that % varies based on a host of variables). But it can read code as well as just write it. And that does matter as it changes the trade-offs this article is weighing up.
Claude is even better at helping you understand the code base then it is at writing code! It can look at a bunch of files and give you an accurate run down in ten minutes.
Several projects I work on call into or interact with gigantic codebases. A couple years ago I would have to allocate 30 minutes some times to either trace through the code base or setup and attach a debugger to step through the code until I found the part I needed to know about.
Now I send an LLM to go find it and it comes back with a list of files and line numbers in minutes.
It’s still not perfect. I had a codebase walk this morning where even GPT-5.5 extra high failed to find something I knew was in there on the first pass. It got it with some more directed prompting. If you delegate trust 100% to an LLM you will get bit eventually.
Once I even asked him to create for me a manim video about it.
With a new code base my first goal understand how it CRUDs state. What structures and what operations?
Less concerned at the outset if its sorting carrots or processing orders for shirts.
AI seems to work way better for me when I tell it to ignore the use case and focus on surfacing runtime and mathematical operations embedded in the code.
Constraints on new abstraction and telling it to stick to math terms and types and objects also seems to help avoid hallucinations and layers of useless indirection.
Claude in its default configuration has untapped potential for explaining and documenting code because it defaults to writing no comments. I added this to my global CLAUDE.md and so far so good:
# Comments
Keep existing comments unless they're wrong
or stale. Err slightly toward adding short
explanatory comments for non-obvious code. > The code they [LLMs] produce is often fine. It works. It passes tests. It might ship as-is
The blog posts they [LLMs] write is often fine. It works. It passes tests. It might ship as-isIf you get them involved in the design process, they feel heard. Feeling heard is one surefire way to have a person feel involved. Feeling involved fosters a sense of ownership and pride which in turn helps keep a person engaged.
I know nothing about AI code generation (or about AI in general), but I wonder if you could include in your prompt a request that the AI describe the reasons for its choices and actually include those reasons as comments in the code.
This is not the case eny more. I never thought LLMs write bad solutions, but when you let it think for you you loose something important, understanding. And when something brekas, some people consider this being a proof that the llm is the problem. And in a sense it is. But you are also to blame. And that's exactly the issue with this all industry - in order to move faster you don't need to "type" faster. A person bashing his hands on the keyboard randomly at top velocity can get pretty high wpm. In order to move master you need to _think_ faster, be _snappier_ and _sharper_ and most people aren't.
I like how one of my colleagues phrased it in a company meeting. He asked one of the c levels "for llm s to 10x me, I need to let them make all decitions and dictate intentions. By percentage, how much of the code you want me to _know_?"
---
Me: I've notice that Claude is deleting my comments and tends not to insert many of its own. I don't want to go too crazy, but can we modify CLAUDE.md to dial up the comments a little? Like, try not to delete comments that are still relevant, and add a few explanatory comments.
Claude: [... proposal for modifying CLAUDE.md...]
Claude: Want me to apply this, tweak the wording, or place it differently? Note that your global ~/.claude/CLAUDE.md-style instructions (visible to me as system guidance) currently say "default to writing no comments" — the project-level rule above will override that for this repo, which is what you want.
Me: Where is this rule that says "default to writing no comments"? That seems crazy?
Claude: That text is part of the system prompt I'm given by Claude Code itself — it's not in any file in your project or in your ~/.claude/ config. It's baked into the harness's default instructions to the model.
Claude: Specifically, two lines from my system prompt:
Claude: Under "Doing tasks": "Default to writing no comments. Only add one when the WHY is non-obvious..."
Claude: Under "Text output": "In code: default to writing no comments. Never write multi-paragraph docstrings or multi-line comment blocks — one short line max."
Really? I find that Claude really likes to write "why nots" in comments when iterating on implementations and fixing bugs, to the extent that the comments grow into spot-logs of overly-specific documentation of what was tried and why it was scrapped.
Writing a skill / set of rules around what makes a good commit message would encourage the LLM to record it's reasoning (however much we truly consider it to be "reasoning").
I think Claude is just behaving like all those programmers who make a VERY BIG DEAL about how much they HATE HATE HATE comments that might reiterate what the code does so will go to extreme measures to avoid them.
After using AI for months (Claude, Gemini, ChatGPT) it is extremely rare for their code to work 'as is' first shot and almost always requires several iterations and cleaning up edge-cases.
When it does work 'first shot' it's usually when it's transferring existing working code to a new project which is slightly different.
I believe that increases the chances of one-shot code working, though it's also possible that it did that against Opus 4.5 and isn't necessary against Opus 4.7 but I haven't spotted the difference yet.
it's unbelievable, it will do the iterations for you, it will easily work 12 hours straight until it's a good output