When I was working we used to get requirements that literally said things like, "Get data and give it to the user". No definition of what data is, where its stored, or in what format to return it. We would then spend a significant amount of time with the product person trying to figure out what they really wanted.
In order to get good results with LLMs we need to do something similar. Vague requirements get vague results.
This has significantly helped devs and made sure that requirements are very clear.
Honestly, with the first step, it seems the PMs are already halfway there to implementation of the feature so I wonder if in the future they'll just do everything themselves and a few devs will be around as SDETs rather than full blown implementers.
Just lol. Is this what you guys mean by productivity boost?
Comical. LLM’s aren’t all that great - it’s more that most orgs are horribly inefficient. Like it’s amazing how bad they are.
That’s why Elon succeeded with spacex - he saw how horrible inefficient the industry was. And used that thinking to take a gamble and it’s paid off.
And then someone copy pastes it into Claude and now those inaccuracies become part of the code and tests.
It's the equivalent of writer's block and is why a common advice given to writers is to put anything they can onto the page then edit it later.
Yes please, I've seen the vibecoded slop PMs put out every day because software engineering is simply not a skill they have, and I'd love to make a LOT of money fixing their crap once it dies in production <3
An LLM will just say, "Sure! Here's the fully implemented code that gets the data and give it to the user. " and be done with it.
> What data should I retrieve, and where should I get it from? Please specify at least: ...
And it then goes on to ask just exactly what is necessary, being all constructive about it.
But the point still stands: in most contexts, the LLM will fill in the blanks with what it deems appropriate like an overconfident intern at best and a bull in a China shop at worst.
It's the wrong thing for important things under the hood (like durability and security requirements) that are not tangible to them.
When we talk about "the" bottleneck being specs it just isnt the case that it's the only thing LLMs do poorly. Theyre really bad at a lot of stuff in the SDLC.
They're also good at providing results which are bad but look ok if you either dont look too closely or dont know what you're looking for.
This was substantially predicted by Fred Brooks in 1986 in the classic No Silver Bullets [1] essay under the sections "Expert Systems" and "Automatic Programming".
In it, he lays out the core features of vibe coding and exactly the experience we are having now with it: Initial success in a few carefully chosen domains and then a reasonable but not ground breaking increase in productivity as it expands outside of those domains.
[1] https://worrydream.com/refs/Brooks_1986_-_No_Silver_Bullet.p...
The LLMs turn out fully formed clones of stuff for which there exists copious amounts of code openly searchable on the web doing the exact same thing.
LLMs require developer-like specification, task/subtask breakdown and detail where such example code already exists.
As a professional prior to LLMs, how many problems that you work on have many existing free solutions but you neglected to use that code and decided to spend days doing it yourself?
I can only think of hobby projects, like writing yet another emulator, expression parser or media processor in a new language I'm trying to master.
In a professional setting, you would always diligently explore libraries and only implement your own if there is no suitable alternative.
Only when the existing free solutions are licensed with something like GPL. Now I can just say, write me a C webserver library similar to mongoose and I get the functionality without the license burden.
And you now own full responsibility for maintenance.
Also I was joking, I'd never do that; feels gross. But I suppose it is a legitimate "productive" use of AI.
"what does X means? how will it work?"
while a programmer will ask, about all cases.
Can't good marketing teams, backed up by World Class Product people, sell anything we build, more or less?
</devil's advocate>
"Make a facebook clone" is the vague human promise to the end user. The reality is that it leads to so many assumptions which are insurmountable due to the vague interpretation so you have to change your requirements in the end to claim success.
Thus everything turns into a mediocre compromise. There is no exceptional outcome, which is what makes a marketable product. There are just corpses everywhere.
You need something better to both define requirements and implement them than this technology.
Anyone who thought that gap could be shrunk substantially lives in delululand.
Hence why we haven’t seen this explosion of ‘really great’ products come out.
Many will continue to parrot ‘bro but the models changed I swear’. I’m sure they did. But you’re missing the damn point.
LLMs just take the same vague or poor requirements and make them look believable until you dig in to them.
This is a big HN LLM discussion divide. I am in the same no-specs work background camp, and so the idea that the humans who input that into dev teams are suddenly going to get anything out of an LLM if they directly input the same is laughable. In my career most orgs there has been no product person and we just talked directly to end users.
For that kind of org, it will accelerate some parts of the SWEs job at different multipliers, but all the non-dev work to get there with discussions, discovery, iteration, rework, etc remains.
If the input to your work is a 20 page specification document to accompany multi-paragraph Jira tickets with embedded acceptance criteria / test cases / etc, then yes there is a danger the person creating that input just feed it into an LLM.
Probably why I haven't ended up in any.
https://web.archive.org/web/20161211074810/http://www.commit...
On the other hand, it feels like we've been over this tens of times recently, on HN specifically and IRL at work. Another blog post isn't going to convince leaders that this is how the world works when they are socially and financially incentivized to pretend like AI really will speed things up. So now I just wait for their AI projects to fail or go as slowly as previous projects and hope they learn something.
Humanity knows how to solve starvation. Clear routes were laid out long ago. The work is in adoption.
So I am spending my days gardening and obsessively working on personal coding projects with these agentic tools. Y'know, building a high performance OLTP database from scratch, and a whole new logic relational persistent programming environment, a synthesizer based on some funky math, an FPGA soft processor. Y'know, normal things normal people do.
So I know what these tools are capable of in a single person's hands. They're amazing.
But I hear the stories from my friends employed at companies setting minimum token quotas or having leaderboards of people who are "star AI coders" telling people "not to do code reviews" and "stop doing any coding by hand" and I shake my head.
I dipped my toes into some contract work in the winter and it was fine but it mostly degraded into dueling LLMs on code reviews while the founder vibe coded an entire new project every weekend.
These tools suck for team work or any real team software engineering work.
I'll just let this shake out and sit out until the industry figures it out. The only places that are going to be sane to work at are places with older wiser people on staff who know how to say "slow down!" and get away with it.
In the meantime, quantities of cut rhubarb $5 a bunch in Hamilton, Ontario area for sale. Also asparagus. Lots and lots of asparagus.
Eg: I had a product manager say to me that he envisions a future where any meeting with stakeholders that does not result in an interactive prototype by the end of the meeting would be considered a failure. This feels directionally correct to me.
The other thing I expect to see is Vibecoding being the "Excel 2.0" where it allows significant self-serve of building interactive apps that's engaged in a continual war with IT to turn them into something with better security guarantees, proper access control & logging, scalability, change management etc.
But the larger historical point here is that every revolutionary transition produces, in the early stages, "Steam Horses". The invention of the steam engine had people imagining that the future of transportation would involve horse shaped objects, powered by steam, pulling along conventional carts. It wasn't until later developments that we understood the function of transportation as divorced from the form.
I started talking about Steam Horses originally in the context of MOOCs, which was a classic Steam Horse idea.
If that sounds familiar, it’s because it’s what dang did over the course of several years.
It’s taken a few weeks. I started right around May, and now it’s able to render large HN threads (900+ comments) within a factor of five of production HN performance. (Thank you to dang for giving actual performance numbers to compare against.)
A couple days ago, mostly out of curiosity, I ran Claude with “/goal make this as fast as HN.” Somewhat surprisingly, it got the job done within a couple hours. I kept the experiment on separate branches, because the code is a mess, just like all AI generated code starts as. But the remarkable part is that it worked, and I can technically claim to have recreated HN within a few weeks.
The real work is in the specifications. My port of HN is missing around a hundred features. Things from favorited comments, to hiding threads, to being able to unvote and re-vote.
But catching up to HN is clearly a matter of effort (time spent actually working on the problem with Claude), not complexity. Each feature in isolation is relatively easy. Getting them all done within a short time span without ruining the codebase is the hard part. And I think that’s where a lot of people get tripped up: you can do a lot, but you have to manage it tightly, or else the codebase explodes into an unreadable mess.
It’s true that if you don’t do that crucial step of “manage the results”, you’ll end up making more work for yourself in the long run, by a large factor. But it’s also true that AI sped me up so much that I was able to do in weeks what would’ve otherwise taken years (and did take dang years). I’m not claiming parity, just that I got close enough to be an interesting comparison point.
AI can clearly accelerate us. But we need to be disciplined in how we use it, just like any other new tool. That doesn’t change the fact that it does work, and I think people might be underestimating how good the results can be.
I think projects where correct is very clearly defined can benefit from LLM acceleration, as you're describing here.
But so much of modern software development is figuring out what the right thing to build is. And in those situations, I don't think LLMs provide nearly as much benefit.
No, the code is actually almost always correct. The way it’s added is probably not what you’re going to like, if you know your code base well enough. You know there’s some ceremony about where things are added, how they are named, how much comments you’d like to add and where exactly. Stuff like that seems to irritate people like me when not being done right by the agent, and it seems to fail even if it’s in the AGENTS.md.
> If you were to give human developers the same amount of feature/scope documentation you would also see your productivity skyrocket.
Almost 2 decades in IT and I absolutely do not believe this can ever happen. And if it does, it’s so rare, it’s not even worth talking about it.
That's not my experience, especially when the inputs are bugs or performance issues. It frequently hallucinates and misdiagnosis without a guiding hand. However, it can still RCA and analyze well and improve efficiency if you keep an eye on what it's doing and push it the right direction.
> If you were to give human developers the same amount of feature/scope documentation you would also see your productivity skyrocket.
I think you run into a ceiling how fast a person can digest and analyze the info compared to a machine
But for a small studio, or independent developer, LLMs are a big game changer. Being able to do a mediocre job at 5 people's jobs is a huge leap over trying to get by without those jobs - relying on third party assets or other sorts of content, or even worse - doing a really awful job of trying to improv those jobs. See the UI of basically any program ever that was clearly laid out by a programmer and not a designer. Or there's the whole trying to rip off stuff from dribbble, but lacking the skills to do so. Whereas with AI, you can suddenly competently rip off everything and everybody - it's basically their entire MO.
What are the chances that this is the Gell-Mann amnesia effect? Sounds like the textbook definition of it.
Personally, I find the exact opposite to be true. LLMs only help me when I already know exactly what I'm doing.
To wit, the answer pre-AI was to hire an expert on that thing, and you would then critically assess their work product, despite being unable to build it yourself.
I get most value from them when I'm asking it to either fill in the blanks of something already half implemented or when I need some feature in a given context/language that only exists in other languages
Programming is a logical circuit breaker. There is a wide range of incompleteness that halts development or puts the solutions in an unpublishable state.
A product person has no compiler, no RAM, no database, no state machine. There is nothing that can fail. There are probably strategies to weed out some issues, but none will be perfect.
We need to combine reality with computers. Computers set the constraints and we can only check if we are in bounds of the constraints by solving the problems with computers.
Oddly enough AI has so far nothing to offer to improve the "product people" problems.
So well said.
AI is unveiling how the bureaucracy is the slow part.
Computing has been doing that for decades. If your process is fucked, computers make it fucked faster.
It’s just that now, we have entire generations alive that have never seem a world without digital computers. ~LLMs~ AI is a fun new lever in some uses so clearly it is finally the hammer that will drive the screws and bolts for us, with less effort on our part!
They just have to learn from experience. It’s what you do when you can’t be bothered to learn the lessons of the past.
Work in large orgs long enough and you will recognize these creatures. Ladder climbing is a skill orthogonal to adding any value to the customer/company.
It's happening about 10x faster than any other I've seen or read about.
Conceive how long it took just to get barcode scanners rolled out in grocery stores. Or direct payment terminals. Or how many decades it's been getting robotics into the manufacturing of cars at scale. I worked through the .com boom and I can tell you that "webification" took 10 years or more for most businesses (and many of them now just gave up and just have a Facebook page instead etc)
This is a little insane what's happening now. It really does change everything. People who don't work in software I don't think have any idea what's coming.
It's highly salient to management, and being forced top-down by them at 10x speed, for sure, because they see a future cost save to reduce headcount.
For certain technical roles its a force multiplier and already very saturated for sure.
On the other hand there's a lot of solution-looking-for-problem going on in large orgs where layers of management have been banging the table for 2-3 years on AI KPIs without any value being delivered.
In the weekly AI wins mail at a friends company, multiple non-technicals were bragging how AI has saved them 15 minutes a day by summarizing their morning inbox. This was the big game changer for them.
Because the "rate of improvement" is only astonishing in well understood areas and really only astonishing if you yourself are not that great at what you do. Speaking for myself here, my job is extremely safe given that my boss doesn't wanna sit there and prompt AI all day and i work in a fun little 4 person company. We already have plans for the 3 next years which involve me :-)
Once tooling (e.g. agent harnesses, external tools) becomes more mature and consistent, the other 2 will become less of a bottleneck.
If I were to take a gamble here, I would argue that development will at one point reach the more ideal scenario, whereas the project planning, the scoping, will become longer. Also, the documentation section will take almost the same as the development, slightly longer at the edges.
The new ai-assisted era will most likely push companies to adopt a Waterfall management, rather than an Agile one.
However, while the engineering team successfully fast tracked development, UAT, and production testing largely thanks to AI other departments only began digging deeper into the project toward the end of April. To be fair, they do use AI in their workflows to some extent, but they haven't adapted their processes to keep pace with engineering's increased productivity.
In my opinion, this lag is mostly because many employees in those departments are older and hesitant to change their routines. While I understand that resistance to change is a natural human trait, what comes to my mind is this beautiful German adage, "Wer nicht mit der Zeit geht, geht mit der Zeit" which loosely translates to, "Who doesn't change with time is left behind by time"
>Process blocked on human inputs
Have AI check chat, email, issue tracker and see who it's blocked on and what latest status is. It may not save a huge amount of time but it can dig through the info pretty quick.
>Exploration
Once again, have it scour issue tracker, chat, customer suggestions, product documentation and summarize history and current status. Much quicker than setting up new meetings to try to rediscover and organize existing info.
Another use case, have agent build prototype, hand to people, have AI summarize and integrate feedback.
Claude or ChatGPT + Slack MCP + Jira MCP + Google Docs MCP + internal knowledgebase MCP + gh (GitHub) CLI + Datadog MCP--really 1 MCP per process in the Gantt chart--has been a huge boost at work just digging through context scattered all over the place and summarizing.
That said, it definitely still needs supervision and hand holding along the way
Another aspect that is not captured here is that the lawyers and subject matter experts will also be using AI to speed up their parts.
The way AI makes your processes go faster will have little to do with cutting software development time in itself, but by letting an organization be made with fewer people, which in itself lowers your misalignment issues. A giant company of 200K people will still be about as messy as one today, but you might be able to do a lot more with the same number of people, just like a lone programmer today, without AI, already does quite a bit more than anyone could do by themselves the 80s.
Maybe some of the advantages are that you don't need quite as many developers, or maybe you can use a smaller marketing team, or you don't need to spend that much time answering questions, because an LLM is doing it for you, and it's tracking what it's been asked of it, turning the questions into product research. Either way, the gains come from being able to run leaner, and therefore minimizing organizational misalignment.
The broader issue is the sheer number of businesses that build massively overcomplicated stacks, bought heavily into bandage solutions like AWS lambda, got on dumb tech bandwagons like big data, nosql etc. This is just another one.
I think you can engineer yourself into being leaner, in some businesses AI will help but we’ve had over a decade of “we can just add more complexity” and it just does not work.
I’m a rails guy. People forget for every unicorn there’s 10 9 figure businesses just ticking away on some niche with a VPS, rails and like 4-10 devs.
This is how I felt when I first started seeing people discuss things like AGENTS.md etc.
Another option is that lower software costs would significantly reduce the cost of whatever non-software product the software supports (manufactured good, electricity, services, telecom etc.) but I don't know in which industry the cost of software is a large portion of the overall product cost.
And there's another thing. A company that makes tractors can't produce food without land. A company that makes metal machining equipment can't make cars without the raw materials. But a software company that makes software that automatically makes software could just produce the result software itself rather than sell the software-making software. If AI ever reaches the point it makes software at a marginal cost that's not much higher than the cost of the AI itself, what would be the incentive of selling that AI?
> ...but that doesn’t mean it’s generating the correct code.
Something I'm observing is that now a lot of the pressure moves to the product team to actually figure out the correct thing to build. Some product teams are simply not used to this and are YOLO-ing prototypes now, iterating, finding out they built and shipped the wrong thing, and then unwinding.Before, when there was the notion that "building is expensive", product teams would think things through, do user interviews up-front, actually do discovery around the customer + business context + underlying human process being facilitated with software.
This has shortened the cycle to first working prototype, but I'd guess that in the longer scale, it extends the time to final product because more time is wasted shifting the deliverable and experience on the user during this process of discovery versus nailing most of the product experience in big, stable chunks through design.
At the end of the day, there is a hidden cost to fast iterative shifts on the fundamental design of the software intended for humans to use and for which humans are responsible for operation. First is the cost on the end users who have to stop, provide feedback, and then retrain on each cycle. Second is that such compounding complexities in the underlying implementation as product learns requirements and vibe-codes the solution creates a system that becomes very challenging for humans to operationalize and maintain.
Ultimately, I think the bookends of the software development process are being neglected (as author points out) to the detriment of both the end users and the teams that end up supporting the software. I do wonder if we're entering an "Ikea era" of software where we should just treat everything as disposable artifacts instead.
Careful who you share this information with- better to roll with the kool-aid drinkers when they're holding the cards.
There's no point in falling under the illusion that they'll finally get it now. This will all fall on deaf ears. They're convinced they're automating us out of existence when in fact they'll need the services of people who can surf complex systems more than ever.
We will be able to do more than ever and potentially faster. The issue remains that most of the things these people ask us to do and want us to do and pay us to do remains basically stupid and as TFA points out, the last mile of getting shit properly shipped isn't going to speed up. It's going to slow down.
If you want to see what happens when you put people in charge who sincerely believe in the "AI automates SWEs out of existence" mantra, take a look at the code quality of Claude Code and the recent "bun rewrite in Rust" fiasco.
...but yeah most organizational processes & people aren't set up for leveraging it and roll out will be slow (same on learning where it does / doesn't work).
I’m currently working on a data migration for an enormous dataset. I’m writing the tooling in go, which is a language I used to be very familiar with, but that I hadn’t touched in about 12 years when I started this. It definitely helped me get back into go faster.
But after the initial speed up, I found myself in the last 10% takes the other 90% of the time phase. And it definitely took longer for me to wrap my head around the code than it would have if I’d skipped the AI. I might have some overall speed up, but if so it’s on the order of 10-20%. Nothing revolutionary.
I have been able to vibe code a few little one off tools that have made my life a little easier. And I have vibe coded a few iPad games for my kids for car trips, but for work I still have to understand the code and reading code is still harder than writing it.
This is also not from lack of trying , I spent $1000 last week during a company wide “AI week”. Mostly on trying to get AI to replicate my migration tooling, complete with verification agents, testing agents, quality gates, elaborate test harnesses etc…
I’d let Claude (opus 4.7 max effort) crank away overnight only to immediately find that had added some horrible new bug or managed to convince the verification agent that it wasn’t really cheating to pass my quality tests.
What I learned from last week is that we are so far away from not needing to understand the code that everyone who says otherwise is probably full of shit. Other people who I trust who have been running the same experiments have told me the same thing.
Until and unless we get to that point, it’s always going to be a 10-50% speed up (if that).
For many businesses that is revolutionary.
Not sure that's enough magic to make the math work for the trillions being invested, but on a ground level within companies even small wins stack up. You may have burned through $1000 without getting much done, but from a company perspective they've probably got an employee with better instincts as to what does or doesn't work
Where I have a problem is with the FOMO, panic, and mania that has come down from up top. There are people in my company saying that we should be spending 3x our salaries in tokens.
But if you’re in a business where a 20% speed up is revolutionary, there are so many things that have been on the table for years that you could have been focusing on. I’ve seen at least 5 advances over that have happened over the last 20 years with that kind of boost.
That’s probably about you’d get from spending time really learning vim or eMacs.
I think many things that were true prior to AI are still true or more so today, but new workflows and processes altogether are needed. I suspect that comprehensive, detailed planning and specification documentation must be assembled in advance of beginning code (akin to waterfall) when working with AI agents. Furthermore, I still believe customers and other key stakeholders need to be involved early and often so that the product can iterate towards a better ultimate end state (i.e., agile). Unlike prior to AI, it's completely plausible to implement both types of approaches, and they aren't mutually exclusive. We can do comprehensive, exhaustive, thorough planning and specification documentation prior to handing off to dedicated engineering and products teams, AND we can work quickly and iteratively via sprints that aim for frequent meetings and updates with the stakeholders that matter.
I also think the same validation gates that mattered before -- linting, SASTs, but most importantly, comprehensive automated testing that gets run locally and in CI/CD and is regularly expanded to cover all expectations about the behavior and structure of newly-implemented functionality -- continue to matter now, more than ever.
New tools and processes also must be built to make human review, the single biggest bottleneck in software development today, more simplified and streamlined, and less taxing. I think tools like CodeRabbit and Qodo can help automate and expedite the code-review and approval processes, but they would be even better if they were working off more surgical and tiny edits. Bloated, verbose AI-generated code edits are the core problem here. Process management techniques to mitigate the problem of AI code overload can prohibit the submission of AI-generated PRs, require senior engineer approval of any PRs prior to merging, or block the maximum number of lines or changes made. More sophisticated processes like Graphite's stacking of PRs are genuinely helpful in breaking down massive PRs into smaller chunks.
Finally, precision-editing tools for AI coding assistants like HIC Mouse (full disclosure, my project) that move beyond the existing options available to AI agents of whole-file replacement or exact string-replacement to enable agents at the editing-tool layer to perform surgical, tiny changes that don't touch any unrelated content, giving agents specialized visibility, recovery, and next-step guidance mechanisms that safeguard AI workflows, can materially reduce AI code slop by alleviating burdens upstream of code reviewers, both automated and human.
The bottom line: Shipping secure, production-grade code was never easy and always took a long time. It's not necessarily easier now just because certain aspects to the overall process can be generated much more rapidly. Arguably, the hardest parts like human review and approval are much harder now -- not easier. Solutions will take hard work and must be tested in the crucible of real-world enterprise usage. I am guessing that companies that deploy successful processes will be wildly profitable. Those that don't, including well-established incumbents, will fail. I do think AI absolutely can give organizations a game-changing boost in development velocity of genuinely high-quality code that might even be better than anything ever created previously. I also fully agree with the author that for many organizations, AI will not make their processes go faster and may even slow things down.
It might be the ultimate tool of disruption.
Have you thought about pair programming together with the AI?
My LLM outputs are intentional, in my style, and tightly reviewed by myself.
I'm also emitting Rust, which I've found to be the very best language to work with in AI. The AST and language design is focused around control flow and error handling. The borrow checker, sum types, filtering and mapping makes it such that good design is idiomatic.
There's a lot JavaScript, Python, PHP, and Java in the world. A lot of it isn't great. The architectures and styles are wildly varied too. Rust doesn't have that problem. The training data is really solid and idiomatic.