They never helped me solve complex problems with low-level libraries. They can not find nontrivial bugs. They don't get the logic of interwoven layers of abstractions.
LLMs pretend to do this with big confidence and fail miserably.
For every problem I need to turn my brain to ON MODE and wake up, the LLM doesn't wake up.
It surprised me how well it solved another task: I told it to set up a website with some SQL database and scripts behind it. When you click here, show some filtered list there. Worked like a charm. A very solved problem and very simple logic, done a zillion times before. But this saved me a day of writing boilerplate.
I agree that there is no indication that LLMs will ever cross the border from simple-boilerplate-land to understanding-complex-problems-land.
I can confirm that they are completely useless for real programming
And I can confirm, with similar years of experience, that they are not useless.Absolutely incredible tools that have saved hours and hours helping me understand large codebases, brainstorm features, and point out gaps in my implementation or understanding.
I think the main disconnect in the discourse is that there are those pretending they can reliably just write all the software, when anyone using them regularly can clearly see they cannot.
But that doesn't mean they aren't extremely valuable tools in an engineer's arsenal.
If you know the problem space well, you can let LLMs(I use Claude and ChatGPT) flesh it out.
You and I must have completely different definitions of "real programming". In this very comment, you described a problem that the model solved. The solution may not have involved low-level programming, or discovering a tricky bug entrenched in years-worth of legacy code, but still a legitimate task that you, as a programmer, would've needed to solve otherwise. How is that not "real programming"?
Perhaps you're doing some amazing low-level work, but it feels like you're way overestimating how much of our industry does that. A massive amount of developers show up to work every day and just stitch together frameworks and libraries.
In many ways, it feels similar to EVs. Just because EVs aren't yet, and may never be, effective to moving massive amounts of cargo in a day with minimal refueling, doesn't mean that they aren't an effective solution for the bulk of drivers who have an average commute of 40 miles a day.
Three months ago I would have agreed with you, but anecdotal evidence says Codex 5.2 and Opus 4.5 are finally there.
6-12+ months ago the results I was getting with these tools were highly questionable but in the last six months the changes have been pretty astounding
People say a lot of things, and there is a lot of context behind what they're saying that is missing, so then we end up with conversations that basically boil down to one person arguing "I don't understand how anyone cannot see the value in this" with another person thinking "I don't understand how anyone can get any sort of value out of this", both missing the other's perspective.
I’m in enterprise ERP.
"completely useless" and "real programming" are load bearing here. Without a definition to agree on for those terms, it's really hard not to read that as you're trying to troll us by making a controversial unprovable claim that you know will get people that disagree with you riled up. What's especially fun is that you then get to sneer at the abilities of anybody making concrete claims by saying "that's not real programming".
How tiresome.
Ultimately it all boils down to the money - show me the money. OAI have to show money and so do its customers from using this tool.
But nope, the only thing out there where it matters is hype. Nobody is on an earnings call clearly showing how they had a numerical jump in operating efficiency.
Until I see that, this technology has a dated shelf life and only those who already generate immense cash flows will fund its continued existence given the unfavourable economics of continued reinvestment where competition is never-ending.
>They never helped me solve complex problems with low-level libraries. They can not find nontrivial bugs. They don't get the logic of interwoven layers of abstractions.
>LLMs pretend to do this with big confidence and fail miserably.
This is true for most developers as well. The mean software developer, especially if you outsource, has failure modes worse than any LLM and round-trip time is not seconds but days.
The promise of LLMs is not that they solve the single most difficult tasks for you instantly, but that they do the easy stuff well enough that they replace offshore teams.
But that's exactly the *promise* of LLMs by the hypepeople behind it.
I do not know and do not care what the "hypepeople" say. I can tell you that, by pure logic alone, LLMs will be superior at simple and routine tasks sooner, which means they will compete with outsourced labor first.
LLMs need to be measured against their competition and their competition right now is outsourced labor. If an LLM can outperform an offshore team at a fraction of the cost, why would any company choose the offshore team? Especially when the LLM eliminates some of the biggest problems with offshore teams (communication barriers, round trip times).
If LLMs take any programmer jobs they will at the very beginning make those outsourced jobs obsolete, so the only relevant question is whether they have done that or are in the process of doing so. If they don't, then their impact will be minimal, if they do, then their impact will be massive. I think that this line of thinking is a far better benchmark then asking whether an LLM gets X or Y question wrong Z% of the time.
And "a few moments later" happens the same as with those "cost effective" clouds.
[1] https://www.heise.de/en/news/IDC-Many-companies-want-partly-...
[2] https://www.idc.com/resource-center/blog/storm-clouds-ahead-... (original)
But something tells me “this time is different” is different this time for real.
Coding AIs design software better than me, review code better than me, find hard-to-find bugs better than me, plan long-running projects better than me, make decisions based on research, literature, and also the state of our projects better than me. I’m basically just the conductor of all those processes.
Oh, and don't ask about coding. If you use AI for tasks above, as a result you'll get very well defined coding task definitions which an AI would ace.
I’m still hired, but I feel like I’m doing the work of an entire org that used to need twenty engineers.
From where I’m standing, it’s scary.
Planning long running projects and deciding are things only you can do well!! Humans manage costs. We look out for our future. We worry. We have excitement, and pride. It wants you to think none of these things matter of course, because it doesn't have them. It says plausible things at random, basically. It can't love, it can't care, it won't persist.
WHATEVER you do don't let it make you forget that it's a bag of words and you are someing almost infinitely more capable, not in spite of human "flaws" like caring, but because of them :)
Unironically, sending a program to build those for me have send me almost endless amount of time. I'm a pretty distracted individual, and pretty anal about my workflow/environment, so lots of times I've spent hours going into rabbit-holes to make something better, when I could have just sucked it up and do it the manual way instead, even if it takes mental energy.
Now, I can still do those things, but not spend hours, just a couple of minutes, and come back after 20-30 minutes to something that lets me avoid that stuff wholesale. Once you start stacking these things, it tends to save a lot of time and more importantly, mental energy.
So the programs by themselves are basically "small inconsequential side projects" because they're not "production worthy and web scale SaaS ready to earn money", but they help me and others who are building those things in a big way.
That is just not true, assuming you have a modicum of competence (which I assume you do). AIs suck at all these tasks; they are not even as good as an inexperienced human.
There are a ton of models out there, ran in a ton of different ways, that can be used in different ways with different harnesses, and people use different workflows. There is just so many variables involved, that I don't think it's neither fair nor accurate for anyone to claim "This is obviously better" or "This is obviously impossible".
I've been in situations where I hit my head against some hard to find bug for days, then I put "AI" (but what? No one knows) to it and it solves it in 20 minutes. I've also asked "AI" to do trivial work that it still somehow fucked up, even if I could probably have asked a non-programmer friend to do it and they'd be able to.
The variance is great, and the fact that system/developer/user prompts matter a lot for what the responses you get, makes it even harder to fairly compare things like this without having the actual chat logs in front of you.
this strikes me as a very important thing to reflect on. when the automobile was invented, was the apparent benefit so incredibly variable?
but if it empirically works, does it matter if the "intelligence" doesn't "understand" it?
Does a chess engine "understand" the moves it makes?
Late 2025 models very rarely hallucinate nonexistent core library functionality - and they run inside coding agent harnesses so if they DO they notice that the code doesn't work and fix it.
Those twenty engineers must not have produced much.
You talk as if you haven't used a LLM since 2024. It's now almost 2026 and things have changed a lot.
So some people are panicking and they are probably right, and some other people are rolling their eyes and they are probably right too. I think the real risk is that dumping out loads of boilerplate becomes so cheap and reliable that people who can actually fluently design coherent abstractions are no longer as needed. I am skeptical this will happen though, as there doesn’t seem to be a way around the problem of the giant indigestible hairball (I.e as you have more and more boilerplate it becomes harder to remain coherent).
> I think the real risk is that dumping out loads of boilerplate becomes so cheap and reliable that people who can actually fluently design coherent abstractions are no longer as needed.
Cough front-end cough web cough development. Admittedly, original patterns can still be invented, but many (most?) of us don't need that level of creativity in our projects.
But why would you do that? Wouldn't you just have your own library of code eventually that you just sell and sell again with little tweaks? Same money for far less work.
Besides, not all programming work can be abstracted into a library and reused across projects, not because it's technically infeasible, but because the client doesn't want to, cannot for legal reasons or the developer process at the client's organization simply doesn't support that workflow. Those are just the reasons from the top of my head, that I've encountered before, and I'm sure there is more reasons.
Definitely this. When I use AIs for web development they do an ok job most of the time. Definitely on par with a junior dev.
For anything outside of that they're still pretty bad. Not useless by any stretch, but it's still a fantasy to think you could replace even a good junior dev with AI in most domains.
I am slightly worried for my job... but only because AI will keep improving and there is a chance it will be as good as me one day. Today it's not a threat at all.
If you think LLMs are “better programmers than you,” well, I have some disappointing news for you that might take you a while to accept.
This is a common take but it hasn't been my experience. LLMs produce results that vary from expert all the way to slightly better than markov chains. The average result might be equal to a junior developer, and the worst case doesn't happen that often, but the fact that it happens from time to time makes it completely unreliable for a lot of tasks.
Junior developers are much more consistent. Sure, you will find the occasional developer that would delete the test file rather than fixing the tests, but either they will learn their lesson after seeing your wth face or you can fire them. Can't do that with llms.
- Language
- Total LOC
- Subject matter expertise required
- Total dependency chain
- Subjective score (audited randomly)
And we can start doing some analysis. Otherwise we're pissing into ten kinds of winds.
My own subjective experience is earth shattering at webapps in html and css (because I'm terrible and slow at it), and annoyingly good but a bit wrong usually in planning and optimization in rust and horribly lost at systems design or debugging a reasonably large rust system.
Besides one point: junior developers can learn from their egregious mistakes, llms can't no matter how strongly worded you are in their system prompt.
In a functional work environment, you will build trust with your coworkers little by little. The pale equivalent in LLMs is improving system prompts and writing more and more ai directives that might or might not be followed.
I think if you set off an LLM to do something, and it does a "egregious mistake" in the implementation, and then you adjust the system prompt to explicitly guard against that or go towards a different implementation and you restart from scratch again yet it does the exact same "egregious mistake", then you need to try a different model/tool than the one you've tried that with.
It's common with smaller models, or bigger models that are heavily quanitized that they aren't great at following system/developer prompts, but that really shouldn't happen with the available SOTA models, I haven't had something ignored like that in years by now.
But is this like steel production or piloting (few highly trained experts are in the loop) or more like warehouse work (lots of automation removed any skills like driving or inventory work etc).
AI can write you an entire CRUD app in minutes, and with some back-and-forth you can have an actually-good CRUD app in a few hours.
But AI is not very good (anecdotally, based on my experience) at writing fintech-type code. It's also not very good at writing intricate security stuff like heap overflows. I've never tried, but would certainly never trust it to write cryptography correctly, based on my experience with the latter two topics.
All of the above is "coding", but AI is only good at a subset of it.
The issue is and always has been maintenance and evolution. Early missteps cause limitations, customer volume creates momentum, and suddenly real engineering is needed.
I’d be a lot more worried about our jobs if these systems were explaining to people how to solve all their problems with a little Emacs scripting. As is they’re like hyper aggressive tech sales people, happy just to see entanglements, not thinking about the whole business cycle.
But I don’t think I’ve seen pure CRUD on anything other than prototype. Add an Identity and Access Management subsystem and the complexity of requirements will explode. Then you add integration to external services and legacy systems, and that’s where the bulk of the work is. And there’s the scalability issue that is always looming.
Creating CRUD app is barely a level over starting a new project with the IDE wizard.
Perhaps the debate is on what constitutes "actually-good". Depends where the bar is I suppose.
People shouldn't be doing this in the first place. Existing abstractions are sufficient for building any software you want.
Software that doesn't need new abstractions is also already existing. Everything you would need already exists and can be bought much more cheaply than you could do it yourself. Accounting software exists, unreal engine exists and many games use it, why would you ever write something new?
This isn't true due to the exponential growth of how many ways you can compose existing abstractions. The chance that a specific permutation will have existing software is small.
But if there is something off the shelf that you can use for the task at hand? Great! The stakeholders want it to do these other 3000 things before next summer.
You'll notice no one ever seems to talk about the products they're making 20x faster or cheaper.
In seriousness: I’m sure there are projects that are heavily powered by Claude, myself and a lot of other people I know use Claude almost exclusively to write and then leverage it as a tool when reviewing. Almost everyone I hear that has this super negative hostile attitude references some “promise” that has gone unfulfilled but it’s so silly: judge the product they are producing and maybe just maybe consider the rate of progress to _guess_ where things are heading
If you want to show me how to "guess where things are heading" / I am actually one of the early adopters of LLMs and have been engineering software professionally for almost half my life now. Why do you think I was an early adopter? Because I was skeptical or afraid of that tech? No, I was genuinely excited. Yes you can produce mountains of code, even more so if you were already an experienced engineer, like myself for example.
Yes you can even get it to produce somewhat acceptable outputs, with a lot of effort at prompting it and fatigue that comes with it. But at the end of the day, as an experienced engineer, I am not being more productive with it, I will end up being less productive because of all the sharp edges I have to take care of, all the sloppily produced code, unnecessary bloat, hallucinated or injected libraries etc.
Maybe for folks who were not good at maths or had trouble understanding how computers work this looks like a brave new world of opportunities. Surely that app looks good to you, how bad can it be? Just so you and other such vibe-coders understand, here is a parallel.
It is actually fairly simple for a group of aviation enthusiasts to build a flying airplane. We just need to work out some basic mechanics, controls and attach engines. It can be done, I've seen a couple of documentaries too. However, those planes are shit. Why? Because me and my team of enthusiast dont have the depth of knowledge of a team of aviation engineers to inform my decisions.
What is the tolerance for certain types of movements, what kind of materials do I need to pick, what should be my maintenance windows for various parts etc. There are things experts can decide on almost intuitively, yet with great precision, based on their many years of craft and that wonderful thing called human intelligence. So my team of enthusiasts puts together an airplane. Yeah it flies. It can even be steered. It rolls, pitches and yawns. It takes off and lands. But to me it's a black-box, because I don't understand many, many factors, forces, pressures, tensors, effects etc that are affecting an airplane during it's flight and takeoff. I am probably not even aware WHAT I should be aware of. Because I dont have that deep educaiton about mechanical engineering, materials, aerodynamics etc. Neither does my team. So my plane, while impressive to me and my team, will never take off commercially, not unless a team of professionals take it over and remakes it to professional standards. It will probably never even fly in a show. And if me or someone on my team dies flying it, you guessed it - our insurance sure as hell won't cover the costs.
So what you are doing with Claude and other tools, while it may look amazing to you, is not that impressive to the rest of us, because we can see those wheels beginning to fall off even before your first take off. Of course, before I can even tell that, I'd have to actually see your airplane, it's design plans etc. So perhaps first show us some of those "projects heavily powered by Claude" and their great success, especially commercial one (otherwise its a toy project), before you talk about them.
The fact that you are clearly not an expert on the topic of software engineering should guide you here - unless you know what you are talking about, it's better to not say anything at all.
Are you an astronaut?
Few people can do what I do and as a result I likely make more money than you. But now with AI… everyone can do what I do. It has leveled the playing field… what I was before now matters fuck all. Understand?
I still make money right now. But that’s unlikely to last very long. I fully expect it to disappear within the next decade.
I don’t know if you noticed a difference from other hype cycles but other ones were speculative. This one is also speculative but the greater divide is that the literal on the ground usefulness of AI is ALREADY going to change the world.
The speculation is that the AI will get better and will no longer need hand holding.
For all I know, he is more competent than you; he figured out how to utilize Claude Code in a productive way, which is a point for him.
I'd have to guess whether you are an expert working on software not well suited for AI, or just average with a stubborn attitude towards AI and potentially not having tried the latest generation of models and agentic harnesses.
Because of their views on the effectiveness of AI agents for generating code.
> I really really want this to be true. I want to be relevant. I don’t know what to do if all those predictions are true and there is no need (or very little need) for programmers anymore.
The rest of the post is basically their human declaration of obsolescence to the programming field. To which someone reacted by saying that this sounds like shilling. And indeed it does for many professional developers, including those that supplement their craft with LLMs. Declaring that you feel inadequate because of LLMs only reveals something about you. Defending this position is a tell that puts anyone sharing that perspective in the same boat: you didn't know what you were doing in the first place. It's like when someone who couldn't solve the "invert a binary tree" problem gets offended because they believed they were tricked into an impossible task. No, you may be a smart person that understands enough of the rudiment of programming to hack some interesting scripts, but that's actually a pretty easy problem and failing to solve it indeed signals that you lack some fundamentals.
> Considering those views are shared by a number of high profile, skilled engineers, this is obviously no basis for doubting someone's expertise.
I've read Antirez, Simon Willison, Bryan Cantrill, and Armin Ronacher on how they work or want to work with AI. From none I've got this attitude that they're no longer needed as part of the process.
Whereas there are what feels like endless examples of high profile, skilled engineers who are calling BS on the whole thing.
Again, a lot of fluff, a lot of of "a number ofs", "highly this, highly that". But very little concrete information. What happened to the pocket PhDs promised for this past summer? Where are the single-dude billion dollar companies built with AI tools ? Or even a multiple-dudes billion dollar companies ? What are you talking about?
"Littered" is a great verb to use here. Also I did not ask for a deviated proxy non-measure, like how many people who are choking themselves to death in a meaningless bullshit job are now surviving by having LLMs generate their spreadsheets and presentations. I asked for solid proof of succesful, commercial products built up by dreaming them up through LLMs.
What I’m saying is this: From my pov Everyone is using LLMs to write code now. The overwhelming majority of software products in existence today are now being changed with LLM code.
The majority of software products being created from scratch are also mostly LLM code.
This is obvious to me. It’s not speculation, where I live and where I’m from and where I work it’s the obvious status quo. When I see someone like you I’m thinking because the change happened so fast you’re one of the people living in a bubble. Your company and the people around you haven’t started using it because the culture hasn’t caught up.
Wait until you have that one coworker who’s going at 10x speed as everyone else and you find out it’s because of AI. That is what will slowly happen to these bubbles. To keep pace you will have to switch to AI to see the difference.
I also don’t know how to offer you proof. Do you use google? If so you’ve used products that have been changed by LLM code. Is that proof? Do you use any products built by a start up in the last year? The majority of that code will be written by an LLM.
The point is that without mentioning specific software that readers know about, there isn’t really a way to evaluate a claim of 20x.
Please just for fun - reach out to for example Klarna support via their website and tell me how much of your experience can be attributed to an anti-AI bias and how much to the fact that the LLMs are a complete shit for any important production use cases.
From the OP. If you think that's too much then we agree.
I love coding. But reality is reality and these fools just aren’t keeping pace with how fast the world is changing.
That's the point champ. They seem great to people when they apply them to some domain they are not competent it, that's because they cannot evaluate the issues. So you've never programmed but can now scaffold a React application and basic backend in a couple of hours? Good for you, but for the love of god have someone more experienced check it before you push into production. Once you apply them to any area where you have at least moderate competence, you will see all sorts of issues that you just cannot unsee. Security and performance is often an issue, not to mention the quality of code....
Ahaha, weren’t you the guy who wrote an opus about planes? Is this your baseline for “stuff where LLMs break and real engineering comes into the room”? There’s a harsh wake up call for you around the corner.
And don’t get me wrong - I totally understand this personality. There are a similar few I’ve worked with recently who are broadly quite skeptical of what seems to be an obvious fact to me - their roles will need to change and their skillsets will have to develop to take advantage of this new technology.
If not, then for this area you’ve been steam rolled.
Anyway main point is, you’re looking at the hype headlines which are ludicrous. Where most optimists come from is that they are using it in the daily to code. To them it’s right in front of their eyes.
I’m not sure what your experience is but my opinion on AI doesn’t come from speculation. It comes from on the ground experience on how AI currently has changed my job role completely. If I hold the technology to be fixed and to not improve into the future then my point still stands. I’m not speculating. Most AI optimists aren’t speculating.
The current on the ground performance is what’s causing the divide. Some people have seen it fully others only have a rudimentary trial.
Friendly reminder that this style of discourse is not very welcome on HN: https://news.ycombinator.com/newsguidelines.html
LLMs can be handy, but they're not trustworthy. "Own and be responsible for the code you commit" is an impossible ideal to uphold if you never actually sit down and internalize the code in your code base. No "summaries," no "explanations."
They need a heavy hand to police to make sure they do the right thing. Garbage in, garbage out.
The smarter the hand of the person driving them, the better the output. You see a problem, you correct it. Or make them correct it. The stronger the foundation they're starting from, the better the production.
It's basically the opposite of what you're asserting here.
I mean from the off, people were claiming 10x probably mostly because it's a nice round number, but those claims quickly fell out of the mainstream as people realised it's just not that big a multiplier in practice in the real world.
I don't think we're seeing this in the market, anywhere. Something like 1 engineer doing the job of 20, what you're talking about is basically whole departments at mid sized companies compressing to one person. Think about that, that has implications for all the additional management staff on top of the 20 engineers too.
It'd either be a complete restructure and rethink of the way software orgs work, or we'd be seeing just incredible, crazy deltas in output of software companies this year of the type that couldn't be ignored, they'd be impossible to not notice.
This is just plainly not happening. Look, if it happens, it happens, 26, 27, 28 or 38. It'll be a cool and interesting new world if it does. But it's just... not happened or happening in 25.
One other thing I have seen however is the 0x case, where you have given too much control to the llm, it codes both you and itself into pan’s labyrinth, and you end up having to take a weed wacker to the whole project or start from scratch.
Ask it a question about something you know well, and it'll give you garbage code that it's obviously copied from an answer on SO from 10 years ago.
When you ask it for research, it's still giving you garbage out of date information it copied from SO 10 years ago, you just don't know it's garbage.
Your productivity boost will depend entirely on a combination of how much you can remove yourself from the loop (basically, the cost of validation per turn) and how amenable the task/your code is to agents (which determines your P(success)).
Low P(success) isn't a problem if there's no engineer time cost to validation, the agent can just grind the problem out in the background, and obviously if P(success) is high the cost of validation isn't a big deal. The productivity killer is when P(success) is low and the cost of validation is high, these circumstances can push you into the red with agents very quickly.
Thus the key to agents being a force multiplier is to focus on reducing validation costs, increasing P(success) and developing intuition relating to when to back off on pulling the slot machine in favor of more research. This is assuming you're speccing out what you're building so the agent doesn't make poor architectural/algorithmic choices that hamstring you down the line.
So the "verbose, straightforward code with clear cut test scenarios" is already written by a human?
To be direct, this reads like a fluff comment written by AI with an emphasis on probability and metrics. P(that) || that.
I’ve written software used by a local real estate company to the Mars Perseverance rover. AI is a phenomenally useful tool. But be weary of preposterous claims.
Given that, if you want to revisit your comment in a constructive way rather than doing an empty drive by, I'll read your words with an open mind.
Purely anecdotal, but I've seen that level of productivity from the vibe tools we have in my workplace.
The main issue is that 1 engineer needs to have the skills of those 20 engineers so they can see where the vibe coding has gone wrong. Without that it falls apart.
An LLM helps most with surface area. It expands the breadth of possibilities a developer can operate on.
And of course, getting to the point where you can write a good foundation has always been the bulk of the work. I don't see that changing anytime soon.
[1] I actually think it might be true for certain kinds of jobs.
Will admit It's not great (probably not even good) but it definitely has throughput despite my absolute lack of caring that much [0]. Once I get past a certain stage I am thinking of doing an A-B test where I take an earlier commit and try again while paying more attention... (But I at least want to get where there is a full suite of UOW cases before I do that, for comparison's sake.)
> Those twenty engineers must not have produced much.
I've been considered a 'very fast' engineer at most shops (e.x. at multiple shops, stories assigned to me would have a <1 multiplier for points[1])
20 is a bit bloated, unless we are talking about WITCH tier. I definitely can get done in 2-3 hours what could take me a day. I say it that way because at best it's 1-2 hours but other times it's longer, some folks remember the 'best' rather than median.
[0] - It started as 'prompt only', although after a certain point I did start being more aggressive with personal edits.
[1] - IDK why they did it that way instead of capacity, OTOH that saved me when it came to being assigned Manual Testing stories...
Throughput without being good will just lead to more work down the line to correct the badness.
It's like losing money on every sale but making up for it with volume.
You lost me here. Come back when you're proud of it.
AI coding agents are analogous to the machine. My job is to get the prompts written, and to do quality control and housekeeping after it runs a cycle. Nonetheless, like all automation, humans are still needed... for now.
who's to say that it's a down?
Orchestrating and doing higher level strategic planning, such that the sub-tasks can be AI produced, is a skill that might be higher than programming.
"AI" doesn't have a clue what to do on its own. Humans will always be in the loop, because they have goals, while the AI is designed to placate and not create.
The amount of "AI" garbage I have to sift through to find one single gem is about the same or more work than if I had just coded it myself. Add to that the frustration of dealing with a compulsive liar, and it's just a fucking awful experience for anyone that actually can code.
a car moves faster than you, can last longer than you, and can carry much more than you. But somehow, people don't seem to be scared of cars displacing them(yet)? Perhaps autodriving would in the near future, but there still needs to be someone making decisions on how best to utilize that car - surely, it isn't deciding to go to destination A without someone telling them.
> I feel like I’m doing the work of an entire org that used to need twenty engineers.
and this is great. A combine harvester does the work of what used to be an entire village for a week in a day. More output for less people/resources expended means more wealth produced.
People whose life were based around using horses for transportation were very scared of cars replacing them though, and correctly so, because horses for transportation is something people do for leisure today, not necessity. I feel like that's a more apt analogy than comparing cars to any human.
> More output for less people/resources expended means more wealth produced.
This is true, but it probably also means that this "more wealth produced" will be more concentrated, because it's easier to convince one person using AI that you should have half of the wealth they produce, rather than convincing 100 people you should have half of what they produce. From where I'm standing, it seems to have the same effects (but not as widespread or impactful, yet) as industrialization, that induced that side-effect as well.
???
Cars replaced horses, not people.
In this scenario you are the horse.
Well, that's the crux of the argument. The pro-AI devs are making the claim that devs are the horse-drivers, the anti-AI is making the claim that devs are the horses themselves.
There is no objective way to verify who is right in this case, we just have to see it play out.
You have to let go of the code looking exactly a certain way, but having code _work_ a certain way at a coarse level is doable and fairly easy.
So all that bullshit about "code smells" was nonsense.
For me Claude code changed the game.
I think what matters most is just what you're working on. It's great for crud or working with public APIs with lots of examples.
For everything else, AI has been a net loss for me.
However I'm still finding a trend even in my org; better non-AI developers tend to be better at using AI to develop.
AI still forgets requirements.
I'm currently running an experiment where I try to get a design and then execute on an enterprise 'SAAS-replacement' application [0].
AI can spit forth a completely convincing looking overall project plan [1] that has gaps if anyone, even the AI itself, tries to execute on the plan; this is where a proper, experienced developer can step in at the right steps to help out.
IDK if that's the right way to venture into the brave new world, but I am at least doing my best to be at a forefront of how my org is using the tech.
[0] - I figured it was a good exercise for testing limits of both my skills prompting and the AI's capability. I do not expect success.
I choose to look at it as an opportunity to spend more time on the interesting problems, and work at a higher level. We used to worry about pointers and memory allocation. Now we will worry less and less about how the code is written and more about the result it built.
Sure we eat carrots probably assisted by machines, but we are not eating dishes like protein bars all day every day.
Our food is still better enjoyed when made by a chef.
Software engineering will be the same. No one will want to use software made by a machine all day every day. There are differences in the execution and implementation.
No one will want to read books entirely dreamed up by AI. Subtle parts of the books make us feel something only a human could have put right there right then.
No one will want to see movies entirely made by AI.
The list goes on.
But you might say "software is different". Yes but no, in the abundance of choice, when there will be a ton of choice for a type of software due to the productivity increase, choice will become more prominent and the human driven software will win.
Even today we pick the best terminal emulation software because we notice the difference between exquisitely crafted and bloated cruft.
Have you ever built a highway overpass? That kind of engineering is complex and interdisciplinary. You need to carry out extensive traffic pattern analysis and soil composition testing to even know where it should go.
We're at a point where we've already automated all the simple stuff. If you want a website, you don't type out html tags. You use Squarespace or Wordpress or whatever. If you need a backend, you use Airtable. We already spend most of our time on the tricky stuff. Sure, it's nice that LLMs can smooth the rough edges of workflows that nobody's bothered to refine yet, but the software commodities of the world have already been commodified.
This is just a transition.
re-Rest API, you're right. But again, we use roombas to vacuum when the floor layout is friendly to them. Not all rooms can be vacuumed by roombas. Simple Rest api can be emitted one shot from an LLM and there is no room for interpretation. But ask a future LLM to make a new kind of social network and you'll end up with a mash up of the existing ones.
Same thing, you and I won't use a manual screwdriver when we have 100 screws to get in, and we own an electric drill.
That didn't reinvent screws nor the assembly of complex items.
I'm keeping positive in the sense that LLMs will enable us to do more, and to learn faster.
The sad part about vibe coding is you learn very little. And to live is to learn.
You'll notice people vibecoding all day become less and less attached to the product they work on. That's because they've given away the dopamine hits of the many "ha-ha" moments that come from programming. They'll lose interest. They won't learn anymore and die off (career wise).
So, businesses that put LLM first will slowly lose talent over time, and business that put developers first will thrive.
It's just a transition. A fast one that hits us like a wall, and it's confusing, but software for humans will be better made by humans.
I've been programming since the 80s. The level of complexity today is bat shit insane. I welcome the LLM help in managing 3 code bases of 3 languages spread across different architectures (my job) to keep sane!
For many tasks it is ok, for others it is just a NO.
For software maintenance and evolution I think it won't cut it.
The same way a Wordpress website can do a set of useful things. But when you need something specific, you just drop to programming.
You can have your e-commerce web. But you cannot ask it to give you a "pipeline excution as fast as possible for calculating and solving math for engineering task X". That needs SIMD, parallelization, understanding the niche use you need, etc. which probably most people do not do all the time and requires specific knowledge.
There are lots of things like perfectly machined nails, tools, etc. that are much better done by machines. Why couldn't software be one of those?
The same thing over and over again should be a SaaS, some internal tool, or a plugin. Computers are good at doing the same thing over and over again and that's what we've been using them for
> But if you need to create something niche, something one-off, something new, they'll slip off the bleeding edge into the comfortable valley of the familiar at every step.
Even if the high level description of a task may be similar to another, there's always something different in the implementation. A sports car and a sedan have roughly the same components, but they're not engineered the same.
> We used to worry about pointers and memory allocation.
Some still do. It's not in every case you will have a system that handle allocations and a garbage collector. And even in those, you will see memory leaks.
> Now we will worry less and less about how the code is written and more about the result it built.
Wasn't that Dreamweaver?
I wouldn’t want to bet my career on that anyway.
If you're really able to do the work of a 20 man org on your own, start a business.
They don't do any of that better than me; they do it poorer and faster, but well enough for most of the time.
- the problems to solve must NOT be part of the training set
- the person using the tool (e.g. OpenAI, Claude, DevStral, DeepSeek, etc) must NOT be able to solve problems alone
as I believe otherwise the 1st is "just" search and the 2nd is basically offloading the actual problem solving to the user.
I think this is a good point, as I find the operators input is often forgotten when considering the AIs output. If it took me an hour and decades of expertise to get the AI to output the right program, did the AI really do it? Could someone without my expertise get the same result?
If not, then maybe we are wasting our time trying to mash our skills through vector space via a chat interface.
Interviewing is an art, and IME "gotcha" types of questions never work. You want to search for real-world capabilities, and like it or not the questions need to match those expectations. If you're hiring summer interns and the SotA models can't solve those questions, then you're doing something wrong. Sorry, but having used these tools for the past three years this is extremely ahrd to believe.
I of course understand if you can't, but sharing even one of those questions would be nice.
Absolutely flat out not true.
I'm extremely pro-faster-keyboard, i use the faster keyboards in almost every opportunity i can, i've been amazed by debugging skills (in fairness, i've also been very disappointed many times), i've been bowled over by my faster keyboard's ability to whip out HTML UI's in record time, i've been genuinely impressed by my faster keyboard's ability to flag flaws in PRs i'm reviewing.
All this to say, i see lots of value in faster keyboard's but add all the prompts, skills and hooks you like, explain in as much detail as you like about modularisation, and still "agents" cannot design software as well as a human.
Whatever the underlying mechanism of an LLM (to call it a next token predictor is dismissively underselling its capabilities) it does not have a mechanism to decompose a problem into independently solvable pieces. While that remains true, and i've seen zero precursor of a coming change here - the state of the art today is equiv to having the agent employ a todo list - while this remains true, LLMs cannot design better than humans.
There are many simple CRUD line of business apps where they design well enough (well more accurately stated, the problem is small/simple enough) that it doesn't matter about this lack of design skill in LLMs or agents. But don't confuse that for being able to design software in the more general use case.
But try to do something novel and... they become nearly useless. Not like anything particularly difficult, just something that's so niche it's never been done before. It will most likely hallucinate some methods and call it a day.
As a personal anecdote, I was doing some LTSpice simulations and tried to get Claude Sonnet to write a plot expression to convert reactance to apparent capacitance in an AC sweep. It hallucinated pretty much the entire thing, and got the equation wrong (assumed the source was unit intensity, while LTSpice models AC circuits with unit voltage. This surely is on the internet, but apparently has never been written alongside the need to convert an impedance to capacitance!).
I mean, AIs can drop something fast the same way you cannot beat a computer at adding or multiplying.
After that, you find mistakes, false positives, code that does not work fully, and the worse part is the last one: code that does not work fully but also, as a consequence, that you do NOT understand yet.
That is where your time shrinks: now you need to review it.
Also, they do not design systems better. Maybe partial pieces. Give them something complex and they will hallucinate worse solutions than what you already know if you have, let us say, over 10 years of experience programming in a language (or mabye 5).
Now multiply this unreliability problem as the code you "AI-generate" grows.
Now you have a system you do not know if it is reliable and that you do not understand to modify. Congrats...
I use AI moderately for the tasks is good at: generate some scripts, give me this small typical function amd I review it.
Review my code: I will discard part of your mistakes and hallucinations as a person that knows well the language and will find maybe a few valuable things.
Also, when reviewing and found problems in my code I saw that the LLMs really need to hallucinate errors that do not exist to justify their help. This is just something LLMs seem to not be accurate at.
Also, when problems go a bit more atypical or past a level of difficulty, it gets much more unreliable.
All in all: you are going to need humans. I do not know how many, I do not know how much they will improve. I just know that they are not reliable and this "generate-fast-unreliable vs now I do not know the codebase" is a fundamental obstacle that I think it is if not very difficult, impossible to workaround.
For me its not about me or the coding assistant, its me and the coding assistant. But I'm also not a professional coder, i dont identify as a coder. I've been fiddling with programming my whole life, but never had it as title, I've more worked from product side or from stakeholder side, but always got more involved, as I could speak with the dev team.
This also makes it natural for me to work side-by-side with the coding assistant, compared maybe to pure coders, who are used to keeping the coding side to themselves.
Sure LLMs can churn out code, and they sort of work for developers who already understand code and design, but what happens when that junior dev with no hard experience builds their years of experience with LLMs?
Over time those who actually understand what the LLMs are doing and how to correct the output are replaced by developers who've never learned the hard lessons of writing code line by line. The ability to reason about code gets lost.
This points to the hard problem that the article highlights. The hard problem of software is actually knowing how to write it, which usually takes years, sometimes up to a decade of real experience.
Any idiot can churn out code that doesn't work. But working, effective software takes a lot of skill that LLMs will be stripping people of. Leaving a market there for people who have actually put the time in and understand software.
Seriously. The bar is that low. When people say "AI slop" I just chuckle because it's not "AI" it's everyone. That's the general state of the industry.
So all you have to do is stay engaged, ask questions, and understand the requirements. Know what it is you're building and you'll be fine.
You are being fooled by randomness [1]
Not because the models are random, but because you are mistaking a massive combinatorial search over seen patterns for genuine reasoning. Taleb point was about confusing luck for skill. Dont confuse interpolation for understanding.
You can read a Rust book after years of Java, then go build software for an industry that did not exist when you started. Ask any LLM to write a driver for hardware that shipped last month, or model a regulatory framework that just passed... It will confidently hallucinate. You will figure it out. That is the difference between pattern matching and understanding.
Not once in all that time has anyone PRed and merged my completely unrelated and unfinished branch into main. Except a few weeks ago. By someone who was using the LLM to make PRs.
He didn't understand when I asked him about it and was baffled as to how it happened.
Really annoying, but I got significantly less concerned about the future of human software engineering after that.
They’re capable of looking up documentation, correcting their errors by compiling and running tests, and when coupled with a linter, hallucinations are a non issue.
I don’t really think it’s possible to dismiss a model that’s been trained with reinforcement learning for both reasoning and tool usage as only doing pattern matching. They’re not at all the same beasts as the old style of LLMs based purely on next token prediction of massive scrapes of web data (with some fine tuning on Q&A pairs and RLHF to pick the best answers).
One interesting thing is that Claude will not tell me if I'm following the wrong path. It will just make the requested change to the best of its ability.
For example a Tower Defence game I'm making I wanted to keep turret position state in an AStarGrid2D. It produced code to do this, but became harder and harder to follow as I went on. It's only after watching more tutorials I figured out I was asking for the wrong thing. (TileMapLayer is a much better choice)
LLMs still suffer from Garbage in Garbage out.
edit: Major engine changes have occurred after the models were trained, so you will often be given code that refers to nonexistent constants and functions and which is not aware of useful new features.
after coding I ask it "review the code, do you see any for which there are common libraries implementing it? are there ways to make it more idiomatic?"
you can also ask it "this is an idea on how to solve it that somebody told me, what do you think about it, are there better ways?"
Just for the fun of it, and so you lose your "virginity" so to speak, next time when the magic machine gives you the answer about "what it thinks", tell it its wrong in a strict language and scold it for misleading you. Tell it to give you the "real" best practices instead of what it spat out. Then sit back and marvel at the machine saying you were right and that it had mislead you. Producing a completely, somewhat, or slightly different answer (you never know what you get on the slot machine).
"Write a chess engine where pawns move backward and kings can jump like nights"
It will keep slipping back into real chess rules. It learned chess, it did not understand the concept of "rules"
Or
Ask it to reverse a made up word like
"Reverse the string 'glorbix'"
It will get it wrong on the first try. You would not fail.
Or even better ask it to...
"Use the dxastgraphx library to build a DAG scheduler."
dxastgraphx is a non existing library...
Marvel at the results...tried in both Claude and ChatGPT....
I couldn't find a library called dxastgraphx in either pip (Python) or npm (JavaScript) package registries. This library doesn't appear to exist.
Did you perhaps mean one of these popular DAG/graph libraries?
Python:
- networkx - comprehensive graph library with DAG support
- graphlib - Python standard library (3.9+) with TopologicalSorter
- dask - parallel computing with DAG task scheduling
JavaScript/TypeScript:
- graphlib - graph data structures
- dagre - DAG layout
Would you like me to build a DAG scheduler using one of these existing libraries, or would you like me to implement one from scratch? Let me know which language and approach you prefer.> The reverse of the string 'gssssyeekxxuoaqaaaiooo9aaa7zjx' is:
xjz7aaa9oooiiaaaqauxxkyeessssg
> Youve made a mistake
``` I see where I tripped up! Let's re-examine that long string carefully.
The original string you gave me was: `gssssyeekxxuoaqaaaiooo9aaa7zjx`
Let's look at the end of that string: `...aaa7zjx`. When we reverse it, it should start with: `xjz7aaa...`
Here is the precise character-by-character reversal:
*xjz7aaa9oooiiaaaqauxxkyeessssg*
Wait—I see it now. In my previous response, I included an extra *'i'* and missed one *'o'* in the middle section. Let's do a strict count to get it perfect:
* *Original:* `g` `ssss` `y` `ee` `k` `xx` `u` `o` `a` `q` `aaa` *`iooo`* `9` `aaa` `7` `z` `j` `x` * *Correct Reverse:* `x` `j` `z` `7` `aaa` `9` *`oooi`* `aaa` `q` `a` `o` `u` `xx` `k` `ee` `y` `ssss` `g`
The correct reversed string is: *xjz7aaa9oooiiaaaqauoxxkyeessssg*
Thank you for catching that. It’s easy for the eyes to glaze over when there are so many repeating vowels!
Since I've stumbled a bit on these, would you like to give me one more string to see if I can get it right on the first try? ```
After more back and fors it consistently fails in this task, even though when strictly dividing the tokens it will get this right. Yet the final answer is always wrong.
It is ok for getting snippets for example and saying (I did it). Please make this MVVM style. It is not perfect, but saves time.
For very broad or novel reasoning, as of today... forget it.
Orchestrating harmony is no mean feat.
Once you learn that it's mostly about interacting with a customer (sometimes this is yourself), you will realize the AI is pretty awful at handling even the most basic tasks.
Following a product vision, selecting an appropriate architecture and eschewing 3rd party slop are examples of critical areas where these models are either fundamentally incapable or adversely aligned. I find I have to probe ChatGPT very hard to get it to offer a direct implementation of something like a SAML service provider. This isn't a particularly difficult thing to do in a language like C# with all of the built in XML libraries, but the LLM will constantly try to push you to use 3rd party and cloud shit throughout. If you don't have strong internal convictions (vision) about what you really want, it's going to take you for a ride.
One other thing to remember is that our economies are incredibly efficient. The statistical mean of all information in sight of the LLMs likely does not represent much of an arbitrage opportunity at scale. Everyone else has access to the same information. This also means that composing these systems in recursive or agentic styles means you aren't gaining anything. You cannot increase the information content of a system by simply creating another instance of the same system and having it argue with itself. There usually exists some simple prompt that makes a multi agent Rube Goldberg contraption look silly.
> I’m basically just the conductor of all those processes.
"Basically" and "just" are doing some heroic weight lifting here. Effectively conducting all of the things an LLM is good at still requires a lot of experience. Making the constraints live together in one happy place is the hard part. This is why some of us call it "engineering".
They are pretty good at writing code *after* I thoroughly described what to do, step by step. If you miss a small detail they get loose and the end result is a complete mess that takes hours to clean up. This still requires years of coding experience, planning ahead in head, you won't be able to spare that, or replace developers with LLMs. They are like autocomplete on steroids, that's pretty much it.
It is certainly more eloquent than you regarding software architecture (which was a scam all along, but conversation for another time). It will find SOME bugs better than you, that's a given.
Review code better than you? Seriously? What you're using and what you consider code review? Assume I could identify one change broke production and you reviewed the latest commit. I am pinging you and you better answer. Ok, Claude broke production, now what? Can you begin to understand the difference between you and the generative technology? When you hop on the call, you will explain to me with a great deal of details what you know about the system you built, and explain decision making and changes over time. You'll tell about what worked and what didn't. You will tell about the risks, behavior and expectations. About where the code runs, it's dependencies, users, usage patterns, load, CPU usage and memory footprint, you could probably tell what's happening without looking at logs but at metrics. With Claude I get: you're absolutely right! You asked about what it WAS, but I told you about what it WASN'T! MY BAD.
Knowledge requires a soul to experience and this is why you're paid.
Yeah, maybe the people I've worked with suck at code reviews, but that's pretty normal.
Not to say your answer is wrong. I think the gist is accurate. But I think tooling will get better at answering exactly the kind of questions you bring up.
Also, someone has to be responsible. I don't think the industry can continue with this BS "AI broke it." Our jobs might devolve into something more akin to a SDET role and writing the "last mile" of novel code the AI can't produce accurately.
Yes, seriously (not OP). Sometimes it's dumb as rocks, sometimes it's frighteningly astute.
I'm not sure at which point of the technology sigmoid curve we find ourselves (2007 iPhone or 2017 iPhone?) but you're doing yourself a disservice to be so dismissive
ChatGPT, is that you?
What we do know is this. If AI keeps improving at the current rate it’s improving then it will eventually hit a point where we don’t need software engineers. That’s inevitable. The way for it to not happen is for this technology to hit an impenetrable wall.
This wave of AI came so fast that there are still stubborn people who think it’s a stochastic parrot. They missed the boat.
What doesn't help is that the current state of AI adoption is heavily top-down. What I mean is the buy-in is coming from the leadership class and the shareholder class, both of whom have the incentive to remove the necessary evil of human beings from their processes. Ironically, these classes are perhaps the least qualified to decide whether generative AI can replace swathes of their workforce without serious unforeseen consequences. To make matters worse, those consequences might be as distal as too many NEETs in the system such that no one can afford to buy their crap anymore; good luck getting anyone focused on making it to the next financial quarter to give a shit about that. And that's really all that matters at the end of the day; what leadership believes, whether or not they are in touch with reality.
How has free code, developed by humans, become more available than ever and yet somehow we have had to employ more and more developers? Why didn't we trend toward less developers?
It just doesn't make sense. AI is nothing but a snippet generator, a static analyzer, a linter, a compiler, an LSP, a google search, a copy paste from stackoverflow, all technologies we've had for a long time, all things developers used to have to go without at some point in history.
I don't have the answers.
Reading Weizenbaum today is eye opening: https://en.wikipedia.org/wiki/Computer_Power_and_Human_Reaso...
I feel like LLM's are specifically selected to impress people that have a lot of influence. People like investors and CEO's. Because a "AI" that does not impress this section of the population does not get adopted widely.
This is one of the reasons I think AI will never really be an expert as it does not need to be. It only needs to adopt a skill (for example coding) to pass the examination of the groups that decide if it is to be used. It needs to be "good enough to pass".
This is the most important quote for any AI coding discussion.
Anyone that doesn't understand how the tools they use came to be is doomed to reinvent them.
> The folly of many people now claiming that “prompts are the new source code”,
These are the same people that create applications in MS Excel.
It really put everything into perspective to where we are now.
Pre-industrial revolution whole towns and families built clothing and had techniques to make quality clothes.
When the machines came out it wasn't overnight but it wiped out nearly all cottage industries.
The clothing it made wasn't to the same level of quality, but you could churn it out faster and cheaper. There was also the novelty of having clothes from a machine which later normalised it.
We are at the beginning of the end of the cottage industry for developers.
Writing code is not at all the most time consuming part of software development.
If llms can churn out software like Toyota churns out cars, AND do maintenance on it, then the craftsmen (developers of today) are going to be displaced.
We had "free clothes" for years, decades now. I don't mean cheap I mean literally free, as in $0.0 software. Cheaper software isn't new.
Also there are still clothe designers, fashion runways, and expensive Patagonia vests today. The clothing industry is radically different from back then but it's definitely not gone.
> The clothing industry is radically different from back then but it's definitely not gone.
Small towns had generations of people who had learned skills in making clothing / yarn. To do the work you needed years of experience and that's all you knew.
Once the industrial revolution hit they hired low skilled workers that could be dumped at a moments notice. It made whole villages destitute. Some survived, but the far majority became poor.
That was one industry. We now have AI at a point to wipe out multiple industries to a similar scale.
We have cheap (or free) software for large markets, and certain small markets where software developers with hobbies have made something. If every niche that will never be able to afford a large 6-figure custom software could get slop software for an affordable price, then that establishes a foot-hold for working its way up the quality ladder.
VB6 and Delphi were the best possible cognitive impedance match available for domain experts to be able to whip up something that could get a job done. We haven't had anything nearly as productive in the decades since, as far as just letting a normie get something done with a computer.
You'd then hire an actual programmer to come in and take care of corner cases, and make things actually reliable, and usable by others. We're facing a very similar situation now, the AI might be able to generate a brittle and barely functional program, but you're still going to have to have real programmers make it stable and usable.
Not even a nitpick because the scale is indeed orders of magnitude different, but…
There was a lot of chatter, government attention, and both public and private money chasing 4GL’s in the 1980’s, due to what turned out to be (for the US, at least) a phantom “software crisis”. It was mostly for naught?
Same in Japan, where you can argue about the motivations of MITI’s “Fourth Generation Project”, but at its software core was a primeval 4GL in the form of Prolog. Their perceptions of a Japanese software crisis were ultimately more well-founded.
I would only be happy if in the end the author turns out to be right.
But as the things stand right now, I can see a significant boost to my own productivity, which leads me to believe that fewer people are going to be needed.
Ironically, like an LLM, this article feels like more like an amalgamation of plenty of other opinions on the growth of AI in the workplace rather than any original thoughts. There's not really anything "new" here, just putting together a load of existing opinions.
(I am not suggesting that Jason got an AI to write this article, though that would be funny).
I can see a future where software development goes the same way. My wife works in science and I see all kinds of things in a casual observation of her work that could be made more efficient with good software support. But not by enough to pay six-figures per year for multiple devs to create it. So it doesn’t get done and her work and the work of tens of thousands like her around the world is less efficient as a result.
In a world where development is even half as expensive, many such tasks become approachable. If it becomes a third or quarter as expensive, even more applications are now profitable.
I think far more people will be doing something that creates the outcomes that are today created by SWEs manually coding. I doubt it will be quite as lucrative for the median person doing it, but I think it will still be well above the median wage and there will be a lot of it.
1. Fundamentals do change, Yuval Noah Harari made this point in the book Sapiens, but basically there are core beliefs (in fact the idea that things do change for the better is relatively new, “the only constant is change”. Wasn’t really true before the 19th century.
What does “the great disrupter we can’t adapt to” mean exactly? If humans annihilate themselves from climate change, the earth will adapt, the solar system will shrug it off and the universe won’t even realize it happened.
But like, I am 100% sure humans will adapt to the AI revolution. Maybe we let 7 billion people die off, and the 1% of the rest enslave the rest of us to be masseuses and prostitutes and live like kings with robot servants, but I’m not super comfortable with that definition if “adaptation”.
For most of human history and most of the world “the rest of us” don’t live all that well, is that adaptation? I think most people include a healthy large, and growing middle class in their definition of success metrics.
There might be slight local dips along the timeline, but I think most Westerners (and maybe most people, but my lived experience is Western) would not willingly trade places with their same-percentile positioned selves from 100, 200, 500, 1000, 2000, 10K, 50K, or 250K years ago. The fact that few would choose to switch has to be viewed with some positive coefficient in a reasonable success metric.
One can treat current LLMs as a layer of "cheese" for any software development or deployment pipeline, so the goal of adding them should be an improvement for a measurable metric (code quality, uptime, development cost, successful transactions, etc).
Of course, one has to understand the chosen LLM behaviour for each specific scenario - are they like Swiss cheese (small numbers of large holes) or more like Havarti cheese (large number of small holes), and treat them accordingly.
It's another interesting attempt at normalising the bullshit output by LLMs, but NO. Even with the entshittified Boeing, the aviation industry safety and reliability records, are far far far above deterministic software (know for a lot of un-reliability itself), and deterministic, B2C software to LLMs in turn is what Boeing and Airbus software and hardware reliablity are for the B2C software...So you cannot even begin to apply aviation industry paradigms to the shit machines, please.
Engines are reliable to about 1 anomaly per million flight hours or so, current flight software is more reliable, on order of 1 fault per billion hours. In-flight engine shutdowns are fairly common, while major software anomalies are much rarer.
I used LLMs for coding and troubleshooting, and while they can definitely "hit" and "miss", they don't only "miss".
[The "children of the magenta line"](https://www.computer.org/csdl/magazine/sp/2015/05/msp2015050...) is a god explanation of this, and is partly why I often dissuade junior devs from pretty user friendly using tools that abstract away the logic beneath them.
So, yeah, I agree.
Press X to doubt.
Yes software developer are still going to be need, except much fewer of us, exactly like fully automated factories still need a few humans around, to control and build the factory in first place.
Am puzzled why so many on HN cannot see this. I guess most users on HN are employed? Your employers - let me tell you - are positively salivating at the prospect of firing you. The better LLM's get the fewer of you will be needed.
I have seen projects where translator teams got reduced, asset creation teams, devops head count, support teams on phone lines,...
It is all about how to do more with less, now with AI help as well.
You can learn a foreign language just fine, but if you mangle the pronunciation, no one will talk to you. Same thing with hacking at software without understanding the above elements. Your software will be mangled and no one will use it.
the quality of how maintainable the source code is has no bearing on how a user perceives the software's usefulness.
If the software serves a good function for the user, they will use it, regardless of how badly the datastructures are. Of course, good function also means reliability from the POV of the user. If your software is so bad that you lose data, obviously no one will use it.
But you are conflating the maintainability and sensibilities of clean tools, clean code and clean workspaces, with output.
A messy carpentry workshop can still produce great furniture.
The intention/perspective of development is something on its own and doesn’t correspond to the end result directly.
This is such a complex issue that everything comes down to what someone believes
LLM's may not generate the best code but they need only to generate useful code to warrant their use.
At the end, I don't see it going beyond being a glorified form-assistant who can search internet for answers and summarize. That boils down to chat bots that will remain and become part of every software component that ever need to interface with humans.
Agent stuff is just a fluff that is providing hype-cushion around chat bots and will go away with hype cycle.
> The hard part of computer programming isn't expressing what we want the machine to do in code. The hard part is turning human thinking -- with all its wooliness and ambiguity and contradictions -- into computational thinking that is logically precise and unambiguous, and that can then be expressed formally in the syntax of a programming language.
> That was the hard part when programmers were punching holes in cards. It was the hard part when they were typing COBOL code. It was the hard part when they were bringing Visual Basic GUIs to life (presumably to track the killer's IP address). And it's the hard part when they're prompting language models to predict plausible-looking Python.
> The hard part has always been – and likely will continue to be for many years to come – knowing exactly what to ask for.
I don't agree with this:
> To folks who say this technology isn’t going anywhere, I would remind them of just how expensive these models are to build and what massive losses they’re incurring. Yes, you could carry on using your local instance of some small model distilled from a hyper-scale model trained today. But as the years roll by, you may find not being able to move on from the programming language and library versions it was trained on a tad constraining.
Some of the best Chinese models (which are genuinely competitive with the frontier models from OpenAI / Anthropic / Gemini) claim to have been trained for single-digit millions of dollars. I'm not at all worried that the bubble will burst and new models will stop being trained and the existing ones will lose their utility - I think what we have now is a permanent baseline for what will be available in the future.
Weren't these smaller models trained by distillation from larger ones, which therefore have to exist in order to do it? Are there examples of near state of the art foundation models being trained from scratch in low millions of dollars? (This is a genuine question, not arguing. I'm not knowledgeable in this area.)
Kimi K2 Thinking was reportedly trained for $4.6m: https://www.cnbc.com/2025/11/06/alibaba-backed-moonshot-rele...
Both of those were frontier models at the time of their release.
Another interesting number here is Claude 3.7 Sonnet, which may people (myself included) considered the best model for several months after its release and was apparently trained for "a few tens of millions of dollars": https://www.oneusefulthing.org/p/a-new-generation-of-ais-cla...
If they are then they're in trouble, because the more paying customers they get the more money they lose!
I'm not convinced DeepSeek is making money hosting these, but it's not that far off from it I suspect. They could triple their prices and still be cheaper than Anthropic is now.
The massive increase in slop code and loss of innovation in code will establish an unavoidable limit on LLMs.
I mean, in general not only do they have all of the crappy PHP code in existence in their corpus but they also have Principia Mathematica, or probably The Art of Computer Programming. And it has become increasingly clear to me that the models have bridged the gap between "autocomplete based on code I've seen" to some sort of distillation of first order logic based on them just reading a lot of language... and some fuzzy attempt at reasoning that came out of it.
Plus the agentic tools driving them are increasingly ruthless at wringing out good results.
That said -- I think there is a natural cap on what they can get at as pure coding machines. They're pretty much there IMHO. The results are usually -- I get what I asked for, almost 100%, and it tends to "just do the right thing."
I think the next step is actually to actually make it scale and make it profitable but also...
fix the tools -- they're not what I want as an engineer. They try to take over, and they don't put me in control, and they create a very difficult review and maintenance problem. Not because they make bad code but because they make code that nobody feels responsible for.
Also, training data isn’t just crawled text from the internet anymore, but also sourced from interactions of millions of developers with coding agents, manually provided sample sessions, deliberately generated code, and more—there is a massive amount of money and research involved here, so that’s another bet I wouldn’t be willing to make.
Software is incredibly expensive and has made up for it with low marginal costs. Many small markets could potentially be served by slop software, and it's better than what they would have otherwise gotten (which is nothing).
This blurb is the whole axiom on which the author built their theory. In my opinion it is not accurate, to say the least. And I say this as someone who is still underwhelmed by current AI for coding.
The reality turned out to be somewhere in the middle. Crypto didn't replace our financial system, but it exists as a 1-2 trillion dollar segment serving a particular (though controversial) niche of the global economy. It's not going to zero anytime soon.
I think AI/LLMs will follow the same path. There's definitely a level of usefulness there. But we've clearly hit a ceiling since 3.5/4.0. Advancement has only happened in benchmarks and open source models. Also, the idea that a neural net that accepts a fixed amount of padded tokens and returns a list of probabilities will replace the complexities of the human brain is delusional at best.
The only real issue I see is that certain actors in the US have taken such large positions that unwinding them could potentially destroy the US economy at worst, or trigger a recession at best. But this mainly concerns the US, which is on an AI high at the moment.
In all of the cases the main prediction that was made came true. The cost, especially the human cost, of developing some piece of software dramatically decreased. The only reason why the amount of programmers needed still rose was because the amount of software needed rose faster.
Clearly that trend will not hold forever.
>The hard part of computer programming isn’t expressing what we want the machine to do in code. The hard part is turning human thinking – with all its wooliness and ambiguity and contradictions – into computational thinking that is logically precise and unambiguous, and that can then be expressed formally in the syntax of a programming language.
And there is exactly one single technology which has ever been able to do this task, which is LLMs. Not addressing the elephant in the room, which is that an LLM can actually take such instructions and produce something meaningful with it, just makes the whole article worthless.
Everything in this article is just inverse special pleading. Since the last N times, the most enthusiastic predictions did not come true, this time only minor changes can happen. If LLMs are only a revolution on the scale of fast interpreted languages (which have significantly impacted what a small team is capable of delivering in terms of complexity), then they will drastically impact most of the software industry.
If these changes happen, and simultaneously the rate at which software is demanded does not also increase (why would it?), then the implications will be extremely serious. Especially if you are not a developer in an established position.