I just have to conclude 1 of 2 things:
1) I'm not good at prompting, even though I am one of the earliest AI in coding adopters I know, and have been consistent for years. So I find this hard to accept.
2) Other people are just less picky than I am, or they have a less thorough review culture that lets subpar code slide more often.
I'm not sure what else I can take from the situation. For context, I work on a 15 year old Java Spring + React (with some old pages still in Thymeleaf) web application. There are many sub-services, two separate databases,and this application needs to also 2-way interface with customer hardware. So, not a simple project, but still. I can't imagine it's way more complicated than most enterprise/legacy projects...
I’ve come back to the idea LLMs are super search engines. If you ask it a narrow, specific question, with one answer, you may well get the answer. For the “non-trivial” questions, there always will be multiple answers, and you’ll get from the LLM all of these depending on the precise words you use to prompt it. You won’t get the best answer, and in a complex scenario requiring highly recursive cross-checks— some answers you get won’t be functional.
It’s not readily apparent at first blush the LLM is doing this, giving all the answers. And, for a novice who doesn’t know the options, or an expert who can scan a list of options quickly and steer the LLM, it’s incredibly useful. But giving all the answers without strong guidance on non-trivial architectural points— entropy. LLMs churning independently quickly devolve into entropy.
Typical iterative-circular process "write code -> QA -> fix remarks" works because the code is analyzable and "fix" is on average cheaper than "write", therefore the process, eventually, converges on a "correct" solution.
LLM prompting is on average much less analyzable (if at all) and therefore the process "prompt LLM -> QA -> fix prompt" falls somewhere between "does not converge" and "convergence tail is much longer".
This is consistent with typical observation where LLMs are working better: greenfield implementations of "slap something together" and "modify well structured, uncoupled existing codebase", both situations where convergence is easier in the first place, i.e. low existing entropy.
The real problem btw was a bug introduced in the PDF handeling package 2 versions ago that caused resource handeling problems in some contexts, and the real solution was roling back to the version before the bug.
I'm still using AI daily in my development though, as as long as you sort of know what you are doing and have enough knowledge to evaluate it is very much a net productivity multiplier for me.
Now I'm wondering if I'm prompting wrong. I usually get one answer. Maybe a few options but rarely the whole picture.
I do like the super search engine view though. I often know what I want, but e.g. work with a language or library I'm not super familiar with. So then I ask how do I do x in this setting. It's really great for getting an initial idea here.
Then it gives me maybe one or two options, but they're verbose or add unneeded complexity. Then I start probing asking if this could be done another way, or if there's a simpler solution to this.
Then I ask what are the trade-offs between solutions. Etc.
It's maybe a mix of search engine and rubber ducking.
Agents are, like for OP, a complete failure for me though. Still can't get them to not run off into a completely strange direction, leaving a minefield of subtle coding errors and spaghetti behind.
And almost every time it screws up we create a test, and often for the whole class of problem. More recent it’s been far better behaved. Between Opus, skills, docs, generating Mermaid diagrams, tests it’s been a lot better. I’ve also cleaned up so much of the architecture so there’s only one way to do things. This keeps it more aligned and helps with entropy. And they’ll work better as models improve. Having a match between code, documents and tests means it’s not just relying on one source.
Prompts like this seem to work: “what’s the ideal way to do this? Don’t be pragmatic. Tokens are cheaper than me hunting bugs down years later”
Yes! This is exactly what it is. A search engine with a lossy-compressed dataset of most public human knowledge, which can return the results in natural language. This is the realization that will pop the AI bubble if the public could ever bring themselves to ponder it en masse. Is such a thing useful? Hell yes! Is such a thing intellegent? Certainly NO!
That's why I think you can work iteratively on code and change parts of the code while keeping others, because the code gets chunked and "probabilitized'. It can also do semantic processing and understanding where it can apply knowledge about one topic (like 'swimming') to another topic (like a 'swimming spaceship', it then generates text about what a swimming spaceship would be which is not in the dataset). It chunks it into patterns of probability and then combines them based on probability. I do think this is a lossy process though which sucks.
LLMs possess and can retrieve knowledge but they don't understand it, and when people try to get them to do that it's like talking to a non-expert who has been coached to smalltalk with experts. I remember reading about a guy who did this with his wife so she could have fun when travelling to conferences with him!
A proofreader would have caught this humorous gaffe. In fact, one just did.
What I will argue is that the LLMs are not just search engines. They have "compressed" knowledge. When they do this, they learn relations between all kinds of different levels of abstractions and meta patterns.
It is really important to understand that the model can follow logical rules and has some map of meta relationships between concepts.
Thinking of a LLM as a "search engine" is just fundamentally wrong in how they work, especially when connected to external context like code bases or live information.
After all, until quite recently, chess engines really were quite mechanically search engines too.
Also: in my experience 1. and 2. are not needed for you to have bad results. The existing code base is a fundamental variable. The more complex / convoluted it is, the worse is the result. Also in my experience LLMs are constantly better at producing C code than anything else (Python included).
I have the feeling that the simplicity of the code bases I produced over the years, and that now I modify with LLMs, and the fact they are mostly in C, is a big factor why LLMs appear to work so well for me.
Another thing: Opus 4.5 for me is bad on the web, compared to Gemini 3 PRO / GPT 5.2, and very good if used with Claude Code, since it requires to reiterate to reach the solution, why the others sometimes are better first-shotter. If you generate code via the web interface, this could be another cause.
There are tons of variables.
There's been a notable jump over the course of the last few months, to where I'd say it's inevitable. For a while I was holding out for them to hit a ceiling where we'd look back and laugh at the idea they'd ever replace human coders. Now, it seems much more like a matter of time.
Ultimately I think over the next two years or so, Anthropic and OpenAI will evolve their product from "coding assistant" to "engineering team replacement", which will include standard tools and frameworks that they each specialize in (vendor lock in, perhaps), but also ways to plug in other tech as well. The idea being, they market directly to the product team, not to engineers who may have specific experience with one language, framework, database, or whatever.
I also think we'll see a revival of monolithic architectures. Right now, services are split up mainly because project/team workflows are also distributed so they can be done in parallel while minimizing conflicts. As AI makes dev cycles faster that will be far less useful, while having a single house for all your logic will be a huge benefit for AI analysis.
I think instead the value is in getting a computer to execute domain-specific knowledge organized in a way that makes sense for the business, and in the context of those private computing resources.
It's not about the ability to write code. There are already many businesses running low-code and no-code solutions, yet they still have software engineers writing integration code, debugging and making tweaks, in touch with vendor support, etc. This has been true for at least a decade!
That integration work and domain-specific knowledge is already distilled out at a lot of places, but it's still not trivial. It's actually the opposite. AI doesn't help when you've finally shaved the yak smooth.
I have't checked the stats lately, but at one point most software written was in non-tech companies for the single business. The first 1/2 of my career was spent writing in-house software for a company that did everything from custom reporting and performance tracking to scraping data of automated phone dialers. There's so much software out there that effectively has a user base of a single company.
A lot of businesses are the only users of their own software. They write and use software in-house in order to accomplish business tasks. If they could get rid of their engineers, they would, since then they'd only have to pay the other employees who use the software.
They're much less likely to get rid of the user employees because those folks don't command engineer salaries.
The way I see feature development in the future is, PM creates a dev cluster (also much easier with a monolith), has AI implement a bunch of features to spec, AI provides some feedback and gets input on anywhere it might conflict with existing functionality, whether eventual consistency is okay, which pieces are performance criticial, etc., and provides the implementation, a bunch of tests for review, and errata about where to find observability data, design decisions considered and chosen, etc. PM does some manual testing across various personas and products (along with PMs from those teams), has AI add feature flags, launches. The feature flag rollout ends up being the long-pole, since generally the product team needs to monitor usage data for some time before increasing the rollout percentage.
So I see that kind of workflow as being a lot easier in a monolithic service. Granted, that's a few years down the road though, before we have AI reliable enough to do that kind of work.
Today we have chatGPT and only now will teams be uninsurable and sued into oblivion? LOL
If you want to go further, you can even require the LLM to produce a machine checkable proof that the software is correct. That's beyond the state of the art at the moment, but it's far from 'unsolvable'.
If you hallucinate such a proof, it'll just not work. Feed back the error message from the proof checker to your coding assistant, and the hallucination goes away / isn't a problem.
> you can easily check whether it builds and passes tests.
This link were on HN recently: https://spectrum.ieee.org/ai-coding-degrades "...recently released LLMs, such as GPT-5, have a much more insidious method of failure. They often generate code that fails to perform as intended, but which on the surface seems to run successfully, avoiding syntax errors or obvious crashes. It does this by removing safety checks, or by creating fake output that matches the desired format, or through a variety of other techniques to avoid crashing during execution."
The trend for LLM generated code is to build and pass tests but do not deliver functionality needed.Also, please consider how SQLite is tested: https://sqlite.org/testing.html
The ratio between test code and code itself is mere 590 times (590 LOC of tests per LOC of actual code), it used to be more than 1100.
Here is notes on current release: https://sqlite.org/releaselog/3_51_2.html
Notice fixes there. Despite being one of the most, if not the most, tested pieces of software in the world, it still contains errors.
> If you want to go further, you can even require the LLM to produce a machine checkable proof that the software is correct.
Haha. How do you reconcile a proof with actual code?It proudly declared the task done.
You can either proof your Rust code correct, or you can use a proof system that allows you to extract executable code from the proofs. Both approaches have been done in practice.
Or what do you mean?
Also tests and proof checkers only catch what they’re asked to check, if the LLM misunderstands intent but produces a consistent implementation+proof, everything “passes” and is still wrong.
> Ultimately I think over the next two years or so, Anthropic and OpenAI will evolve their product from "coding assistant" to "engineering team replacement", which will include standard tools and frameworks that they each specialize in (vendor lock in, perhaps), but also ways to plug in other tech as well.
This is the context of how this thread started, and this is the context in which DrammBA was saying that the spec problem is very hard to fix [without an engineering team].
To repeat: To specify what a "system" we want to create does, is a highly complicated task which needs human engioneers who understand the requirements for the system, and how parts of those reqquirements/specs interact with opther parts of the spec, what are the conbsequences of one spec to other specs.
So I expect that software engineers will still be in high demand, but they will be much more productive with AI than without it. This means there will be much more software because it will be cheaper to produce it. And the quality of the softweare will be higher in terms of doing what humans need it to do. Usability. Correctness. Evolvability. In a sense the natural language-spec we give the AI is really like something written in a very high-level programming-language - the language of engineers.
Given so much of the work of managing these systems has become so rote now, my only conclusion is that all that's left (before getting to 95+% engineer replacement) is an "agent engineering" problem, not an AI research problem.
(Termination in the wider sense: for example an event loop has to be able to finish each run through the loop in finite time.)
You can see eg Rust's or Haskell's type system as another light-weight formal model that lets you make and proof some simple statements, without having a full formal spec of the whole desired behaviour of the system.
The critical bugs here are related to security (DDoS attacks, authorization and authentication, data exfiltration, etc), concurrency, performance, data corruption, transactionality and so forth. Most enterprise systems are distributed or at least concurrent systems which depend on several components like databases, distributed lock managers, transaction managers, and so forth, where developing a proper formal spec is a monumental task and possibly impossible to do in a meaningful way because these systems were not initially developed with formal verification in mind. The formal spec, if faithful, will have to be huge to capture all the weird edge cases.
Even if you had all that, you need to actually formulate important properties of your application in a formal language. I have no idea how to even begin doing that for the vast majority of the work I do.
Proving the correctness of linear programs using techniques such as Hoare logic is hard enough already for anything but small algorithms. Proving the correctness of concurrent programs operating on complex data structures requires much more advanced techniques, setting up complicated logical relations and dealing with things like separation logic. It's an entirely different beast, and I honestly do not see LLMs as a panacea that will suddenly make these things scale for anything remotely close in size to a modern enterprise system.
I just gave the simplest example I could think of.
And termination is actually a much stronger and more useful property than you make it out to be---in the face of locks and concurrency.
But the context of this thread is the idea that the user daxfohl launched that these companies will, in the next few years, launch an "engineering team replacement" program; and then the user eru claimed that this is indeed more doable in programming than other domains because you can have specs and tests for programs in a way that you can't for, say, an animated movie.
I can see an argument where you can get none programers to create the input and output of said tests but if the can do that, they are basically programmers.
This is of course leaving aside that half the stated use cases I hear for AI are that it can 'write the tests for you'. If it is writing the code and the tests it is pointless.
What tests? You can't trust the tests that the LLM writes, and if you can write detailed tests yourself you might as well write the damn software.
At some point a human has to actually use their brain to decide what the actual goals of a given task are. That person needs to be a domain expert to draw the lines correctly. There's no shortcut around that, and throwing more stochastic parrots at it doesn't help.
For comparison have a look at compilers: nowadays approximately no one writes their software by hand, we write a 'prompt' in something like Rust or C, and ask another computer program to create the actual software.
We still need the human in the loop here, but it takes much less human time than creating the ELF directly.
I think this is part of it.
When coding style has been established among a team, or within an app, there are a lot of extra hoops to jump through, just to get it to look The Right Way, with no detectable benefit to the user.
If you put those choices aside and simply say: does it accomplish the goal per the spec (and is safe and scalable[0]), then you can get away with a lot more without the end user ever having a clue.
Sure, there's the argument for maintainability, and vibe coded monoliths tend to collapse in on themselves at ~30,000 LOC. But it used to be 2,000 LOC just a couple of years ago. Temporary problem.
[0]insisting that something be scalable isn't even necessary imo
It feels like you're diminishing the parent commenter's views, reducing it to the perspective of style. Their comment didn't mention style.
i.e. not a greenfield project.
Morphing an already decent PR into a different coding style is actually something that LLMs should excel at.
For what it's worth, multiple times in my career, I've worked at shops that once thought they could do it quick and cheap and it would be good enough, and then had to hire someone 'picky' like me to sort out the inevitable money-losing mess.
From what I've seen even Opus 4.5 spit, the 'picky' are going to remain in demand for a little while longer still. Will that last? No clue. We'll see.
A coding agent just beat every human in the AtCoder Heuristic optimization contest. It also beat the solution that the production team for the contest put together. https://sakana.ai/ahc058/
It's not enterprise-grade software, but it's not a CRUD app with thousands of examples in github, either.
Optimization is a very simple problem though.
Maintaining a random CRUD app from some startup is harder work.
C'mon, there's post every other week that optimization never happens anymore because it's too hard. If AI can take all the crap code humans are writing and make it better, that sounds like a huge win.
Program optimization problems are less simple than both, but still simpler than free-form CRUD apps with fuzzy, open ended acceptance criteria. It would stand to reason an autonomous agent would do well at mathematically challenging problems with bounded search space and automatically testable and quantifiable output.
(Not GP but I assume that's what they were getting at)
I'm happy enough for it to automate away the trivial coding tasks. That's an immense force multiplier in its own right.
For personal projects on the other hand, it has expedited me what? 10x, 30x? It's not measurable. My output has been so much more than what would have been possible earlier, that there is no benchmark because these level of projects would not have been getting completed in the first place.
Back to using at work: I think it's a skill issue. Both on my end and yours. We haven't found a way to encode our domain knowledge into AI and transcend into orchestrators of that AI.
Doesn't match my experience, that figure is closer to about 20-40% to me, though a lot of those changes I want are possible by just further prompting OR turning to a different model, or adding some automated checks that promptly fail and the AI can do a few more loops of fixes.
> Other people are just less picky than I am, or they have a less thorough review culture that lets subpar code slide more often.
This is also likely, or you are just doing stuff that is worse represented in the training data, or working on novel things where the output isn't as good. But I'm leaning towards people just being picky about what they view as "good code" (or underspecifying how the AI is supposed to output it) at least roughly since Sonnet 4, since with some people I work with it's just endless and oftentimes meaningless discussions and bikeshedding when in code review.
You can always be like: "This here pattern in these 20 files is Good Code™, use the same collection of approaches and code style when working on this refactoring/new feature."
…and then add that to your CLAUDE.md, and never worry about having to say it again manually.
What helped me a bunch was having prebuild scripts (can be Bash, can be Python, can be whatever) for each of the architectural or style conventions I want to enforce. Tools like ESLint are also nice but focused a bit more on the code than architecture/structure.
Problems start when a colleague might just remove some of those due to personal preference without discussion but then you have other problems - in my experience, with proper controls in place AI will cause less issues and friction than people (ofc depending on culture fit).
Not coding related but my wife is certainly better than most and yet I’ve had to reprompt certain questions she’s asked ChatGPT because she gave it inadequate context. People are awful at that. Us coders are probably better off than most but just as with human communication if you’re not explaining things correctly you’re going to get garbage back.
If you buy a table saw and can't figure out how to cut a straight line in a piece of wood with it - or keep cutting your fingers off - but didn't take any time at all to learn how to use it, that's on you.
Likewise a car, you have to take lessons and a test before you can use those!
Why should LLMs be any different?
The fact that an LLM works great for one user on one project does not mean it will work equally great for another user on a different project. It might! It might work better. It might work worse.
And both users might be using the tool equally well, with equal skill, insofar as their part goes.
The various magic incantations that LLMs require cannot be learned or repeated. Whatever the "just one more prompt bro" du jour you're thinking of may or may not work at any given time for any given project in any given language.
It's the same programming with LLMs. Through experience, you build up intuition and rules of thumb that allow you to get good results, even if you don't get exactly the same result every time.
It's like you buy Visual Studio and don't believe anyone who tells you that it's complex software with a lot of hidden features and settings that you need to explore in order to use it to its full potential.
I use LLMs as a better form of search engines and that's a useful product.
And that's the only issue here. Many programmers feel offended by an AI threatening their livelihood, and are too arrogant to invest some time in a tool they do deem below themselves—then proceed to complain how useless the tool is on the internet.
I'd really suggest taking antirez' advice at heart, and invest time in actually learning how to work with AI properly. Just because Claude Code has a text prompt like ChatGPT doesn't mean you know how to work with it yet. It is going to pay off.
I think this touches on the root of the issue. I am seeing a results over process winning. Code quality will reduce. Out of touch or apathetic project management who prioritize results, now are even more emboldened to have more tech debt riddled code
A bummer is that we have a genai team (louie.ai) and a gpu/viz/graph analytics team (graphistry), and those who have spent the last 2-3 years doing genai daily have a higher uptake rate here than those who aren't. I wouldn't say team 1 is better than team 2 in general: these are tools, and different people have different engineering skill and ai coding skill, including different amounts of time doing both.
What was a revelation for me personally was taking 1-2mo early in claude code's release was to go full cold turkey on manual coding, similar to getting immersed in a foreign language. That forced eliminating a lot of bad habits wrt effective ai coding both personally and in state of our repo tooling. Since then, it's been steady work to accelerate and smooth that loop, eg, moving from vibe coding/engineering to now more eval-driven ai coding loops: https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-t... . That takes a LOT of buildout.
Gee I wonder why
Code quality is an issue you need to ignore with vibe coding - if code quality is important to your project or you then it’s not an issue. But if you abandon this concept and build things small enough or modular enough then speed gains await!
IMO codebases can be architected for LLMs to work better in them, but this is harder in brownfield apps.
Greenfield is fundamentally easier than maintaining existing software. Once software exists, users expect it to behave a certain way and they expect their data to remain usable in new versions.
The existing software now imposes all sorts of contraints that may not be explicit in the spec. Some of these constraints end up making some changes very hard. Bad assumptions in data modeling can make migrations a nightmare.
You can't just write entirely new software every time the requirements change.
Most of the architectural failures come from it still not having the whole codebase in mind when changing stuff.
The funny thing is reviewing stuff claude has made isn't actually unfamiliar to me in the slightest. It's something I'm intimately familiar with and have been intimately familiar with for many years, long before this AI stuff blew up...
..it's what code I've reviewed/maintained/rejected looks like when a consulting company was brought on board to build something. Such a company that leverages probably underpaid and overworked laborers both overseas and US based workers on visas. The delivered documentation/code is noisy+disjointed.
Yeah, which is what you get if your memory consists of everything you’ve read in the past 20 minutes. Most of my Claude work involves pointing it at the right things.
I think that mentally estimating the problem space helps. These things are probabilistic models, and if there are a million solutions the chance of getting the right one is clearly unlikely.
Feeding back results from tests really helps too.
I think part of the problem a lot of senior devs are having is that they see what they do as an artisanal craft. The rest of the world just sees the code as a means to an end.
I don't care how elegantly my toaster was crafted as long as it toasts the bread and doesn't break.
- it becomes brittle and rigid (can't change it, can't add to it)
- it becomes buggy and impossible to fix one bug without creating another
- it becomes harder to tell what it's doing
- plus it can be inefficient / slow / insecure, etc.
The problem with your analogy is that toasters are quite simple. The better example would be your computer, and if you want your computer to just run your programs and not break, then these things matter.
* You have made a new file format. Consider that it will live forever.
* You have added exactly what the user/product team asked for it. It must be supported forever.
Part of my job is to push back on user requests. I also think a lot about ease of use.
I think even with an LLM that can one-shot a task, the engineer writing the prompt must still have "engineering judgment".
Think of all the awful cheapest android phones and Windows PCs and laptops that are slow, buggy, have not had a security update in however long and are thus insecure, become virtually unusable within a couple years. The majority of the people in the world live on such devices either because they don't know better or have no better option. The world continues to turn.
People are fine with imperfection in their products, we're all used to it in various aspects of our lives.
Code being buggy, brittle, hard to extend, inefficient, slow, insecure. None of those are actual deal breakers to the end user, or the owners of the companies, and that's all that really matters at the end of the day in determining whether or not the product will sell and continue to exist.
If we think of it in terms of evolution, the selection pressure of all the things you listed is actually very weak in determining whether or not the thing survives and proliferates.
> The usefulness is a function of how quickly the consequences from poor coding arrive and how meaningful they are to the organization.
I would just add that these hypothetical senior devs we are talking about are real people with careers, accountability and responsibilities. So when their company says "we want the software to do X" those engineers may be responsible for making it happen and accountable if it takes too long or goes wrong.
So rather than thinking of them as being irrationally fixated on the artisanal aspect (which can happen) maybe consider in most cases they are just doing their best to take responsibility for what they think the company wants now and in the future.
At the same time, the direction of software by and large seems to me to be going in the direction of fast fashion. Fast, cheap, replaceable, questionable quality.
Not all software can tolerate this, as I mentioned in another comment, flight control software, the software controlling your nuclear power plant, but the majority of the software in the world is far more trivial and its consumers (and producers) more tolerant of flaws.
I don’t think of seniors as purely irrationally fixated on the artisanal aspect, I also think they are rationally, subconsciously or not, fearful of the implications for their career as the bottom falls out of this industry.
I could be wrong though! Maybe high quality software will continue to be what the industry strives for and high paying jobs to fix the flawed vibe coded slop will proliferate, but I’m more pessimistic than to think that.
Like in finance if your AI trading bot makes a drastic mistake it's immediately realized and can be hugely consequential, so AI is less useful. Retail is somewhat in the middle, but for something like marketing or where the largest function is something with data or managerial the negatives aren't as quickly realized so there can be a lot of hype around AI and what it may be able to do.
Another poster commented how very useful AI was to the insurance industry, which makes total sense, because even then if something is terribly wrong it has only a minor chance of ever being an issue and it's very unlikely that it would have a consequence soon.
A consumer or junior engineer cares whether the toaster toasts the bread and doesn’t break.
Someone who cares about their craft also cares about:
- If I turn the toaster on and leave, can it burn my house down, or just set off the smoke alarm?
- Can it toast more than sliced uniform-thickness bread?
- What if I stick a fork in the toaster? What happens if I drop it in the bathtub while on? Have I made the risks of doing that clear in such a way that my company cannot be sued into oblivion when someone inevitably electrocutes themselves?
- Does it work sideways?
- When it fills up with crumbs after a few months of use, is it obvious (without knowing that this needs to be done or reading the manual) that this should be addressed, and how?
- When should the toaster be replaced? After a certain amount of time? When a certain misbehavior starts happening?
Those aren’t contrived questions in service to a tortured metaphor. They’re things that I would expect every company selling toasters to have dedicated extensive expertise to answering.
> A consumer
is all that ultimately matters.
All those things you’re talking about may or may not matter some day, after years and a class action lawsuit that may or may not materialize or have any material impact on the bottom line of the company producing the toaster, by which time millions of units of subpar toasters that don’t work sideways will have sold.
The world is filled with junk. The majority of what fills the world is junk. There are parts of our society where junk isn’t well tolerated (jet engines, mri machines) but the majority of the world tolerates quite a lot of sloppiness in design and execution and the companies producing those products are happily profitable.
You're right that "there are parts of our society where junk isn't well tolerated", but the scope of those areas is far greater than you give credit for.
It won't be worth it the first few times you try this, and you may not get it to where you want it. I think you might be pickier than others and you might be giving it harder problems, but I also bet you could get better results out of the box after you do this with a few problems.
When things really broke open for me was when I adopted windsurf with Opus 4, and then again with Opus 4.5. I think the way the IDE manages the context and breaks down tasks helps extend llm usefulness a lot, but I haven't tried cursor and haven't really tried to get good at Claude code.
All that said, I have a lot of experience writing in business contexts and I think when I really try I am a pretty good communicator. I find when I am sloppy with prompts I leave a lot more to chance and more often I don't get what I want, but when I'm clear and precise I get what I want. E.g. if it's using sloppy patterns and making bad architectural choices, I've found that I can avoid that by explaining more about what I want and why I want it, or just being explicit about those decisions.
Also, I'm working on smaller projects with less legacy code.
So in summary, it might be a combination of 1, 2 and the age/complexity of the project you're working on.
1) Have an AGENTS.md that describes not just the project structure, but also the product and business (what does it do, who is it for, etc). People expect LLMs to read a snippet of code and be as good as an employee who has implicit understanding of the whole business. You must give it all that information. Tell it to use good practices (DRY, KISS, etc). Add patterns it should use or avoid as you go.
2) It must have source access to anything it interacts with. Use Monorepo, Workspaces, etc.
3) Most important of all, everything must be setup so the agent can iterate, test and validate it's changes. It will make mistakes all the time, just like a human does (even basic syntax errors), but it will iterate and end up on a good solution. It's incorrect to assume it will make perfect code blindly without building, linting, testing, and iterating on it. No human would either. The LLM should be able to determine if a task was completed successfully or not.
4) It is not expected to always one shot perfect code. If you value quality, you will glance at it, and sometimes ahve to reply to make it this other way, extract this, refactor that. Having said that, you shouldn't need to write a single line of code (I haven't for months).
Using LLMs correctly allow you to complete tasks in minutes that would take hours, days, or even weeks, with higher quality and less errors.
Use Opus 4.5 with other LLMs as a fallback when Opus is being dumb.
This was the biggest unlock for me. When I received a bug report I have the LLM tell me where it thinks the source of the bug is located, write a test that triggers the bug/fails, design a fix, finally implement the fix and repeat. I'm routinely surprised how good it is at doing this, and the speed with which it works. So even if I have to manually tweak a few things, I've moved much faster than without the LLM.
Writing logic that verifies something complex requires basically solving the problem entirely already.
Situation B) Model writes a new endpoint, runs lint and build, adds e2e tests with sample data and runs them.
Did situation B mathematically prove the code is correct? No. But the odds the code is correct increases enormously. You see all the time how the Agent finds errors at any of those steps and fixes them, that otherwise would have slipped by.
https://github.com/antirez?tab=overview&from=2026-01-01&to=2...
I have, it becomes a race to the bottom.
So you tell it again, "of course you are right", and the cycle repeats.
And then the context window gets exhausted. Compaction loses most of the details and degrades quality. You start a new session, but the new session has to re-learn the entire world from scratch and may or may not fix the issue.
And so the cycle continued.
Have another model review the code, and use that review as automatic feedback?
the outcome is usually a new or tweaked skill file.
it doesn't always fix the problem, but it's definitely been making some great improvements.
But it is astounding how terrible they are at debugging non-trivial assembly in my experience.
Anyone else have input here?
Am I in a weird bubble? Or is this just not their forte?
It's truly incredible how thoughtless they can be, so I think I'm in a bubble.
Oh god I thought I was the only one. Do you find yourself getting mad at them too?
Even when refactoring, it would change all my comments, which is really annoying, as I put a lot of thought into my comments. Plus, the time it took to do each refactoring step was about how long it would take me, and when I do it I get the additional benefit of feeling when I'm repeating code too often.
So, I'm not using it for now, except for isolating bugs. It's addicting having it work on it for me, but I end up feeling disconnected and then something inevitably goes wrong.
So many people hyping AI are only thinking about new projects and don't even distinguish between what is a product and what is a service.
Most software devs employed today work on maintaining services that have a ton of deliberate decisions baked in that were decided outside of that codebase and driven by business needs.
They are not building shiny new products. That's why most of the positive hype about AI doesn't make sense when you're actually at work and not just playing around with personal projects or startup POCs.
It seems that theres more people writing and finishing projets, but not many have reached the point where they have to maintain their code / deal with the tech debt.
Given how consistently terrible the code of Claude Code-d projects posted here have been, I think this is it.
I find LLMs pretty useful for coding, for multiple things(to write boilerplate, as an idiomatic design pattern search engine, as a rubber duck, helping me name things, explaining unclear error messages, etc.), but I find the grandiose claims a bit ridiculous.
Think about it like compiler output. Literally nobody cares if that is well formatted. They just care that they can get fairly performant code without having to write assembly. People still dip to assembly (very very infrequently now) for really fine performance optimizations, but people used to write large programs in it (miserably).
These are all important when you consider the long-term viability of a change. If you're working in a greenfield project where requirements are constantly changing and you plan on throwing this away in 3 months, maybe it works out fine. But not everyone is doing that, and I'd estimate most professional SWEs are not doing that, even!
It's just the Dunning-Kruger effect. People who think AI is the bee's knees are precisely the dudes who are least qualified to judge its effectiveness.
Instead I recommend that you use LLMs to fix the problems that they introduced as well, and over time you'll get better at figuring out the parts that the LLM will get confused by. My hunch is that you'll find your descriptions of what to implement were more vague than you thought, and as you iterate, you'll learn to be a lot more specific. Basically, you'll find that your taste was more subjective than you thought and you'll rid yourself of the expectation that the LLM magically understands your taste.
I feel like this is not the same for everyone. For some people, the "fire" is literally about "I control a computer", for others "I'm solving a problem for others", and yet for others "I made something that made others smile/cry/feel emotions" and so on.
I think there is a section of programmer who actually do like the actual typing of letters, numbers and special characters into a computer, and for them, I understand LLMs remove the fun part. For me, I initially got into programming because I wanted to ruin other people's websites, then I figured out I needed to know how to build websites first, then I found it more fun to create and share what I've done with others, and they tell me what they think of it. That's my "fire". But I've met so many people who doesn't care an iota about sharing what they built with others, it matters nothing to them.
I guess the conclusion is, not all programmers program for the same reason, for some of us, LLMs helps a lot, and makes things even more fun. For others, LLMs remove the core part of what makes programming fun for them. Hence we get this constant back and forth of "Can't believe others can work like this!" vs "I can't believe others aren't working like this!", but both sides seems to completely miss the other side.
I've worked with all the types, and no type is wrong. For example, I can certainly appreciate the PL researcher type who wants to make everything functional, etc... I won't fight against it as long as it doesn't get in the way of solving the problem. I've also found that my style works well with the other styles because I have way of always asking "so does this solve the problem??" which is sometimes forgotten by the code is beautiful people, etc...
Probably other people feel differently.
When I do have it one-shot a complete problem, I never copy paste from it. I type it all out myself. I didn't pay hundreds of dollars for a mechanical keyboard, tuned to make every keypress a joy, to push code around with a fucking mouse.
I’ve used Claude to create copies of my tests, except instead of testing X feature, it tests Y feature. That has worked reasonably well, except that it has still copied tests from somewhere else too. But the general vibe I get is that it’s better at copying shit than creating it from scratch.
Can’t you use vim controls?
I can recommend for that problem to make the "jumps" smaller, e.g. "Add a react component for the profile section, just put a placeholder for now" instead of "add a user profile".
With coding LLMs there's a bit of a hidden "zoom" functionality by doing that, which can help calibrating the speed/involvment/thinking you and the LLM does.
1. Look at it as a completely different discipline, dont consider it leverage for coding - it's it's own thing.
2. Try using it on something you just want to exist, not something you want to build or are interested in understanding.
3. Make the "jumps" smaller. Don't oneshot the project. Do the thinking yourself, and treat it as a junior programmer: "Let's now add react components for the profile section and mount them. Dont wire them up yet" instead of "Build the profile section". This also helps finding the right speed so that you can keep up with what's happening in the codebase
I don't get any enjoyment from "building something without understanding" — what would I learn from such a thing? How could I trust it to be secure or to not fall over when i enter a weird character? How can I trust something I do not understand or have not read the foundations of? Furthermore, why would I consider myself to have built it?
When I enter a building, I know that an engineer with a degree, or even a team of them, have meticulously built this building taking into account the material stresses of the ground, the fault lines, the stresses of the materials of construction, the wear amounts, etc.
When I make a program, I do the same thing. Either I make something for understanding, OR I make something robust to be used. I want to trust the software I'm using to not contain weird bugs that are difficult to find, as best as I can ensure that. I want to ensure that the code is clean, because code is communication, and communication is an art form — so my code should be clean, readable, and communicative about the concepts that I use to build the thing. LLMs do not assure me of any of this, and the actively hamstring the communication aspect.
Finally, as someone surrounded by artists, who has made art herself, the "doing of it" has been drilled into me as the "making". I don't get the enjoyment of making something, because I wouldn't have made it! You can commission a painting from an artist, but it is hubris to point at a painting you bought or commissioned and go "I made that". But somehow it is acceptable to do this for LLMs. That is a baffling mindset to me!
All of these questions are irrelevant if the objective is 'get this thing working'.
The majority of the work on a lot of famous masterpieces of art was done by apprentices. Under the instruction of a master, but still. No different than someone coming up with a composition, and having AI do a first pass, then going in with photoshop and manually painting over the inadequate parts. Yet people will knob gobble renaissance artists and talk about lynching AI artists.
It's true that many master artists had workshops with apprenticeships. Because they were a trade.
By the time you were helping to paint portraits, you'd spent maybe a decade learning techniques and skill and doing the unimportant parts and working your way up from there.
It wasn't a half-assed, slop some paint around and let the master come fix it later. The people doing things like portrait work or copies of works were highly skilled and experienced.
Typing "an army of Garfields storming the beach at Normandy" into a website is not the same.
These are ways I'd suggest to approach working with LLMs if you enjoy building software, and are trying to find out how it can fit into your workflow.
If this isnt you, these suggestions probably wont work.
> I don't get any enjoyment from "building something without understanding".
That's not what I said. It's about your primary goal. Are you trying to learn technology xyz, and found a project so you can apply it vs you want a solution to your problem, and nothing exists, so you're building it.
What's really important is that wether you understand in the end what the LLM has written or not is 100% your decision.
You can be fully hands off, or you can be involved in every step.
I've been really frustrated with the state of Heart Rate Variability (HRV) research and HRV apps, particularly those that claim to be "biofeedback" but are really just guided breathing exercises by people who seem to have the lights on and nobody home. [1]
I could have spent a lot of time reading the docs to understand the Web Bluetooth API and facing up to the stress that getting anything with Bluetooth working with a PC is super hit and miss so estimating the time I'd expect a high risk of spending hours rebooting my computer and otherwise futzing around to debug connection problems.
Although it's supposedly really easy to do this with the Web Bluetooth API I amazingly couldn't find any examples which made all the more apprehensive that there was some reason it doesn't work. [2]
As it was Junie coded me a simple webapp that pulled R-R intervals from my Polar H10 heart rate monitor in 20 minutes and it worked the first time. And in a few days, I've already got an HRV demo app that is superior to the commercial ones in numerous ways... And I understand how it works 100%.
I wouldn't call it vibe coding because I had my feet on the ground the whole time.
[1] for instance I am used to doing meditation practices with my eyes closed and not holding a 'freakin phone in my hand. why they expect me to look at a phone to pace my breathing when it could talk to be or beep at me is beyond me. for that matter why they try to estimate respiration by looking at my face when they could get if off the accelerometer if i put in on my chest when i am lying down is also beyond me.
[2] let's see, people don't think anything is meaningful if it doesn't involve an app, nobody's gotten a grant to do biofeedback research since 1979 so the last grad student to take a class on the subject is retiring right about now...
I think comments on YouTube like "anyone still here in $CURRENT_YEAR" are low effort noise, I don't care about learning how to write a web extension (web work is my day job) so I got Claude to write one for me. I don't care who wrote it, I just wanted it to exist.
You can bet that "AI" is coming for this too. The lawsuits that will result when buildings crumble and kill people because an LLM "hallucinated" will be tragic, but maybe we'll learn from it. But we probably won't.
I’ve wanted a good markdown editor with automatic synchronization. I used to used inkdrop. Which I stopped using when the developer/owner raised the price to $120/year.
In a couple hours with Claude code, I built a replacement that does everything I want, exactly the way I want. Plus, it integrates native AI chat to create/manage/refine notes and ideas, and it plugs into a knowledge RAG system that I also built using Claude code.
What more could I ask for? This is a tool I wanted for a long time but never wanted to spend the dozens of hours dealing with the various pieces of tech I simply don’t care about long-term.
This was my AI “enlightenment” moment.
I would argue that it's the same question as whether it's possible to get into a flow state when being the "navigator" in a pair-programming session. I feel you and agree that it's not quite the same flow state as typing the code yourself, but when a session with a human programmer or Claude Code is going well for me, I am definitely in something quite close to flow myself, and I can spend hours in the back and forth. But as others in this thread said, it's about the size of the tasks you give it.
The other day I was making changes to some CSS that I partially understood.
Without an LLM I would looked at the 50+ CSS spec documents and the 97% wrong answers on Stack Overflow and all the splogs and would have bumbled around and tried a lot of things and gotten it to work in the end and not really understood why and experienced a lot of stress.
As it was I had a conversation with Junie about "I observe ... why does it work this way?", "Should I do A or do B?", "What if I did C?" and came to understand the situation 100% and wrote a few lines of code by hand that did the right thing. After that I could have switched it to Code mode and said "Make it so!" but it was easy when I understood it. And the experience was not stressful at all.
In other words, getting to be the "ideas guy", but without sounding like a dipstick who can't do anything.
I don't think we're anywhere near that point yet. Instead we're at the same point where we are with self-driving: not doing anything but on constant alert.
imagine a game, like Galaxians but using tractor trailers,
and as a first person shooter. Three.js in index.html
Result: https://gisthost.github.io/?771686585ef1c7299451d673543fbd5dPrompt two:
No, let's try it again with an army of bagpipers.
Result: https://gisthost.github.io/?60e18b32de6474fe192171bdef3e1d91I'll be honest, the bagpiper 3D models were way better than I expected! That game's a bit too hard though, you have to run sideways pretty quickly to avoid being destroyed by incoming fire.
Here's the full transcript: https://gisthost.github.io/?73536b35206a1927f1df95b44f315d4c
I know this is not Reddit, but when I see such a comment, I can't resist posting a video of "the internet's favorite song" on an electrical violin and bagpipes:
> Through the Fire and Flames (Official Video) - Mia x Ally
You're welcome to drop the HTML into a coding agent and tell it to do that. In my experience you usually have to decide how you want that to work - I've had them build me on-screen D-Pad controls before but I've also tried things like getting touch-to-swipe plus an on-screen fire button.
There are full self driving systems that have been in operation with human driver oversight from multiple companies.
And the capabilities of the LLMs in regards to your specific examples were demonstrated below.
The inability of the public to perceive or accept the actual state of technology due to bias or cognitive issues is holding back society.
My app is fairly mature with well established patterns, etc. When I’m adding “just CRUD” as part of a feature it’s very tedious to prompt agents, reviewing code, rinse & repeat. Were I actually writing the code by hand I would probably be less productive and just as bored/unsatisfied.
I spent a decent amount of time today designing a very robust bulk upload API (compliance fintech, lots of considerations to be had) for customers who can’t do a batch job. When it was finished I was very pleased with the result and had performance tests and everything.
LLMs aren’t replacing the joy of coding for me, but they do seem to be helping me deal with the misery of being a professional coder.
To me, using an LLMs is more like having a team of ghostwriters writing your novel. Sure, you "built" your novel but it feels entirely different to writing it yourself.
To scientists, the purpose of science is to learn more about the world; to certain others, it's about making a number of dollars go up. Mathematicians famously enjoy creating math, and would have no use for a "create more math" button. Musicians enjoy creating music, which is very different from listening to it.
We're all drawn to different vocations, and it's perverse to accept that "maximize shareholder value" is the highest.
I've hit flow state setting it up to fly. When it flys is when the human gets out of the loop so the AI can look at the thing itself and figure out why centering the div isn't working to center the div, or why the kernel isn't booting. Like, getting to a point, pre-AI, where git bisect running in a loop is the flow state. Now, with ai, setting that up is the flow.
Like, if it tells you merge sort is better on that particular problem, do you trust it or do you go through an analysis to confirm it really is?
I have a hard time trusting what I don't understand. And even more so if I realize later I've been fooled. Note that it's the same with human though. I think I only trust technical decision I don't understand when I deem the risk of being wrong low enough. Overwise I'll invest in learning and understanding enough to trust the answer.
But this is just a small part from a much grander testing activity that needs to wrap the LLM code. I think my main job moved to 1. architecting and 2. ensuring the tests are well done.
What you don't test is not reliable yet, looking at code is not testing, it's "vibe-testing" and should be an antipattern, no LGTM for AI code. We should rely on our intuition alone because it is not strict enough, and it makes everything slow - we should not "walk the motorcycle".
So far, my biggest issue is, when the code produced is incorrect, with a subtle bug, then I just feel I have wasted time to prompt for something I should have written because now I have to understand it deeply to debug it.
If the test infrastructure is sound, then maybe there is a gain after all even if the code is wrong.
Like right now I am working on algorithms for computing heart rate variability and only looking at a 2 minute window with maybe 300 data points at most so whether it is N or N log N or N^2 is beside the point.
When I know I computing the right thing for my application and know I've coded it up correctly and I am feeling some pain about performance that's another story.
Who doesn't? But we have to trust them anyway, otherwise everyone should get a PhD on everything.
Also for people who "has a hard time trusting", they might just give up when encountering things they don't understand. With AI at least there is a path for them to keep digging deeper and actually verify things to whatever level of satisfaction they want.
But yes, the bench result will tell something true.
Coding with an LLM seems like it’s often more editing in service of less writing.
I get this is a very simplistic way of looking at it and when done right it can produce solutions, even novel solutions, that maybe you wouldn’t have on your own. Or maybe it speeds up a part of the writing that is otherwise slow and painful. But I don’t know, as somebody who doesn’t really code every time I hear people talk about it that’s what it sounds like to me.
Reminds me of this excerpt from Richard Hamming's book:
> Finally, a more complete, and more useful, Symbolic Assembly Program (SAP) was devised—after more years than you are apt to believe during which most programmers continued their heroic absolute binary programming. At the time SAP first appeared I would guess about 1% of the older programmers were interested in it—using SAP was “sissy stuff”, and a real programmer would not stoop to wasting machine capacity to do the assembly. Yes! Programmers wanted no part of it, though when pressed they had to admit their old methods used more machine time in locating and fixing up errors than the SAP program ever used. One of the main complaints was when using a symbolic system you do not know where anything was in storage—though in the early days we supplied a mapping of symbolic to actual storage, and believe it or not they later lovingly pored over such sheets rather than realize they did not need to know that information if they stuck to operating within the system—no! When correcting errors they preferred to do it in absolute binary addresses.
I do all of my programming on paper, so keystrokes and formal languages are the fast part. LLMs are just too slow.
Others have different prerogatives, but I personally do not want to work more than I need to.
You don't just code, you also test, and your safety is just as good as your test coverage and depth. Think hard about how to express your code to make it more testable. That is the single way we have now to get back some safety.
But I argue the manual inspection of code and thinking it through in your head is still not strict coding, it is vibe-testing as well, only code backed by tests is not vibe-based. If needed use TLA+ (generated by LLM) to test, or go as deep as necessary to test.
I think alot of us dont get everything specced out up front, we see how things fit, and adjust accordingly. most of the really good ideas I've had were not formulated in the abstract, but realizations had in the process of spelling things out.
I have a process, and it works for me. Different people certainly have other ones, and other goals. But maybe stop telling me that instead of interacting with the compiler directly its absolutely necessary that instead I describe what I want to a well meaning idiot, and patiently correct them, even though they are going to forget everything I just said in a moment.
This perfectly describes the main problem I have with the coding agents. We are told we should move from explicit control and writing instructions for the machine to pulling the slot lever over and over and "persuading the machine" hoping for the right result.
The first people using higher level languages did feel compelled to. That's what the quote from the book is saying. The first HLL users felt compelled to check the output just like the first LLM users.
But there is no reason to suppose that responsible SWEs would ever be able to stop doing so for an LLM, given the reliance on nondeterminism and a fundamentally imprecise communication mechanism.
That's the point. It's not the same kind of shift at all.
I think it's a very apt comparison.
That was the comparison made. AI is an eerily similar shift.
> I don't think that's valid at all.
I dont think you made the case by cherry picking what it can't do. This is exactly the same situation, as the time SAP appeared. There weren't symbols for every situation binary programmers were using at the time. This doesn't change the obvious and practical improvement that abstractions provided. Granted, I'm not happy about it, but I can't deny it either.
I had an inkling that the feeling existed back then, but I had no idea it was documented so explicitly. Is this quote from The Art of Doing Science and Engineering?
AI obviously brings big benefits into the profession. We just have not seen exactly what they are just yet. How it will unfold.
But personally I feel that a future of not having to churn out yet another crud app is attractive.
For those who are less experienced with the constant surprises that legacy code bases can provide, LLMs are deeply unsettling.
I've never worked in web development, where it seems to me the majority of LLM coding assistants are deployed.
I work on safety critical and life sustaining software and hardware. That's the perspective I have on the world. One question that comes up is "why does it take so long to design and build these systems?" For me, the answer is: that's how long it takes humans to reach a sufficient level of understanding of what they're doing. That's when we ship: when we can provide objective evidence that the systems we've built are safe and effective. These systems we build, which are complex, have to interact with the real world, which is messy and far more complicated.
Writing more code means that's more complexity for humans (note the plurality) to understand. Hiring more people means that's more people who need to understand how the systems work. Want to pull in the schedule? That means humans have to understand in less time. Want to use Agile or this coding tool or that editor or this framework? Fine, these tools might make certain tasks a little easier, but none of that is going to remove the requirement that humans need to understand complex systems before they will work in the real world.
So then we come to LLMs. It's another episode of "finally, we can get these pesky engineers and their time wasting out of the loop". Maybe one day. But we are far from that today. What matters today is still how well do human engineers understand what they're doing. Are you using LLMs to help engineers better understand what they are building? Good. If that's the case you'll probably build more robust systems, and you _might_ even ship faster.
Are you trying to use LLMs to fool yourself into thinking this still isn't the game of humans needing to understand what's going on? "Let's offload some of the understanding of how these systems work onto the AI so we can save time and money". Then I think we're in trouble.
This is a key question. If you look at all the anti-AI stuff around software engineering, the pervading sentiment is “this will never be a senior engineer”. Setting aside the possibility of future models actually bridging this gap (this would be AGI), let’s accept this as true.
You don’t need an LLM to be a senior engineer to be an effective tool, though. If an LLM can turn your design into concrete code more quickly than you could, that gives you more time to reason over the design, the potential side effects, etc. If you use the LLM well, it allows you to give more time to the things the LLM can’t do well.
I think LLMs need different coding languages, ones that emphasise correctness and formal methods. I think we'll develop specific languages for using LLMs with that work better for this task.
Of course, training an LLM to use it then becomes a chicken/egg problem, but I don't think that's insurmountable.
that's kind of not what we're talking about. a pretty large fraction of the community thinks programming is stone cold over because we can talk to an LLM and have it spit out some code that eventually compiles.
personally I think there will be a huge shift in the way things are done. it just won't look like Claude.
The scenario you describe is a legitimate concern if you’re checking in AI generated code with minimal oversight. In fact I’d say it’s inevitable if you don’t maintain strict quality control. But that’s always the case, which is why code review is a thing. Likewise you can use LLMs without just checking in garbage.
The way I’ve used LLMs for coding so far is to give instructions and then iterate on the result (manually or with further instructions) until it meets my quality standards. It’s definitely slower than just checking in the first working thing the LLM churns out, but it’s sill been faster than doing it myself, I understand it exactly as well because I have to in order to give instructions (design) and iterate.
My favorite definition of “legacy code” is “code that is not tested” because no matter who writes code, it turns into a minefield quickly if it doesn’t have tests.
I have tested throwing several features at an LLM lately and I have no doubt that I’m significantly faster when using an LLM. My experience matches what Antirez describes. This doesn’t make me 10x faster, mostly because so much of my job is not coding. But in term of raw coding, I can believe it’s close to 10x.
Unfortunately, "tests" don't do it, they have to be "good tests". I know, because I work on a codebase that has a lot of tests and some modules have good tests and some might as well not have tests because the tests just tell you that you changed something.
On the contrary, legacy code has, by definition, been battle tested in production. I would amend the definition slightly to:
“Legacy code is code that is difficult to change.”
Lacking tests is one common reason why this could be, but not the only possible reason.
The biggest barrier to changing code is usually insufficient automated testing. People are terrified of changing code when they can’t verify the results before breaking production.
More glibly legacy code is “any code I don’t want to deal with”. I’ve seen code written 1 year prior officially declared “legacy” because new coding standards were being put in place and no one wanted to update the old code to match.
For me, if I check in LLM-generated code, it means I've signed off on the final revision and feel comfortable maintaining it to a similar degree as though it were fully hand-written. I may not know every character as intimately as that of code I'd finished writing by hand a day ago, but it shouldn't be any more "legacy" to me than code I wrote by hand a year ago.
It's a bit of a meme that AI code is somehow an incomprehensible black box, but if that is ever the case, it's a failure of the user, not the tool. At the end of the day, a human needs to take responsibility for any code that ends up in a product. You can't just ship something that people will depend on not to harm them without any human ever having had the slightest idea of what it does under the hood.
1. a detailed spec, result of your discussions with the agent about work, when it gets it you ask the agent to formalize it into docs
2. an extensive suite of tests to cover every angle; the tests are generated, but your have to ensure their quality, coverage and depth
I think, to make a metaphor, that specs are like the skeleton of the agent, tests are like the skin, while the agent itself is the muscle and cerebellum, and you are the PFC. Skeleton provides structure and decides how the joints fit, tests provide pain and feedback. The muscle is made more efficient between the two.
In short the new coding loop looks like: "spec -> code -> test, rinse and repeat"
There is no process solution for low performers (as of today).
A lot of the criticisms of AI coding seem to come from people who think that the only way to use AI is to treat it as a peer. “Code this up and commit to main” is probably a workable model for throwaway projects. It’s not workable for long term projects, at least not currently.
The trade off with an LLM is different. It’s not actually a junior or underperforming engineer. It’s far faster at churning out code than even the best engineers. It can read code far faster. It writes tests more consistently than most engineers (in my experience). It is surprisingly good at catching edge cases. With a junior engineer, you drag down your own performance to improve theirs and you’re often trading off short term benefits vs long term. With an LLM, your net performance goes up because it’s augmenting you with its own strengths.
As an engineer, it will never reach senior level (though future models might). But as a tool, it can enable you to do more.
I'm not sure I can think of a more damning indictment than this tbh
Owning code requires you to maintain it. Finding out what parts of the code actual implement features and what parts are not needed anymore (or were never needed in the first place) is really hard. Since most of the time the requirements have never been documented and the authors have left or cannot remember. But not understanding what the code does removed all possibility to improve or modify it. This is how software dies.
Churning out code fast is a huge future liability. Management wants solutions fast and doesn't understand these long term costs. It is the same with all code generators: Short term gains, but long term maintainability issues.
The fact that AI can churn out code 1000x faster does not mean you should have it churn out 1000x more code. You might have a list of 20 critical features and it have time to implement 10. AI could let you get all 20 but shouldn’t mean you check in code for 1000 features you don’t even need.
It is implied that the code being created is for “capabilities”. If your AI is churning out needless code, then sure, that’s a bad thing. Why would you be asking the AI for code you don’t need, though? You should be asking it for critical features, bug fixes, the things you would be coding up regardless.
You can use a hammer to break your own toes or you can use it to put a roof on your house. Using a tool poorly reflects on the craftsman, not the tool.
I'm going to nit on this specifically. I firmly believe anyone that genuinely believes this either never writes tests that actually matter, or doesn't review the tests that an LLM throws out there. I've seen so many cases of people saying 'look at all these valid tests our LLM of choice wrote' only for half of them to do nothing and half of them misleading as to what it actually tests.
I recently had AI code up a feature that was essentially text manipulation. There were existing tests to show it how to write effective tests and it did a great job of covering the new functionality. My feedback to the AI was mostly around some inaccurate comments it made in the code but the coverage was solid. Would have actually been faster for me to fix but I’m experimenting with how much I can make the AI do.
On the other hand I had AI code up another feature in a different code base and it produced a bunch of tests with little actual validation. It basically invoked the new functionality with a good spectrum of arguments but then just validated that the code didn’t throw. And in one case it tested something that diverged slightly from how the code would actually be invoked. In that case I told it how to validate what the functionality was actually doing and how to make the one test more representative. In the end it was good coverage with a small amount of work.
For people who don’t usually test or care bunch about testing, yeah, they probably let the AI create garbage tests.
That seems like the kind of feature where the LLM would already have the domain knowledge needed to write reasonable tests, though. Similar to how it can vibe code a surprisingly complicated website or video game without much help, but probably not create a single component of a complex distributed system that will fit into an existing architecture, with exactly the correct behaviour based on some obscure domain knowledge that pretty much exists only in your company.
An LLM is not a principal engineer. It is a tool. If you try to use it to autonomously create complex systems, you are going to have a bad time. All of the respectable people hyping AI for coding are pretty clear that they have to direct it to get good results in custom domains or complex projects.
A principal engineer would also fail if you asked them to develop a component for your proprietary system with no information, but a principal engineer would be able to so their own deep discovery and design if they have the time and resources to do so. An AI needs you to do some of that.
And this also goes back to my first point about writing tests that matters. Coverage can matter, but coverage is not codifying business logic in your test suite. I've seen many engineers focus only on coverage only for their code to blow up in production because they didn't bother to test the actual real world scenarios it would be used in, which requires deep understanding of the full system.
You can’t ask an LLM to autonomously write complex test suites. You have to guide it. But when AI creates a solid test suite with 20 minutes of prodding instead of 4 hours of hand coding, that’s a win. It doesn’t need to do everything alone to be useful.
> writing tests that matters
Yeah. So make sure it writes them. My experience so far is that it writes a decent set of tests with little prompting, honestly exceeding what I see a lot of engineers put together (lots of engineers suck at writing tests). With additional prompting it can make them great.
An LLM only follows rules/prompts. They can never become Senior.
I think the second is part of RL training to optimize for self contained task like swe bench.
It can output something that looks like the "why" and that's probably good enough in a large percentage of cases.
Example from this morning, I have to recreate the EFI disk of one of my dev vm's, it means killing the session and rebooting the vm. I had Claude write itself a remaining.md to complement the overall build_guide.vm I'm using so I can pick up where I left off. It's surprisingly effective.
The nice thing about LLMs, however, is that they don't grumble about writing extra documentation and tests like humans do. You just tell them to write lots of docs and they do it, they don't just do the fun coding part. I can empathize why human programmers feel threatened.
This feels like a distinction without difference. This is an extension of the common refrain that LLMs cannot “think”.
Rather than get overly philosophical, I would ask what the difference is in practical terms. If an LLM can write out a “why” and it is sufficient explanation for a human or a future LLM, how is that not a “why“?
If you're planning on throwing the code away, fine, but if you're not, eventually you're going to have to revisit it.
Say I'm chasing down some critical bug or a security issue. I run into something that looks overly complicated or unnecessary. Is it something a human did for a reason or did the LLM just randomly plop something in there?
I don't want a made up plausible answer, I need to know if this was a deliberate choice, forex "this is to work around an bug in XY library" or "this is here to guard against [security issue]" or if it's there because some dude on Stackoverflow wrote sample code in 2008.
If your concern is practical and you are worried that the “why” an LLM might produce is arbitrary, then my experience so far says this isn’t a problem. What I’m seeing LLMs record in commit messages and summaries of work is very much the concrete reasons they did things. I’ve yet to see a “why” that seemed like nonsense or arbitrary.
If you have engineers checking in overly complex blobs of code with no “why”, that’s a problem whether they use AI or not. AI tools do not replace engineers and I would not with in any code base where engineers were checking in vibe coded features without understanding them and vetting the results properly.
There are many companies and scenarios where this is completely legitimate.
For example, a startup that's iterating quickly with a small, skilled dev team. A bunch of documentation is a liability, it'll be stale before anyone ever reads it.
Just grabbing someone and collaborating with them on what they wrote is much more effective in that situation.
This is a huge advantage for AI though, they don't complain about writing docs, and will actively keep the docs in sync if you pipeline your requests to do something like "I want to change the code to do X, update the design docs, and then update the code". Human beings would just grumble a lot, an AI doesn't complain...it just does the work.
> Just grabbing someone and collaborating with them on what they wrote is much more effective in that situation.
Again, it just sounds to me that you are arguing why AIs are superior, not in how they are inferior.
There are like eight bajillion systems out there that can generate low-level javadoc-ish docs. Those are trivial.
The other types of internal developer documentation are "how do I set this up", "why was this code written" and "why is this code the way it is" and usually those are much more efficiently conveyed person to person. At least until you get to be a big company.
For a small team, I would 100% agree those kinds of documentation are usually a liability. The problem is "I can't trust that the documentation is accurate or complete" and with AI, I still can't trust that it wrote accurate or complete documentation, or that anyone checked what it generated. So it's kind of worse than useless?
In particular IME the LLM generates a lot of documentation that explains what and not a lot of the why (or at least if it does it’s not reflecting underlying business decisions that prompted the change).
So with that, I can change the code by hand afterwards or continue with LLMs, it makes no difference, because it's essentially the same process as if I had someone follow the ideas I describe, and then later they come back with a PR. I think probably this comes naturally to senior programmers and those who had a taste of management and similar positions, but if you haven't reviewed other's code before, I'm not sure how well this process can actually work.
At least for me, I manage to produce code I can maintain, and seemingly others to, and they don't devolve into hairballs/spaghetti. But again, requires reviewing absolutely every line and constantly edit/improve.
The problem is, that code would require a massive amount of cleanup. I took a brief look and some code was in the wrong place. There were coding style issues, etc.
In my experience, the easy part is getting something that works for 99%. The hard part is getting the architecture right, all of the interfaces and making sure there are no corner cases that get the wrong results.
I'm sure AI can easily get to the 99%, but does it help with the rest?
Yes the AI can help with 100% is it. But the operator of the AI needs to be able to articulate this to the AI .
I've been in this position, where I had no choice but to use AI to write code to fix bugs in another party's codebase, then PR the changes back to the codebase owners. In this case it was vendor software that we rely on which the vendor hadn't fixed critical bugs in yet. And exactly as you described, my PR ultimately got rejected because even though it fixed the bugs in the immediate sense, it presented other issues due to not integrating with the external frameworks the vendor used for their dev processes. At which point it was just easier for the vendor to fix the software their way instead of accept my PR. But the point is that I could have made the PR correct in the first place, if I as the AI operator had the knowledge needed to articulate these more detailed and nuanced requirements to the AI. Since I didn't have this information then the AI generated code that worked but didn't meet the vendors spec. This type of situation is incredibly easy to fall into and is a good example of why you still need a human at the wheel on projects to set the guidance but you don't necessarily need the human to be writing every line of code.
I don't like the situation much but this is the reality of it. We're basically just code reviewers for AI now
Focus on architecture, interfaces, corner-cases, edge-cases and tradeoffs first, and then the details within that won't matter so much anymore. The design/architecture is the hard part, so focus on that first and foremost, and review + throw away bad ideas mercilessly.
I'd treat PRs like that as proof of concepts that the thing that can be done, but I'd be surprised if they often produced code that should be directly landed.
That nearly happened - it's why OpenAI didn't release open weight models past GPT2, and it's why Google didn't release anything useful built on Transformers despite having invented the architecture.
If we lived in the world today, LLMs would be available only to a small, elite and impossibly well funded class of people. Google and OpenAI would solely get to decide who could explore this new world with them.
I think that would suck.
With all due respect I don’t care about an acceleration in writing code - I’m more interested in incremental positive economic impact. To date I haven’t seen anything convince me that this technology will yield this.
Producing more code doesn’t overcome the lack of imagination, creativity and so on to figure out what projects resources should be invested in. This has always been an issue that will compound at firms like Google who have an expansive graveyard of projects laid to rest.
In fact, in a perverse way, all this ‘intelligence’ can exist. At the same time humans can get worse in their ability to make judgments in investment decisions.
So broadly where is the net benefit here?
I get the impression there's no answer here that would satisfy you, but personally I'm excited about regular people being able to automate tedious things in their lives without having to spend 6+ months learning to program first.
And being able to enrich their lives with access to as much world knowledge as possible via a system that can translate that knowledge into whatever language and terminology makes the most sense to them.
Bring the implicit and explicit costs to date into your analysis and you should quickly realise none of this makes sense from a societal standpoint.
Also you seem to be living in a bubble - the average person doesn’t care about automating anything!
His workplace has no one with programming skills, this is automation that would never have happened. Of course it’s not exactly replacing a human or anything. I suppose he could have hired someone to write the script but he never really thought to do that.
I do not want the chief of a fire station losing two days of work to something that could be scripted!
Currently, if you ask an LLM to do something small and self-contained like solve leetcode problems or implement specific algorithms, they will have a much lower rate of mistakes, in terms of implementing the actual code, than an experienced human engineer. The things it does badly are more about architecture, organization, style, and taste.
One of my life goals is to help bring as many people into my "technology can automate things for you" bubble as I possibly can.
For companies, if these tools make experts even more special, then experts may get more power certainly when it comes to salary.
So the productively benefits of AI have to be pretty high to overcome this. Does AI make an expert twice as productive?
- If the number of programmers will be drastically reduced, how big of a price increase companies like Anthropic would need to be profitable?
- If you are a manager, you now have a much higher bus factor to deal with. One person leaving means a greater blow on the team's knowledge.
- If the number of programmers will be drastically reduced, the need for managers and middle managers will also decline, no? Hmm...
"Oh, and check it out: I'm a bloody genius now! Estás usando este software de traducción in forma incorrecta. Por favor, consulta el manual. I don't even know what I just said, but I can find out!"
Sometimes I run into a problem that the LLM can't really handle yet, but I just break the problem up into more docs, tests, and code. So...that usually works, but I admit I'm move more slowly on those problems, and I'm not asking the LLM how to break the problem up yet (although I think we will get there soon).
I'm working on workflow processing to make this easier ATM (because I can't help my coworkers do what I'm doing, and what I'm doing is so ad hoc), which is why I'm talking about it so much. So the idea is that you request a change at the top, and the LLM updates everything to accommodate the change, keeping track of what changed in each artifact. When it goes to generate code...it has a change for the artifacts that input into code (which are just read in along with a prompt saying "generate the code!"). You just don't ask the LLM to change the code directly (because if you do that, none of the docs get updated for the change, and things can go bad after that...).
When things go wrong, I add extra context if I can spot the problem ("focus on X, X looks wrong because...") and that just merges with the other docs as the context. Sometimes if I can't figure out why a test is failing, I ask it to create a simpler version of the test and see if that fails (if it does, it will be easier to eye the problem). Manual intervention is still necessary (and ugh, sometimes the LLM is just having a bad day and I need to /clear and try again).
The files I had it generate were interesting but I’m not convinced looking at them that they contain the real info the AI needs to be more efficient. I should look into what kind of context analysis agents are passing back because that seems like what I want to save for later.
I can’t imagine asking AI to change some code without having a description of what the code does. You could maybe reverse engineer that, but that would basically be generating the documents after the fact. Likewise changing code without tests, where failing tests are actionable signals for the AI to make sure it doesn’t break things on update. Some people here think you can just ask it to write code without any other artifacts, thats nuts (maybe agentic will develop in the direction where AI writes persistent artifacts on its own without being told to do so, actually I’m sure that will happen eventually).
(via simonw, didn't see it already on HN)
I don't think many companies/codebases allow LLMs to autonomously edit code and deploy it, there is still a human in the loop that "prompt > generates > reviews > commits", so it really isn't hard to find someone to blame for those errors, if you happen to work in that kind of blame-filled environment.
Same goes with contractors I suppose, if you end up outsourcing work to a contractor, they do a shitty job but that got shipped anyways, who do you blame? Replace "contractor" with "LLM" and I think the answer remains the same.
Talk about a good thing coming from bad intentions! Congratulations on shaking that demon.
I don't think this is really it for many people (maybe any); after all, you can do all of that when writing a text message rather than a piece of code.
But it inches closer to what I think is the "right answer" for this type of software developer. There are aspects of software development that are very much like other forms of writing (e.g., prose or poetry).
Like other writing, writing code can constitute self-expression in an inherently satisfying way, and it can also offer the satisfaction of finding "the perfect phrase". LLMs more or less eliminate both sources of pleasure, either by eliminating the act of writing itself (that is, choosing and refining the words) or through their bland, generic, tasteless style.
There are other ways that LLMs can disconnect the people using them from what is joyful about writing code, not least of all because LLMs can be used in a lot of different ways. (Using them as search tools or otherwise consulting them rather than having them commit code to simply be either accepted/rejected "solves" the specific problems I just mentioned, for instance.)
There is something magical about speaking motion into existence, which is part of what has made programming feel special to me, ever since I was a kid. In a way, prompting an LLM to generate working code preserves that and I can imagine how, for some, it even seems to magnify the magic. But there is also a sense of essential mastery involved in the wonderful way code brings ideas to life. That mastery involves not just "understanding" things in the cursory way involved in visually scanning someone else's code and thinking "looks good to me", but intimately knowing how the words and abstractions and effects all "line up" and relate to each other (and hopefully also with the project's requirements). That feeling of mastery is itself one of the joys of writing code.
Without that mastery, you also lose one of the second-order joys of writing code that many here have already mentioned in these comments: flow. Delegation means fumbling in a way that working in your own context just doesn't. :-\
Exactly me.
One of the reasons why I learned vim was because I enjoy staying in the keyboard; I'm a fast typer and part of the fun is typing out the code I'm thinking.
I can see how some folks only really like seeing the final product rather than the process of building it but I'm just not cut for that — I hate entrepreneurship for the same reason, I enjoy the building part more than the end.
And it's the part that's killing me with all this hype.
I'm not entirely sure what that means myself, so please speak up if my statement resonates with you.
Drawing and painting is a ritual to me as well. No one pays me for it and I am happy about that.
Now the fun is gone, maybe I can do more important work.
This is a very sad, bleak, and utilitarian view of "work." It is also simply not how humans operate. Even if you only care about the product, humans that enjoy and take pride in what they're doing almost invariably produce better products that their customers like more.
Maybe there are people who are about literally typing the code, but I get satisfaction from making the codebase nice and neat, and now I have power tools. I am just working on small personal projects, but so far, Claude Opus 4.5 can do any refactoring I can describe.
Anecdotally, I’ve had a few coworkers go from putting themselves firmly in this category to saying “this is the most fun I’ve ever had in my career” in the last two months. The recent improvement in models and coding agents (Claude Code with Opus 4.5 in our case) is changing a lot of minds.
I know you didn't mean to, but I think that description is a mischaracterization. I'd wager most of us "I control the computer" people who enjoy crafting software don't really care for the actual imputation of symbols. That is just the mechanism by which we move code from our heads to the computer. What LLMs destroy – at least for me – is the creation of code in my head and its (more-or-less) faithful replication inside the computer. I don't particularly enjoy the physical act of moving my fingers across a piece of plastic, but I do enjoy the result executing my program on my computer.
If an LLM is placed in the middle, two things happen: first, I'm expressing the _idea_ of my program not to a computer, but to an LLM; and second, the LLM expresses its "interpretation" of that idea to the computer. Both parts destroy joy for me. That's of course not important to anyone but myself and likeminded people, and I don't expect the world to care. But I do also believe that both parts come with a whole host of dangers that make the end result less trustworthy and less maintainable over time.
I'm definitely warming to the role of LLMs as critics though. I also see value in having them write tests – the worst a bad or unmaintainable test will provide is a false error.
Do people actually spend a significant time typing? After I moved beyond the novice stage it’s been an inconsequential amount of time. What it still serves is a thorough review of every single line in a way that is essentially equivalent to what a good PR review looks like.
See, that also works.
Unfortunately the job market does not demand both types of programmer equally: Those who drive LLMs to deliver more/better/faster/cheaper are in far greater demand right now. (My observation is that a decade of ZIRP-driven easy hiring paused the natural business cycle of trying to do more with fewer employees, and we’ve been seeing an outsized correction for the past few years, accelerated by LLM uptake.)
I doubt that the LLM drivers deliver something better; quite the opposite. But I guess managers will only realize this when it's too late: and of course they won't take any responsibility for this.
That is your definition of “better”. If we’re going to trade our expertise for coin, we must ask ourselves if the cost of “better” is worth it to the buyer. Can they see the difference? Do they care?
Also HN: "Why does all commercial software seem to suck more and more as time goes on?"
This is exactly the phenomenon of markets for "lemons":
> https://en.wikipedia.org/wiki/The_Market_for_Lemons
(for the HN readers: a related concept is "information asymmetry in markets").
George Akerlof (the author of this paper), Michael Spence and Joseph Stiglitz got a Nobel Memorial Prize in Economic Sciences in 2001 for their analyses of markets with asymmetric information.
("Solving a problem for others" also resonates, but I think I implement that more by tutoring and mentoring.)
but luckily for us, we can still do that, and it's just as fun as it ever was. LLMs don't take anything away from the fun of actually writing code, unless you choose to let them.
if anything the LLMs make it more fun, because the boring bits can now be farmed out while you work on the fun bits. no, i don't really want to make another CRUD UI, but if the project i'm working on needs one i can just let claude code do that for me while i go back to working on the stuff that's actually interesting.
AI coding makes creating things far more efficient (as long as you use AI), and will likely mean you don't get paid much (unless you use AI).
You can still code for the fun of it, but you don't get the ancillary benefits.
It's not about the typing, it's about the understanding.
LLM coding is like reading a math textbook without trying to solve any of the problems. You get an overview, you get a sense of what it's about and most importantly you get a false sense of understanding.
But if you try to actually solve the problems, you engage completely different parts of your brain. It's about the self-improvement.
Most math textbooks provide the solutions too. So you could choose to just read those and move on and you’d have achieved much less. The same is true with coding. Just because LLMs are available doesn’t mean you have to use them for all coding, especially when the goal is to learn foundational knowledge. I still believe there’s a need for humans to learn much of the same foundational knowledge as before LLMs otherwise we’ll end up with a world of technology that is totally inscrutable. Those who choose to just vibe code everything will make themselves irrelevant quickly.
Well, it's both, for different people, seemingly :)
I also like the understanding and solving something difficult, that rewards a really strong part of my brain. But I don't always like to spend 5 hours in doing so, especially when I'm doing that because of some other problem I want to solve. Then I just want it solved ideally.
But then other days I engage in problems that are hard because they are hard, and because I want to spend 5 hours thinking about, designing the perfect solution for it and so on.
Different moments call for different methods, and particularly people seem to widely favor different methods too, which makes sense.
Can be, but… well, the analogy can go wrong both ways.
This is what Brilliant.org and Duolingo sell themselves on: solve problems to learn.
Before I moved to Berlin in 2018, I had turned the whole Duolingo German tree gold more than once, when I arrived I was essentially tourist-level.
Brilliant.org, I did as much as I could before the questions got too hard (latter half of group theory, relativity, vector calculus, that kind of thing); I've looked at it again since then, and get the impressions the new questions they added were the same kind of thing that ultimately turned me off Duolingo, easier questions that teach little, padding out a progressions system that can only be worked through fast enough to learn anything if you pay a lot.
Code… even before LLMs, I've seen and I've worked with confident people with a false sense of understanding about the code they wrote. (Unfortunately for me, one of my weaknesses is the politics of navigating such people).
I'm not trying to be snobbish here, it's completely fine to enjoy those sorts of products (I consume a lot of pop science, which I put in the same category) but you gotta actually get your hands dirty and do the work.
It's also fine to not want to do that -- I love to doodle and have a reasonable eye for drawing, but to get really good at it, I'd have to practice a lot and develop better technique and skills and make a lot of shitty art and ehhhh. I don't want it badly enough.
GET /svg/weather
|> jq: weatherData
|> jq: `
.hourly as $h |
[$h.time, $h.temperature_2m] | transpose | map({time: .[0], temp: .[1]})
`
|> gg({ "type": "svg", "width": 800, "height": 400 }): `
aes(x: time, y: temp)
| line()
| point()
`
I've even started embedding my DSLs inside my other DSLs!It obviously depends a lot on what exactly you're building, but in many projects programming entails a lot of low intellectual effort, repetitive work.
It's the same things over and over with slight variations and little intellectual challenge once you've learnt the basic concepts.
Many projects do have a kernel of non-obvious innovation, some have a lot of it, and by all means, do think deeply about these parts. That's your job.
But if an LLM can do the clerical work for you? What's not to celebrate about that?
To make it concrete with an example: the other day I had Claude make a TUI for a data processing library I made. It's a bunch of rather tedious boilerplate.
I really have no intellectual interest in TUI coding and I would consider doing that myself a terrible use of my time considering all the other things I could be doing.
The alternative wasn't to have a much better TUI, but to not have any.
I think I can reasonably describe myself as one of the people telling you the thing you don't really get.
And from my perspective: we hate those projects and only do them if/because they pay well.
> the other day I had Claude make a TUI for a data processing library I made. It's a bunch of rather tedious boilerplate. I really have no intellectual interest in TUI coding...
From my perspective, the core concepts in a TUI event loop are cool, and making one only involves boilerplate insofar as the support libraries you use expect it. And when I encounter that, I naturally add "design a better API for this" to my project list.
Historically, a large part of avoiding the tedium has been making a clearer separation between the expressive code-like things and the repetitive data-like things, to the point where the data-like things can be purely automated or outsourced. AI feels weird because it blurs the line of what can or cannot be automated, at the expense of determinism.
Plus the size of project that an LLM can help maintain keeps growing. I actually think that size may no longer have any realistic limits at all now: the tricks Claude Code uses today with grep and sub-agents mean there's no longer a realistic upper limit to how much code it can help manage, even with Opus's relatively small (by today's standards) 200,000 token limit.
The thing is:
1) A lot of the low-intellectual stuff is not necessarily repetitive, it involved some business logic which is a culmination of knowing the process behind what the uses needs. When you write a prompt, the model makes assumptions which are not necessarily correct for the particular situation. Writing the code yourself forced you to notice the decision points and make more informed choices.
I understand your TUI example and it's better than having none now, but as a result anybody who wants to write "a much better TUI" now faces a higher barrier to entry since a) it's harder to justify an incremental improvement which takes a lot of work b) users will already have processes around the current system c) anybody who wrote a similar library with a better TUI is now competing with you and quality is a much smaller factor than hype/awareness/advertisement.
We'll basically have more but lower quality SW and I am not sure that's an improvement long term.
2) A lot of the high-intellectual stuff ironically can be solved by LLMs because a similar problem is already in the training data, maybe in another language, maybe with slight differences which can be pattern matched by the LLM. It's laundering other people's work and you don't even get to focus on the interesting parts.
Yes, this follows from the point the GP was making.
The LLM can produce code for complex problems, but that doesn't save you as much time, because in those cases typing it out isn't the bottleneck, understanding it in detail is.
I've "vibe coded" a ton of stuff and so I'm pretty bullish on LLMs, but I don't see a world where "coding by hand" isn't still required for at least some subset of software. I don't know what that subset will be, but I'm convinced it will exist, and so there will be ample opportunities for programmers who like that sort of thing.
---
Why am I convinced hand-coding won't go away? Well, technically I lied, I have no idea what the future holds. However, it seems to me that an AI which could code literally anything under the sun would almost by definition be that mythical AGI. It would need to have an almost perfect understanding of human language and the larger world.
An AI like that wouldn't just be great at coding, it would be great at everything! It would be the end of the economy, and scarcity. In which case, you could still program by hand all you wanted because you wouldn't need to work for a living, so do whatever brings you joy.
So even without making predictions about what the limitations of AI will ultimately be, it seems to me you'll be able to keep programming by hand regardless.
LLMs do empower you (and by "you" I mean the reader or any other person from now on) to actually complete projects you need in the very limited free time and have available. Manually coding the same could take months (I'm speaking from experience developing a personal project for about 3 hours every Friday and there's still much to be done). In a professional context, you're being paid to ship and AI can help you grow an idea to an MVP and then to a full implementation in record-breaking time. At the end of the day, you're satisfied because you built something useful and helped your company. You probably also used your problem solving skills.
Programming is also a hobby though. The whole process matters too. I'm one of the people who feels incredible joy when achieving a goal, knowing that I completed every step in the process with my own knowledge and skills. I know that I went from an idea to a complete design based on everything I know and probably learned a few new things too. I typed the variable names, I worked hard on the project for a long time and I'm finally seeing the fruits of my effort. I proudly share it with other people who may need the same and can attest its high quality (or low quality if it was a stupid script I hastily threw together, but anyway sharing is caring —the point is that I actually know what I've written).
The experience of writing that same code with an LLM will leave you feeling a bit empty. You're happy with the result: it does everything you wanted and you can easily extend it when you feel like it. But you didn't write the code, someone else did. You just reviewed an intern's work and gave feedback. Sometimes that's indeed what you want. You may need a tool for your job or your daily life, but you aren't too interested in the internals. AI is truly great for that.
I can't reach a better conclusion than the parent comment, everyone is unique and enjoys coding in a different way. You should always find a chance to code the way you want, it'll help maintain your self-esteem and make your life interesting. Don't be afraid of new technologies where they can help you though.
You wouldn’t say, “It’s not that they hate electricity it’s just that they love harpooning whales and dying in the icy North Atlantic.”
You can love it all you want but people won’t pay you to do it like they used to in the good old days.
1. Those who see their codebase as a sculpture, a work of art, a source of pride 2. Those who focus on outcomes.
They are not contradictory goals, but I'm finding that if your emphasis is 1, you general dislike LLMs, and if your emphasis is 2, you love them, or at least tolerate them.
I have my personal projects where every single line if authored by hand.
Still, I will ask LLMs for feedback or look for ideas when I have the feeling something could be rearchitected/improved but I don't see how.
More often than not, they fluke, but occasionally they will still provide valid feedback which otherwise I'd missed.
LLMs aren't just for the "lets dump large amounts of lower-level work" use case.
It is about business value.
Programming exists, at scale, because it produces economic value. That value translates into revenue, leverage, competitive advantage, and ultimately money. For decades, a large portion of that value could only be produced by human labor. Now, increasingly, it cannot be assumed that this will remain true.
Because programming is a direct generator of business value, it has also become the backbone of many people’s livelihoods. Mortgages, families, social status, and long term security are tied to it. When a skill reliably converts into income, it stops being just a skill. It becomes a profession. And professions tend to become identities.
People do not merely say “I write code.” They say “I am a software engineer,” in the same way someone says “I am a pilot” or “I am a police officer.” The identity is not accidental. Programming is culturally associated with intelligence, problem solving, and exclusivity. It has historically rewarded those who mastered it with both money and prestige. That combination makes identity attachment not just likely but inevitable.
Once identity is involved, objectivity collapses.
The core of the anti AI movement is not technical skepticism. It is not concern about correctness, safety, or limitations. Those arguments are surface rationalizations. The real driver is identity threat.
LLMs are not merely automating tasks. They are encroaching on the very thing many people have used to define their worth. A machine that can write code, reason about systems, and generate solutions challenges the implicit belief that “this thing makes me special, irreplaceable, and valuable.” That is an existential threat, not a technical one.
When identity is threatened, people do not reason. They defend. They minimize. They selectively focus on flaws. They move goalposts. They cling to outdated benchmarks and demand perfection where none was previously required. This is not unique to programmers. It is a universal human response to displacement.
The loudest opponents of AI are not the weakest programmers. They are often the ones most deeply invested in the idea of being a programmer. The ones whose self concept, status, and narrative of personal merit are tightly coupled to the belief that what they do cannot be replicated by a machine.
That is why the discourse feels so dishonest. It is not actually about whether LLMs are good at programming today. It is about resisting a trend line that points toward a future where the economic value of programming is increasingly detached from human identity.
This is not a moral failing. It is a psychological one. But pretending it is something else only delays adaptation.
AI is not attacking programming. It is attacking the assumption that a lucrative skill entitles its holder to permanence. The resistance is not to the technology itself, but to the loss of a story people tell themselves about who they are and why they matter.
That is the real conflict. HN is littered with people facing this conflict.
This is because they have entrenched themselves in a comfortable position that they don’t want to give up.
Most won’t admit this to be the actual reason. Think about it: you are a normal hands on self thought software developer. You grew up tinkering with Linux and a bit of hardware. You realise there’s good money to be made in a software career. You do it for 20-30 years; mostly the same stuff over and over again. Some Linux, c#, networking. Your life and hobby revolves around these technologies. And most importantly you have a comfortable and stable income that entrenches your class and status. Anything that can disrupt this state is obviously not desireable. Never mind that disrupting others careers is why you have a career in the first place.
Not every software project has or did this. In fact I would argue many new businesses exist that didn't exist before software and computing and people are doing things they didn't beforehand. Especially around discovery of information - solving the "I don't know what I don't know" problem also expanded markets and demand to people who now know.
Whereas the current AI wave seems to be more about efficiency/industrialization/democratizing of existing use cases rather than novel things to date. I would be more excited if I saw more "product orientated" AI use cases other than destroying jobs. While I'm hoping that the "vibing" of software will mean that SWE's are needed to productionise it I'm not confident that AI won't be able to do that soon too nor any other knowledge profession.
I wouldn't be surprised with AI if there's mass unemployment but we still don't cure cancer for example in 20 years.
That's exactly what I am hoping to see happen with AI.
If the local unskilled job matters more than a SWE now these people have gone from being worth something to society to being less of worth than someone unskilled with a job. At that point following from your logic I can assume their long term value is one of an unemployed person which to some people is negative. That isn't just an identity crash; its a crash potentially on their whole lives and livelihood. Even smart people can be in situations where it is hard to pivot (as you say mortgages, families, lives, etc).
I'm sure many of the SWE's here (myself included) are asking the same questions; and the answers are too pessimistic to admit public ally and even privately. Myself the joy of coding is taken away with AI in general, in that there is no joy doing something that a machine will be able to do better soon for me at least.
What stands out to me is that there seems to be a threshold where reality itself becomes too pessimistic to consciously accept.
At that point people do not argue with conclusions. They argue with perception.
You can watch the systems work. You can see code being written, bugs being fixed, entire workflows compressed. You can see the improvement curve. None of this is hidden. And yet people will look straight at it and insist it does not count, that it is fake, that it is toy output, that it will never matter in the real world. Not because the evidence is weak, but because the implications are unbearable.
That is the part that feels almost surreal. It is not ignorance. It is not lack of intelligence. It is the mind refusing to integrate a fact because the downstream consequences are too negative to live with. The pessimism is not in the claim. It is in the reality itself.
Humans do this all the time. When an update threatens identity, livelihood, or future security, self deception becomes a survival mechanism. We selectively ignore what we see. We raise the bar retroactively. We convince ourselves that obvious trend lines somehow stop right before they reach us. This is not accidental. It is protective.
What makes it unsettling is seeing it happen while the evidence is actively running in front of us. You are holding reality in one hand and watching people try to look away without admitting they are looking away. They are not saying “this is scary and I do not know how to cope.” They are saying “this is not real,” because that is easier.
So yes, the questions you raise are the real ones. Do people still matter. How will they matter. What happens when economic value shifts faster than lives can adapt. Those questions are heavy, and I do not think anyone has clean answers yet.
But pretending the shift is not happening does not make the answers kinder. It just postpones the reckoning.
The disturbing thing is not that reality is pessimistic. It is that at some point reality becomes so pessimistic that people start editing their own perception of it. They unsee what is happening in order to preserve who they think they are.
That is the collision we are watching. And it is far stranger than a technical debate about code quality.
Have you considered that there are people who actually just enjoy programming by themselves?
The question is more about why my post triggered you... why would my simple opinion trigger you? Does disagreement trigger you? If I said something that is obviously untrue that you disagreed with, for example: "The world is flat." Would this trigger you? I don't think it would. So why was my post different?
Maybe this is more of a question you should ask yourself.
I have two things to add.
> This is not a moral failing. It is a psychological one.
(1) I disagree: it's not a failing at all. Resisting displacement, resisting that your identity, existence, meaning found in work, be taken away from you, is not a failing.
Such resistance might be futile, yes; but that doesn't make it a failing. If said resistance won, then nobody would call it a failing.
The new technology might just win, and not adapting to that reality, refusing that reality, could perhaps be called a failing. But it's also a choice.
For example, if software engineering becomes a role to review AI slop all day, then it simply devolves, for me, into just another job that may be lucrative but has zero interest for me.
(2) You emphasize identity. I propose a different angle: meaning, and intrinsic motivation. You mention:
> economic value of programming is increasingly detached from human identity
I want to rephrase it: what has been meaningful to me thus far remains meaningful, but it no longer allows me to make ends meet, because my tribe no longer appreciates when I act out said activity that is so meaningful to me.
THAT is the real tragedy. Not the loss of identity -- which you seem to derive from the combination of money and prestige (BTW, I don't fully dismiss that idea). Those are extrinsic motivations. It's the sudden unsustainability of a core, defining activity that remains meaningful.
The whole point of all these AI-apologist articles is that "it has happened in the past, time and again; humanity has always adapted, and we're now better off for it". Never mind those generations that got walked over and fell victim to the revolution of the day.
In other words, the AI-apologists say, "don't worry, you'll either starve (which is fine, it has happened time and agani), or just lose a large chunk of meaning in your life".
Not resisting that is what would be a failing.
What I was trying to point at is how strange it is to watch this happen in real time. You can see something unfolding directly in front of you. You can observe systems improving, replacing workflows, changing incentives. None of it is abstract. And yet the implications of what is happening are so negative for some people that the mind simply refuses to integrate them. It is not that the facts are unknown. It is that the outcome is psychologically intolerable.
At that point something unusual happens. People do not argue with conclusions, they argue with perception. They insist the thing they are watching is not really happening, or that it does not count, or that it will somehow stop before it matters. It is not a failure of intelligence or ethics. It is a human coping mechanism when reality threatens meaning, livelihood, or future stability.
Meaning and intrinsic motivation absolutely matter here. The tragedy is not that meaningful work suddenly becomes meaningless. It is that it can remain meaningful while becoming economically unsustainable. That combination is brutal. But denying the shift does not preserve meaning. It only delays the moment where a person has to decide how to respond.
What I find unsettling is not the fear or the resistance. It is watching people stand next to you, looking at the same evidence, and then effectively unsee it because accepting it would force a reckoning they are not ready for.
>I'm unsure if you've written it with AI-assistance, but even if that's the case, I'll tolerate it.
Even if it was, the world is changing. You already need to tolerate AI in code, it's inevitable AI will be part of writing.
https://en.wikipedia.org/wiki/Cognitive_dissonance
Or perhaps, a form of grief.
> denying the shift does not preserve meaning
I think you meant to write:
"denying the shift does not preserve sustainability"
as "meaning" need not be preserved by anything. The idea here is that meaning -- stemming from the profession being supplanted -- is axiomatic.
And with that correction applied, I agree -- to an extent anyway. I hope that, even if (or "when") the mainstream gets swayed by AI, pockets / niches of "hand-crafting" remain sustainable. We've seen this with other professions that used to be mainstream but have been automated away at large scale.
and from the first line of the article:
> I love writing software, line by line.
I've said it before and I'll say it again: I don't write programs "line by line" and typing isn't programming. I work out code in the abstract away from the keyboard before typing it out, and it's not the typing part that is the bottleneck.
Last time I commented this on HN, I said something like "if an AI could pluck these abstract ideas from my head and turn them into code, eliminating the typing part, I'd be an enthusiastic adopter", to which someone predictably said something like "but that's exactly what it does!". It absolutely is not, though.
When I "program" away from the keyboard I form something like a mental image of the code, not of the text but of the abstract structure. I struggle to conjure actual visual imagery in my head (I "have aphantasia" as it's fashionable to say lately), which I suspect is because much of my visual cortex processes these abstract "images" of linguistic and logical structures instead.
The mental "image" I form isn't some vague, underspecified thing. It corresponds directly to the exact code I will write, and the abstractions I use to compartmentalise and navigate it in my mind are the same ones that are used in the code. I typically evaluate and compare many alternative possible "images" of different approaches in my head, thinking through how they will behave at runtime, in what ways they might fail, how they will look to a person new to the codebase, how the code will evolve as people make likely future changes, how I could explain them to a colleague, etc. I "look" at this mental model of the code from many different angles and I've learned only to actually start writing it down when I get the particular feeling you get when it "looks" right from all of those angles, which is a deeply satisfying feeling that I actively seek out in my life independently of being paid for it.
Then I type it out, which doesn't usually take very long.
When I get to the point of "typing" my code "line by line", I don't want something that I can give a natural language description to. I have a mental image of the exact piece of logic I want, down to the details. Any departure from that is a departure from the thing that I've scrutinised from many angles and rejected many alternatives to. I want the exact piece of code that is in my head. The only way I can get that is to type it out, and that's fine.
What AI provides, and it is wildly impressive, is the ability to specify what's needed in natural language and have some code generated that corresponds to it. I've used it and it really is very, very good, but it isn't what I need because it can't take that fully-specified image from my head and translate it to the exact corresponding code. Instead I have to convert that image to vague natural language, have some code generated and then carefully review it to find and fix (or have the AI fix) the many ways it inevitably departs from what I wanted. That's strictly worse than just typing out the code, and the typing doesn't even take that long anyway.
I hope this helps to understand why, for me and people like me, AI coding doesn't take away the "line-by-line part" or the "typing". We can't slot it into our development process at the typing stage. To use it the way you are using it we would instead have to allow it to replace the part that happens (or can happen) away from the keyboard: the mental processing of the code. And many of us don't want to do that, for a wide variety of reasons that would take a whole other lengthy comment to get into.
There’s many who’s thinking is not so deep nor sharp as yours - LLM’s are welcomed by them but come at a tremendous cost to their cognition and the firms future well-being of its code base. Because this cost is implicit and not explicit it doesn’t occur to them.
> Because this cost is implicit and not explicit it doesn’t occur to them.
Your arrogance and naiveté blinds you to the fact it is does occur to them, but because they have a better understanding of the world and their position in it, they don't care. That's a rational and reasonable position.
Try not to use better/worse when advocating so vociferously. As described by the parent they are short-term pragmatic, that is all. This discussion can open up into a huge worldview where different groups have strengths and weaknesses based on this axis of pragmatic/idealistic.
"Companies" are not a monolith, both laterally between other companies, and what they are composed of as well. I'd wager the larger management groups can be pragmatic, where the (longer lasting) R&D manager will probably be the most idealistic of the firm, mainly because of seeing the trends of punching the gas without looking at long-term consequences.
> Try not to use better/worse when advocating so vociferously.
Hopefully you see the irony in your comment.
Software engineers are not paid to write code, we're paid to solve problems. Writing code is a byproduct.
Like, my job is "make sure our customers accounts are secure". Sometimes that involves writing code, sometimes it involves drafting policy, sometimes it involves presentations or hashing out ideas. It's on me to figure it out.
Writing the code is the easy part.
This is naiveté. Secure customer accounts and the work to implement them is tolerated by the business only while it is necessary to increase profits. Your job is not to secure customer accounts, but to spend the least amount of money to produce a level of account security that will not affect the bottom line. If insecure accounts were tolerated or became profitable, that would be the immediate goal and your job description would pivot on a dime.
Failure to understand this means you don't understand your role, employer, or industry.
I completely agree with every line of this statement. That is literally the job.
Of course I balance time/cost against risk. That's what engineers do. You don't make every house into a concrete bunker because it's "safer", that's expensive and unnecessary. You also don't engineer buildings for hurricanes in California. You do secure against earthquakes, because that's a likely risk.
Engineers are paid for our judgement, not our LOC. Like I said.
I agree with this. The hard part of software development happens when you're formulating the idea in your head, planning the data structures and algorithms, deciding what abstractions to use, deciding what interfaces look like--the actual intellectual work. Once that is done, there is the unpleasant, slow, error-prone part: translating that big bundle of ideas into code while outputting it via your fingers. While LLMs might make this part a little faster, you're still doing a slow, potentially-lossy translation into English first. And if you care about things other than "does it work," you still have a lot of work to do post-LLM to clean things up and make it beautiful.
I think it still remains to be seen whether idea -> natural language -> code is actually going to be faster or better than idea -> code. For unskilled programmers it probably already is. For experts? The jury may still be out.
Funny thing. I tend to agree, but I think it wouldn't look that way to an outside observer. When I'm typing in code, it's typically at a pretty low fraction of my general typing speed — because I'm constantly micro-interrupting myself to doubt the away-from-keyboard work, and refine it in context (when I was "working in the abstract", I didn't exactly envision all the variable names, for example).
Give it a first pass from a spec. Since you know how it should be shaped you can give an initial steer, but focus on features first, and build with testability.
Then refactor, with examples in prompts, until it lines up. You already have the tests, the AI can ensure it doesn't break anything.
Beat it up more and you're done.
This is just telling me to do this:
> To use it the way you are using it we would instead have to allow it to replace the part that happens (or can happen) away from the keyboard: the mental processing of the code.
I don't want to do that.
The entire idea using natural language for composite or atomic command units is deeply unsettling to me. I see language as an unreliable abstraction even with human partners that I know well. It takes a lot of work to communicate anything nuanced, even with vast amounts of shared context. That's the last thing I want to add between me and the machine.
What you wrote futher up resonates a lot for me, right down to the aphantasia bit. I also lack an internal monologue. Perhaps because of these, I never want to "talk" to a device as a command input. Regardless of whether it is my compiler, smartphone, navigation system, alarm clock, toaster, or light switch, issuing such commands is never going to be what I want. It means engaging an extra cognitive task to convert my cognition back into words. I'd much rather have a more machine-oriented control interface where I can be aware of a design's abstraction and directly influence its parameters and operations. I crave the determinism that lets me anticipate the composition of things and nearly "feel" transitive properties of a system. Natural language doesn't work that way.
Note, I'm not against textual interfaces. I actually prefer the shell prompt to the GUI for many recurring control tasks. But typing works for me and speaking would not. I need editing to construct and proof-read commands which may not come out of my mind and hands with the linearity it assumes in the command buffer. I prefer symbolic input languages where I can more directly map my intent into the unambiguous, structured semantics of the chosen tool. I also want conventional programming syntax, with unambiguous control flow and computed expressions for composing command flows. I do not want vagaries of natural language interfering here.
Almost more importantly is: the people who pay you to build software, don’t care if you type or enjoy it, they pay you for an output of working software
Literally nothing is stopping people from writing assembly in their free time for fun
But the number of people who are getting paid to write assembly is probably less than 1000
- making art as you thing it should be, but at the risk of it being non-commercial
- getting paid for doing commercial/trendy art
choose one
It’s so easy to be a starving artist; and in the world of commercial art it’s bloody dog-eat-dog jungle, not made for faint-hearted sissies.
Of course there are some artists who sit comfortably in the grey area between the two oppositions, and for these a little nudging towards either might influence things. But for most artists, their ideas or techniques are simply not relevant to a larger audience.
I'm not sure what your background is, but there are definitly artists out there drawing, painting and creating art they have absolutely zero care for, or even actively is against or don't like, but they do it anyways because it's easier to actually get paid doing those things, than others.
Take a look in the current internet art community and ask how many artists are actively liking the situation of most of their art commissions being "furry lewd art", vs how many commissions they get for that specific niche, as just one example.
History has lots of other examples, where artists typically have a day-job of "Art I do but do not care for" and then like the programmer, hack on what they actually care about outside of "work".
I was mostly considering contemporary artists that you see in museums, and not illustrators. Most of these have moved on to different media, and typically don't draw or paint. They would therefore also not be able to draw commission pieces. And most of the time their work does not sell well.
(Source: am professionally trained artist, tried to sell work, met quite a few artists, thought about this a lot. That's not to say that I may still be completely wrong though, so I liked reading your comment!)
Edit: and of course things get way more complicated and nuanced when you consider gallerists pushing existing artists to become trendy, and artists who are only "discovered" after their deaths, etc. etc.)
It's:
- Making art because you enjoy working with paint
- Making art because you enjoy looking at the painting afterward
This sounds like an alien trying and failing to describe why people like creating things. No, the typing of characters in a keyboard has no special meaning, neither does dragging a brush across a canvas or pulling thread through fabric. It's the primitive desire to create something by your own hands. Have people using AI magically lost all understanding of creativity or creation, everything has to be utilitarian and business?
Sometimes I like to make music because I have an idea of the final results, and I wanna hear it like that. Other times, I make music because I like the feeling of turning a knob, and striking keys at just the right moment, and it gives me a feeling of satisfaction. For others, they want to share an emotion via music. Does this mean someone of us are "making music for the wrong reasons"? I'd claim no.
In a creative process, when you really know your tools, you start being able to go from thought to result without really having to think about the tools. The most common example when it comes to computers would be touch-typing - when your muscle memory gets so good you don't think about the keyboard at all anymore, your hands "know" what to do to get your thoughts down. But for those of us with enough experience in the programming languages and editor/IDE we use, the same thing can happen - going from thought to code is nearly effortless, as is reading code, because we don't need to think about the layers in between anymore.
But this only works when those tools are reliable, when we know they'll do exactly what we expect. AI tooling isn't reliable: It introduces two lossy translation layers (thought -> English and English -> code) and a bunch of waiting in the middle that breaks any flow. With faster computers maybe we can eliminate the waiting, but the reliability just isn't there.
This applies to music, painting, all sorts of creative things. Sure there's prep time beforehand with physical creation like painting, but when someone really gets into the flow it's the same: they're not having to think about the tools so much as getting their thoughts into the end result. The tools "disappear".
> Other times, I make music because I like the feeling of turning a knob, and striking keys at just the right moment, and it gives me a feeling of satisfaction.
But I'll bet you're not thinking about "I like turning this knob" at the moment you're doing it, I'll bet you're thinking "Increase the foo" (and if you're like me it's probably more liking knowing that fact without forming the words) and the knob's immediate visceral feedback is where the satisfaction comes from because you're increasing the foo without having to think about how to do it - in part because of how reliable it is.
Nah bro, most of us learn touch typing and musical instrument finger exercises etc when starting out, it's usually abstracted away once we get competent.
AI takes away the joy of creation, not the low level actions. That's like abstracted twice over..
I use CC for both business and personal projects. In both cases: I want to achieve something cool. If I do it by hand, it is slow, I will need to learn something new which takes too much time and often time the thing(s) I need to learn is not interesting to me (at the time). Additionally, I am slow and perpetually unhappy with the abstractions and design choices I make despite trying very hard to think through them. With CC: it can handle parts of the project I don't want to deal with, it can help me learn the things I want to learn, it can execute quickly so I can try more things and fail fast.
What's lamentable is the conclusion of "if you use AI it is not truly creative" ("have people using AI lost all understanding of creativity or creation?" is a bit condescending).
In other threads the sensitive dynamic from the AI-skeptic crowds is more or less that AI enthusiasts "threaten or bully" people who are not enthusiastic that they will get "punished" or fall behind. Yet at the same time, AI-skeptics seem to routinely make passive aggressive implications that they are the ones truly Creating Art and are the true Craftsman; as if this venture is some elitist art form that should be gate kept by all of you True Programmers (TM).
I find these takes (1) condescending, (2) wrong and also belying a lack of imagination about what others may find genuinely enjoyable and inspiring, (3) just as much of a straw man as their gripes against others "bullying" them into using AI.
> How do I feel, about all the code I wrote that was ingested by LLMs? I feel great to be part of that, because I see this as a continuation of what I tried to do all my life: democratizing code, systems, knowledge.
I don't see it as democratic or democratising. TBH the knowledge is stored in three giga companies that used sometimes almost non-lawful (if not lawful?) methods to gain it, scraping it off the gpl projects etc. And now they are selling it to us without giving the models away. The cost IS understandable because the horrendously expensive vector cards do not come for free, but there is only one country the knowledge is gathered in so this might as well fade away one day when an orange present says so (gimme all the monies or else..)The way I see it, I can just start using AI once they get good enough for my type of work. Until then I'm continuing to learn instead of letting my brain atrophy.
I don't think that's true.
I'm really good at getting great results out of coding agents and LLMs. I've also been using LLMs for code on an almost daily basis since ChatGPT's release on November 30th 2022. That's more than three years ago now.
Meanwhile I see a constant flow of complaints from other developers who can't get anything useful out of these machines, or find that the gains they get are minimal at best.
Using this stuff well is a deep topic. These things can be applied in so many different ways, and to so many different projects. The best asset you can develop is an intuition for what works and what doesn't, and getting that intuition requires months if not years of personal experimentation.
I don't think you can just catch up in a few weeks, and I do think that the risk of falling behind isn't being taken seriously enough by much of the developer population.
I'm glad to see people like antirez ringing the alarm bell about this - it's not going to be a popular position but it needs to be said!
This has been my experience. When something gets good enough, someone will create some really good resource on it. Allowing the dust to settle, to me is a more efficient strategy than constantly trying to “keep up”. Maybe also not waiting too long to do so.
This wouldn’t work of course if a person was trying to be some AI thought leader.
Also, Simon, with all due respect, and I mean it, I genuinely look in awe at the amount of posts you have on your blog and your dedication, but it’s clear to anyone that the projects you created and launched before 2022 far exceed anything you’ve done since. And I will be the first to say that I don’t think that’s because of LLMs not being able to help you. But I do think it’s because what makes you really, really good at engineering you kept replacing slowly but surely with LLMs more and more by the month.
If I look at Django, I can clearly see your intelligence, passion, and expertise there. Do you feel that any of the projects you’ve written since LLMs are the main thing you focus on are similar?
Think about it this way: 100% of you wins against 100% of me any day. 100% of Claude running on your computer is the same as 100% of Claude running on mine. 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.
I do worry when I see great programmers like you diluting their work.
It’s not just Simon that we’re getting less of, it’s YOU we’re getting less of too. And we want you around. Don’t go.
I’m keen to understand your reasoning on this. I don’t agree, but maybe I’m just stuck with old practices, so help me?
What’s your justification as to why starting a new chat is an antipattern?
I see what you're saying, but I'm not sure it is true. Take simonw and tymscar, put them each in charge of a team of 19 engineers (of identical capabilities). Is the result "nowhere near the same jump" as simonw vs. tymscar alone? I think it's potentially a much bigger jump, if there are differences in who has better ideas and not just who can code the fastest.
With LLMs its admittedly a bit closer to doing it yourself because the feedback loop is much tighter
There's a learning curve to any toolset, and it may be that using coding agents effectively is more than a few weeks of upskilling. It may be, and likely will be, that people make their whole careers about being experts on this topic.
But it's still a statistical text prediction model, wrapped in fancy gimmicks, sold at a loss by mostly bad faith actors, and very far from its final form. People waiting to get on the bandwagon could well be waiting to pick up the pieces once it collapses.
> But it's still a statistical text prediction model
This is reductive to the point of absurdity. What other statistical text prediction model can make tool calls to CLI apps and web searches? It's like saying "a computer is nothing special -- it's just a bunch of wires stuck together"
I wouldn't say it's shady or even untoward. Simon writes prolifically and he seems quite genuinely interested in this. That he has attached his public persona, and what seems like basically all of his time from the last few years, to LLMs and their derivatives is still a vested interest. I wouldn't even say that's bad. Passion about technology is what drives many of us. But it still needs saying.
> This is reductive to the point of absurdity. What other statistical text prediction model can make tool calls to CLI apps and web searches?
It's just a fact that these things are statistical text prediction models. Sure, they're marvels, but they're not deterministic, nor are they reliable. They are like a slot machine with surprisingly good odds: pull the lever and you're almost guaranteed to get something, maybe a jackpot, maybe you'll lose those tokens. For many people it's cheap enough to just keep pulling the lever until they get what they want, or go bankrupt.
But I'm still seeing clear evidence it IS a statistical text prediction model. You ask it the right niche thing and it can only pump out a few variations of the same code, that's clearly someone else's code stolen almost verbatim.
And I just use it 2 or 3 times a day.
How are SimonW and AntiRez not seeing the same thing?
How are they not seeing the propensity for both Claude + ChatGPT to spit out tons of completely pointless error handling code, making what should be a 5 line function a 50 line one?
How are they not seeing that you constantly have to nag it to use modern syntax. Typescript, C#, Python, doesn't matter what you're writing in, it will regularly spit out code patterns that are 10 years out of date. And woe betide you using a library that got updated in the last 2 years. It will constantly revert back to old syntax over and over and over again.
I've also had to deal with a few of my colleagues using AI code on codebases they don't really understand. Wrong sort, id instead of timestamp. Wrong limit. Wrong json encoding, missing key converters. Wrong timezone on dates. A ton of subtle, not obvious, bugs unless you intimately know the code, but would be things you'd look up if you were writing the code.
And that's not even including the bit where the AI obviously decided to edit the wrong search function in a totally different part of the codebase that had nothing to do with what my colleague was doing. But didn't break anything or trigger any tests because it was wrapped in an impossible to hit if clause. And it created a bunch of extra classes to support this phantom code, so hundreds of new lines of code just lurking there, not doing anything but if I hadn't caught it, everyone thinks it does do something.
The real unlock though is the coding agent harnesses. It doesn't matter any more if it statistically predicts junk code that doesn't compile, because it will see the compiler error and fix it. If you tell it "use red/green TDD" it will write the tests first, then spot when the code fails to pass them and fix that too.
> How are they not seeing the propensity for both Claude + ChatGPT to spit out tons of completely pointless error handling code, making what should be a 5 line function a 50 line one?
TDD helps there a lot - it makes it less likely the model will spit out lines of code that are never executed.
> How are they not seeing that you constantly have to nag it to use modern syntax. Typescript, C#, Python, doesn't matter what you're writing in, it will regularly spit out code patterns that are 10 years out of date.
I find that if I use it in a codebase with modern syntax it will stick to that syntax. A prompting trick I use a lot is "git clone org/repo into /tmp and look at that for inspiration" - that way even a fresh codebase will be able to follow some good conventions from the start.
Plus the moment I see it write code in a style I don't like I tell it what I like instead.
> And that's not even including the bit where the AI obviously decided to edit the wrong search function in a totally different part of the codebase that had nothing to do with what my colleague was doing.
I usually tell it which part of the codebase to execute - or if it decides itself I spot that and tell it that it did the wrong thing - or discard the session entirely and start again with a better prompt.
Honestly, what you're describing sounds like the older models. If you are getting these sorts of results with Opus 4.5 or 5.2-codex on high I would be very curious to see your prompts/workflow.
There are only so many ways to express the same idea. Even clean room engineers write incidentally identical code to the source sometimes.
Just like the stuff LLMs are being used for today. Why wouldn't "using LLMs well" be not just one of the many things LLMs will simplify too?
Or do you believe your type of knowledge is somehow special and is resistant to being vastly simplified or even made obsolete by AI?
Back in ~2024 a lot of people were excited about having "LLMs write the prompt!" but I found the results to be really disappointing - they were full of things like "You are the world's best expert in marketing" which was superstitious junk.
As of 2025 I'm finding they actually do know how to prompt, which makes sense because there's a ton more information about good prompting approaches in the training data as opposed to a couple of years ago. This has unlocked some very interesting patterns, such as Claude Code prompting sub-agents to help it explore codebases without polluting the top level token window.
But learning to prompt is not the key skill in getting good results out of LLMs. The thing that matters most is having a robust model of what they can and cannot do. Asking an LLM "can you do X" is still the kind of thing I wouldn't trust them to answer in a useful way, because they're always constrained by training data that was only aware of their predecessors.
It's different from assigning a task to a co-worker who already knows the business rules and cross-implications of the code in the real world. The agent can't see the broader picture of the stuff it's making, it can go from ignoring obvious (to a human that was present in the last planning meeting) edge cases to coding defensively against hundreds of edge cases that will never occur, if you don't add that to your prompt/context material.
- https://github.com/simonw/denobox is a new Python library that gives you the ability to run arbitrary JavaScript and WASM in a sandbox provided by Deno, because it turns out a Python library can depend on deno these days. I built that on my phone in bed yesterday morning.
- https://github.com/simonw/pwasm is a WebAssembly runtime written in pure Python with no dependencies, built by feeding Claude Code the official WASM specification along with its conformance test suite and having it hack away at that (again via my phone) to get as many of the tests to pass as possible. It's pretty slow and not really useful yet but it's certainly interesting.
- https://github.com/datasette/datasette-transactions is a Datasette plugin which provides a JSON API for starting a SQLite transaction, running multiple queries within it and then executing or rolling back that transaction. I built that one on my phone on a BART (SF Bay Area metro) trip.
- https://github.com/simonw/micro-javascript is a pure Python, no dependency JavaScript interpreter which started as a port of MicroQuickJS. Here's a demo of that one running in a browser https://simonw.github.io/micro-javascript/playground.html - that's my JavaScript interpreter running inside Python running in Pyodide in WebAssembly in your browser of choice, which I find inherently amusing.
All of those are from the past three weeks. Most of them were built on my phone while I was doing other things.
Looking at these projects, I have a few questions:
1. These seem to be fairly self-contained and well specified problems, which is the best case scenario for “vibe coding”. Do you have any examples of projects where the solution was somewhat vague and open-ended? If not, how do you think Claude Code or similar would perform?
2. Did you feel excited or energized by having an LLM implement these projects end-to-end? Personally, I find LLMs useful as a closely guided assistant, particularly to interactively explore the space of solutions. I also don’t feel energized at all by having it implement anything non-trivial end to end, outside of writing tests (and even then, not all types of tests!).
3. Do you think others would find these projects useful? In particular, if you vibe coded them, why couldn’t someone else do the same thing? And once these projects are picked up by future model training runs, they’ll probably be even easier to one shot, reducing the value even further.
Let me provide an example of what I mean by (2), at least in the context of hobbyist dev. I could have Claude Code vibe code a Gameboy emulator and it would probably do a fine job given that it’s a well specified problem that is likely well represented in its training data. But the process would neither be exciting nor energizing. I would rather spend hours gradually getting more and more working and experience the fruits of my labor (I did this already btw).
At $DAYJOB, I simply do not have confidence in an LLM doing anything non-trivial end to end. Besides, the complexity remains in defining the requirements and constraints, designing the solution, gaining consensus, and devising a plan for implementation. The goal would be for the LLM to pick up discrete, well defined chunks of work.
This one is pretty open ended, and I'm having a ton of fun designing and iterating on it: https://github.com/simonw/claude-code-transcripts - it's also attracting quite a few happy users now.
I have another project in the works in Go which is proving to be a ton of fun from a software design perspective, but it's not ready for outside eyes just yet.
"Did you feel excited or energized by having an LLM implement these projects end-to-end"
I'm enjoying myself so much right now. My BART rides have never been this entertaining before!
"Do you think others would find these projects useful? In particular, if you vibe coded them, why couldn’t someone else do the same thing?"
I don't think many developers have the combined taste and knowledge necessary to spin up Denobox or django-transactions. They both solve problems that I'm very confident need solving, but I expect to have to explain why those matter in some detail to all but a very small group of people who share my particular interests.
The other two are pretty standard - I suggest anyone who wants to learn more about JavaScript interpreters or WASM runtimes try something similar in the language of choice as a learning exercise.
I think that's the reason why LLMs work so well for some like you, and generate slop for others, because if you let them alone with projects that require opinionated code and actual decision making they most often don't grasp the users intention well or worse misinterpret it so confidently that you end up with something with all the wrong opinions and decisions compounding path-dependently into the strangest and most useless slop.
Yes, exactly! How amazing is it that we have technology now that lets us quickly build projects where we would normally have to spend considerable time on the finicky implementation details?
Of course I don't know for sure if you had any substantial input other than writing a few paragraphs of prompt text and sending Claude some links, because I didn't witness your workflow there. But I think this is kind of what irks some people including myself.
What's stopping me from "building" something similar also? Maybe I won't be as fast as you since you seem to be more experienced with these tools, but at the end of the day, would you be able to describe in detail what got built without you asking Claude about it? If you don't know anything about what you built other than just prompting an AI, in my opinion you didn't actually "build" anything -- Claude did.
One of my favorite options is "directed" - "I directed this". It's not quite obvious enough for me to use it in comments on threads like this though.
I've also experimented with "We built" but that feels uncomfortably like anthropomorphizing the model.
One of the reasons I publish almost all of my prompts and transcripts is that I don't believe in gatekeeping this stuff and I want other people to be able to learn how to do what I can do. Here are the transcripts for me Denobox project, for example: https://github.com/simonw/denobox/tree/transcripts - you can view those with my new https://orphanhost.github.io/ tool like this: https://orphanhost.github.io/?simonw/denobox/transcripts/ses...
People are under zero obligation to release their work to the public. Simon actually publishes and writes about a remarkable amount of the side projects he builds with AI.
The rest of us just build tons of cool stuff for personal use or for $JOB. Releasing stuff to the public is, in general, a massive amount of extra work for very little benefit. There are loads of FOSS maintainers trapped spending as much time managing their communities as they do their actual projects and many of us just don't have time for that.
I wouldn't worry about this.
There are many examples of people sharing a project they've used LLMs to help write, and the result was not a huge amount of attention & expectation of burden.
Perhaps "I don't share it because I'm worried people will love it too much" even suggests the opposite: you can concretely demonstrate the kinds of things you've been able to build using LLMs.
> This is such a tired response at this point.
Lack of specificity & concrete examples frequently mean all that's left for discussion is emotion for hype and anti-hype, though.
In this thread, the discussion was:
pro: use LLMs or get left behind
conserve: okay, I'll start using LLMs when they're good
pro: no no they won't be that good, it takes effort to get to use them
conserve: do you have any examples?
pro: why should we have to share examples?
I like LLMs. But making big claims while being reticent about concrete claims and demonstrations is irksome.Software quality has been on a step downwards curve as far as quality and capabilities are concerned, for years before LLM coding had its breakthrough. For all the promises I'd have expected to, three years later, at least notice the downward trajectory easing off. But it hasn't been happening.
> I could if I wanted to, but I just don't feel like it.
What am I missing where I can understand that's not what you meant?
People seem to believe that there is a burden of proof. There is not. What do I care if you are on board?
I don't know what could change your mind, but of course the answer is "nothing" as long as you aer not open to it. Just look around. There is so much stuff, from so many credible people in all domains. If you can't find anything that is convincing or at least interesting to you, you are simply not looking.
The burden of proof rests on those making the positive claim. You say you don't care if others get on board, but a) clearly a lot of others do (case in point: the linked article) and b) a quick check of your posts in this very thread shows that you are indeed making positive claims about the merits of LLM assisted software development.
Without enough adoption expect some companies you are a client of to increase prices more, or close entirely down the road, due to insufficient cash inflow.
So, you would care, if you want to continue to use these tools and see them evolve, instead of seeing the bubble pop.
https://github.com/williamcotton/gramgraph
The motivation? I needed a declarative plotting language for another DSL I'm working on called Web Pipe:
GET /weather.svg
|> fetch: `https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m`
|> jq: `
.data.response.hourly as $h |
[$h.time, $h.temperature_2m] | transpose | map({time: .[0], temp: .[1]})
`
|> gg({ "type": "svg", "width": 800, "height": 400} ): `
aes(x: time, y: temp)
| line()
| point()
`
"Web Pipe is an experimental DSL and Rust runtime for building web apps via composable JSON pipelines, featuring native integration of GraphQL, SQL, and jq, an embedded BDD testing framework, and a sophisticated Language Server."https://github.com/williamcotton/webpipe
https://github.com/williamcotton/webpipe-lsp
https://williamcotton.com/articles/basic-introduction-to-web...
I've been working at quite a clip for a solo developer who is building a new language with a full featured set of tooling.
I'd like to think that the approach to building the BDD-testing framework directly into the language itself and having the test runner using the production request handlers is at least somewhat novel!
GET /hello/:world
|> jq: `{ world: .params.world }`
|> handlebars: `<p>hello, {{world}}</p>`
describe "hello, world"
it "calls the route"
let world = "world"
when calling GET /hello/{{world}}
then status is 200
and selector `p` text equals "hello, {{world}}"
I'm married with two young kids and I have a full-time job. Before these tools there was no way I could build all of these experiments with such limited resources.- https://tools.simonwillison.net/bullish-bearish
- https://tools.simonwillison.net/user-agent
For more impressive examples see https://simonwillison.net/2025/Dec/10/html-tools/ and https://news.ycombinator.com/item?id=46574276#46582192
I can't gauge the other two since I don't use those things, so maybe they are cool, idk.
If you're here to converse in good faith, what's your opinion of the examples I shared in this post over here? https://news.ycombinator.com/item?id=46574276#46582192
Maybe AI gets good enough at writing code that it's users' knowledge of computer science and software development becomes irrelevant. In that case, approximately everyone on this site is just screwed. We're all in the business of selling that specialized knowledge, and if it's no longer required then companies aren't going to pay us to operate the AI, they're going to pay PMs, middle managers, executives, etc. But even that won't be particularly workable long term, because all their customers will realize they no longer need to pay the companies for software either. In this world, the price of software goes to zero (and hosting likely gets significantly more commoditized than it is now). Any time you put into learning to use LLMs for software development doesn't help you keep making money selling software, and actually stops you from picking up a new career.
If, on the other hand, CS and software engineering knowledge is still needed, companies will have to keep/restart hiring or training new developers. In terms of experience using AI, it is impossible for anyone to have less experience than these new developers. We will, however, have much more experience and knowledge of the aforementioned non-LLM skills that we're assuming (in this scenario) are still necessary for the job. In this scenario you might be better off if you'd started learning to prompt a bit earlier, but you'll still be fine if you didn't.
What's the impressive thing that can convince me it's equivalent, or better than anything created before, or without it?
I understand you've produced a lot of things, and that your clout (which depends on the AI ferver) is based largely because of how refined a workflow you've invented. But I want to see the product, rather than the hype.
Make me say; I wish I was good enough to create this!
Without that, all I can see is the cost, or the negative impact.
edit: I've read some of your other posts, and for my question, I'd like to encourage you to pick only one. Don't use the scatter shot approach that LLMs love, giving plenty of examples, hoping I'll ignore the noise for the single that sounds interesting.
Pick only one. What project have you created that you're truly proud of?
I'll go first, (even though it's unfinished): Verse
It's intuitive to use but hard to master
If you never AI coded before then get ready for fun!
It is just way easier for someone to get up to speed today than it was a year ago. Partly because capabilities have gotten better and much of what was learned 6+ months ago no longer needs to be learned. But also partly because there is just much more information out there about how to get good results, you might have coworkers or friends you can talk to who have gotten good results, you can read comments on HN or blog posts from people who have gotten good results, etc.
I mean, ok, I don't think someone can fully catch up in a few weeks. I'll grant that for sure. But I think they can get up to speed much faster than they could have a year ago.
Of course, they will have to put in the effort at that time. And people who have been putting it off may be less likely to ever do that. So I think people will get left behind. But I think the alarm to raise is more, "hey, it's a deep topic and you're going to have to put in the effort" rather than "you better start now or else it's gonna be too late".
The intuition just doesn't hold. The LLM gets trained and retrained by other LLM users so what works for me suddenly changes when the LLM models refresh.
LLMs have only gotten easier to learn and catch up on over the years. In fact, most LLM companies seem to optimise for getting started quickly over getting good results consistently. There may come a moment when the foundations solidify and not bothering with LLMs may put you behind the curve, but we're not there yet, and with the literally impossible funding and resources OpenAI is claiming they need, it may never come.
I mean, I just think of them like a dog that'll get distracted and go off doing some other random thing if you don't supervise them enough and you certainly don't want to trust them to guard your sandwich.
The pro AI people don't understand what quadratic attention means and the anti-ai people don't understand how much information can be contained in a tb of weights.
At the end of the day both will be hugely disappointed.
>The best asset you can develop is an intuition for what works and what doesn't, and getting that intuition requires months if not years of personal experimentation.
Intuition does not translate between models. Whatever you think dense llms were good at deepseek completely upended it in an afternoon. The difference between major revisions of model families is substantial enough that intuition is a drawback not an asset.
I've so far found that intuition travels between models of a similar generation remarkably well. The conformance suite trick (find a 9,200 test existing conformance suite and tell an agent to build a fresh implementation that passes all those tests) I first found with GPT-5.2 turned out to work exactly as well against Claude Opus 4.5, for example.
It argues that the self-attention mechanism in transformers works by having every token "attend to" every other token in a sequence, which is quadratic - n^2 against input - which should limit the total context length available to models.
This would explain why the top models have been stuck at 1 million tokens since Gemini 1.5 in February 2024 (there has been a 2 million token Gemini but it's not in wide use, and Meta claimed their Llama 4 Scout could do 10 million but I don't know of anyone who's seen that actually work.)
My counter-argument here is that Claude Opus 4.5 has a comparatively tiny 200,000 token window which turns out to work incredibly well for the kinds of coding problems we're throwing at it, when accompanied by a cleverly designed harness such as Claude Code. So this limit from 2022 has been less "disappointing" than people may have expected.
What's practically limiting context size IME is that results seem to get "muddy" and get off track when you have a giant context size. For a single-topic long session, I imagine you get a large number of places in the context which may be good matches for a given query, leading to ambiguous results.
I'm also not sure how much work is being put into reinforcement in extremely large context inference, as it's presumably quite expensive to do and hard to reliably test.
Perfect for a demo or work on a single self contained file.
Disastrous for a large code base with logic scattered all throughout it.
Greenfield is easy (but it always was). Working on well-organised modules that are self contained and cleanly designed is easy - but that always was, too.
Things that they couldn't do six months go might now be things that they can do - and knowing they couldn't do X six months ago is useful because it helps systematize your explorations.
A key skill here is to know what they can do, what they can't do and what the current incantations are that unlock interesting capabilities.
A couple I've learned in the past week:
1. Don't give Claude Code a URL to some code and tell it to use that, because by default it will use its WebFetch tool but that runs an extra summarization layer (as a prompt injection defense) which loses details. Telling it to use curl sometimes works but a guaranteed trick is to have it git clone the relevant repo to /tmp and look at the code there instead.
2. Telling Claude Code "use red/green TDD" is a quick to type shortcut that will cause it to write tests first, run them and watch them fail, then implement the feature and run the test again. This is a wildly effective technique for getting code that works properly while avoiding untested junk code that isn't needed.
Now multiply those learnings by three years. Sure, the stuff I figure out in 2023 mostly doesn't apply today - but the skills I developed in learning how to test and iterate on my intuitions from then still count and still keep compounding.
The idea that you don't need to learn these things because they'll get better to the point that they can just perfectly figure out what you need is AGI science fiction. I think it's safe to ignore.
A somewhat intelligent junior will dive deep for one week and be on the same knowledge level as you in roughly 3 years.
In particular, the idea of saying something like "use red/green TDD" is an expression of communication skill (and also, of course, awareness of software methodology jargon).
And if the hype is right, why would you need to know any of them? I've seen people unironically suggest telling the LLM to "write good code", which seems even easier.
Telling an intern to care about code quality might actually cause an intern who hasn't been caring about code quality to care a little bit more. But it isn't going to help the intern understand the intended purpose of the software.
> prompting the AI with words that you'd actually use when asking a human to perform the task, generally works better
Ok, but why would you assume that would remain true? There's no reason it should.
As AI starts training on code made by AI, you're going to get feedback loops as more and more of the training data is going to be structured alike and the older handwritten code starts going stale.
If you're not writing the code and you don't care about the structure, why would you ever need to learn any of the jargon? You'd just copy and paste prompts out of Github until it works or just say "hey Alexa, make me an app like this other app".
It's also useful for figuring out what I think and how best to express that. Sometimes I get really great replies too - I compared ethical LLM objections to veganism today on Lobste.rs and got a superb reply explaining why the comparison doesn't hold: https://lobste.rs/s/cmsfbu/don_t_fall_into_anti_ai_hype#c_oc...
I agree that CC becoming omniscient is science fiction, but the goal of these interfaces is to make LLM-based coding more accessible. Any strategies we adopt to mitigate bad outcomes are destined to become part of the platform, no?
I've been coding with LLMs for maybe 3 years now. Obviously a dev who's experienced with the tools will be more adept than one who's not, but if someone started using CC today, I don't think it would take them anywhere near that time to get to a similar level of competency.
Could it be, they use other definition of "good"?
Being "a person who can code" carries some prestige and signals intelligence. For some, it has become an important part of their identity.
The fact that this can now be said of a machine is a grave insult if you feel that way.
It's quite sad in a way, since the tech really makes your skills even more valuable.
I've had this conversation with a few people so far, and I've offered to personally walk through a project of their choosing with them. Everyone who has done this has changed their perspective. You may not be convinced it will change the world, but if you approach it with an open mind and take the time to learn how to best use it, I'm 100% sure you will see that it has so much potential.
There are tons of youtube videos and online tutorials if you really want to learn.
Here we go, as I said, and again and again and again it's always out fault we're not using well. It is impossible to counter argument. Btw to reply to your question, yes many times and proved to be useful in very small specialized tasks and a couple of migrations. I really like how LLMs are helping me in my day to day, but still so far away from all this astroturfing
You'd be sage with your time just to keep a high-level view until workflows become stable and aren't advancing every few months.
The time to consider mastering a workflow is when a casual user of the "next release" wouldn't trivially supersede your capabilities.
Similarly we're still in the race to produce a "good enough" GenAI, so there isn't value in mastering anything right now unless you've already got a commercial need for it.
This all reminds me of a time when people were putting in serious effort to learn Palm Pilot's Graffiti handwriting recognition, only for the skill to be made redundant even before they were proficient at it.
This is true, but still shocking. Professional (working with others at least) developers basically live or die by their ability to communicate. If you're bad at communication, your entire team (and yourself) suffer, yet it seems like the "lone ranger" type of programmer is still somewhat praised and idealized. When trying to help some programmer friends with how they use LLMs, it becomes really clear how little they actually can communicate, and for some of them I'm slightly surprised they've been able to work with others at all.
An example the other day, some friend complained that the LLM they worked with was using the wrong library, and using the wrong color for some element, and surprised that the LLM wouldn't know it from the get go. Reading through the prompt, they never mentioned it once, and when asked about it, they thought "it should have been obvious" which yeah, to someone like you who worked for 2 years on this project that might be obvious, but for some with zero history and zero context about what you do? How you expect it to know this? Baffling sometimes.
The world changed for good and we will need to adapt. The bigger and more important question at this point isn't anymore if LLMs are good enough, for the ones who want to see, but, as you mention in your article, is what will happen to people who will get unemployed. There's a reality check for all of us.
Learning all of the advanced multi-agent worklows etc. etc... Maybe that gets you an extra 20%, but it costs a lot more time, and is more likely to change over time anyway. So maybe not very good ROI.
The most advanced tooling today looks nothing like the tooling for writing software 3 years ago. We've got multi-agent orchestration with built in task and issue tracking, context management, and subagents now. There's a steep learning curve!
I'm not saying that everyone has to do it, as the tools are so nascent, but I think it is worthwhile to at least start understanding what the state of the art will look like in 12-24 months.
Part of the problem with things that iterate quickly is that iterations tend to reference previous versions. So, you try learning the new hotness (v261), but there are implied references to v254, v239, and v198. Then you realize, v1, v5, v48, v87, v138, v192, and v230 have cute identifiers that you aren't familiar with and are never explained anywhere. New concepts get introduced in v25, v50, v102, and v156 that later became foundational knowledge that is assumed to be understood by the reader and is never explained anywhere.
So, if you feel confident something will be the next hotness, it's usually best to be an early adopter, so you gain your knowledge slowly over years instead of having to cram when you need to pick it up.
I've learned a lot of new things this year thanks to AI. It's true that the low levels skills with atrophy. The high level skills will grow though; my learning rate is the same, just at a much higher abstraction level; thus covering more subjects.
The main concern is the centralisation. The value I can get out of this thing currently well exceeds my income. AI companies are buying up all the chips. I worry we'll get something like the housing market where AI will be about 50% of our income.
We have to fight this centralisation at all costs!
I don't think that's too far away. Anthropic, OpenAI, etc. are pushing the idea that you need a subscription but if opensource tools get good enough they could easily become an expensive irrelivance.
Always? I think that only holds for a certain amount of time (different for each sector) after which the open stuff is better.
I thought it was only true for dev tools, but I had to rethink it when I met a guy (not especially technical) who runs open source firmware on his insulin pump because the closed source stuff doesn't gives him as much control.
I’m surprised by how good the models I can run on my old M1 Max laptop are.
In a year’s time open models on something like a Mac Studio M5 Ultra are going to be very impressive compared to the closed models available today.
They won’t be state of the art for their time but they will be good enough and you’ll have full control.
Thank goodness this isn't in a problem!
In fact, I'd say I code even better since I started doing one hour per day of a mixture of fun coding and algo quizzes while at work I mostly focus on writing a requirements plan and implementation plan later and then letting the AI cook while I review all the output multiple times from multiple angles.
Guess we are still in the 1970s era of AI computing. We need to hope for a few more step changes or some breakthrough on model size.
I don't think it's a coincidence that some of the best developers[1] are using these tools and some openly advocating for them because it still requires core skills to get the most out of them
I can honestly say that building end-to-end products with claude code has made me a better developer, product designer, tester, code reviewer, systems architect, project manager, sysadmin etc. I've learned more in the past ~year than I ever have in my career.
[0] abandoned cursor late last year
[1] see Linus using antigravity, antirez in OP, Jared at bun, Charlie at uv/ruff, mitushiko, simonw et al
(I had been using GitHub Copilot for 5+ years already, started as an early beta tested, but I don’t really consider that the same)
I like to say it’s like learning a programming language. it takes time, but you start pattern matching and knowing what works. it took me multiple attempts and a good amount of time to learn Rust, learning effective use of these tools is similar
I’ve also learned a ton across domains I otherwise wouldn’t have touched
- find information about APIs without needing to open a browser
- writing a plan for your business-logic changes or having it reviewed
- getting a review of your code to find edge cases, potential security issues, potential improvements
- finding information and connecting the dots of where, what and why it works in some way in your code base?
Even without letting AI author a single line of code (where it can still be super useful) there are still major uses for AI.
It's not that different overall, I suppose, from the loop of thinking of an idea and then implementing it and running tests; but potentially very disorienting for some.
One of the key skills needed in working with LLMs is learning to ignore the hype and marketing and figure out what these things are actually capable of, as opposed to LinkedIn bluster and claims from CEOs who's net worth are tied to investor sentiment in their companies.
If someone spends more time talking about "AGI" then what they're actually building, filter that person out.
This is precisely what led me to realize that while they have some use for code review and analyzing docs, for coding purposes they are fairly useless.
The hypesters responses' to this assertion exclusively into 5 categories. Ive never heard a 6th.
If you believe that uncritically about everything else, then you have to answer why agentic workflows or MCP or whatever is the one thing that it can't evolve to do for us. There's a logical contradiction here where you really can't have it both ways.
oh I think I do get your point now after a few rereads (correct if wrong but you’re saying it should keep getting better until there’s nothing for us to do). “AI”, and computer systems more broadly, are not and cannot be viable systems. they don’t have agency (ironically) to affect change in their environment (without humans in the loop). computer systems don’t exist/survive without people. all the human concerns around what/why remain, AI is just another tool in a long line of computer systems that make our lives easier/more efficient
Prompt Engineer to AI Engineer: Designing agentic workflows is a waste of time, just pre/postfix whatever input you'd normally give to the agentic system with the request to "build or simulate an appropriate agentic workflow for this problem"
There is a staggering number of unserious folks in the ears of people with corporate purchasing power.
your argument amounts to “some people said stupid shit one time and I took it seriously”
Replace that with anything and you will notice that people who are building startups in this area will want to bring the narrative like that as it usually highly increases the value of their companies. When narrative gets big enough, then big companies must follow - or they look like "lagging behind". Whether the current thing brings value or not. It is a fire that keeps feeding itself. In the end, when it gets big enough - we call it as bubble. Bubble that may explode. Or not.
Whether the end user gets actual value or not, is just side effect. But everyone wants to believe that that it brings value - otherwise they were foolish to jump in the train.
For now i think people can still catch up quickly, but at the end of 2026 it's probably going to be a different story.
Can you elaborate? Skill in AI use will be a differentiator?
At some point you will need to combine multiple skills together:
- communication
- engineering skills (understanding requirements, finding edge cases, etc)
- architectural proficiency
- prompting
- agentic workflows and skills
- context management
- and yes, proper old fashioned coding skills to keep things tidy and consistent
Ah yes, an ecosystem that is fundamentally inherently built on probabilisitic quick sand and even with the "best prompting practices", you still get agents violating the basics of security and committing API keys when they were told not to. [0]
Design your secrets to include a common prefix, then use deterministic scanning tools like git hooks to prevent then from being checked in.
Or have a git hook that knows which environment variables have secrets in and checks for those.
For example, what if your code (that the LLM hasn't reviewed yet) has a dumb feature in where it dumps environment variables to log output, and the LLM runs "./server --log debug-issue-144.log" and commits that log file as part of a larger piece of work you ask it to perform.
If you don't want a bad thing to happen, adding a deterministic check that prevents the bad thing to happen is a better strategy than prompting models or hoping that they'll get "smarter" in the future.
Some of this negativity I think is due to unrealistic expectations of perfection.
Use the same guardrails you should be using already for human generated code and you should be fine.
CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works
"Trust only me bro".
It takes 10 seconds to see the many examples of API keys + prompts on GitHub to verify that tweet. The issue with AI isn't limited to that tweet which demonstrates its probabilistic nature; Otherwise why do need a sandbox to run the agent in the first place?
Nevermind, we know why: Many [0] such [1] cases [2]
> CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works
Except you just made a false equivalence. CPUs can be tested / verified transparently and even if it does go wrong, we know exactly why. Where as you can't explain why the LLM hallucinated or decided to delete your home folder because the way it predicts what it outputs is fundamentally stochastic.
[0] https://old.reddit.com/r/ClaudeAI/comments/1pgxckk/claude_cl...
[1] https://old.reddit.com/r/ClaudeAI/comments/1jfidvb/claude_tr...
[2] https://www.google.com/search?q=ai+deleted+files+site%3Anews...
my point is more “skill issue” than “trust me this never happens”
my point on CPUs is people who don’t understand LLMs talk like “hallucinations” are a real thing — LLMs are “deciding” to make stuff up rather than just predicting the next token. yes it’s probabilistic, so is practically everything else at scale. yet it works and here we are. can you really explain in detail how everything you use works? I’m guessing I can explain failure modes of agentic systems (and how to avoid them so you don’t look silly on twitter/github) and how neural networks work better than most people can explain the technology they use every day
That doesn't refute the probabilistic nature of LLMs despite best prompting practices. In fact it emphasises it. More like your 1 anecdotal example vs my 20+ examples on GitHub.
My point tells you that not only it indeed does happen, but a previous old issue is now made even worse and more widespread, since we now have vibe-coders without security best practices assuming the agent should know better (when it doesn't).
> my point is more “skill issue” than “trust me this never happens”
So those that have this "skill issue" are also those who are prompting the AI differently then? Either way, this just inadvertently proves my whole point.
> yes it’s probabilistic, so is practically everything else at scale. yet it works and here we are.
The additional problem is can you explain why it went wrong as you scale the technology? CPUs circuit design go through formal verification and if a fault happens, we know exactly why; hence it is deterministic in design which makes them reliable.
LLMs are not and don't have this. Which is why OpenAI had to describe ChatGPT's misaligned behaviour as "sycophancy", but could not explain why it happened other than tweaking the hyper-parameters which got them that result.
So LLMs being fundamentally probabilistic and are hence, more unexplainable being the reason why you have the screenshot of vibe-coders who somehow prompted it wrong and the agent committed the keys.
Maybe that would never have happened to you, but it won't be the last time we see more of this happening on GitHub.
yes AI makes leaking keys on GH more prevalent, but so what? it’s the same problem as before with roughly the same solution
I’m saying neural networks being probabilistic doesn’t matter — everything is probabilistic. you can still practically use the tools to great effect, just like we use everything else that has underlying probabilities
OpenAI did not have to describe it as sycophancy, they chose to, and I’d contend it was a stupid choice
and yes, you can explain what went wrong just like you can with CPUs. we don’t (usually) talk about quantum-level physics when discussing CPUs; talking about neurons in LLMs is the wrong level of abstraction
Verses your anecdote being a proof of what? Skill issue for vibe coders? Someone else prompting it wrong?
You do realize you are proving my entire point?
> yes AI makes leaking keys on GH more prevalent, but so what? it’s the same problem as before with roughly the same solution
Again, it exacerbates my point such that it makes the existing issue even worse. Additionally, that wasn't even the only point I made on the subject.
> I’m saying neural networks being probabilistic doesn’t matter — everything is probabilistic.
When you scale neural networks to become say, production-grade LLMs, then it does matter. Just like it does matter for CPUs to be reliable when you scale them in production-grade data centers.
But your earlier (fallacious) comparison ignores the reliability differences between them (CPUs vs LLMs.) and determinism is a hard requirement for that; which the latter, LLMs are not.
> OpenAI did not have to describe it as sycophancy, they chose to, and I’d contend it was a stupid choice
For the press, they had to, but no-one knows the real reason, because it is unexplainable; going back to my other point on reliability.
> and yes, you can explain what went wrong just like you can with CPUs. we don’t (usually) talk about quantum-level physics when discussing CPUs; talking about neurons in LLMs is the wrong level of abstraction
It is indeed wrong for LLMs because not even the researchers can practically give an explanation why a single neuron (for every neuron in the network) gives different values on every fine-tune or training run. Even if it is "good enough", it can still go wrong at the inference-level for other unexplainable reasons other than it "overfitted".
CPUs on the other hand, have formal verification methods which verify that the CPU conforms to its specification and we can trust that it works as intended and can diagnose the problem accurately without going into atomic-level details.
> I’m saying it doesn’t matter it’s probabilistic, everything is,
Maybe it doesn't matter for you, but it generally does matter.
The risk level of a technology failing is far higher if it is more random and unexplainable than if it is expected, verified and explainable. The former eliminates many serious use-cases.
This is why your CPU, or GPU works.
LLMs are neither deterministic, no formal verification exists and are fundamentally black-boxes.
That is why many vibe-coders reported many "AI deleted their entire home folder" issues even when they told it to move a file / folder to another location.
If it did not matter, why do you need sandboxes for the agents in the first place?
very little software (or hardware) used in production is formally verified. tons of non-deterministic software (including neural networks) are operating in production just fine, including in heavily regulated sectors (banking, health care)
No.
> very little software (or hardware) used in production is formally verified. tons of non-deterministic software (including neural networks) are operating in production just fine, including in heavily regulated sectors (banking, health care)
It's what happens when it all goes wrong.
You have to explain exactly why, a system failed in heavily regulated sectors.
Saying 'everything is probabilistic' as the reason for the cause of an issue, is a non answer if you are a chip designer, air traffic controller, investment banker or medical doctor.
So your point does not follow.
I have repeated myself many times and you decide to continue to ignore the reliability points that inherently impede LLMs in many use-cases which exclude them in areas where predictability in critical systems is required in production.
Vibe coders can use them, but the gulf between useful for prototyping and useful for production is riddled with hard obstacles as such a software like LLMs are fundamentally unpredictable hence the risks are far greater.
> I’m saying these non-deterministic neural networks are already widespread in industry with these regulations, and it’s fine.
So when a neural network scales beyond hundreds of layers and billions of parameters, equivalent to a production-grade LLM, explain exactly how is such a black-box on that scale explainable when it messes up and goes wrong?
> they can be explained despite your repeated assertions they cannot be.
With what methods exactly?
Early on, I said formal verification and testing on CPUs for explaining when they go wrong at scale. It is you that provided absolutely nothing of your own assertions with the equivalent for LLMs other than "they can be explained" without providing any evidence.
> also the entire point on CPUs which you may have noticed I dropped from my responses because you seemed distracted arguing about it). this is not productive and we’re both clearly stubborn, glhf
You did not make any point with that as it was a false equivalence, and I explained why the reliability of a CPU isn't the same as the reliability of a LLM.
The ones pushing this narrative have either the following:
* Invested in AI companies (which they will never disclose until they IPO / acquired)
* Employees at AI companies that have stock options which they are effectively paid boosters around AGI nonsense.
* Mid-life crisis / paranoia that their identity as a programmer is being eroded and have to pivot to AI.
It is no different to the crypto web3 bubble of 2021. This time, it is even more obvious and now the grifters from crypto / tech are already "pivoting to ai". [0]
> It is no different to the crypto web3 bubble of 2021
web3 didn't produce anything useful, just noise. I couldn't take a web3 stack to make an arbitrary app. with the PISS machine I can.
Do I worry about the future, fuck yeah I do. I think I'm up shit creek. I am lucky that I am good at describing in plain English what I want.
With AI companies still selling services far below cost, it's only a matter of time before the money runs out and the true value of these tools will be tested.
As someone who was at a large company that was dabbling in NFTs, there was no value apart from pure gambling. At the time that we were doing it, it was also too late, so it was just a jinormous
My issue with GenAI is the rampant copyright violation, and the effect it will have on the economy. Its also replacing all of the fun bits of the world that I inhabit.
At least with web3 it was mostly contained with in the BO infested basement that crypto bros inhabit. AI bollocks has infected half the world.
Stay in plan mode most of the time. It will produce a step by step set of instructions - more context - for the LLM to execute the change. It’s the best place to exert detailed control over what will happen. Claude lets you edit it in a vim window.
Think about testing strategy carefully. Connecting the feedback back into the LLM is what makes a lot of the magic happen. But it requires thought or the LLM might cheat or you get a suboptimal result.
Then with these two you spend your time thinking in terms of product correctness - good tests - and implementation plan - deciding if the LLM has a sane grasp of the problem and will create a sane result.
You’re at a higher level of abstraction, still caring about details, but rarely finicky up to your elbows in line by line code.
If you can get good at these you’re well on your way.
Force it to have clear metrics / observability on what it is doing. For instance the other day I wanted Claude to modify a Commodore 64 emulator, and I started saying it to implement an observability framework where as the emulator run, it can connect to a socket and ask for registers, read/write memory areas, check the custom chips status, set breakpoints, ... As you can guess, after this the work is of a different kind.
The concern mostly comes from the business side… that for all the usefulness on the tech there is no clearly viable path that financially supports everything that’s going on. It’s a nice set of useful features but without products with sufficient revenue flowing in to pay for it all.
That paints a picture of the tech sticking around but a general implosion of the startups and business models betting on making all this work.
The later isn’t really “anti-AI hype” but more folks just calling out the reality that there’s not a lot of evidence and data to support the amount of money invested and committed. And if you’ve been around the tech and business scene a while you’ve seen that movie before and know what comes next.
In 5 years time I expect to be using AI more than I do now. I also expect most of the AI companies and startups won’t exist anymore.
One is the time of a human (irreplaceable) and the other is a tool for some human to use, seems proportional to me.
Everyone is replaceable. Software devs aren't special.
If it's all meant to be ironical, it's a huge failure and people will use it to support their AI hype.
I for one look forward to the next AI winter, which I hope will be long, deep, and savage.
Don't let hype deter you to get your own hands dirty and try shit.
- when Google paid $1 bil for YouTube
- when Facebook paid $1 bil for Instagram
- when Facebook paid $1 bil for WhatsApp
The same thing - these 3 companies make no money, and have no path to making money, and that the price paid was crazy and decoupled from any economics.
Yet now, in hindsight, they look like brilliant business decisions.
you lack imagination, human workers are paid globally over $10 trillion dollars.
What? HN is absolutely packed with people complaining about LLMs are nothing more than net useless creators of slop.
Granted, fewer than six months ago, which should tell people something...
We have people who are running the same tasl 10 times in parallel and having one LLm write a prompt for another LLm to execute then sitting on their phone for an hour while they let the AI's battle it out. For tasks that should take 3 minutes. Then having another coding agent make a PR, update JIRA tickets, etc.
Frankly it blows my mind that so many developers have so little actual understanding of cost associated with AI.
Why we don't have to be anti-AI? Why in his opinion is just "HYPE"? I didn't find any answer in his post. He doesn't analyse the cons of AI and explain why some people might be anti-AI. He skipped the hard part and wrote a mild article that re-publish the narrative that is already getting spread on every social media.
Edit for clarification: I don't consider anti-AI the people that think LLMs don't work, they are wrong. I consider anti-AI people that are worried how this technology will impact society in so many ways that are hard to predict, including the future of software engineering.
I do think at least being proficient right now with the LLMs will help you with whatever comes next, just because you’ll build the intuition around it. Being anti-AI might negatively affect one’s employability, and especially the younger ones who don’t have seniority or connections over the decades.
From purely business and career perspective being anti-blockchain/NFT/online gambling/adtech/fascism (at least for now in US)/etc. is a self-own, too.
I'm sure everybody making a choice against that knows it.
Thankfully purely business and career perspectives don't dictate everything.
If you refuse to work with AI, however, you're already significantly limiting your opportunities. And at the pace things are going, you're probably going to find yourself constrained to a small niche sooner rather than later.
There's always more shady jobs than ethically satisfying ones. There's increasingly more jobs in prediction markets and other sorts of gambling, adtech (Meta, Google). Moral compromise pays.
But if you really think about it and set limits on what is acceptable for you to work on (interesting new challenges, no morally dubious developments like stealing IP for ML training, etc.) then you simply don't have that FOMO of "I am sacrificing my career" when you screen those jobs out. Those jobs just don't exist for you.
Also, people who tag everybody like that as some sort of "anti-AI" tinfoilhatters are making a straw man argument. Most people with an informed opinion don't like the ways this tech is applied and rolled out in ways that is unsustainable and exploitative of ordinary people and open-source ecosystem, the confused hype around it, circular investment, etc., not the underlying tech on its own. Being vocally against these matters does not make one an unemployable pariah in the slightest, especially considering most jobs these days build on open source and being anti license-violating LLMs is being pro sustainable open-source.
That, and you can’t also get the amazing results if you’re poor or have bad internet.
Why do you guys always assume we don't as though the oldest models are easy to use accidentally
Good for anything >= 1 month old.
Use other nonsense fear inducing argument in the mean time, continue gathering gobs of VC money, get your bag, continue till the bubble pops.
In all fairness, and putting hype and anti-hype aside, I’m really interested to see the actual value of LLM/agent services after the VC money subsidies dry out. Would people we willing to pay for services at 10x the current price?
They ran out of believable arguments or never had any to begin with?
As it was said on a thread here, LLMs are search engines. The rest is religion.
In software, we, the developers, have increasingly been a bottleneck. The world needs WAY more software than we can economically provide, and at long last a technology has arrived that will help route around us for the benefit of humanity.
Here's an excellent Casey Handmer quote from a recent Dwarkesh episode:
> One way to think about the industrial revolutions is [...] what you're doing is you're finding some way of bypassing a constraint or bypassing a bottleneck. The bottleneck prior to what we call the Industrial Revolution was metabolism. How much oats can a human or a horse physically digest and then convert into useful mechanical output for their peasant overlord or whatever? Nowadays we would giggle to think that the amount of food we produce is meaningful in the context of the economic power of a particular country. Because 99% of the energy that we consume routes around our guts, through the gas tanks of our cars and through our aircraft and in our grids and stuff like that.
> Right now, the AI revolution is about routing around cognitive constraints, that in some ways writing, the printing press, computers, the Internet have already allowed us to do to some extent. A credit card is a good example of something that routes around a cognitive constraint of building a network of trust. It's a centralized trust.
It's a great episode, I recommend it: https://www.dwarkesh.com/p/casey-handmer
Is that really true? I'm getting the impression that most software reinvents the wheel.
Everything you wrote here is directly contradicted by casual observation of reality.
Developers aren't a bottleneck. If they were, we wouldn't be in a historic period of layoffs. And before you say that AI is causing the layoffs -- it's not. They started before AI was widely used for production, and they're also being done at companies that aren't heavily using AI anyway. They're a result of massive over-hiring during periods of low interest rates.
Beyond that, who is demanding software developers? The things that make our lives better (like digital forms at the doctor's office) aren't complex software.
The majority of the demand is from enshittification companies making our lives worse with ads and surveillance. No one is demanding developers, but certainly individual humans aren't demanding them.
The world is chock-full of important, society-scale problems that have been out of reach because the economics have made them costly to work on and therefore risky to invest in. Lowering the cost of software development de-risks investment and increases the total pool of profitable (or potentially profitable) projects.
The companies that will work on those new problems are being conceived or born right now, and [collectively] they'll need lots of AI-native software devs.
No. I agree with the author, but it's hyperbolic of him to phrase it like this. If you have solid domain knowledge, you'll steer the model with detailed specs. It will carry those out competently and multiply your productivity. However, the quality of the output still reflects your state of knowledge. It just provides leverage. Given the best tractors, a good farmer will have much better yields than a shit one. Without good direction, even Opus 4.5 tends to create massive code repetion. Easy to avoid if you know what you are doing, albeit in a refactor pass.
One is LLMs writing code. Not everything and not for everyone. But they are useful for most of the code being written. It is useful.
What it does not do (yet, if ever) is bridging the gap from "idea" to a working solution. This is precisely where all the low-code ideas of the past decades fell apart. Translating an idea in to formal rules is very, very hard.
Think of all of the "just add a button there"-type comments we've all suffered.
Sure Opus can work fully on its own by just telling it “add a button that does X”, but do that 20 times and the good turns into mush. Steer the model with detailed tech specs on the other hand, and the output becomes magical
That qualifies as a good set of hints about what the end result should be.
Being differently trained and using different tools than almost everyone else I know in engineering my entire career has allowed me to find solutions and vulnerabilities others have missed time and time again. I exclusively use open source software I can always take apart, fully understand, and modify as I like. This inclination has served me well and is why I have the skillsets I do today.
If everyone is doing things one way, I instinctively want to explore all the other ways to train my own brain to continue to be adversarial and with a stamina to do hard experiments by hand when no tools exist to automate them yet.
Watching all my peers think more and more alike actually scares me, as they are all talking to the same LLMs. None for me, thanks.
"But this magic proprietary tool makes my job so much easier!!" has never been a compelling argument for me.
Antirez + LLM + CFO = Billion Dollar Redis company, quite plausibly.
/However/ ...
As for the delta provided by an LLM to Antirez, outside of Redis (and outside of any problem space he is already intimately familiar with), an Apples to Apples comparison would be he trying this on an equally complex codebase he has no idea about. I'll bet... what Antirez can do with Redis and LLMs (certainly useful, huge Quality of Life improvement to Antirez), he cannot even begin to do with (say) Postgres.
The only way to get there with (say) Postgres, would be to /know/ Postgres. And pretty much everyone, no matter how good, cannot get there with code-reading alone. With software at least, we need to develop a mental model of the thing by futzing about with the thing in deeply meaningful ways.
And most of us day-job grunts are in the latter spot... working in some grimy legacy multi-hundred-thousand line code-mine, full of NPM vulns, schelpping code over the wall to QA (assuming there is even a QA), and basically developing against live customers --- "learn by shipping", as they say.
I do think LLMs are wildly interesting technology, however they are poor utility for non-domain-experts. If organisations want to profit from the fully-loaded cost of LLM technology, they better also invest heavily in staff training and development.
Although calling AI "just autocomplete" is almost a slur now, it really is just that in the sense that you need to A) have a decent mental picture of what you want, and, B) recognize a correct output when you see it.
On a tangent, the inability to identify correct output is also why I don't recommend using LLMs to teach you anything serious. When we use a search engine to learn something, we know when we've stumbled upon a really good piece of pedagogy through various signals like information density, logical consistency, structuredness/clarity of thought, consensus, reviews, author's credentials etc. But with LLMs we lose these critical analysis signals.
You are calling out the and subtle nuance that many don’t get…
LLMs help with that part too. As Antirez says:
Writing code is no longer needed for the most part. It is now a lot more interesting to understand what to do, and how to do it (and, about this second part, LLMs are great partners, too).
How to know the "how to do it" is sensible? (sensible = the product will produce the expected outcome within the expected (or tolerable) error bars?)
How did you ever know? It's not like everyone always wrote perfect code up until now.
Nothing has changed, except now you have a "partner" to help you along with your understanding.
Who "knows"?
It's who has a world-model. It's who can evaluate input signal against said world-model. Which requires an ability to generate questions, probe the nature of reality, and do experiments to figure out what's what. And it's who can alter their world-model using experiences collected from the back-and-forth.
As I've mentioned often, I'm solving problems in a domain I had minimal background in before. However, that domain is computer vision. So I can literally "see" if the code works or not!
To expand, I've set up tests, benchmarks and tools that generate results as images. I chat with the LLM about a specific problem at hand, it presents various solutions, I pick a promising approach, it writes the code, I run the tests which almost always pass, but if they don't, I can hone in on the problem quickly with a visual check of the relevant images.
This has allowed me to make progress despite my lack of background. Interestingly, I've now built up some domain knowledge through learning by doing and experimenting (and soon, shipping)!
These days I think an agent could execute this whole loop by itself by "looking" at the test and result images itself. I've uploaded test images to the LLM and we had technical conversations about them as if it "saw" them like a human. However, there are ton of images and I don't want to burn the tokens at this point.
The upshot is, if you can set up a way of reliably testing and validating the LLM's output, you could still achieve things in an unfamiliar domain without prior expertise.
Taking your Postgres example, it's a heavily tested and benchmarked project. I would bet someone like Antirez would be able to jump in and do original, valid work using AI very quickly, because even if hasn't futzed with Postgres code, he HAS futzed with a LOT of other code and hence has a deep intuition about software architecture in general.
So this is what I meant by the meaning of "domain expert" changing. The required skills have become a lot more fundamental. Maybe the only required skills are intuition about software engineering, critical thinking, and basic knowledge of statistics and the scientific method.
For most of us vibe coding gives 0 advantage. Our software will just sit there and get no views and producing it faster means nothing. In fact, it just scares us that some exec is gonna look at this and write us for low performance because they saw someone do the same thing we are doing in 2 days instead of 4.
Most engineers in my experience are much less skillful at reading code than writing code. What I’ve seen so far with use of LLM tools is a bunch of minimally edited LLM produced content that was not properly critiqued.
It's not conceptually challenging to understand, but time consuming to write, test, and trust. Having an LLM write these types of things can save time, but please don't trust it blindly.
IDK, just two days ago I had a bug report/fix accepted by a project which I would have never dreamt of digging into as what it does is way outside my knowledge base. But Claude got right on in there and found the problem after a few rounds of printf debugging which lead to an assertion we would have hit with a debug build which led to the solution. Easy peasy and I still have no idea how the other library does its thing at all as Claude was using it to do this other thing.
It needs a lot of work to not be skeptical, when when I try it, it generates shit, especially when I want something completely new, not existing anywhere, and also when these people when they show how they work with it, it always turns out that it’s on the scale of terrible to bad.
I also use AI, but I don’t allow it to touch my code, because I’m disgusted by its code quality. I ask it, and sometimes it delivers, but mostly not.
(If you need help finding it try visiting https://tools.simonwillison.net/hn-comments-for-user and searching for simonw - you can then search my 1,000 most recent comments in one place.)
If my tests are green then it tells me a LOT about what the software is capable of, even if I haven't reviewed every line of the implementation.
The next step is to actually start using it for real problems. That should very quickly shake out any significant or minor issues that sneaked past the automated tests.
I've started thinking about this by comparing it to work I've done within larger companies. My team would make use of code written by other teams without reviewing everything those other teams had written. If their tests passed we would build against their stuff, and if their stuff turned out not to work we would let them know or help debug and fix it ourselves.
The overwhelming majority of users use and see the benefits of AI and at the same time are fully aware that you won't move software by copy pasting a jira task and lots of thinking is involved into planning and reviewing the changes.
There's vested interests posting 20 replies in a single thread that benefits them and flagging replies that don't
There's literally 20-25% of dissenters comments in each of these posts being repeatedly flagged.
I haven't flagged or downvoted anybody and I have no vested interest in anything. Not sure what my cause should be and what would be my benefit.
My profile contains my full name, you can search me, I'm a random freelancer, not somebody with any stakes in pushing AI.
You might feel great, thats fine, but I dont. And software quality is going down, I wouldn't agree that LLMs will help write better software
Your statement makes no sense.
Even if you don't let LLMs author a single line of your code, they can still review it, find edge cases you didn't think about or suggest different approaches.
The fact that AI allows lots of slop, does not negate its overall utility in good informed hands.
Is there some metric for this?
To be fair, it's been getting worse since before LLMs were a thing.
If so then none of this matters, because it will run through that lather-rinse-repeat loop itself in less than a minute.
I work mostly in C++ (MFC applications on Windows) and assembly language (analyzing crash reports).
For the C++ work, the AIs do all kinds of unsafe things like casting away constness or doing hacks to expose private class internals. What they give me is sometimes enough to get unstuck though which is nice.
For crash reports (a disassembly around the crash site and a stack trace) they are pretty useless and that’s coming from someone who considers himself to be a total novice at assembly. (Looking to up my x64 / WinDbg game and any pointers to resources would be appreciated!)
I do prototyping in Python and Claude is excellent at that.
Yeah. Most AIs today are pretty good at detecting that they're in a loop and aren't making progress. When that happens, they either take a different approach, or stop and say they are stuck. But, if you're really worried about it, you can cap monthly spend on the billing page of virtually every AI provider.
Github stars? That's 100% marketing. Shit that clears a low quality bar can rack up stars like crazy just by being well marketed.
Number of startups? That's 100% marketing. Investors put money into products that have traction, or founders that look impressive, and both of those are mostly marketing.
People actually are vibe coding stuff rather than using SaaS though, that one's for real. Your example is hyperbolic, but the Tailwind scenario is just one example of AI putting pressure on products.
If you don't have a halo already, you need to be blessed or you're just going to suffer. Getting a good mention by someone like Theo or SimonW >> 1000 well written articles.
Another one (I've open sourced, you can check it out here https://github.com/luvchurchill/mani-gpg) A site I use (manifold.markets) announced they are getting rid of DMs due to spam (they've since brought it back) so I made an extension which makes it easy to use pgp & age encryption on the site so we can do pseudo DMs. It injects "Decrypt" buttons next to exncrypted text etc etc. You can see screenshots at https://manifold.markets/post/an-extension-to-assist-with-so...
(Look at the comments for the latest look)
Besides for that, there are a few I'm sure can be scripts
There is not bad publicity. More you spam more you will be noticed. Human attention is limited. So grab as much as you can. And also this helps your product name to get into training data and thus later in LLM outputs.
Even more ideas. When you find an email address. Spam that too. Get your message out multiple times to each address.
It's hard to disambiguate this from people who have a "fanbase." People will upvote stuff from people like simonw sight unseen without reading. I'd like to do a study on HN where you hide the author, to see how upvote patterns change, in order to demonstrate the "halo" benefit.
Falling salaries?
Remember that an average software engineer only spends around 25% of their time coding.
Instead of asking "where are the AI-generated projects" we could ask about the easier problem of "where are the AI-generated ports". Why is it still hard to take an existing fully concrete specification, and an existing test suite, and dump out a working feature-complete port of huge, old, and popular projects? Lots of stuff like this will even be in the training set, so the fact that this isn't easy yet must mean something.
According to claude, wordpress is still 43% of all the websites on the internet and PHP has been despised by many people for many years and many reasons. Why no python or ruby portage? Harder but similar, throw in drupal, mediawiki, and wonder when can we automatically port the linux kernel to rust, etc.
We have a smaller version of that ability already:
- https://simonwillison.net/2025/Dec/15/porting-justhtml/
See also https://www.dbreunig.com/2026/01/08/a-software-library-with-...
I need to write these up properly, but I pulled a similar trick with an existing JavaScript test suite for https://github.com/simonw/micro-javascript and the official WebAssembly test suite for https://github.com/simonw/pwasm
And yet it doesn't feel true yet, otherwise we'd see it. Why do you think that is?
(This capability is also brand new: prior to Claude Opus 4.5 in November I wasn't getting results from coding agents that convinced me they could do this.)
It turns out there are some pretty big problems that works for, like HTML5 parsers and WebAssembly runtimes and reduced-scoped JavaScript language interpreters. You have to be selective though. This won't work for Linux.
I thought it wouldn't work for web browsers either - one of my 2026 predictions was "by 2029 someone will build a new web browser using mostly LLM-code"[1] - but then I saw this thread on Reddit https://www.reddit.com/r/Anthropic/comments/1q4xfm0/over_chr... "Over christmas break I wrote a fully functional browser with Claude Code in Rust" and took a look at the code and it's surprisingly deep: https://github.com/hiwavebrowser/hiwave
[1] https://simonwillison.net/2026/Jan/8/llm-predictions-for-202...
If that's what is shown then why doesn't it work on anything that has a sufficiently large test-suite, presumably scaling linearly in time with size? Why should we be selective, and based on what?
This is the advice I've been giving my friends and coworkers as well for a while now. Forget the hype, just take time to test them from time to time. See where it's at. And "prepare" for what's to come, as best you can.
Another thing to consider. If you casually look into it by just reading about it, be aware that almost everything you read in "mainstream" places has been wrong in 2025. The people covering this, writing about this, producing content on this have different goals in this era. They need hits, likes, shares and reach. They don't get that with accurate reporting. And, sadly, negativity sells. It is what it is.
THe only way to get an accurate picture is to try them yourself. The earlier you do that, the better you'll be. And a note on signals: right now, a "positive" signal is more valuable for you than many "negative" ones. Read those and try to understand the what, if not the how. "I did this with cc" is much more valuable today than "x still doesn't do y reliably".
You can refuse to support it on the grounds that its being used to harm people. That might not do anything but its still important to be on the right side of humanity.
I don't condemn the tech, but the tech depends on factors that are harming people and not supporting that part of it is an act of support for humanity.
However I can’t help but notice some things that look weird/amusing:
- The exact time that many programmers were enlightened about the AI capabilities and the frequency of their posts.
- The uniform language they use in these posts. Grandiose adjectives, standard phrases like ‘it seems to me’
- And more importantly the sense of urgency and FOMO they emit. This is particularly weird for two reasons. First is that if the past has shown something regarding technology is that open source always catches up. But this is not the case yet. Second, if the premise is that we re just the in beginning all these ceremonial flows will be obsolete.
Do not get me wrong, as of today these are all valid ways to work with AI and in many domains they increase the productivity. But I really don’t get the sense of urgency.
I attribute that to the holidays. Many people finally had the time to goof around with these tools. At least that's how it happened to me.
It was an incredible experience. I implemented a few features quickly and in a much better way than I could otherwise. Realized how many tiny holes my app had and a few suboptimal patterns I was using. Made me worry about my career, initially, but after using for a while, I now see it as going up the chain of abstraction. Only thing I'm not doing is writing code by hand. Im still having to do everything else like thinking about architecture and the big picture, keeping it dry and maintainable, debugging, etc - but with a lot of help from LLMs. Sometimes it's 10x and sometimes you wasted sometime, you know, just like how using packages made us go up the chain.
Mind-boggling amount of investment needing a return or the promise of a return
I want AI that responds instantaneously, and in a manner perfectly suited to my particular learning style.
I want AI so elegant in its form and function that I completely take it for granted.
What I'm getting instead is something clunky, slow, and flawed. So excuse me while I remain firmly in the anti-AI crowd.
I leverage LLMs where it makes sense for me to do so, but let's dispense with this FOMO silliness. People who choose not to aren't missing out on anything, any more than people who choose to use stock Vim rather than VSCode aren't missing out on anything.
Using AI you're increasing the level of abstraction you can work at, and reducing the amount of detail you have to worry about. You tell the AI what you want to do, not how to do it, other than providing context that does tell it about the things that you actually care about (as much or little as you choose, but generally the more the better to achieve a specific outcome).
If it were deterministic, yes, but it's not. When I write in a high level language, I never have to check the compiled code, so this comparison makes no sense.
If we see new kinds of languages, or compile targets, that would be different.
Effort that might be put into feeling that you need to manually review all code generated might better be put into things like automating quality checks (e.g code review, adherence to guidelines) ensuring that testing is comprehensive, and overall management of the design and process into modular testable parts the same way as if you'd done it manually.
While AI is a tool, the process of AI-centric software development is better regarded as a pair-design and pair-coding process, treating the AI more like a person than a tool. A human teammate isn't deterministic either, yet if they produce working artifacts that meet interface requirements and pass unit tests, you probably aren't going to insist on reviewing all of their code.
This is not the support argument you think it is, it just further allures to the fact that people raving about AI just generate slop and either don't review their code or just send it for their coworkers to review.
I guess AI bros are just the equivalent of script-kiddies, just running shit they don't know how it works and claiming credit for it.
We have to abandon the appeal to authority and take the argument on its merits, which honestly, we should be doing regardless.
I don't really agree. In virtually any field, when those who have achieved mastery speak, others, even other masters, tend to listen. That does not mean blindly trust them. It means adjust your priors and reevaluate your beliefs.
Software development is not special. When people like antirez (redis) and simonw (django) and DHH (rails) are speaking highly of AI, and when Linus Torvalds is saying he's using AI now, suggesting they may be on to something is not an appeal to authority. And frankly, claiming that they might be saying nice things about AI because of some financial motive is crazy.
That LLMs advocates are resorting to the appeal to authority fallacy isn't a good look for them either.
Well that's a way to put it. But not everyone enjoy the art only for the results.
I personally love learning, and by letting AI drive forward and me following, I don't learn. To learn is to be human.
So saying the fun is untouched is one-sided. Not everyone is in it for the same reasons.
If I have to do all this babysitting, is it really saving me anything other than typing the code? It hasn't felt like it yet and if anything it's scary because I need to always read the code to make sure it's valid, and reading code is harder than writing it.
This is the things thar gets me the most. Code review is _hard_. So hard that I'm convinced my colleagues don't do it and just slap "LGTM" on everything.
We are trading "one writer, one reader" for "two readers", and it seems like a bad deal.
Also add the huge security gap of letting a probabilistic tool with blurry boundaries execute shell commands. Add the fact that AI is currently not being profitable, and that all major players most likely train on your code (Anthropic does).
- you should document your best practices in a file and point it to the LLM (the standards are @claude or @agent markdown files
- you should manage context (the larger it gets the weaker the output)
- you should use good and clear prompts
- you should generally make it generate a plan with the requirements (business logic changes focused) and then follow and review the implementation plan (I generally produce both in two different markdown files).
- only then you let it code
The last phase, isn't even the most important to be honest, you can do it manually. But I have found that forcing myself through the first two and having AI find information in the codebase, edge cases in the business logic, propose different solutions, evaluate the impact of the changes is a huge productivity multiplier.
Very often I'm not worn out by the coding part, again, I can do it on my own, it's the finding information and connecting the dots the hard one. In that, it excels and I would struggle (mentally) to go back to jumping from file to file while keeping track of my findings in notes to figure out the wheres, whats and whys.
Right now, there’s a limit to how widely software is adopted, largely based on software quality and cost. AI will improve software quality (for example, you can add a ton of automated tests even if you don’t use AI to develop features) and reduce the cost of building software.
That will lead to better software—and software we didn’t build in the past because it was too complex, or so niche that we weren’t sure we could make enough profit to justify the development costs. It will say also change many other industries, but I think generally for the better: more ways to create new things, more variations, and more customization for specific purposes.
I think for some who are excited about AI programming, they're happy they can build a lot more things. I think for others, they're excited they can build the same amount of things, but with a lot less thinking. The agent and their code reviewers can do the thinking for them.
The goal of the labs is to continue these leaps will get even bigger with every generation. Unless you secretly believe that some portion of the craft will be left unexplored by the labs or the things that are still relatively borked now will not be worked on or fixed later is a silly notion to me. Future versions will be easier to prompt and the tools will do more of the heavy lifting of following up and re-rolling misinterpretations. I argue that a user sleeping through all of this is likely to use a future version better than someone who is obsessing with all their assumptions on how to coerce these models to work right now, current version hyper users will likely bring unnecessary baggage imo.
For now, even with Opus 4.5 the time horizon for delivering a full-stack project is not significantly different than before, it's still limited by how much you can push it. I'd argue that someone without understanding of how things work is unlikely to succeed in getting production-grade outcomes from these current versions. The point is, if you choose to learn more and get better in understanding and building things that work (with AI or otherwise) you'll be just fine to use the versions that have fully or mostly automated the entire process. Nobody will be left behind, only those who stop building altogether.
You will still need hardware to run those open models, and that avenue is far easier to contain and close than stopping code distribution. Expect the war on private/personal compute to ramp up even more significantly than ot already has.
This doesn't make sense to me.
Surely if you were "quite involved step-by-step through the whole prototyping phase" you would have been able to prevent architectural mistakes being made?
What does your process really look like?
I don't "vibe code" in the sense that I have it build entire apps without looking at the code; I prompt it to write maybe about the 100-200 lines of code I need next after thinking about what they should look like.
I don't see how you get architectural issues creeping in if you do it that way.
I've been taking a proper whack at the tree every 6 months or so. This time it seems like it might actually fall over. Every prior attempt I could barely justify spending $10-20 in API credits before it was obvious I was wasting my time. I spent $80 on tokens last night and I'm still not convinced it won't work.
Whether or not AI is morally acceptable is a debate I wish I had the luxury of engaging in. I don't think rejecting it would allow me to serve any good other than in my own mind. It's really easy to have certain views when you can afford to. Most of us don't have the privilege of rejecting the potential that this technology affords. We can complain about it but it won't change what our employers decide to do.
Walk the game theory for 5 minutes. This is a game of musical chairs. We really wish it isn't. But it is. And we need to consider the implications of that. It might be better to join the "bad guys" if you actually want to help those around you. Perhaps even become the worst bad guy and beat the rest of them to a functional Death Star. Being unemployed is not a great position to be in if you wish to assist your allies. Big picture, you could fight AI downstream by capitalizing on it near term. No one is keeping score. You might be in your own head, but you are allowed to change that whenever you want.
Trying to beat a demon long term by making a contract with it short term?
We now have top chart hits which are soulless AI songs. It's perhaps a testament to the fact that some of these genres where this happens a lot, were already trending towards industrially produced songs with little soul in them (you know what genres these are, and it's hilarious that one of them). But most concerning to me is the idea that we'll never trust our eyes with what's true starting now.
We can't trust that someone who calls us is human, or that a photo or recording is of a real event. This was always true in some sense, but it required a ton of effort to pull off at least. Now it's going to be trivial. And for every photo depicting an actual event, there will be a thousand depicting non-events. What does that do to the most important thing we have as a society: the "shared truth"? The decay of traditional media already put a big dent in this - with catastrophic results. Ai will make it 10x worse.
You’re right, there’s no end of things that are legal (except by making those things illegal).
Not in this climate. The laws are being circumvented by criminals. Everything is different now. You can tell yourself all you want that its ok to support a technology that is being used to enslave us, but its not going to change the outcome, we are still being harmed by people that have control over the technology.
The best thing to do right now is to stop supporting the tech where its being used by corporations that are in the business of harming the people by there actions and inaction.
And destroying the gaming industry and altering the energy grid and pooping on the environment
I always looked up to antirez. Redis was really taking off after I graduated and I was impressed by the whole system and the person behind it. I was impressed to see them walk away to do something different after being so successful. I was impressed to read their blog about tackling difficult problems and how they solved them.
I'm not a 10x programmer. I don't chase MVPs or shipping features. I like when my manager isn't paying attention and I can dig into a problem and just try things out. Our database queries have issues? Maybe I can write my own AST by parsing just part of the code. Things like that.
I love BUILDING, not SHIPPING. I learn and grow when I code. Maybe my job will require me to vibe code everything some day just to keep up with the juniors, but in my free time I will use AI only enough to help speed up my typing. Every vibe coded app I've made has been unmaintainable spaghetti and it takes the joy out of it. What's the point of that?
To bring it all together, I guess some part of me was disappointed to see a person that I considered a really good programmer, seem to indicate that they didn't care about doing the actual programming?
> Writing code is no longer needed for the most part
> As a programmer, I want to write more open source than ever, now.
This is the mentality of the big companies pushing AI. Write more code faster. Make more things faster. Get paid the same, understand less, get woken up in the middle of the night when your brittle AI code breaks.
Maybe that's why antirez is so prolific and I'm not.
Sometimes I wish I was a computer scientist, instead of a programmer...
My take on this is that we as a society are now on the verge of transitioning towards programming as an art form. And the methodologies of art vs non art programming are vastly different.
Take clothes, for example. Manufacturing is vastly optimized for throughput, but its art form is heavily optimized for design and customization. Maybe that is what all this is about now with programming, too?
I too would think of myself as someone who likes to code for the sake of explorative understanding and optimization. I'm pretty bad at the last 10%, like _reeeally_ bad actually.
But I am aware that the methodology of programming is changing. And currently I believe that design and customization might in parts also change, because a lot of LLM- / slop-coded successful projects were optimizing for something like text-in-the-loop where they started with a terminal CLI and made it a real design later, because the LLM agent was able to parse and understand CLI / TTY characters.
Maybe this is what it's actually about. Maybe we need to optimize things for text now so that LLMs can help us more in these topics?
I'm thinking lately a lot about scene graphs and event graphs and how to make them serializable so that I can be more efficient in generating UIs. Sorry for babbling, maybe these are just thoughts I'm gonna regret in the future.
It already was. This just makes it a subscription service.
I wonder if I’m the odd one out or if this is a common sentiment: I don’t give a shit about building, frankly.
I like programming as a puzzle and the ability to understand a complex system. “Look at all the things I created in a weekend” sounds to me like “look at all the weight I moved by bringing a forklift to the gym!”. Even ignoring the part that there is barely a “you” in this success, there is not really any interest at all for me in the output itself.
This point is completely orthogonal to the fact that we still need to get paid to live, and in that regard I'll do what pays the bills, but I’m surprised by the amount of programmers that are completely happy with doing away with the programming part.
I enjoy those things about programming too, which is why I'm having so much fun using LLMs. They introduce new layers of complex system understanding and problem solving (at that AI meta-layer), and let me dig into and solve harder and more time-consuming problems than I was able to without them.
This is not my experience at all. My experience is that the moment I stop using them as google or search on steroids and let them generate code, I start losing the grip of what is being built.
As in, when it’s time for a PR, I never feel 100% confident that I’m requesting a review on something solid. I can listen to that voice and sort of review myself before going public, but that usually takes as much time as writing myself and is way less fun, or I can just submit and be dishonest since then I’m dropping that effort into a teammate.
In other words, I feel that the productivity gain only comes if you’re willing to remove yourself from the picture and let others deal with any consequence. I’m not.
Maybe a factor here is that I've invested a huge amount of effort over the last ~10 years in getting better at reading code?
I used to hate reading code. Then I found myself spending more time in corporate life reviewing code then writing it myself... and then I realized the huge unlock I could get from using GitHub search to find examples of the things I wanted to do, I'd only I could overcome my aversion to reading the resulting search results!
When LLMs came along they fit my style of working much better than they would have earlier in my career.
The point is exactly that, that ai feels like reviewing other people’s code, only worse because bad ai written code mimics good code in a way that bad human code doesn’t, and because you don’t get the human factor of mentoring someone when you see they lack a skill.
If I wanted to do that for a living it’s always been an option, being the “architect” overseeing a group of outsourced devs for example. But I stay as individual contributor for doing quite different work.
Yeah, that's a good way to put it.
I've certainly felt the "mimics good code" thing in the past. It's been less of a problem for me recently, maybe because I've started forcing Claude Code into a red/green TDD cycle for almost everything which makes it much less likely to write code that it hasn't at least executed via the tests.
The mentoring thing is really interesting - it's clearly the biggest difference between working with a coding agent and coaching a human collaborator.
I've managed to get a weird simulacrum of that by telling the coding agents to take notes as they work - I even tried "add to a til.md document of things you learned" on a recent project - and then condensing those lessons into an AGENTS.md later on.
The naive view considers only the small scale ease of completing a task in isolation and expects compensation to be proportional to it. But that's not how things work. Yes abstraction makes individual tasks easier to complete, but with the extra time available more can be done, and as more is done and can be done, new complexities emerge. And as an individual can do more, the importance of trust grows as well. This is why CEO's make disproportionately more than their employees, because while the complexity of their work may scale only linearly with their position, or not at all even beyond a certain point, the impact of their decisions grows exponentially.
LLM's are just going to enhance the power and influence of software developers.
The managerial class believes that all the value in a business comes from managerial work. LLMs are being hyped by the managerial class because they are turning software development into managerial work and eliminating "programmer" as a professional category. The key insight Milt Bryce had with PRIDE is that software is a product that can be manufactured just like any other product. The ideal software production workflow is that of a factory, and the ideal factory is staffed by no more than a man and a dog—in other words, fully automated.
So the rules of business in your father's or grandfather's time prevail once again. It's up or out. Learn people skills, learn the business, and take on more responsibilities putting those skills to use and fewer responsibilities involving code. Or find yourself increasingly irrelevant.
You don't have to consider the feelings of your coding agent, or their specific taste, or what challenges would best help them advance in their skills or career.
You tell them to do something, and if they do it wrong you tell them what to fix, and you can keep on hammering away at them until you get the right result.
If they go too far off the tracks you reboot them with a clean slate and set them on the task again in a different direction.
The great thing about working with LLMs, from a business perspective—or at least the promise—is that you, as a programmer/software engineer, don't need to be building the software at all. A director on the business side could be telling the agents what to do just as they would tell a development division within the company, see it done with far less pushback and at far less cost, and stay focused on their business responsibilities like devising or implementing organizational strategy to align core competencies and achieve synergy. So again, programmers will need to transition to becoming businesspeople in order to keep their relevance within the company.
They did the same with Upton Sinclair's quote, which is now used against any worker who dares to hope for salary.
There is not much creativity in the pro-LLM faction, which is guided by monetary interests and does not mind to burn its social capital in exchange for loss of credibility and money.
Often while trying to fall asleep, I'll be thinking something like "I need my app to do such and such".
The next day, instead of forcing myself to start coding, I can literally say to Intelij Junie (using Claude), exactly that: "I need my app to do such and such". I'm often pleasantly surprised by the outcome. And if there's anything that needs to be tweaked, I'm now in the mode of critiquing and editing.
It strikes me that if you developed your skill set around using AI more effectively, you could have both developed a deep understanding and gotten what you wanted, and done it in less time and at higher quality than you could have done solo.
That said, the fact that you can use AI in an unskilled way to produce something kinda cool... is itself kinda cool! It means there's an on-ramp to using AI! People with no skills can get started, same day, and make stuff. And over time, can learn to make even better stuff! That's pretty cool to me.
Group 1 is untouched since they were writing code for the sake of writing and they have the reward of that altruism.
Group 2 are those that needed their projects to bring in some revenue so they can continue writing open-source.
Group 3 are companies that used open-source as a way to get market share from proprietary companies, using it more in a capitalistic way.
Overtime, I think groups 2 and 3 will leave open-source and group 1 will make up most of the open-source contributors. It is up to you to decide if projects like Redis would be built today with the monetary incentives gone.
It reads like someone with (good) English as a second language. LLMs don't write like that.
AI is also automation but the instructions are given in a higher level language. You still have to know how to automate it. You need to instruct the machine in sufficient detail, and if done correctly the machine will once again be able to interpret your intention, transform it to a lower level code, and execute it for you.
This does not actually follow from the way LLMs work.
And then goes on describing two things for which I bet almost anyone with enough knowledge of C and Redis could implement a POC in... Guess what? Hours.
At this point I am literally speechless, if even Antirez falls for this "you get so quick!!!" hype.
You get _some_ speed up _for things you could anyway implement_. You get past the "blank screen block" which prevents you from starting some project.
These are great useful things that AI does for you!
Shaving off _weeks_ of work? Let's come back in a couple of month when he'll have to rewrite everything that AI has written so well. Or, that code would just die away (which is another great use case for AI: throw away code).
People still don't understand that writing code is a way to understand something? Clearly you don't need to write code for a domain you already understand, or that you literally created.
What leaves me sad is that this time it is _Antirez_ that writes such things.
I have to be honest: it makes me doubt of my position, and I'll constantly reevaluate it. But man. I hope it's just a hype post for an AI product he'll release tomorrow.
I hope AI leads to a Cambrian explosion of software people running their own businesses, given the force multiplier it affords. On the other hand, the jaded part of me feels that AI may lead to a consolidation into a very small set of monopolies. We'll see.
I worry less about the model access and more about the hardwire required to run those models (i.e. do inference).
If a) the only way to compete in software development in the future is to outsource the entire implementation process to one of a few frontier models (Chinese, US or otherwise)
and b) only a few companies worldwide have the GPU power to run inference with those models in a reasonable time
then don't we already have a massive amount of centralization?
That is also something I keep wondering with agentic coding - being able to realize your epic fantasy hobby project you've on and off been thinking about for the last years in a couple of afternoons is absolutely amazing. But if you do the same with work projects, how do you solve the data protection issues? Will we all now just hand our entire production codebases to OpenAI or Anthropic etc and hope their pinky promises hold?
Or will there be a race for medium-sized companies to have their own GPU datacentets, not for production but solely for internal development and code generation?
But then how will we review each PR enough to have confidence in it?
How will we understand the overall codebase too after it gets much bigger?
Are there any better tools here other than just asking LLMs to summarize code, or flag risky code... any good "code reader" tools (like code editors but focused on this reading task)?
Seriously? If these were open source tools that anyone could run on their home PC that statement would make sense, but that's not what we are talking about here. LLMs are tools that cost massive amounts of money to operate, apparently. The tool goes away if the money goes away. Fossil fuels revolutionized the world, but only because the cost benefit made sense (at least in the relative short-term).
What's missing is (captured) the test of the changed software to verify the fixes solved the problem and no other problems where introduced ....
Then a analysis of the original software changes. An analysis of the test results, test cases, test evidence to ensure it is appropriate and adequate.
Notwithstanding the above, to my understanding LLM services are currently being sold below cost.
If all of the above is true, at some point the degredation of quality in codebases that use these tools will be too expensive to ignore.
Really, one of the first things he said, sums it up:
> facts are facts, and AI is going to change programming forever.
I have been using it in a very similar manner to how he describes his workflow, and it’s already greatly improved my velocity and quality.
I also can relate to this comment:
> I feel great to be part of that, because I see this as a continuation of what I tried to do all my life: democratizing code, systems, knowledge.
UBI gives government more control over individuals' finances, especially those without independent means. Poverty is also the result of unfair taxation, where poor people face onerous taxes while receiving less and less in return, and the wealthy avoid tax at every turn. Or that it is difficult for people to be self-employed due to red tape favouring big business. UBI does not address those issues.
UBI also centralises control at the expense of local self-determination and community engagement.
UBI potentially leads to inflation. If everyone has X amount of income then rents and prices go up accordingly.
Taxation is totally unfair. 20% of most of what we buy here is going into government coffers, raising our cost of living. We get less and less in return as public services are slashed. Add onto that other taxes, and it is the government, not just corporations who are major instigators of debt and the poverty trap...
You don’t spend weeks explaining intent, edge cases, or what I really meant to a developer. You iterate 1:1 with the system and adjust immediately when something feels off.
I’m starting to think of AI use more like a dietary choice. Most people are omnivores. Some people are vegans. Others are maxing protein. All of them can coexist in society and while they might annoy each other if the topic comes up, for the most part it’s a personal choice.
Every now and then I post the same exact comment here on HN, where the heck are the products then? Or where is the better outcome? The faster software? Let alone small team competing with bigger companies?
We are NOT anti AI we're exhausted to keep reading bs from ai astroturfers or wanna be ai tech influencers. It's so exhausting it's always your fault that you're not "using the tool properly", and you're going to be left behind. I'm not anti AI I just wish the bubble will pop so instead of fighting back bs from managers that "I read that on HN" I can go back coding with and without ai where applies to my needs
how AI speeds me up:
- no longer have to remember how to set up unit test boilerplate in each of the 6ish programming languages i commonly use
- can often vaguely gesture at an existing pattern and have AI "copy-paste" it into new code. "do that read-through cache pattern like you see there and there but do it for this table and this proto msg type."
- can quickly answer questions like "does anyone in the code seem to build this string manually instead of using the library/helper method for it"
- can quickly generate code like "all I want is a gosh dang PKCS-formatted key, why is that so hard for this library" which the docs did not provide
which is really cool. it absolutely speeds things up by 10-100x in some scenarios. a lot of the sucky parts of programming are being mired down in these kinds of messes.
how AI slows me down:
- have to explain to jr dev why, even though it has unit tests, the AI-generated bespoke mutex async cache is not going into our production codebase
- have to explain to PM why I cannot let them vibe code new features into the hot path of our prod services when they are not on-call to be forced to clean it up when it explodes at 3am
- have to explain to senior dev who should REALLY know better why you cannot _just_ ask someone to review a 2000 LOC PR
- have to explain to CEO in tremendous itemized, evidenced detail why [big project in eye of sauron] did not go noticeably faster than it did 2 years ago even though the team was hand-picked to be full of people he knew would use AI as much as he wanted them to.
- have to explain to CEO why I really wish he would stop playing with AI and bothering the crap out of the engineers and go back to actually doing whatever it is the CEO gets paid 10-100x what a software engineer salary to do. [actually still trying to figure this one out without getting fired.]
I'm as interested in AI use as anyone can be, when I have to put up with sycophantic "believers" who really wish they could replace me entirely with the chatbot.
Also, this shit is expensive and still being sold at a loss. I signed up for Amp and blew through my $10 of signup credit getting very little done. I'm certainly not paying my own money for that.
There really should be a label on the product to let the consumer know. This should be similar to Norway that requires disclosure of retouched images. No other way can I think of to help body image issues arising from pictorial people and how they never can being in real life.
And this is Hacker News, which you might expect to attract people who thrive on exploring the edges of weird new technology!
I don't believe that AI will put most of the working force out of jobs. That would be so different from what we had in history that I think the chances are minimal. However, they are not zero, and that is scary as fuck for a lot of people.
> However, this technology is far too important to be in the hands of a few companies.
I wholeheartedly agree 1000%. Something needs to change this landscape in the US.
Furthermore, the entire open source models being dominated by China is also problematic.
Nope. It was coding. Enjoying the process itself.
If I wanted to hand out specs and review code (which is what an AI jockey does), I'd be having fucking project managers as role models, not coders...
At it's core, AI has capability to extract structure/meaning from unstructured content and vice-versa. Computing systems and other machines required inputs with limited context. So far, it was a human's job to prepare that structure and context and provide it to the machines. That structure can be called as "program" or "form data" or "a sequence of steps or lever operations or button presses".
Now the machines got this AI wrapper or adapter that enables them to extract the context and structure from the natural human-formatted or messy content.
But all that works only if the input has the required amount of information and inherent structure to it. Try giving a prompt with jumbled up sequence of words. So it's still the human jobs to provide that input to the machine.
If A.I writes everything for you - cool, you can produce faster ? but is it really true ? if you're renting capacity ? what if costs go up, now you can't rent anymore - but you can't code anymore, the documentation is no longer there - coz mcp etc assumption that everything will be done by agents then what ?
what about the people that work on messy 'Information Systems' - things like redis - impressive but it's closed loop software just like compilers -
some smart guy back in the 80s - wrote it's always a people problem -
And no, my work as redteam IT sec. is completely unrelated :D
So my prediction is that any software worth scanning by redteams will become more secure. Not less.
If programmers keep up good coding practices then the vibe coders are the ones left behind
I would draw an analogy here between building software and building a home.
When building a home we have a user providing the requirements, the architect/structural engineer providing the blueprint to satisfy the reqs, the civil engineer overseeing the construction, and the mason laying the bricks. Some projects may have a project-manager coordinating these activities.
Building software is similar in many aspects to building a structure. If developers think of themselves as a mason they are limiting their perspective. If AI can help lay the bricks use it ! If it can help with the blueprint or the design use it. It is a fantastic tool in the tool belt of the profession. I think of it as a power-tool and want to keep its batteries charged to use it at any time.
The reason I am anti-AI is because I believe it poses a net-negative to society overall. Not because it is inherently bad, but because of the way it is being infused into society by large corps (and eventually governments). Yes, it makes me, and other developers, more productive. And it can more quickly solve certain problems that were time consuming or laborious to solve. And it might lead to new and greater scientific and technological advances.
But those gains do not outweigh all of the negatives: concentration of power and capital into an increasingly small group, the eventual loss of untold millions of jobs (with, as of yet, not even a shred of indication of what might be replace them), the loss of skills in the next generations who are delegating much of their critical thinking (or thinking period), to ChatGPT; the loss of trust in society now that any believable video can be easily generated; the concentration of power in the the control of information if everyone is getting their info from LLMs instead of the open internet (and ultimately, potentially the death of the open internet); the explosion in energy consumption by data centers which exacerbates rather than mitigates global warming; and plenty more.
AI might allow us to find better technological solutions to world hunger, poverty, mental health, water shortages, climate change, and war. But none of those problems are technological problems; technology only plays a small part. And the really important part is being negatively exacerbated by the "AI arms race". That's why I, who was my whole life a technological optimist, am no longer hopeful for the future. I wish I was.
It's obvious that AI, if it succeeds, will be primarily used to make people, even as physical beings, redundant.
From TFA:
> the more people get fired, the more political pressure there will be to vote for those who will guarantee a certain degree of protection
This is daydreaming. Just look at the US. "Political pressure" is not a thing.
There will be war.
That's fine if he feels that way, but he can only speak for himself, not for all the copyright holders of the other code that was "ingested" to power LLMs.
If you want to see how most creators who care about their work and actually own it (unlike most software), look at many book authors and illustrators. Many of whom have a burning hatred for AI bros not only stealing their work, but also then using it to destroy the livelihoods of their field.
A lot of the techbros who do care about their work aren't feeling as wronged or threatened, because we're trying to pivot to get a piece of the pie, from all the exploitation and pillaging of many fields.
The closest is probably music sampling, which has had a very robust money-based licensing scheme built around it for many years.
This is the crux. AI suddenly became good and society hasn't caught on yet. Programmers are a bit ahead of the curve here, being closer to the action of AI. But in a couple of years, if not already, all the other technical and office jobs will be equally affected. Translators, admin, marketing, scientists, writers of all sorts and on and on. Will we just produce more and retain a similar level of employment, or will AI be such a force multiplier that a significant number or even most of these jobs will be gone? Nobody knows yet.
And yet, what I'm even more worried about for their society upending abilities, is robots. These are coming soon and they'll arrive with just as much suddeness and inertia as AI did.
The robots will be as smart as the AI running them, so what happens when they're cheap and smart enough to replace humans in nearly all physical jobs?
Nobody knows the answer to this. But in 5 years, or 10, we will find out.
I assure you that this isn't anything like the level before.
Lawyering has changed forever.
I don't agree, unless by "art industry" what you actually mean is "art establishment".
If we broaden it to mean "anywhere that money is paid, or used to be paid, to people for any kind of artistic endeavor" - even if we limit that to things related to drawing, painting, illustrating, graphic design, 3d design etc. - then AI is definitely replacing or augmenting a ton of human work. Just go on any Photoshop forum. It's all about AI now, just like everywhere else.
There is enough evidence to support claims that AI is a black hole where money gets evaporated.
It’s great that you can delegate some tasks to it now and not have to write all of the code yourself. There is some evidence showing that it doesn’t benefit junior developers nearly as much. If you didn’t generate the specification test that demonstrates the concurrency issue you were trying to solve in Redis but you read the code it generated and understood it then you didn’t need to learn anything. How is a junior developer who has never solved such problems supposed to learn so they can do the same thing?
But worse, UBI and such are the solutions of libertarian oligarchs that dream of a world without people, according to Doctorow and I think he’s right. It seems like the author also wants this? He doesn’t seem to know what will happen to the jobless but we should vote in some one who will start a government program to take care of them. How long until the author is replaced as well?
Lastly… who’s “hyping” anti-AI and what do they gain from making false claims?
I think the real problem for programming is when these companies all collapse and take the rest of the economy down with them… are there going to be enough programmers left to maintain everything? Or will we be sifting though the mountains of tech debt never to see the light of day again?
Imo its to hard for companies to get infra into a place where text can be an interface. IaC is mostly an aspiration beyond a certain scale ime, which is close enough to interacting with infra through text.
There is no way I can convince a user that my vibe coded version of Todolist is better than 100 other made this week
This is already happening.
AI had an impact on simplest coding first, this is self-evident. So any impact it had, had to be on the quantity of software created, and only then on its quality and/or complexity. And mobile apps are/were a tedious job with a lot of scaffolding and a lot of "blanks to fill" to make them work and get accepted by stores. So first thing that had to skyrocket in numbers with the arrival of AI, had to be mobile apps.
But the number of apps on Apple Store is essentially flat and rate of increase is barely distinguishable from the past years, +7% instead of +5%. Not even visible.
Apparently the world doesn't need/can't make monetisable use of much more software than it already does. Demand wasn't quite satisfied say 5 years ago, but the gap wasn't huge. It is now covered many times over.
Which means, most of us will probably never get another job/gig after the current one - and if it's over, it's over and not worth trying anymore - the scraps that are left of the market are not worth the effort.
You will not find such a government. They're here for a different purpose
If I can run an agent on my machine, with no remote backend required, the problem is solved. But right now, aren't all developers throwing themselves into agentic software development betting that these services will always be available to them at a relatively low cost?
Show me these "facts"
One of the big red flags I see around the pro-AI side is this constant desire to promote the technology. At least the anti-ai side is reactionary.
Maybe he's a generous person.
You can't just hand-wavily say "a bigger percentage of programmers is using AI with success every day" and not give a link to a study that shows it's true
as a matter of fact, we know that a lot of companies have fired people by pretending that they are no longer needed in the age of AI... only to re-hire offshored people for much cheaper
for now, there hasn't been a documented sudden increase in velocity / robustness for code, a few anecdotical cases sure
I use it myself, and I admit it saves some time to develop some basic stuff and get a few ideas, but so far nothing revolutionary. So let's take it at face value:
- a tech which helps slightly with some tasks (basically "in-painting code" once you defined the "border constraints" sufficiently well)
- a tech which might cause massive disruption of people's livelihoods (and safety) if used incorrectly, which might FAR OUTWEIGH the small benefits and be a good enough reason for people to fight against AI
- a tech which emits CO2, increases inequalities, depends on quasi slave-work of annotators in third-world countries, etc
so you can talk all day long about not dismissing AI, but you should take it also with everything that comes with it
2. The US alone air conditioning usage is around 4 times the energy / CO2 usage of all the world data centers (not just AI) combined together. AI is 10% of the data centers usage, so just AC is 40 times that.
# Fact-Checking This Climate Impact Claim
Let me break down this claim with actual data:
## The Numbers
*US Air Conditioning:* - US A/C uses approximately *220-240 TWh/year* (2020 EIA data) - This represents about 6% of total US electricity consumption
*Global Data Centers:* - Estimated *240-340 TWh/year globally* (IEA 2022 reports) - Some estimates go to 460 TWh including cryptocurrency
*AI's Share:* - AI represents roughly *10-15%* of data center energy (IEA estimates this is growing rapidly)
## Verdict: *The claim is FALSE*
The math doesn't support a 4:1 ratio. US A/C and global data centers use *roughly comparable* amounts of energy—somewhere between 1:1 and 1:1.5, not 4:1.
The "40 times AI" conclusion would only work if the 4x premise were true.
## Important Caveats
1. *Measurement uncertainty*: Data center energy use is notoriously difficult to measure accurately 2. *Rapid growth*: AI energy use is growing much faster than A/C 3. *Geographic variation*: This compares one country's A/C to global data centers (apples to oranges)
## Reliable Sources - US EIA (Energy Information Administration) for A/C data - IEA (International Energy Agency) for data center estimates - Lawrence Berkeley National Laboratory studies
The quote significantly overstates the disparity, though both are indeed major energy consumers.
2. it's not because the US is incredibly bad at energy spending in AC that it somehow justifies the fact that we would add another, mostly unnecessary, polluting source, even if it's slightly lower. ACs have existed for decades. AI has been exploding for a few years, so we can definitely see it go way, way past the AC usage
also the idea is of "accelerationnism". Why do we need all this tech? What good does it make to have 10 more silly slop AI videos and disinformation campaigns during election? Just so that antirez can be a little bit faster at doing his code... that's not what the world is about.
Our world should be about humans, connecting together (more slowly, not "faster"), about having meaningful work, and caring about planetary resources
The exact opposite of what capitalistic accelerationism / AI is trying to sell us
> Why do we need all this tech?
Slightly odd question to be asking here on Hacker News!
That doesn't mean that we have to accept claims that LLMs drastically increase productivity without good evidence (or in the presence of evidence to the contrary). If anything, it means the opposite.
My own personal experience supports that too.
If you're determined to say "I refuse to accept appeal to authority here, I demand a solution to the measuring productivity problem first" then you're probably in for a long wait.
The problem is that we know that developers' - including experienced developers' - subjective impressions of whether LLMs increase their productivity at all is unreliable and biased towards overestimation. Similarly, we know that previously the claims of massive productivity gains were false (no study reputable showed even a 50% improvement, let alone the 2x, 5x, 10x, etc that some were claiming, indicators of actual projects shipped were flat, etc). People have been making the same claims for years at this point, and every time when we actually were able to check, it turned out they were wrong. Further, while we can't check the productivity claims (yet) because that takes time, we can check other claims (e.g. the assertion that a model produces code that doesn't need to be reviewed by a human anymore), and those claims do turn out to be false.
> If you're determined to say "I refuse to accept appeal to authority here, I demand a solution to the measuring productivity problem first" then you're probably in for a long wait.
Maybe, but my point still stands. In the absence of actual measurement and evidence, claims of massive productivity gains do not win by default.
> Slightly odd question to be asking here on Hacker News!
It's absolutely not? The first line of question when you work in a domain SHOULD BE "why am I doing this" and "what is the impact of my work on others"
Labor is worth less, capital and equity ownership make more or the same
I continue to hope that we see the opposite effect: the drop of cost in software development drives massively increased demand for both software and our services.
I wrote about that here: https://simonwillison.net/2026/Jan/8/llm-predictions-for-202...
I want to write less, just knowing that LLM models are going to be trained on my code is making me feel more strongly than ever that my open source contributions will simply be stolen.
Am I wrong to feel this? Is anyone else concerned about this? We've already seen some pretty strong evidence of this with Tailwind.
I know the GPL didn't have a specific clause for AI, and the jury is still out on this specific case (how similar is it to a human doing the same thing?), but I like to imagine, had it been made today, there probably would be a clause covering this usage. Personally I think it's a violation of the spirit of the license.
There are non-US jurisdictions where you have some options, but since most of them are trained in the US that won't help much.
They can claim whatever they want. You can still try to stop it via lawsuits and make them claim it in court. Granted, I believe there's already been some jurisdictions that have sided with fair use in those particular cases.
Strict copyright enforcement is a competitive disadvantage. Western countries lobbied for copyright enforcement in the 20th century because it was beneficial. Now the tables have turned, don't hold your breath for copyright enforcement against the wishes of the markets. We are all China now.
That the LLM itself is not allowed to produce copyrighted work (e.g. just copies of works or too structurally similar) without using a license for that work is something that is probably currently law. They are working around this via content filters. They probably also have checks during/after training that it does not reproduce work that is too similar. There are law suits about this pending if I remember correctly e.g. with the New York Times.
LLMs themselves are compressed models of the training data. The trick is the compression is highly lossy by being able to detect higher-order patterns instead of fucusing on the first-order input tokens (or bytes). If you look at how, for example, any of the Lempel-Ziv algorithms work, they also contain patterns from the input and they also predict the next token (usually byte in their case), except they do it with 100% probability because they are lossless.
So copyright should absolutely apply to the models themselves and if trained on AGPL code, the models have to follow the AGPL license and I have the right to see their "source" by just being their user.
And if you decompress a file from a copyrighted archive, the file is obviously copyrighted. Even if you decompress only a part. What LLMs do is another trick - by being lossy, they decompress probabilistically based on all the training inputs - without seeing the internals, nobody can prove how much their particular work contributed to the particular output.
But it is all mechanical transformation of input data, just like synonym replacement, just more sophisticated, and the same rules regarding plagiarism and copyright infringement should apply.
---
Back to what you said - the LLM companies use fancy language like "artificial intelligence" to distract from this so they can they use more fancy language to claim copyright does not apply. And in that case, no license would help because any such license fundamentally depends on copyright law, which as they claim does not apply.
That's the issue with LLMs - if they get their way, there's no way to opt out. If there was, AGPL would already be sufficient.
An open question would be if there is some degree of "loss" where copyright no longer applies. There is probably case law about this in different jurisdictions w.r.t. image previews or something.
There should be a system which protects all work (intellectual and physical) and makes sure the people doing it get rewarded according to the amount of work and skill level. This is a radical idea and not fully compatible with capitalism as implemented today. I have a lot on my to-read list and I don't think I am the first to come up with this but I haven't found anyone else describing it, yet.
And maybe it's broken by some degenerate case and goes tits up like communism always did. But AFAICT, it's a third option somewhere in between, taking the good parts of each.
For now, I just wanna find ways to stop people already much richer than me from profiting from my work without any kind of compensation for me. I want inequality to stop worsening but OTOH, in the past, large social change usually happened when things got so bad people rejected the status quo and went to the streets, whether with empty hands or not. And that feels like where we're headed and I don't know whether I should be exited or worried.
At some point, I'll have to look it up because if that's right, the billionaires and wannabe-trillionaires owe me a shitton of money.
I suppose the question is when does a machine applied transformation become a new work?
They cannot violate the license, because in their view they have not licensed anything from you.
I think that's horse shit, and a clear violation of the intellectual property rights that are supposed to protect creatives from the business boys, but apparently the stock market must grow.
(I didn't come up with this quote but I can't find the source now. If anything good comes out of LLMs, it's making me appreciate other people's more and trying to give credit where it's due.)
NVidia is a shovel-maker worth a few trillion dollars...
I haven't seen this argument made elsewhere, it would be interesting to get it into the courtrooms - I am told cases are being fought right now but I don't have the energy to follow them.
Plus as somebody else put it eloquently, it's labor theft - we, working programmers, exchanged out limited lifetime for money (already exploitative) in a world with certain rules. Now the rules changed, our past work has much more value, and we don't get compensated.
https://en.wikipedia.org/wiki/Idea–expression_distinction
https://en.wikipedia.org/wiki/Structure,_sequence_and_organi...
https://en.wikipedia.org/wiki/Abstraction-Filtration-Compari...
In a court of law you're going to have to argue that something is an expression instead of an idea. Most of what LLMs pump out are almost definitionally on the idea side of the spectrum. You'd basically have to show verbatim code or class structure at the expressive level to the courts.
There's a couple issues I see:
1) All of the concepts were developed with the idea that only humans are capable of certain kinds of work needed for producing IP. A human would not engage in highly repetitive and menial transformation of other people's material to avoid infringement if he could get the same or better result by working from scratch. This placed, throughout history, an upper limit on how protective copyright had to be.
Say, 100 years ago, synonym replacement and paraphrasing of sentences were SOTA methods to make copies of a book which don't look like copies without putting in more work than the original. Say, 50 years ago, computers could do synonym replacement automatically so it freed up some time for more elaborate restructuring of the original work and the level of protection should have shifted. Say, 10 years ago, one could use automatic replacement of phrases or translation to another language and back, freeing up yet more time.
The law should have adapted with each technological step up and according to your links it has - given the cases cited. It's been 30 years and we have a massive step up in automatic copying capabilities - the law should change again to protect the people who make this advancement possible.
Now with a sufficiently advanced LLM trained on all public and private code, you can prompt them to create a 3D viewer for Quake map files and I am sure it'll most of the time produce a working program which doesn't look like any of the training inputs but does feel vaguely familiar in structure. Then you can prompt it to add a keyboard-controlled character with Quake-like physics and it'll produce something which has the same quirks as Quake movement. Where did bunny hopping, wallrunning, strafing, circlejumps, etc. come from if it did not copy the original and the various forks?
Somebody had to put in creative work to try out various physics systems and figure out what feels good and what leads to interesting gameplay.
Now we have algorithms which can imitate the results but which can only be created by using the product of human work without consent. I think that's an exploitative practice.
2) It's illegal to own humans but legal to own other animals. The USA law uses terms such as "a member of the species Homo sapiens" (e.g. [0]) in these cases.
If the legality of tech in question was not LLMs but remixing of genes (only using a tiny fraction of human DNA) to produce a animals which are as smart as humans with chimpanzee bodies which can be incubated in chimpanzee females but are otherwise as sentient as humans, would (and should) it be legal to own them as slaves and use them for work? It would probably be legal by the current letter of the law but I assure you the law would quickly change because people would not be OK with such overt exploitation.
The difference is the exploitation by LLM companies is not as overt - in fact, mane people refer to LLMs as AIs and use pronouns such as "he" or "she", indicating them believe them to be standalone thinking entities instead of highly compressed lossy archives of other people's work.
3) The goal of copyright is progress, not protection of people who put in work to make that progress possible. I think that's wrong.
I am aware of the "is" vs "should" distinction but since laws are compromises between the monopoly in violence and the people's willingness to revolt instead of being an (attempted) codification of a consistent moral system, the best we can do is try to use the current laws (what is) to achieve what is right (what should be).
[0]: https://en.wikipedia.org/wiki/Unborn_Victims_of_Violence_Act
The idea of wallrunning should not be protected by copyright.
The quake special behaviors are results of essentially bugs which were kept because it led to fun gameplay. The model would almost certainly generate explicit handling for these behaviors because the original quake code is very obviously not the only reasonable way to do it. And in that case the model and its output is derivative work of the training input.
The issue is such an experiment (training a model with specific content excluded) would cost (tens/hundreds of?) millions of dollars and the only companies able to do it are not exactly incentivized to try.
---
And then there's the thing that current LLMs are fundamentally impossible to create without such large amounts of code as training data. I honestly don't care what the letter of the law is, to any reasonable person, that makes them derivative work of the training input and claiming otherwise is a scam and theft.
I always wonder if people arguing otherwise think they're gonna get something out of it when the dust settles or if they genuinely think society should take stuff from a subgroup of people against their will when it can to enrich itself.
That said, this comment is funny to me because I’ve done the same thing too, take some signal of disagreement, and assume the signal means I’m right and there’s a low-key conspiracy to hold me down, when it was far more likely that either I was at least a bit wrong, or said something in an off-putting way. In this case, I tend to agree with the general spirit of the sibling comment by @williamcotton in that it seems like you’re inventing some criteria that are not covered by copyright law. Copyrights cover the “fixation” of a work, meaning they protect only its exact presentation. Copyrights do not cover the Madlibs or Cliff Notes scenarios you proposed. (Do think about Cliff Notes in particular and what it implies about AI - Cliff Notes are explicitly legal.)
Personally, I’ve had a lot of personal forward progress on HN when I assume that downvotes mean I said something wrong, and work through where my own assumptions are bad, and try to update them. This is an important step especially when I think I’m right.
I’m often tempted to ask for downvote explanations too, but FWIW, it never helps, and aside from HN guidelines asking people to avoid complaining about downvotes, I find it also helps to think of downvotes as symmetric to upvotes. We don’t comment on or demand an explanation for an upvote, and an upvote can be given for many reasons - it’s not only used for agreement, it can be given for style, humor, weight, engagement, pity, and many other reasons. Realizing downvotes are similar and don’t only mean disagreement helps me not feel personally attacked, and that can help me stay more open to reflecting on what I did that is earning the downvotes. They don’t always make sense, but over time I can see more places I went wrong.
It shouldn't matter.
Currently, downvote means "I want this to be ranked lower". There really should be 2 options "factually incorrect" and "disagree". For people who think it should matter, there should be a third option, "rude", which others can ignore.
I've actually emailed about this with a mod and it seems he conflated talking about downvotes with having to explain a reason. He also told me (essentially) people should not have the right to defend themselves against incorrect moderator decisions and I honestly didn't know what to say to that, I'll probably message him again to confirm this is what he meant but I don't have high hopes after having similar interactions with mods on several different sites.
> FWIW, it never helps
The way I see it, it helped since I got 2 replies with more stuff to read about. Did you mean it doesn't work for you?
> downvotes as symmetric to upvotes
Yes, and we should have more upvote options too. I am not sure the explanation should be symmetric though.
Imagine a group conversation in which somebody lies (the "factually incorrect" case here). Depending on your social status within the group and group politics, you might call out the lie in public, in private with a subset or not at all. But if you do, you will almost certainly be expected to provide a reasoning or evidence.
Now imagine he says something which is factually correct. If you say you agree, are you expected to provide references why? I don't think so.
---
BTW, on a site which is a more technical alternative to HN, there was recently a post about strange behavior of HN votes. Other people posted their experience with downvotes here and they mirrored mine - organic looking (i.e. gradual) upvotes, then within minutes of each other several downvotes. It could be coincidence but me and others suspect voting rings evading detection.
I also posted a link to my previous comment as an experiment - if people disagree, they are more likely to also downvote that one. But I did not see any change there so I suspect it might be bots (which are unlikely to be instructed to also click through and downvote there). Note sample size is 1 here, for now.
2) re "hoot": You can say "fuck" here. You've been rudely dismissive twice now, yet you use a veil of politeness. I prefer when people don't hide their displeasure at me.
3) If you think I am wrong, you can say so instead of downvoting, it'll be more productive.
4) If you want me to expend effort on looking up statutes, you can say so instead of downvoting, it'll be more productive.
5) The law can be changed. If a well-reasoned argument is presented publicly, such as in a court room, and the general agreement is that the argument should apply but the court has to reject is because of poorly designed laws, that's a good impetus for changing it.
I don't think it's wrong, but misdirected maybe. What do you that someone can "steal" your open source contributions? I've always released most of my code as "open source", and not once has someone "stolen" it, it still sits on the same webpage where I initially published it, decades ago. Sure, it's guaranteed ingested into LLMs since long time ago, but that's hardly "stealing" when the thing is still there + given away for free.
I'm not sure how anyone can feel like their open source code was "stolen", wasn't the intention in the first place that anyone can use it for any purpose? That's at least why I release code as open source.
On the other side BSD0 is just a polite version of WTFPL, and people that like it doesn't care about what you do with the code.
> Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
> THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
The operative language here is “all copies or substantial portions of the Software.” LLMs, with rare exceptions, don’t retain copies or substantial portions of the software it was trained on. They’re not libraries or archives. So it’s unclear to me how training an AI model with an MIT-licensed project could violate the license.
(IAAL and this is my personal analysis, not legal advice.)
in other words, i've never been in the position that I felt my charitable givings anywhere were ever stolen.
Some people write code and put it out there without caveats. Some people jump into open source to be license warriors. Not me. I just write code and share it. If youre a person, great. if you're a machine then I suppose that's okay too -- I don't want to play musical chairs with licenses all day just to throw some code out there, and I don't particularly care if someone more clever than myself uses it to generate a profit.
I’ve never been a fan of coercive licensing. I don’t consider that “open.” It’s “strings-attached.”
I make mine MIT-licensed. If someone takes my stuff, and gets rich (highly unlikely), then that’s fine. I just don’t want some asshole suing me, because they used it inappropriately, or a bug caused them problems. I don’t even care about attribution.
I mainly do it, because it forces me to take better care, when I code.
I'm not exactly sure what you mean. I've been doing it for a couple of decades, so far, and haven't regretted it. Am I holding it wrong?
I'd be grateful for some elucidation.
Thanks!
Some people are happy to release code openly and have it used for anything, commercial or otherwise. Totally understandable and a valid choice to make.
Other people are happy to release code openly so long as people who incorporate it into their projects also release it in the same way. Again, totally understandable and valid.
None of this is hard to understand or confusing or even slightly weird.
I've written a ton of open source code and I never cared what people do with it, both "good" or "bad". I only want my code to be "useful". Not just to the people I agree with, but to anyone who needs to use a computer.
Of course, I'd rather people use my code to feed the poor than build weapons, but it's just a preference. My conviction is that my code is _freed_ from me and my individual preferences and shared for everyone to use.
I don't think my code is "stolen", if someone uses it to make themselves rich.
Why not say "... but to the people I disagree with"?
Would you be OK knowing your code is used to cause more harm than good? Would you still continue working on a hypothetical OSS which had no users, other than, say, a totalitarian government in the middle east which executes homosexuals? Would you be OK with your software being a critical directly involved piece of code for example tracking, de-anonymizing and profiling them?
Where is the line for you?
I'm not going to deliberately write code that's LIKELY to do more harm than good, but crippling the potential positive impact just because of some largely hypothetical risk? That feels almost selfish, what would I really be trying to avoid, personally running into a feel-bad outcome?
Douglas Crockford[0] tried this with JSON. Now, strictly speaking, this does not satisfy the definition of Open Source (it merely is open source, lowercase). But after 10 years of working on Open Source, I came to the conclusion that Open Source is not the absolute social good we delude ourselves into thinking.
Sure, it's usually better than closed source because the freedoms mean people tend to have more control and it's harder for anyone (including large corporations) to restrict those freedoms. But I think it's a local optimum and we should start looking into better alternatives.
Android, for example, is nominally Open Source but in reality the source is only published by google periodically[1], making any true cooperation between the paid devs and the community difficult. And good luck getting this to actually run on a physical device without giving up things like Google Play or banking apps or your warranty.
There's always ways to fuck people over and there always will be but we should look into further ways to limit and reduce them.
[0]: https://en.wikipedia.org/wiki/Douglas_Crockford
[1]: https://www.androidauthority.com/aosp-source-code-schedule-3...
The one thing I do care about is attribution — though maybe actually not in the nefarious cases.
I see this a lot and while being technically correct, I think it ignores the costs for them.
In practice such a government doesn't need to have laws and courts either but usually does because the appearance of justice.
Breaking international laws such as copyright also has costs for them. Nobody will probably care about one small project but large scale violations could (or at least should) lead to sanctions.
Similarly, if they want to offer their product in other countries, now they run the risk of having to pay fines.
Finally, see my sibling comment but a lot of people act like Open Source is an absolute good just because it's Open Source. By being explicit about our views about right and wrong, we draw attention to this delusion.
If I made something open source, you can train your LLM on it as much as you want. I'm glad my open source work is useful to you.
AI doesn't hold up its end of the bargain, so if you're in that mindset you now have to decide between going full hands-off like you or not doing any open source work at all.
I consider the payment I and my employer make to these AI companies to be what the LLM is paying me back for. Even the free ones get paid for my usage somehow. This stuff isn't charity.
It comes across as really trying too hard and a bit aggressive.
You could just write one top level comment and chill a bit. Same advice for any future threads too...
The entire point isn’t to allow a large corporation to make private projects out of your open source project for many open source licenses. It’s to ensure the works that leverage your code are open source as well. Something AI is completely ignoring using various excuses as to why their specific type of theft is ok.
In other words, the open source model of "open core with paid additional features" may be dead thanks to LLMs. Perhaps less so for some types of applications, but for frameworks like Tailwind very much so.
It's very hard to prevent specific types of usage (like feeding code to an LLM) without throwing out the baby with the bathwater and also preventing all sorts of other valid usages. AGPLv3, which is what antirez and Redis use goes to far IMHO and still doesn't quite get the job done. It doesn't forbid people (or tools) to "look" at the code which is what AI training might be characterized as. That license creates lots of headaches for corporate legal departments. I switched to Valkey for that reason.
I actually prefer using MIT style licenses for my own contributions precisely because I don't want to constrain people or AI usage. Go for it. More power to you if you find my work useful. That's why I provide it for free. I think this is consistent with the original goals of open source developers. They wanted others to be able to use their stuff without having to worry about lawyers.
Anyway, AI progress won't stop because of any of this. As antirez says, that stuff is now part of our lives and it is a huge enabler if you are still interested in solving interesting problems. Which apparently he is. I can echo much of what he says. I've been able to solve larger and larger problems with AI tools. The last year has seen quite a bit of evolution in what is possible.
> Am I wrong to feel this?
I think your feelings are yours. But you might at least examine your own reasoning a bit more critically. Words like theft and stealing are big words. And I think your case for that is just very weak. And when you are coding yourself are you not standing on the shoulders of giants? Is that not theft?
Why would a feeling be invalid? You have one life, you are under no obligation to produce clean training material, much less feel bad about this.
In future everyone will expect to be able to customise an application, if the source is not available they will not chose your application as a base. It's that simple.
The future is highly customisable software, and that is best built on open source. How this looks from a business perspective I think we will have to find out, but it's going to be fun!
I think there is room for closed source platforms that are built on top of using LLMs via some sort of API that it exposes. For example, iOS can be closed source and LLMs can develop apps for it to expand the capabilities of one's phone.
Allowing total customization by a business can allow them to mess up the app itself or make other mistakes. I don't think it's the best interface for allowing others to extend the app.
This seems unlikely. It's not the norm today for closed-source software. Why would it be different tomorrow?
I'm feeling this already.
Just the other day I was messing around with Fly's new Sprites.dev system and I found myself confused as to how one of the "sprite" CLI features worked.
So I went to clone the git repo and have Claude Code figure out the answer... and was surprised to find that the "sprite" CLI tool itself (unlike Fly's flycli tool, which I answer questions about like this pretty often) wasn't open source!
That was a genuine blocker for me because it prevented me from answering my question.
It reminded me that the most frustrating thing about using macOS these days is that so much of it is closed source.
I'd love to have Claude write me proper documentation for the sandbox-exec command for example, but that thing is pretty much a black hole.
• Increased upfront software complexity
• Increased maintenance burden (to not break officially supported plugins/customizations)
• Increased support burden
• Possible security/regulatory/liability issues
• The company may want to deliberately block functionality that users want (e.g. data migration, integration with competing services, or removing ads and content recommendations)
> That was a genuine blocker for me because it prevented me from answering my question.
It's always been this way. From the user's point of view there has always been value in having access to the source, especially under the terms of a proper Free and Open Source licence.
Tailwind is a business and they picked a business model that wasn't resilient enough.
To my surprise, my doctoral advisor told me to keep the code closed. She told me not only LLMs will steal it and benefit from it, but there's a risk of my code becoming a target after it's stolen by companies with fat attorney budgets and there's no way I could defend and prove anything.
https://archclx.medium.com/enforcing-gpg-encryption-in-githu...
My opinion on the matter is that AI models stealing the open source code would be ok IF the models are also open and remain so, and the services like chatgpt will remain free of cost (at least a free tier), and remain free of ads.
But we all know how it is going to go.
I want to write code to defy this logic and express my humanity. "To have fun", yes. But also to showcase what it means when a human engages in the act of programming. Writing code may increasingly not be "needed", but it increasingly is art.
Or accept that there definitely wont be open model businesses. Make them proprietary and accept the fact that even permissive licenses such as MIT, BSD Clause 2/3 wont't be followed by anyone while writing OSS.
And as for Tailwind, I donno if it is cos of AI.
Not everything needs to be mit or gnu.
Software licenses aren't, AI companies can just take your GPL code and spit it back out into non-GPL codebases and there's no way for you to even find out it happened, much less do anything about it, and the law won't help you either.
There's no such thing as a wrong feeling.
And I say this as one of those with the view that AI training is "learning" rather than "stealing", or at least that this is the goal because AI is the dumbest, the most error prone, and also the most expensive way, to try to make a copy of something.
My fears about setting things loose for public consumption are more about how I will be judged for them than about being ripped off, which is kinda why that book I started writing a decade ago and have not meaningfully touched in the last 12 months is neither published properly nor sent to some online archive.
When it comes to licensing source code, I mostly choose MIT, because I don't care what anyone does with the code once it's out there.
But there's no such thing as a wrong feeling, anyone who dismisses your response is blinding themselves to a common human response that also led to various previous violent uprisings against the owners of expensive tools of automation that destroyed the careers of respectable workers.
And sure, I could stubbornly refuse to use an LLM and write the code myself. But after getting used to LLM-assisted coding, particularly recent models, writing code by hand feels extremely tedious now.
LLMs are labor theft on an industrial scale.
I spent 10 years writing open source, I haven't touched it in the last 2. I wrote for multiple reasons none of which any longer apply:
- I believe every software project should have an open source alternative. But writing open source now means useful patterns can be extracted and incorporated into closed source versions _mechanically_ and with plausible deniability. It's ironically worse if you write useful comments.
- I enjoyed the community aspect of building something bigger than one person can accomplish. But LLMs are trained on the whole history and potentially forum posts / chat logs / emails which went into designing the SW too. With sufficiently advanced models, they effectively use my work to create a simulation of myself and other devs.
- I believe people (not just devs) should own the product they build (an even stronger protection of workers against exploitation than copyright). Now our past work is being used to replace us in the future without any compensation.
- I did it to get credit. Even though it was a small motivation compared to the rest, I enjoyed everyone knowing what I accomplished and I used it during job interviews. If somebody used my work, my name was attached to it. With LLMs, anyone can launder it and nobody knows how useful my work was.
- (not solely LLM related) I believed better technology improves the world and quality of life around me. Now I see it as a tool - neutral - to be used by anyone for both good and bad purposes.
Here's[0] a comment where I described why it's theft based on how LLMs work. I call it higher order plagiarism. I haven't seen this argument made by other people, it might be useful for arguing about those who want to legalize this.
In fact, I wonder if this argument has been made in court and whether the lawyers understand LLMs enough to make it.
I believe open source will become a bit less relevant in it’s current form, as solution/project tailored libraries/frameworks can be generated in a few hours with LLMs.
I love AI and pay for four services and will never program without AI again.
It pleases me that my projects might be helping out.
Meaning 99% of everything oss released now is de-facto abandonware.
If running an open source model means that I have only given out without receiving anything, there remains the possibility of being exploited. This dynamic has always existed, such as companies using a project and sending in vulnerability reports and the like but not offering to help, and instead demanding, often quite rudely.
In the past working with such extractive contributors may have been balanced with other benefits such as growing exposure leading to professional opportunities, or being able to sell hosted versions, consulting services and paid features, which would have helped the maintainer of the open source project pay off their bills and get ahead in life.
However with the rise of LLMs, it both facilitates usage of the open source tools without getting a chance to direct their attention towards these paid services, nor allows the maintainer to have direct exposure to their contributors. It also indirectly violates the spirit of said open source licenses, as LLMs can spit out the knowledge contained in these codebases at a scale that humans cannot, thus allowing people to bypass the license and create their own versions of the tools, which are themselves not open source despite deriving their knowledge from such data.
Ultimately we don't need to debate about this; if open source remains a viable model in the age of LLMs, people will continue to do it regardless of whether we agree or disagree regarding topics such as this; on the other hand, if people are not rewarded in any way we will only be left with LLM generated codebases that anyone could have produced, leaving all the interesting software development to happen behind closed doors in companies.
I’ve written complete GUIs in 3D on the front end. This GUI was non traditional. It allows you to playback, pause speed up, slow down and rewind a gps track like a movie. There is real time color changing and drawing of the track as the playback occurs.
Using mapbox to do this straight would be to slow. I told the AI to optimize it by going straight into shader extensions for mapbox to optimize GPU code.
Make no mistake. LLMs are incredible for things that are non systems based that require interaction with 3D and GUIs.
> However, this technology is far too important to be in the hands of a few companies.
This is the most important assessment and we should all heed this warning with great care. If we think hyperscalers are bad, imagine what happens if they control and dictate the entire future.
Our cellphones are prisons. We have no fundamental control, and we can't freely distribute software amongst ourselves. Everything flows through funnels of control and monitoring. The entire internet and all of technology could soon become the same.
We need to bust this open now or face a future where we are truly serfs.
I'm excited by AI and I love what it can do, but we are in a mortally precarious position.
Great news if you know the current generation of languages, you won't need to learn a new one for quite some time.
LLMs understand language Grammar files really well. A new language is easy for them (you can tell this by giving them a JSON schema and seeing how well they do)
What they don't always have is good taste with what preexisting libraries work together well. But this isn't a problem for new languages.
If you're developing a new programming language today, one of the assets you need to prepare is a short (~10,000 token or less) LLM-friendly guide to your language, plus a bunch of examples that coding agents can search through and crib from.
Done well, I expect this could accelerate the adoption of your new language - as users can start prompting their coding agents to build with it before they've even finished reading the tutorial themselves.
Your disadvantage will be that LLMs won't recommend your language when people ask "what could I build this in", but people discovered new languages via word-of-month before LLMs came along and I expect that to continue, especially if your language has something genuinely new and interesting to offer.
Who is going to control AI? The people in power obviously. The will buy all of the computers so running models locally will no longer be feasible. In case it hasn’t been obvious that this is already happening. It will only get worse.
They will not let themselves be taxed.
But who will buy the things the people in power produce if nobody has a job?
This is how civilization collapses.
I think there are some negative consequences to this; perhaps a new form of burn out. With the force multiplier and assisted learning utility comes a substantial increase in opportunity cost.
It definitely can.
The innovation that was the open, social web of 20 years ago? still an option, but drowned between closed ad-fueled toxic gardens and drained by AI illegal copy bots.
The innovation that was democracy? Purposely under attack in every single place it still exists today.
Insulin at almost no cost (because it costs next to nothing to produce)? Out of the question for people that live under the regime of pharmaceutical corporations that are not reigned by government, by collective rules.
So, a technology that has a dubious ROI over the energy and water and land consumed, incites illegal activities and suicides, and that is in the process of killing the consumer public IT market for the next 5 years if not more, because one unprofitable company without solid verifiable prospects managed to pass dubious orders with unproven money that lock memory components for unproven data centers... yes, it definitely can be taken back.
The tech will still be there. As much as blockchains, crypto, NFTs and such, whose bubbles have not yet burst (well, the NFT one did, it was fast).
But (Gen)AI today is much less about the tech, and much more about the illegal actions (harvesting copyrighted works) that permit it to run and the disastrous impact it has on ... everything (resources, jobs, mistaken prospectives, distorted IT markets, culture, politics) because it is not (yet) regulated to the extent it should.
AI is both a near-perfect propaganda machine and, in the programming front, a self-fulfilling prophecy: yes, AI will be better at coding than human. Mostly because humans are made worse by using AI.
Obviously, you are also joking about the thing that AI is immune to consanguinity, right ?
I'm confident you are wrong about that.
AI makes people who are intellectually lazy and like to cheating worse, in the same way that a rich kid who hires someone to do their university homework for them is hurting their ability to learn.
A rich kid who hires a personal tutor and invests time with them is spending the same money but using it to get better, not worse.
Getting worse using AI is a choice. Plenty of people are choosing to use it to accelerate and improve their learning and skills instead.
Every time I got bad results, looking back I noticed my spec was just vague or relied on assumptions. Of course you can't fix your collegues, if they suck they suck and sombody gotta do the mopping :)
I've never used AI to code, I'm a software architect and currently assume I get little value out of an LLM. It would be useful for me if this debate had a vaguely engineering-smelling quality to it, because its currently just two groups shouting at eachother and handwaving criticism away.
If you actually deal with AI generated problems, I love it, please make a post about it so we have something concrete to point to.
I wasted so much work time trying to steer one of these towards the light, which is very demotivating when design and "why did you do this?" questions are responded to with nothing but another flurry of commits. Even taking the time to fully understand the problem and suggest an alternative design which would fix most of the major issues did nothing (nothing useful must have emerged when that was fed into the coin slot...)
Since I started the review, I ended up becoming the "blocker" for this feature when people started asking why it wasn't landed yet (because I also have my own work to do), to the point where I just hit Approve because I knew it wouldn't work at all for the even more complex use cases I needed to implement in that area soon, so I could just fix/rewrite it then.
From my own experience, the sooner you accept code from an LLM the worse a time you're going to have. If wasn't a good solution or even was the wrong solution from the get-go, no amount of churning away at the code with an LLM will fix it. If you _don't know_ how to fix it yourself, you can't suddenly go from reporting your great progress in stand-ups to "I have nothing" - maybe backwards progress is one of those new paradigms we'll have to accept?
We are talking about a "stupid" tool that parses a google sheet and makes calls to a third-party API
So there is one google sheet per team, with one column per person
One line per day
And each day, someone is in charge of the duty
The tool grabs the data from the sheet and configures pagerduty so that alerts go to the right person
Very basic, no cleverness needed, really straightforward actually
So we have 1 person that wrote the code, with AI. Then we have a second person that checked the code (with AI). Then the shit comes to my desk. To see this kind of cruft:
def create_headers(api_token: str) -> dict:
"""Create headers for PagerDuty API requests.
Args:
api_token: PagerDuty API token.
Returns:
Headers dictionary.
"""
return {
"Accept": "application/vnd.pagerduty+json;version=2",
"Authorization": f"Token token={api_token}",
"Content-Type": "application/json",
}
And then, we have 5 usage like this: def delete_override(
base_url: str,
schedule_id: str,
override_id: str,
api_token: str,
) -> None:
"""Delete an override from a schedule.
Args:
base_url: PagerDuty API base URL.
schedule_id: ID of the schedule.
override_id: ID of the override to delete.
api_token: PagerDuty API token.
"""
headers = create_headers(api_token)
override_url = f"{base_url}/schedules/{schedule_id}/overrides/{override_id}"
response = requests.delete(override_url, headers=headers, timeout=60)
response.raise_for_status()
No HTTP keep-alive, no TCP reuse, the API key is passed down to every method, so is the API's endpoint. Timeout is defined in each method.
The file is ~800 lines of python code, contains 19 methods and only deals with pagerduty (not google sheet). It tooks 2 fulltime days.These people fail to produce anything meaningful, this is not really a surprise given their failure to do sane things with such a basic topic
Does AI brings good idea: obviously no, but we knew this. Does AI improves the quality of the result (regardless of the quality of the idea): apparently no Does AI improves productivity: again, given this example: no Are these people better, more skilled or else: no
Am I too demanding ? Am I asking too much ?
> No HTTP keep-alive, no TCP reuse, the API key is passed down to every method, so is the API's endpoint. Timeout is defined in each method. Fix all of those issues.
The issue here is more organizational with the engineers not getting the code up to standards before handing off, not the capabilities of the AI itself.
For the type of work I do, I found it best to tightly supervise my LLMs. Giving lots of design guidance upfront, and being very critical towards the output. This is not easy work. In fact, this was always the hard part, and now I'm spending a larger percentage of my time doing it. As the impact of design mistakes is a lot smaller, I can just revert after 20 minutes instead of 3 days, I also get to learn from mistakes quicker. So I'd say, I'm improving my skills faster than before.
For juniors though, I think you are right. By relying on this tech from early on in their careers, I think it will be very hard to grow their skills, taste and intuition. But maybe I'm just an old guy yelling at the clouds, and the next generation of developers will do just fine building careers as AI whisperers.
What I would really urge people to avoid doing is listening to what any tech influencer has to say, including antirez. I really don't care what famous developers think about this technology, and it doesn't influence my own experience of it. People should try out whatever they're comfortable with, and make up their own opinions, instead of listening what anyone else has to say about it. This applies to anything, of course, but it's particularly important for the technology bubble we're currently in.
It's unfortunate that some voices are louder than others in this parasocial web we've built. Those with larger loudspeakers should be conscious of this fact, and moderate their output responsibly. It starts by not telling people what to do.
Said by someone who spent his career writing code, it lacks a bit of details... a more correct way to phrase it is: "if you're already an expert in good coding, now you can use these tools to skip most of code writing"
LLMs today are mostly some kind of "fill-in-the-blanks automation". As a coder, you try to create constraints (define types for typechecking constraints, define tests for testing constraints, define the general ideas you want the LLM to code because you already know about the domain and how coding works), then you let the model "fill-in the blanks" and you regularly check that all tests pass, etc
No, I really don't think they will. Software has only been getting worse, and LLMs are accelerating the rate at which incompetent developers can pump out low quality code they don't understand and can't possibly improve.
There's still no point. Resharper and clang-tidy still have more value than all LLMs. It's not just a hype, it's a bloody cult, right besides those nft and church of COVID people.
IMO the only rebuttal to this can be that LLMs are almost at their peak and there is not going to be any possible significant breakthrough or steady improvement in the next years, in which case they will never become "the new computers".
This is what reams of the AI proponents fail to understand. "Amazing, I don't have to write code, 'only' review AI slop" is sitting backwards on the horse. Who the heck wants to do that?
Model improvements may have flattened, the quality improvements due to engineering work around those models certainly have not.
If we always wait for technology to calcify and settle before we interact with it, then that would be rather boring for some of us. Acquiring knowledge is not really that much of a heavy burden that it's an issue if it's outdated a year in . But that's maybe just a mindset thing.
Next breakthrough will happen in 2030 or it might happen next Tuesday; it might have already happened, it's just that the lab which did it is too scared to release it. It doesn't matter: until it happens, you should work with what you've got.
> AI code is slop, therefore you shouldn't use it
You should learn how to responsibly use it as a tool, not a replacement for you. This can be done, people are doing it, people like Salvatore (antirez), Mitchell (of Terraform/Ghostty fame), Simon (swillison) and many others are publicly talking about it.
> AI can't code XYZ
It's not all-or-nothing. Use it where it works for you, don't use it where it doesn't. And btw, do check that you actually described the problem well. Slop-in, slop-out. Not sayin' this is always the case, but turns out it's the case surprisingly often. Just sayin'
> AI will atrophy your skills, or prevent you from learning new ones, therefore you shouldn't use it
Again, you should know where and how to use it. Don't tune out while doing coding. Don't just skim the generated code. Be curious, take your time. This is entirely up to you.
> AI takes away the fun part (coding) and intensifies the boring (management)
I love programming but TBH, for non-toy projects that need to go into production, at least three quarters are boring boilerplate. And making that part interesting is one of the worst things you can do in software development! That path lies resume-driven development, architecture astronautics, abusing design patterns du jour, and other sins that will make code maintenance on that thing a nightmare! You want boring, stable, simple. AI excels at that. Then you can focus on the small tiny bit that's fun and hand-craft that!
Also, you can always code for fun. Many people with boring coding jobs code for fun in the evenings. AI changes nothing here (except possibly improving the day job drudgery).
> AI is financially unsustainable, companies are losing money
Perhaps, and we're probably in the bubble. Doesn't detract from the fact that these things exist, are here now, work. OpenAI and Anthropic can go out of business tomorrow, the few TB of weights will be easily reused by someone else. The tech will stay.
> AI steals your open source code, therefore you shouldn't write open-source
Well, use AI to write your closed-source code. You don't need to open source anything if you're worried someone (AI or human) will steal it. If you don't want to use something on moral grounds, that's a perfectly fine thing to do. Others may have different opinion on this.
> AI will kill your open source business, therefore you shouldn't write open-source
Open source is not a business model (I've been saying this for longer than median user of this site has been alive). AI doesn't change that.
As @antirez points out, you can use AI or not, but don't go hiding under a rock and then being surprised in a few years when you come out and find the software development profession completely unrecognizable.
You apparently see "making the boilerplate interesting" as doing a bunch of overengineering. Strange. To my mind, the overengineering is part of the boilerplate. "Making the boilerplate interesting" in my mind is not possible; but rather the goal is to fix the system such that it doesn't require boilerplate any more. (Sometimes that just means a different implementation language.)
A company I worked with a while ago had a microservices architecture, and have decided to not use one of a few standard API serialization/deserialization options, but write their own, because was going to be more performant, easier to maintain, better fit for their use case. A few years on, after having grown organically to support all the edge cases, it's more convoluted, slower, and buggy than if they went with the boring option that ostensibly had "a bit more boilerplate" from the start.
A second example is from a friend, whose coworker decided to write a backend-agnostic, purpose-agnostic, data-agnostic message broker/routing library. They spent a few months of this, delivered a beautifully architected solution in a few dozen k lines of code. The problem is the solution solves many problems the company didnt and wouldn't have, and will be a maintenance drag from then forevermore. Meanwhile, they could have done it in a few hundred lines of code that would be coupled to the problem domain, but still farily decend from most people's point of view.
These two are from real projects. But you can also notice that in general people are often picking a fancy solution over a boring one, ostensibly because it has something "out of the box". The price of the "out of the box"-ness (aside from potential SaaS/infra costs and vendor lock in), is that you now need to adapt your own code to work with the mental model (domain) of the fancy solution.
Or to harp on something trivial, you end up depending on left-pad because writing it yourself was boring.
> fix the system such that it doesn't require boilerplate any more.
I think perhaps I used a more broad meaning for "boilerplate" than you had in mind. If we're talking about boilerplate as enumerating all the exceptions a Java method may raise, or whatever unholy sad thing we have to do in C to use GTK/GObject, then I agree.
But I also meant something more closer to "glue code that isn't the primary carrier of value of the project", or to misuse financial language in this context, the code that's a cost center, not a profit center.
There's also a short-termism aspect of AI generated code that's seemingly not addressed as much. Don't pee your pants in the winter to keep warm.
But maybe we should cherish these people. Maybe it's among them we find the embryo to the resistance - people who held out when most of us were seduced - seduced into giving the machine all our knowledge, all our skills, all the secrets about us we were not even aware of ourselves - and setting it up to be orders of magnitude more intelligent than any of us, combined. And finally - just as mean, vindictive and selfish as most of the people in the training data on which it was trained.
Maybe it's good to stay skeptical a bit longer.