FilterHN

mrkeen

2 days ago

[-]

This was a pre-existing problem, even if reliance on LLMs is making it worse.

Naur (https://gwern.net/doc/cs/algorithm/1985-naur.pdf) called it "theory building":

> The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

Lamport calls it "programming ≠ coding", where programming is "what you want to achieve and how" and coding is telling the computer how to do it.

I strongly agree with all of this. Even if your dev team skipped any kind of theory-building or modelling phase, they'd still passively absorb some of the model while typing the code into the computer. I think that it's this last resort of incidental model building that the LLM replaces.

I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

bob1029

2 days ago

[-]

> I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

I have some anecdotal evidence that suggests that we can accomplish far more value-add on software projects when completely away from the computer and any related technology.

It's amazing how fast the code goes when you know exactly what you want. At this point the LLM can become very useful because its hallucinations instantly flag in your perspective. If you don't know what you want, I don't see how this works.

I really never understood the rationale of staring at the technological equivalent of a blank canvas for hours a day. The LLM might shake you loose and get you going in the right direction, but I find it much more likely to draw me into a wild goose chase.

The last 10/10 difficulty problem I solved probably happened in my kitchen while I was chopping some onions.

[1] https://www.youtube.com/watch?v=OqEeIG8aPPk

notpachet

2 days ago

[-]

> It's amazing how fast the code goes when you know exactly what you want.

To quote Russ Ackoff[1]:

> Improving a system requires knowing what you could do if you could do whatever you wanted to. Because if you don't know what you would do if you could do whatever you wanted to, how on earth are you going to know what you can do under constraints?

jodrellblank

2 days ago

[-]

If you were playing chess and could do whatever you wanted, you could take several goes in a row, take the opponents pieces off the board and move yours into a winning position. How does that help you play better under the constraints of the rules?

munificent

2 days ago

[-]

This metaphor doesn't really work because the entire point of a game—the thing that makes playing it playful—is that it has no effect on the world outside of the game itself. Thus, ignoring the rules of chess to reach a goal doesn't make sense. There are no chess goals that don't involve the game of chess.

This isn't true in programming or real-world tasks where you are trying to accomplish some external objective.

jacobolus

2 days ago

[-]

If you were playing chess, and you could do whatever you wanted, you might want to, e.g., set up a beautiful mating combination using the minor pieces, set a brilliant trap with forced mate in 10 moves and then trick the opponent into falling into it, keep control of the center all game and make the opponent play in a cramped and crippled style, promote a pawn to a knight for a win, skewer the queen and king, or turn a hopelessly lost position into a draw. The constraint is: you need to take turns and the opponent wants to win / doesn't want to let you win.

timeinput

2 days ago

[-]

If you don't know you could take the king and win the game why would you bother with any of that?

Retric

1 day ago

[-]

It’s actually useful technique in chess to see what you would do if you could make multiple moves in a row.

If I could move my Rook there it’s a win, is there any way I can make that happen? How about if I sacrifice my knight etc.

hvb2

2 days ago

[-]

> The last 10/10 difficulty problem I solved probably happened in my kitchen while I was chopping some onions.

Let me guess, you had tears in your eyes when you found the solution?

rhetocj23

2 days ago

[-]

"I have some anecdotal evidence that suggests that we can accomplish far more value-add on software projects when completely away from the computer and any related technology.

It's amazing how fast the code goes when you know exactly what you want"

Yeah its the same reason why demand for pen and paper still exists. Its the absolute best way for one to think and get their thoughts out. I can personally attest to this - no digital whiteboard can ever compete with just a pen and paper. My best and original ideas come from a blank paper and a pen.

Solutions can emerge from anywhere. But its likely to happen when the mind is focused in a calm state - thats why walking for instance is great.

CliffStoll

2 days ago

[-]

Strongly agree: scribbling requirements, process maps, and block diagrams goes a long way to understanding what needs to be done, as well as getting us to think through what'll be the easy parts and pinch-points.

Goes back to Fred Brooks' Mythical Man-Month: Start with understanding the requirements; then design the architecture. Only after that, begin programming.

rerdavies

1 day ago

[-]

And how did that turn out? Complete and utter disaster, as I remember it.

lelanthran

2 days ago

[-]

> The last 10/10 difficulty problem I solved probably happened in my kitchen while I was chopping some onions.

Were you the one who developed TOR?

cantor_S_drug

2 days ago

[-]

Another aspect is test cases constrains the domain of possible correct codes by a lot ( a randomly picked number will never solve a quadratic equation, by having many such quadratic equations [test cases] simulatenously, we are imposing lot of constraints on the solution space). Let's say I want LLM to write a regex but by having it run on test cases, I can gain confidence. This is the thesis of Simon Willison. Once LLMs continuously learn about what a code "means" in a tight REPL internal loop, it will start to gain better understanding.

skydhash

2 days ago

[-]

The thing is that for most code, the meaning is outside the code itself. The code is just a description of a computable solution. But both the problem and foundational solution is out there, not in the code.

And some code are solving additional complexities not essential ones (like making it POSIX instead of using bashisms). In this case, it’s just familiarity with the tools that can help you to derive alternatives approaches.

tom_m

2 days ago

[-]

100% - but this has always been true. Some people have always lived into the code before understanding. Now it's probably an even slippier slope.

It's like you don't know how to ski and you're going down a really steep hill...now with AI, imagine that really steep hill is iced over.

cozzyd

2 days ago

[-]

The risk there is I chop my fingers too

grigri907

2 days ago

[-]

9x programmer?

harvey9

2 days ago

[-]

Tom Lehrer: Base 8 is just like base 10 really. If you're missing two fingers.

sorokod

1 day ago

[-]

or you could just skip the pain and count the space between the fingers.

cozzyd

1 day ago

[-]

Add your toes and you get hexadecimal!

dleeftink

2 days ago

[-]

The illusive onion flow state!

anon7725

2 days ago

[-]

elusive

jacobr1

2 days ago

[-]

I think this is part of the reason why I've had a bit more success with AI Coding than some of my colleagues. My pre-llm workflow was to rapidly build a crappy version of something so that I could better understand it, then rework it (even throw away to the prototype) to build something I now know how I want to handle. I've found even as plenty of thought leaders talk about this general approach (rapid prototyping, continuous refactoring, etc ...) that many engineers are resistant and want to think through the approach and then build it "right." Or alternatively just whip something out and don't throw it away, but rather toil on fixes to the their crappy first pass.

With AI this loop is much easier. It is cheap to even build 3 parallel implementations of something and maybe another where you let the system add whatever capability it thinks would be interesting. You can compare and use that to build much stronger "theory of the program" with requirements, where the separation of concerns are, how to integrate with the larger system. Then having AI build that, with close review of the output (which takes much less time if you know roughly what should be being built) works really well.

HarHarVeryFunny

2 days ago

[-]

> My pre-llm workflow was to rapidly build a crappy version of something so that I could better understand it, then rework it (even throw away to the prototype)

That only works for certain type of simpler products (mostly one-man projects, things like web apps) - you're not going to be building a throw-away prototype, either by hand or using AI, of something more complex like your company's core operating systems, or an industrial control system.

sandoze

2 days ago

[-]

I can’t speak to OS development but industrial coding there’s a lot of experimenting and throw away. You generally don’t write a lot of code for the platform you’re building on (PLCs, automation components). It’s well tested and if it doesn’t hit industry standards (eg. timing, braking) you iterate or start over. At least that was my experience.

When it comes to general software development for customers in the everyday world (phones, computers, web). I often write once for proof, iterate as product requirements becomes clearer/refined, rewrite if necessary (code smell, initial pattern was inefficient for the final outcome).

On a large project, often I’ll touch something I wrote a year ago and realize I’ve evolved the pattern or learned something new in the language/system and I’ll do a little refactor while I’m in there. Even if it’s just code organization for readability.

RHSeeger

2 days ago

[-]

> to rapidly build a crappy version of something so that I could better understand it, then rework it

I do this, too. And it makes me awful at generating "preliminary LOEs", because I can't tell how long something will take until I get in there and experiment a little.

rgbrgb

1 day ago

[-]

In my experience, the only reliable LOE estimate is from someone who just touched that feature or someone who scaffolded it out or did the scrappy prototype in the process of generating the LOE.

nyrikki

2 days ago

[-]

A formalized form of this is the red-green-refactor pattern common in TDD.

Self created or formalized methods work, but they have to have habits or practices in place that prevent disengagement and complacency.

With LLMs there is the problem with humans and automation bias, which effects almost all human endeavors.

Unfortunately that will become more problematic as tools improve, so make sure to stay engaged and skeptical, which is the only successful strategy I have found with support from fields like human factors research.

NASA and the FAA are good sources for information if you want to develop your own.

citizenpaul

2 days ago

[-]

This is what I came to comment. I'm seeing this more and more on HN of all places. Commenters are essentially describing TDD as how they use AI but don't seem to know that is what they are doing. (Without the tests though.)

Maybe I am more of a Leet coder than I think?

XenophileJKO

2 days ago

[-]

In my opinion TDD is antithetical to this process.

The primary reason is because what you are rapidly refactoring in these early prototypes/revisions are the meta structure and the contacts.

Before AI the cost of putting tests on from the beginning or TTD slowed your iteration speed dramatically.

In the early prototypes what you are figuring out is the actual shape of the problem and what the best division of responsibilities and how to fit them together to fit the vision for how the code will be required to evolve.

Now with AI, you can let the AI build test harnesses at little velocity cost, but TDD is still not the general approach.

nyrikki

2 days ago

[-]

There are multiple schools to TDD, sounds like you were exposed to the kind that aims for coverage vs domain behavior.

Like any framework they all have costs,benefits, and places they work and others that they don’t.

Unless taking time to figure out what your inputs and expected outputs, the schools of thought that targeted writing all tests and even implement detail tests I would agree with you.

If you focus on writing inputs vs outputs, especially during a spike, I need to take prompt engineering classes from you

ryoshu

2 days ago

[-]

And you have to pay special attention to the tests written by LLMs. I catch them mocking when they shouldn't, passing tests that don't actually pass, etc.

lxgr

2 days ago

[-]

> With LLMs there is the problem with humans and automation bias, which effects almost all human endeavors.

Yep, and I believe that one will be harder to overcome.

Nudging an LLM into the right direction of debugging is a very different skill from debugging a problem yourself, and the better the LLMs get, the harder it will be to consciously switch between these two modes.

beder

2 days ago

[-]

I agree, and even moreso, it's easy to see the (low!) cost of throwing away an implementation. I've had the AI coder produce something that works and technically meets the spec, but I don't like it for some reason and it's not really workable for me to massage it into a better style.

So I look up at the token usage, see that it cost 47 cents, and just `git reset --hard`, and try again with an improved prompt. If I had hand-written that code, it would have been much harder to do.

2 days ago

[-]

> My pre-llm workflow was to rapidly build a crappy version of something so that I could better understand it, then rework it (even throw away to the prototype) to build something I now know how I want to handle.

In my experience this is a bad workflow. "Build it crappy and fast" is how you wind up with crappy code in production because your manager sees you have something working fast and thinks it is good enough

MathMonkeyMan

2 days ago

[-]

The trick is not to show anybody the prototype, especially your manager.

exe34

2 days ago

[-]

The old "oh I just came up with that exact set of necessary and sufficient specs" in agile meetings.

fullstop

2 days ago

[-]

Oh, man, I've been there. It's worse if sales sees it.

withinboredom

2 days ago

[-]

I've been there. It gets even worse if the customer sees it.

laserlight

2 days ago

[-]

I wouldn't be surprised if customers are easier to convince, than sales people, that the prototype is not suitable for deployment.

arethuza

2 days ago

[-]

I have a bad feeling that a prototype I wrote ~15 years ago is still being used by a multinational company... It was pretty crappy because it was supposed to be replaced by something embedded in the shiny new ERP system. Funnily enough the ERP project crashed and burned...

rossdavidh

2 days ago

[-]

My experience as well; once it is working, other things take priority.

The question is, will the ability of LLMs to whip out boilerplate code cause managers to be more willing to rebuild currently "working" code into something better, now that the problem is better understood than when the first pass was made? I could believe it, but it's not obvious to me that this is so.

bazoom42

2 days ago

[-]

Sounds more like a problem with the manager than with the workflow per se?

jdbernard

2 days ago

[-]

Maybe, but even so workflows like this don't exist in a vacuum. We have to work within the constraints of the organizational systems that exist. There are many practices that I personally adopt in my side projects that would have benefited many of my main jobs over the years, but to actually implement them at scale in my workplaces would require me to spend more time managing/politicking than building software. I did eventually go into management for this reason (among others), but that still didn't solve the workflow problem at my old jobs.

recursive

2 days ago

[-]

Nothing lasts longer than a temporary fix.

nonethewiser

2 days ago

[-]

Thats why you throw away the prototype.

withinboredom

2 days ago

[-]

That's also when you tell your manager: "this is just the happy flow, it isn't production ready". Manager will then ask how long that will take and the answer is that the estimate hasn't changed.

2 days ago

[-]

I'm not sure I understand the problem. Just don't publish the prototype.

lioeters

2 days ago

[-]

> theory building

That's insightful how you connected the "comprehension debt" of LLM-generated code with the idea of programming as theory building.

I think this goes deeper than the activity of programming, and applies in general to the process of thinking and understanding.

LLM-generated content - writing and visual art also - is equivalent to the code, it's what people see on the surface as the end result. But unless a person is engaged in the production, to build the theory of what it means and how it works, to go through the details and bring it all into a whole, there is only superficial understanding.

Even when LLMs evolve to become more sophisticated so that it can perform this "theory building" by itself, what use is such artificial understanding without a human being in the loop? Well, it could be very useful and valuable, but eventually people may start losing the skill of understanding when it's more convenient to let the machine do the thinking.

tsunamifury

2 days ago

[-]

What if the LLM can just understand the theory or read the code and derive it?

wiremine

2 days ago

[-]

> I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

I also strongly agree with Lamport, but I'm curious why you don't think Ai can help in the "theory building" process, both for the original team, and a team taking over a project? I.e., understanding a code base, the algorithms, etc.? I agree this doesn't replace all the knowledge, but it can bridge a gap.

wholinator2

2 days ago

[-]

I agree, the llm _vastly_ speeds up the process of "rebuilding the theory" of dead code, even faster than the person who wrote it 3 years ago can. I've had to work on old fortran codebases before and recently had the pleasure of including ai in my method and my god, it's so much easier! I can just copy and paste every relevant function into a single prompt, say "explain this to me" and it will not only comment each line with its details, but also elucidate the deeper meaning behind the set of actions. It can tell exactly which kind of theoretical simulation the code is performing without any kind of prompting on my part, even when the functions are named things like "a" or "sv3d2". Then, i can request derivations and explanations of all relevant theory to connect to the code and come away in about 1 days worth of work with a pretty good idea of the complete execution of a couple thousand lines of detailed mathematical simulations in a languages I'm no expert in. LLMs contribution to building theory has actually been more useful to me than is contribution in writing code!

bunderbunder

2 days ago

[-]

From what I've seen they're great at identifying trees and bad at mapping the forest.

In other words, they can help you identify what fairly isolated pieces of code are doing. That's helpful, but it's also the single easiest part of understanding legacy code. The real challenges are things like identifying and mapping out any instances of temporal coupling, understanding implicit business rules, and inferring undocumented contracts and invariants. And LLM coding assistants are still pretty shit at those tasks.

manishsharan

2 days ago

[-]

Not always.

You could paste your entire repo into Gemini and it could map your forest and also identify the "trees".

Assuming your codebase is smaller than Gemini context window. Sometimes it makes sense to upload a package,s code into Gemini and have it summarize and identify key ideas and function. Then repeat this for every package in the repository.then combine the results . It sounds tedious but it is a rather small python program that does this for me.

bunderbunder

2 days ago

[-]

I've tried doing things like that. Results reminded me of that old chestnut, "Answers: $1. Correct answers: $50."

Concrete example, last week a colleague of mine used a tool like this to help with a code & architectural review of a feature whose implementation spanned four repositories with components written in four different programming languages. As I was working my way through the review, I found multiple instances where the information provided by the LLM missed important details, and that really undermined the value of the code review. I went ahead and did it the old fashioned way, and yes it took me a few hours but also I found four defects and failure modes we previously didn't know about.

prmph

2 days ago

[-]

Indeed, I once worked with a developer on a contract team who was only concerned with runtime execution, no concern whatever for architecture or code clarity or whatever at all.

The client loved him, for obvious reasons, but it's hard to wrap my head around such an approach to software construction.

Another time, I almost took on a gig, but when I took one look at the code I was supposed to take over, I bailed. Probably a decade would still not be sufficient for untangling and cleaning up the code.

True vibe coding is the worst thing. It may be suitable for one-ff shell script s of < 100 line utilities and such, anything more than that and you are simple asking for trouble

N70Phone

2 days ago

[-]

> I.e., understanding a code base, the algorithms, etc.?

The big problem is that LLMs do not *understand* the code you tell them to "explain". They just take probabilistic guesses about both function and design.

Even if "that's how humans do it too", this is only the first part of building an understanding of the code. You still need to verify the guess.

There's a few limitations using LLMs for such first-guessing: In humans, the built up understanding feeds back into the guessing, as you understand the codebase more, you can intuit function and design better. You start to know patterns and conventions. The LLM will always guess from zero understanding, relying only on the averaged out training data.

A following effect is that which bunderbunder points out in their reply: while LLMs are good at identifying algorithms, mere pattern recognition, they are exceptionally bad at world-modelling the surrounding environment the program was written in and the high level goals it was meant to accomplish. Especially for any information obtained outside the code. A human can run a git-blame and ask what team the original author was on, an LLM cannot and will not.

This makes them less useful for the task. Especially in any case where you intent to write new code; Sure, it's great that the LLM can give basic explanations about a programming language or framework you don't know, but if you're going to be writing code in it, you'd be better off taking the opportunity to learn it.

netghost

2 days ago

[-]

Perhaps it's the difference between watching a video of someone cooking a meal and cooking it for yourself.

wiremine

2 days ago

[-]

That's a good analogy.

To clarify my question: Based on my experience (I'm a VP for a software department), LLMs can be useful to help a team build a theory. It isn't, in and of itself, enough to build that theory: that requires hands-on practice. But it seems to greatly accelerate the process.

panarky

2 days ago

[-]

People always wring their hands that operating at a new, higher level of abstraction with destroy people's ability to think and reason.

But people still think and reason just fine, but now at a higher level that gives them greater power and leverage.

Do you feel like you're missing something when you "cook for yourself" but you didn't you didn't plant and harvest the vegetables, raise and butcher the protein, forge the oven, or generate the gas or electricity that heats it?

You also didn’t write the CPU microcode or the compiler that turns your code into machine language.

When you cook or code, you're already operating on top of a very tall stack of abstractions.

JoeAltmaier

2 days ago

[-]

Nah. This is a different beast entirely. This is removing the programmer from the arena, so they'll stop having intuition about how anything works or what it means. Not more abstract; completely divorced from software and what it's capable of.

Sure, manager-types will generally be pleased when they ask AI for some vanilla app. But when it doesn't work, who will show up to make it right? When they need something more complex, will they even know how to ask for it?

It's the savages praying to Vol, the stone idol that decides everything for them, and they've forgotten their ancestors built it and it's just a machine.

mikestorrent

2 days ago

[-]

I agree with your sentiment. The thing is, in the past, the abstractions supporting us were designed for our (human) use, and we had to understand their surface interface in order to be able to use them effectively.

Now, we're driving such things with AI; it follows that we will see better results if we do some of the work climbing down into the supporting abstractions to make their interface more suitable for AI use. To extend your cooking metaphor, it's time to figure out the manufactured food megafactory now; yes, we're still "cooking" in there, but you might not recognize the spatulas.

Things like language servers (LSPs) are a step in this direction: making it possible to interact with the language's parser/linter/etc before compile/runtime. I think we'll eventually see that some programming languages end up being more apropos to efficiently get working, logically organized code out of an AI; whether that is languages with "only one way to do things" and extremely robust and strict typing, or something more like a Lisp with infinite flexibility where you can make your own DSLs etc remains to be seen.

Frameworks will also evolve to be AI-friendly with more tooling akin to an LSP that allows an MCP-style interaction from the agent with the codebase to reason about it. And, ultimately, whatever is used the most and has the most examples for training will probably win...

epgui

2 days ago

[-]

The comment you're replying to is not about abstraction at all. ie.: The difference between passive listening and active learning is not abstraction.

jquaint

2 days ago

[-]

I agree with this sentiment. Perhaps this is why there is such a senior / junior divide with LLM use. Seniors already build their theories. Juniors don't have that skill.

the_af

2 days ago

[-]

I really like this definition of "life" and "death" of programs, quite elegant!

I've noticed that I struggle the most when I'm not sure what the program is supposed to do; if I understand this, the details of how it does it become more tractable.

The worry is that LLMs make it easier to just write and modify code without truly "reviving" the program... And even worse, they can create programs that are born dead.

BobbyTables2

2 days ago

[-]

Fully agree.

I was once on a project where all the original developers suddenly disappeared and it was taken over by a new team. All institutional knowledge had been lost.

We spent a ridiculous amount of time trying to figure out the original design. Introduced quite a few bugs until it was better understood. But also fixed a lot of design issues after a much head bashing.

By the end, it had been mostly rewritten and extended to do things not originally planned.

But the process was painful.

leptons

1 day ago

[-]

I once took over a project that was built by someone in Mexico and all the function names and variables were kind of obscure Mexican slang words, and I don't know any Spanish. That was probably the most frustrating project I've ever worked on.

kossTKR

2 days ago

[-]

While interesting this is not the point of the article.

Point is LLM's makes this problem 1000 times worse and so it really is a ticking time bomb thats totally new - most people, most programmers, most day to day work will not include some head in the clouds abstract metaprogramming but now LLM's both force programmers to "do more" and constantly destroys anyones flow state, memory, and the 99% of the talent and skill that comes from actually writing good code for hours a day.

LLM's are amazing but they also totally suck because they essentially steal learning potential, focus and increase work pressure and complexity, and this really is new, because also senior programmers are affected by this, and you really will feel this at some point after using these systems for a while.

They make you kind of demented, and no you can't fight this with personal development and forced book reading after getting up at 4 am in the morning just as with scrolling and the decrease in everyones focus, even bibliophiles.

lxgr

2 days ago

[-]

I'd actually argue that developers being actually sped up by LLMs (i.e. in terms of increasing their output of maintainable artifacts and not just lines of code) are those that have a good theory of the system they're working on.

At least at this point, LLMs are great at the "how", but are often missing context for the "what" and "why" (whether that's because it's often not written down or not as prevalent in their training data).

827a

2 days ago

[-]

I've used the word "coherence" to describe this state; when an individual or a team has adequately groked the system and its historical context to achieve a level of productivity in maintenance and extension, only then is the system coherent.

Additionally and IMO critically to this discussion: Its easy for products or features to "die" not when the engineers associated with it lose coherence on how it is implemented from a technical perspective, but also when the product people associated with it lose coherence on why it exists or who it exists for. The product can die even if one party (e.g. engineers) still maintains coherence while the other party (e.g. product/business) does not. At this point you've hit a state where the system cannot be maintained or worked on because everyone is too afraid of breaking an existing workflow.

LLMs are, like, barely 3% of the way toward solving the hardest problems I and my coworkers deal with day-to-day. But the bigger problem is that I don't yet know which 3% it is. Actually, the biggest problem is maybe that its a different, dynamic 3% of every new problem.

w10-1

2 days ago

[-]

In my observation, this "coherence" is a matter not only of understanding, but of accepting, particularly certain trade-off's. Often this acceptance is because people don't want to upset the person who insisted on it.

Once they're gone or no longer applying pressure, the strain is relieved, and we can shift to a more natural operation, application, or programming model.

For this reason, it helps to set expectations that people are cycled through teams at slow intervals - stable enough to build rapport, expertise, and goodwill, but transient enough to avoid stalls based on shared assumptions.

danmaz74

2 days ago

[-]

Having worked on quite a few legacy applications in my career, I would say that, as for so many other issues in programming, the most important solution to this issue is good modularization of your code. That allows a new team to understand the application at high level in terms of modules interacting with each other, and when you need to make some changes, you only need to understand the modules involved, and ideally one at a time. So you don't need to form a detailed theory of the whole application all at the same time.

What I'm finding with LLMs is that, if you follow good modularization principles and practices, then LLMs actually make it easier to start working on a codebase you don't know very well yet, because they can help you a lot in navigating, understanding "as much as you need", and do specific changes. But that's not something that LLMs do on their own, at least from my own experience - you still need a human to enforce good, consistent modularization.

FrustratedMonky

2 days ago

[-]

Debt has always existed, and "LLMs is making it worse"

Yes, I think point is, LLM's are making it 'a-lot' worse.

And then compounding that will be in 10 years when no Senior Devs were being created, so nobody will be around to fix it. Extreme of course, there will be dev's, they'll just be under-water, piled on with trying to debug the LLM stuff.

pixl97

2 days ago

[-]

>they'll just be under-water, piled on with trying to debug the LLM stuff.

So in that theory the senior devs of those days will still be able to command large salaries if they know their stuff, in specific how to untangle the mess of LLM code.

FrustratedMonky

2 days ago

[-]

Good point. Maybe it will circle around, and a few devs that like to dig through this stuff will probably be in high demand. And it will be like earlier cycles when, for example only, a few people really liked working with bits and Boolean logic and they were paid well.

rapind

2 days ago

[-]

I could also argue that 20 years ago EJBs made it a lot worse, ORMs made it massively worse, heck Rails made it worse, and don't even get me started on Javascript frameworks, which are the epitome of dead programs and technical debt. I guarantee there were assembly programmers shouting about Visual Basic back in the day. These are all just abstractions, as is AI IMO, and some are worse than others.

If and when technical debt becomes a paralyzing problem, we'll come up with solutions. Probably agents with far better refactoring skills than we currently have (most are kind of bad at refactoring right now). What's crazy to me is how tolerant the consumer has become. We barely even blink when a program crashes. A successful AAA game these days is one that only crashes every couple hours.

I could show you a Java project from 20+ years ago and you'd have no idea wtf is going on, let alone why every object has 6 interfaces. Hey, why write SQL (a declarative, somewhat functional language, which you'd think would be in fashion today!), when you could instead write reams of Hibernate XML?! We've set the bar pretty low for AI slop.

coredog64

2 days ago

[-]

An abstraction is somewhat reversible: I can take an EJB definition and then rummage around in the official J2EE & vendor appserver docs & identify what is supposed to happen. Similarly, for VB there is code that the IDE adds to a file that's marked "Don't touch" (at least for the early versions, ISTR VB6 did some magic).

Even were I to store the prompts & model parameters, I suspect that I wouldn't get an exact duplicate of the code running the LLM again.

rapind

2 days ago

[-]

I see what you mean. The abstractions I mentioned are pretty much just translations / transformations (immutable) on their own. Keep in mind that most of these are also tied to a version (and versioning is not always clear, not is documentation around that version). The underlying byte code translation could also change even without a language or framework version change.

Also, as soon as a human is involved in implementation, it becomes less clear. You often won't be able to assume intent correctly. There will also be long lived bugs, pointer references that are off, etc.

I concede that the opacity and inconsistency of LLMs is a big (and necessary) downside though for sure.

withinboredom

2 days ago

[-]

In which universe is an abstraction reversible? You can ask 10 people around you to make you a sandwhich. You've abstracted away the work, but I'm willing to bet $10 that each person will not make the same sandwhich (assuming an assortment of meats, veggies, and sauces) ...

tom_m

2 days ago

[-]

I love how AI is surfacing problems that have been present all along. People are beginning to spend more time thinking about what's actually important when building a software product.

My hope is that people keep the dialogue going because you may be right about the feeling of LLMs speeding things up. It could likely be because people are not going through the proper processes including planning and review. That will create mountains of future work; bugs, tech debt, and simply learning. All of which still could benefit from AI tools of course. AI is a very helpful tool, but it does require responsibility.

loudmax

2 days ago

[-]

LLMs can be used to vibe-code with limited or superficial understanding, but they can also be extremely helpful parsing and explaining code for a programmer who wants to understand what the program is doing. Well-managed forward thinking organizations will take advantage of the latter approach. The overwhelming majority of organizations will slide into the former without realizing it.

In the medium to longer term, we might be in a situation where only the most powerful next-generation AI models are able to make sense of giant vibe-coded balls of spaghetti and mud we're about to saddle ourselves with.

jrochkind1

2 days ago

[-]

Yep. I think the industry discounted the need for domain-specific and code-specific knowledge, the value of having developer stick around. And of having developers spend time on "theory building" and sharing.

You can't just replace your whole coding team and think you can proceed at the same development pace. Even if the code is relatively good and the new developers relatively skilled. Especially if you lack "architecture model" level docs.

But yeah LLM's push it to like an absurd level. What if all your coders were autistic savant toddlers who get changed out for a new team of toddlers every month.

HPsquared

2 days ago

[-]

Kind of like a dead (natural) language.

diob

2 days ago

[-]

Funny enough I find LLMs useful for fixing the "death of a program" issue. I was consulting on a project built offshore where all the knowledge / context was gone, and it basically allowed me to have an AI version of the previous team that I could ask questions of.

I could ask questions about how things were done, have it theorize about why, etc.

Obviously it's not perfect, but that's fine, humans aren't perfect either.

pjc50

2 days ago

[-]

Ah, an undead programmer. Reminds me of "Dixie Flatline" from Neuromancer (1984), a simulation of a famous dead hacker trapped in a cartridge to help the protagonist.

BenoitP

2 days ago

[-]

> "theory building"

Strongly agree with your comment. I wonder now if this "theory building" can have a grammar, and be expressed in code; be versioned, etc. Sort of like a 5th-generation language (the 4th-generation being the SQL-likes where you let the execution plan be chosen by the runtime).

The closest I can think of:

* UML

* Functional analysis (ie structured text about various stakeholders)

* Database schemas

* Diagrams

CaptainOfCoit

2 days ago

[-]

Prolog/Datalog with some nice primitives for how to interact with the program in various ways? Would essentially be something like "acceptance tests" but expressed in some logic programming language.

dpritchett

2 days ago

[-]

Cucumber-style BDD has been trying to do this for a long time now, though I never found it to be super comfortable.

https://lamport.azurewebsites.net/tla/tla.html

nonethewiser

2 days ago

[-]

Like TLA+?

adw

2 days ago

[-]

LLMs speed you up more if you have an appropriate theory in greenfield tasks (and if you do the work of writing your scaffold yourself).

Brownfield tasks are harder for the LLM at least in part because it’s harder to retroactively explain regular structure in a way the LLM understands and can serialize into eg CLAUDE.md.

zitterbewegung

2 days ago

[-]

You could add part of your workflow to explain what it did. I've also asked it to check its own output and it even fixes itself (sort of counter intuitive) but it makes more sense if you think about someone just asking to fix their own code.

ants_everywhere

2 days ago

[-]

> programming is "what you want to achieve and how"

As in linear programming or dynamic programming.

> I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

This is an interesting prediction. I think you'll get a correlation regardless of the underlying cause because most programmers don't think there needs to be a model/theory and most programmers report LLMs speeding them up.

But if you control for that, there are also some reasons you might expect the opposite to be true. It could be that programmers who feel the least sped up by LLMs are the ones who feel their primary contributing is in writing code rather than having the correct model. And people who view their job as finding the right model are more sped up because the busy work of getting the code in the right order is taken off their plate.

DrNosferatu

1 day ago

[-]

Then separate model from code, and leverage LLMs to that effect.

rusk

2 days ago

[-]

> programmers who don't think that there needs to be a model/theory

Ah rationalism vs empiricism again

Kant up in heaven laughing his ass off

bodhi_mind

2 days ago

[-]

At least llm will delete code it replaces instead of commenting out every piece of old functionality.

matt_heimer

2 days ago

[-]

LLMs have made it better for us. The quality of code committed by the junior developers has improved.

amelius

2 days ago

[-]

> The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered.

Yeah but we can ask an LLM to read the code and write documentation, if that happens.

WJW

2 days ago

[-]

Good documentation also contains the "why" of the code, ie why it is the way it is and not one of the other possible ways to write the same code. That is information inherently not present in the code, and there would be no way for a LLM to figure it out after the fact.

Also, no "small" program is ever at risk of dying in the sense that Naur describes it. Worst case, you can re-read the code. The problem lies with the giant enterprise code bases of the 60s and 70s where thousands of people have worked on it over the years. Even if you did have good documentation, it would be hundreds of pages and reading it might be more work than just reading the code.

Marazan

2 days ago

[-]

I'm currently involved in a project where we are getting the LLM to do exactly that. As someone who _does_ have a working theory of the software (involved in designing and writing it) my current assessment is that the LLM generated docs are pure line noise at the moment and basically have no value in imparting knowledge.

Hopefully we can iterate and get the system producing useful documents automagically but my worry is that it will not generalise across different system and as a result we will have invested a huge amount of effort into creating "AI" generated docs for our system that could have been better spent just having humans write the docs.

hiatus

2 days ago

[-]

My experience has been mixed with tools like deepwiki, but that's precisely the problem. I tried it with libraries I was familiar with and it was subtly wrong about some things.

Marazan

1 day ago

[-]

We are not at the subtly wrong stage yet, currently we are at the totally empty words devoid of real meaning stage.

sfn42

2 days ago

[-]

It's insane to me how you people are so confident in the LLMs abilities. Have you not tried them? They fuck things up all the time. Basic things. You can't trust them to do anything right.

But sure let's just have it generate docs, that's gonna work great.

medstrom

2 days ago

[-]

There's a skill to phrasing the prompt so the code comes out more reliable.

Was some thread on here the other day, where someone said they routinely give Claude many paragraphs specifying what the code should and shouldn't do. Take 20 minutes just to type it up.

sfn42

2 days ago

[-]

Yeah sure but that's not what dude above is suggesting. Dude above is suggesting "hello ai please document this entire project for me".

I mean even if that did work you still gotta read the docs to roughly the same degree as you would have had to read the code and you have to read the code to work with it anyway.

ljm

2 days ago

[-]

The problem will always remain that it cannot answer 'why', only 'what'. And oftentimes you need things like intent and purpose and not just a lossy translation from programming instructions to prose.

I'd see it like transcribing a piece of music where an LLM, or an uninformed human, would write down "this is a sequence of notes that follow a repetitive pattern across multiple distinct blocks. The first block has the lyrics X, Y ...", but a human would say "this is a pop song about Z, you might listen to it when you're feeling upset."

amelius

2 days ago

[-]

That's a bad example, because an LLM is perfectly capable of saying when something is a song or not.

ljm

1 day ago

[-]

And how does it do that? By looking at the words and seeing that they rhyme?

An LLM is not capable of subtext or reading between the lines or understanding intention or capability or sarcasm or other linguistic traits that apply a layer of unspoken context to what is actually spoken. Unless it matches a pattern.

It has one set of words, provided by you, and another set of words, provided by its model. You will get the bang average response every single time and mentally fill in the gaps yourself to make it work.

2 days ago

[-]

It would be nice if LLMs could do that without being wrong about what the code does and doesn't do.

meindnoch

2 days ago

[-]

Magical thinking.

amelius

2 days ago

[-]

That's what people would say 3 years ago about today's state of AI.

trjordan

2 days ago

[-]

LLMs absolutely produce reams of hard-to-debug code. It's a real problem.

But "Teams that care about quality will take the time to review and understand LLM-generated code" is already failing. Sounds nice to say, but you can't review code being generated faster than you can read it. You either become a bottleneck (defeats the point) or you rubber-stamp it (creates the debt). Pick your poison.

Everyone's trying to bolt review processes onto this. That's the wrong layer. That's how you'd coach a junior dev, who learns. AI doesn't learn. You'll be arguing about the same 7 issues forever.

These things are context-hungry but most people give them nothing. "Write a function that fixes my problem" doesn't work, surprise surprise.

We need different primitives. Not "read everything the LLM wrote very carefully" ways to feed it the why, the motivation, the discussion and prior art. Otherwise yeah, we're building a mountain of code nobody understands.

mattlondon

2 days ago

[-]

We use the various instruction .md files for the agents and update them with common issues and pitfalls to avoid, as well as pointers to the coding standards doc.

Gemini and Claude at least seem to work well with it, but sometimes still make mistakes (e.g. not using c++ auto is a recurrent thing, even though the context markdown file clearly states not to). I think as the models improve and get better at instruction handling it will get better.

Not saying this is "the solution" but it gets some of the way.

I think we need to move away from "vibe coding", to more caring about the general structure and interaction of units of code ourselves, and leave the AI to just handle filling in the raw syntax and typing the characters for us. This is still a HUGE productivity uplift, but as an engineer you are still calling the shots on a function by function, unit by unit level of detail. Feels like a happy medium.

sbene970

1 hour ago

[-]

> even though the context markdown file clearly states not to

You might know this, but telling the LLM what to do instead of what not to do generally works better, or so I heard.

1 day ago

[-]

It does rather invite the question of whether the most popular programming languages today are conductive to "more caring about the general structure and interaction of units of code" in the first place. Intuitively it feels that something more like say Ada SPARK, with its explicit module interfaces and features like design by contract would be better suited to this.

Same thing with syntax - so far we've been optimizing for humans, and humans work best at a certain level of terseness and context-dependent implicitness (when things get too verbose, it's visually difficult to parse), even at the cost of some ambiguity. But for LLMs verbosity can well be a good thing to keep the model grounded, so perhaps stuff like e.g. type inference, even for locals, is a misfeature in this context. In fact, I wonder if we'd get better results if we forced the models to e.g. spell out the type of each expression in full, maybe even outright stuff like method chains and require each call result to be bound to some variable (thus forcing LM to give it a name, effectively making a note on what it thinks it's doing).

Literate programming also feels like it should fit in here somewhere...

So, basically, a language that would be optimized specifically for LLMs to write, and for humans to read and correct.

Going beyond the language itself, there's also a question of ecosystem stability. Things that work today should continue to work tomorrow. This includes not just the language, but all the popular libraries.

And what are we doing instead? We're having them write Python and JavaScript, of all things. One language famous for its extreme dynamism, with a poorly bolted on static type system; another also like that, but also notorious for its footguns and package churn.

trjordan

2 days ago

[-]

100% agree. If you care about API design, data flow, and data storage schemas, you're already halfway there.

I think there's more juice to squeeze there. A lot of what we're going to learn is how to pick the right altitude of engagement with AI, I think.

Herring

2 days ago

[-]

> You … become a bottleneck (defeats the point)

It's better if the bottleneck is just reviewing, instead of both coding and reviewing, right?

We've developed plenty of tools for this (linting, fuzzing, testing, etc). I think what's going on is people who are bad at architecting entire projects and quickly reading/analyzing code are having to get much better at that and they're complaining. I personally enjoy that kind of work. They'll adapt, it's not that hard.

trjordan

2 days ago

[-]

There's plenty of changes that don't require deep review, though. If you're written a script that's, say, a couple fancy find/replaces, you probably don't need to review every usage. Check 10 of 500, make sure it passes lint/tests/typecheck, and it's likely fine.

The problem is that LLM-driven changes require this adversarial review on every line, because you don't know the intent. Human changes have a coherence to them that speeds up review.

(And you your company culture is line-by-line review of every PR, regardless of complexity ... congratulations, I think? But that's wildly out of the norm.)

baq

1 day ago

[-]

A proper line by line review tops out at 400-500 lines per hour and the reviewer should be spent and take a 30 minute break. It’s a great process if you’re building a spaceship I guess.

2 days ago

[-]

> It's better if the bottleneck is just reviewing, instead of both coding and reviewing, right?

Not really. There's something very "generic" about LLM generated code that makes you just want gloss over it, no matter how hard you try not to.

acedTrex

1 day ago

[-]

The bottleneck has never been coding lol

2 days ago

[-]

Yes, "just take the time to review and understand LLM-generated code" is the new "just don't write bad code and you won't have any bugs". As an industry, we all know from years of writing bugs despite not wanting to that this task is impossible at scale. Just reviewing all the AI code to make sure it is good code likewise does not scale in the same way. Will not work, and it will take 5-10 years for the industry to figure it out.

shinecantbeseen

2 days ago

[-]

I've had some (anecdotal) success reframing how I think about my prompts and the context I give the LLM. Once I started thinking about it as reducing the probability space of output through priming via context+prompting I feel like my intuition for it has built up. It also becomes a good way to inject the "theory" of the program in a re-usable way.

It still takes a lot of thought and effort up front to put that together and I'm not quite sure where the breakover line between easier to do-it-myself and hand-off-to-llm is.

solatic

2 days ago

[-]

> We need different primitives

The correct primitives are the tests. Ensure your model is writing tests as you go, and make sure you review the tests, which should be pretty readable. Don't merge until both old and new tests pass. Invest in your test infrastructure so that your test suite doesn't get too slow, as it will be in the hot path of your model checking future work.

Legacy code is that which lacks tests. Still true in the LLM age.

raincole

2 days ago

[-]

> You either become a bottleneck (defeats the point)

How...?

When I found code snippets from StakcOverflow, I read them before pasting them into my IDE. I'm the bottleneck. Therefore there is no point to use StackOverflow...?

mixedbit

2 days ago

[-]

My experience is that LLM too often finds solutions that work, but are way more complex than necessary. It is easiest to recognize and remove such complexity when the code is originally created, because at this time the author should have the best understanding of the problem being solved, but this requires extra time and effort. Once the overly complex code is committed, it is much harder to recognize the complexity is not needed. Readers/maintainers of code usually assume that the existing code solves real world problem, they do not have enough context to recognize that much simpler solution could work as well.

2 days ago

[-]

It's easy to avoid overly complex solutions with LLMs.

First, your prompts should be direct enough to the LLM doesn't wander around producing complexity for no reason.

Second, you should add rules/learning/context to always solve problems in the simplest way possible.

Lastly, after generation, you can prompt the LLM to reduce the complexity of the solution.

justsocrateasin

2 days ago

[-]

Okay how about this situation that one of my junior devs hit recently:

Coding in an obj oriented language in an enormous code base (big tech). Junior dev is making a new class and they start it off with LLM generation. LLM adds in three separate abstract classes to the inheritance structure, for a total of seven inherited classes. Each of these inherited classes ultimately comes with several required classes that are trivial to add but end up requiring another hundred lines of code, mostly boilerplate.

Tell me how you, without knowing the code base, get the LLM to not add these classes? Our language model is already trained on our code base, and it just so happens that these are the most common classes a new class tends to inherit. Junior dev doesn't know that the classes should only be used in specific instances.

Sure, you could go line by line and say "what does this inherited class do, do I need it?" and actually, the dev did that. It cut down the inherited classes from three to two, but missed two of them because it didn't understand on a product side why they weren't needed.

Fast forward a year, these abstract classes are still inherited, no one knows why or how because there's no comprehension but we want to refactor the model.

jofla_net

2 days ago

[-]

True True, I remember another example, with Linus Torvalds, who at a conference used a trivial example of simplifying functions, as to why hes good at what he does, or what makes a good lead developer in general. It went something along the lines of.

"Well we have this starting function which clearly can solve the task at hand. Its something 99 developers would be happy with, but I can't help but see that if we just reformulate it into a do-while instead we now can omit the checks here and here, almost cutting it in half."

Now obviously it doesn't suffice as real-world example but, when scaled up, is a great view at what waste can accumulate at the macro level. I would say the ability to do this is tied to a survival instinct, one which, undoubtedly will be touted as something that'll be put in the 'next-iteration' of the model. Its not strictly something I think that can be trained to be achievable though, as in pattern matching, but its clearly not achievable yet as in your example from above.

acuozzo

2 days ago

[-]

> Tell me how you, without knowing the code base, get the LLM to not add these classes?

Stop talking to it like a chatbot.

Draft, in your editor, the best contract-of-work you can as if you were writing one on behalf of NASA to ensure the lowest bidder makes the minimum viable product without cutting corners.

---

  Goal: Do X.

  Sub-goal 1: Do Y.

  Sub-goal 2: Do Z.

  Requirements:

    1. Solve the problem at hand in a direct manner with a concrete implementation instead of an architectural one.

    2. Do not emit abstract classes.

    3. Stop work and explain if the aforementioned requirements cannot be met.

---

For the record: Yes, I'm serious. Outsourcing work is neither easy nor fun.

metalliqaz

2 days ago

[-]

Every time I see something like this, I wonder what kind of programmers actually do this. For the kinds of code that I write (specific to my domain and generates real value), describing "X", "Y", and "Z" is a very non-trivial task.

If doing those is easy, then I would assume that the software isn't that novel in the first place. Maybe get something COTS

I've been coding for 25 years. It is easier for me to describe what I need in code than it is to do so in English. May as well just write it.

acuozzo

1 day ago

[-]

> I've been coding for 25 years.

20 here, mostly in C; mixture of systems programming and embedded work.

My only experience with vibe-coding is when working under a time-crunch very far outside of my domain of expertise, e.g., building non-transformer-based LLMs in Python.

2 days ago

[-]

I mean, unless you just don't know how to program, I struggle to see what value the LLM is providing. By the time you've broken it down enough for the LLM, you might as well just write the code yourself.

acuozzo

1 day ago

[-]

I've been writing code for over 20 years, mostly in C.

My only experience with vibe-coding is when working under a time-crunch very far outside of my domain of expertise.

No amount of "knowing how to program" is going to give me >10 years of highly-specialized PhD-level Mathematics experience in under three months.

13 hours ago

[-]

The how do you know it got it right?

baq

1 day ago

[-]

Yeah, but LLM is simply faster, especially in this case where you know exactly what you need, it’s just a lot of typing.

stillworks

2 days ago

[-]

Curious about the mechanics here — when you say the model was ‘trained on our code base’, was that an actual fine-tune of the weights (e.g. LoRA/adapter or full SFT), or more of a retrieval/indexing setup where the model sees code snippets at inference? Always interested in how teams distinguish between the two.

2 days ago

[-]

What would you tell a junior dev that did this?

You tell them not to create extra abstract classes and put that in your onboarding docs.

You literally do the same thing with llms. Instead of onboarding code standards docs you make rules files or whatever the llm needs.

0 - https://www.youtube.com/watch?v=3BBNG0TlVwM

2 days ago

[-]

Was listening to the Dwarkesh Patel podcast recently and the guest (Agustin Lebron) [0] mentioned the book "A Deepness In The Sky" by Vernor Vinge [1].

I started reading it and a key plot point is that there is a computer system that is thousands of years old. One of the main characters has "cold sleeped" for so long that he's the only one who knows some of the hidden backdoors. That legacy knowledge is then used to great effect.

Highly recommend it for a great fictional use of institutional knowledge on a legacy codebase (and a great story overall).

1 - https://amzn.to/42Fki8n

mock-possum

2 days ago

[-]

His description of learning to be a programmer in that far future era was fun too, iirc there was just so much ‘legacy code’, like practically infinite libraries and packages to perform practically any function - that ‘coding’ was mostly a matter of finding the right pieces and wiring them together. Knowing the nuances of these existing pieces and the subtlety of their interpretation was the skill.

2 days ago

[-]

100%

Another great example:

In Fire Upon the Deep, due to the delay in communications between star systems, everyone use a descendant of Usenet.

octoberfranklin

1 day ago

[-]

everyone use a descendant of Usenet

The future I dream of.

Taylor_OD

2 days ago

[-]

RIP Vernor Vinge. Somehow, his ideas seem more and more relevant.

https://en.wikipedia.org/wiki/Technological_singularity

2 days ago

[-]

Especially since he coined the term "technological singularity"

blackhaj7

2 days ago

[-]

Sounds great - thanks for the recommendation.

Looks like it is the second in a trilogy. Can you just dive in or did you read the first book before?

duskwuff

2 days ago

[-]

A Fire upon the Deep and A Deepness in the Sky are loosely connected; you can read them in either order. Both novels reveal some details which explain bits of the other.

However, I would recommend skipping Children of the Sky. It's not as good, and was clearly intended as the first installment of a series which Vinge was unable to complete. :(

1 day ago

[-]

The first two books can be treated largely as standalone works. They do technically take place in the same broad universe, but said universe is basically divided into FTL and non-FTL zones with vastly different societies in each (for obvious reasons), and the non-FTL societies aren't even aware of this boundary. "Fire upon the Deep" is set mostly in the FTL zone, with the boundary itself being a major plot point. "Deepness in the Sky" is set entirely in the non-FTL zone, and the lack of FTL is a major plot point there.

Chronologically, DitS takes place before FotD. But there is exactly one character in common between the two books, and while he is a major character in both, none of the events of DitS are relevant to the story in FotD (which makes sense since FotD was written first).

So it's really largely a matter of preference as to which one to read first. I would say that FotD has more action and, for the lack of better term, "weirdness" in the setting; while DitS is more slow-paced, with more character development and generally more fleshed-out characters, and explores its themes deeper. But both books have plenty for your mind to chew on.

All in all I think FotD is an easier read, and DitS is a more rewarding one, but this is all very subjective.

One upside to the books being decoupled as much as they are is that whichever one you start with, you get a complete story, so even if you're a completionist you can disregard the other book if you don't like the first one.

2 days ago

[-]

I read Fire Upon the Deep first and liked both books.

General recommendation is to read them in order (Fire first, Deepness second) but I don't really think it matters.

blackhaj7

2 days ago

[-]

Awesome - thanks!

octoberfranklin

1 day ago

[-]

You can start with the second, but the first book is better at grabbing the attention of a new reader with wild ideas (broadband audio hive-minds, variable speed-of-light). If you make it through the first chapter you won't be able to put it down.

The second book is just as good, but doesn't try as hard to get you addicted early on. The assumption is that you already know how good Vinge's work is.

I recommend starting with Fire Upon the Deep.

low_tech_punk

2 days ago

[-]

Most programmers don't understand the low level assembly or machine code. High level language becomes the layer where human comprehension and collaboration happens.

LLM is pushing that layer towards natural language and spec-driven development. The only *big* difference is that high level programming languages are still deterministic but natural language is not.

I'm guessing we've reached an irreducible point where the amount of information needed specify the behavior of a program is nearly optimally represented in programming languages after decades of evolution. More abstraction into the natural language realm would make it lossy. And less abstraction down to the low level code would make it verbose.

adamddev1

2 days ago

[-]

The difference is not just a jump to a higher abstraction with natural language. It's something fundamentally differet.

The previous tools (assemblers, compilers, frameworks) were built on hard-coded logic that can be checked and even mathematically verified. So you could trust what you're standing on. But with LLMs we jump off the safely-built tower into a world of uncertainty, guesses, and hallucinations.

2 days ago

[-]

If LLMs still produce code that is eventually compiled down to a very low level...that would mean it can be checked and verified, the process just has additional steps.

JavaScript has a ton of behavior that is very uncertain at times and I'm sure many JS developers would agree that trusting what you're standing on is at times difficult. There is also a large percentage of developers that don't mathematically verify their code, so the verification is kind of moot in those cases, hence bugs.

The current world of LLM code generation lacks the verification you are looking for, however I am guessing that these tools will soon emerge in the market. For now, building as incrementally as possible and having good tests seems to be a decent path forward.

cobbal

2 days ago

[-]

There are 4 important components to describing a compiler. The source language, the target language, and the meaning (semantics in compiler-speak) of both those languages.

We call a C->asm compiler "correct" if the meaning of every valid C program turns into an assembly program with equivalent meaning.

The reason LLMs don't work like other compilers is not that they're non-deterministic, it's that the source language is ambiguous.

LLMs can never be "correct" compilers, because there's no definite meaning assigned to english. Even if english had precise meaning, LLMs will never be able to accurately turn any arbitary english description into a C program.

Imagine how painful development would be if compilers produced incorrect assembly for 1% of all inputs.

1 day ago

[-]

English does have precise meaning, if constructed to be precise, the issue is that LLMs do not assign meaning in the way humans assign meaning. Humans assign English meaning to code every day just fine, and sometimes it does result in bugs as well.

The LLM in this loop is the equivalent of a human, which also has ambiguous source language if we’re going by your theory of English being ambiguous. So it sounds like you’re saying that if a human produces a C program, it is not verifiable and testable because the human used an ambiguous source language?

I guess for some reason people thought I meant that the compiler would be LLM > machine code, where actually I meant the compiler would still be whatever language the LLM produces down to machine code. Its just that the language the LLM produces can be checked through things like TDD or a human, etc...

prmph

2 days ago

[-]

> If LLMs still produce code that is eventually compiled down to a very low level...that would mean it can be checked and verified

I don't think you have thought about this deeply enough. Who or what would do the checking, and according to what specifications?

1 day ago

[-]

I would probably agree! I came off sounding as if there is no human in the loop. What I meant is that input is still the programming language that is produced and output is the result. Not that the LLM is the initial input. A human in the loop can clean the code produced or create tests that check for an end result(or intermediate results as well).

I understand that an input to an LLM will create a different result in many cases, making the output not deterministic, but that doesn’t mean we can’t use probability to arrive to results eventually.

adamddev1

1 day ago

[-]

I mean the things _producing_ the code can be checked and verified, meaning the code generated is guaranteed to be correct. You're talking about verifying the code _produced_. That's the big difference.

1 day ago

[-]

Would be curious as to how you check and verify LLMs? And how you get guaranteed correct code?

Verifying code produced is a much simpler task for some code because I, as a human, can look at a generated snippet and reason about it and determine if it is what I want. I can also create tests to say “does this code have this effect on some variable” and then proceed to run the test.

austin-cheney

2 days ago

[-]

> Most programmers don't understand the low level assembly or machine code.

Most programmers that write JavaScript for a living don't really understand how to scale applications in JavaScript, which includes data structures in JavaScript. There is a very real dependence on layers of abstractions to enable features that can scale. They don't understand the primary API to the browser, the DOM, at all and many don't understand the Node API outside the browser.

For an outside observer it really begs the Office Space question: What would you say you do here? Its weird trying to explain it to people completely outside software. For the rest of us in software we are so used to this we take the insanity for granted as an inescapable reality.

Ironically, at least in the terms of your comment, is that when you confront JavaScript developers about this lack of fundamental knowledge comparisons to assembly frequently come up. As though writing JavaScript directly is somehow equivalent to writing machine code, but for many people in that line of work they are equivalent distant realities.

The introduction of LLMs makes complete sense. When nobody knows how any of this code works then there isn't a harm to letting a machine write it for you, because there isn't a difference in the underlying awareness.

rmunn

2 days ago

[-]

> Most programmers that write JavaScript for a living don't really understand how to scale applications in JavaScript, which includes data structures in JavaScript. There is a very real dependence on layers of abstractions to enable features that can scale.

Although I'm sure you are correct, I would also want to mention that most programmers that write JavaScript for a living aren't working for Meta or Alphabet or other companies that need to scale to billions, or even millions, of users. Most people writing JavaScript code are, realistically, going to have fewer than ten thousand users for their apps. Either because those apps are for internal use at their company (such as my current project, where at most the app is going to be used by 200-250 people, so although I do understand data structures I'm allowing myself to do O(N^2) business logic if it simplifies the code, because at most I need to handle 5-6 requests per minute), or else because their apps are never going to take off and get the millions of hits that they're hoping for.

If you don't need to scale, optimizing for programmer convenience is actually a good bet early on, as it tends to reduce the number of bugs. Scaling can be done later. Now, I don't mean that you should never even consider scaling: design your architecture so that it doesn't completely prevent you from scaling later on, for example. But thinking about scale should be done second. Fix bugs first, scale once you know you need to. Because a lot of the time, You Ain't Gonna Need It.

foo42

1 day ago

[-]

A side effect of the non-deterministic behaviour is that, unlike previous increases in abstraction, the high level prompts are not checked in to the code base and available to recreate their low level output on demand. Instead we commit the lower level output (ie code) and future revisions must operate on this output without the ability to modify the original high level instructions.

the_duke

2 days ago

[-]

I feel like natural language specs can play a role, but there should be an intermediate description layer with strict semantics.

Case in point: I'm seeing much more success in LLM driven coding with Rust, because the strong type system prevents many invalid states that can occur in more loosely or untyped languages.

It takes longer, and often the LLM has to iterate through `cargo check` cycles to get to a state that compiles, but once it does the changes are very often correct.

The Rust community has the saying "if it compiles, it probably works". You can still have plenty of logic bugs of course , but the domain of possible mistakes is smaller.

What would be ideal is a very strict (logical) definition of application semantics that LLMs have to implement, and that ideally can be checked against the implementation. As in: have a very strict programming language with dependent types , littered with pre/post conditions, etc.

LLMs can still help to transform natural language descriptions into a formal specification, but that specification should be what drives the implementation.

redsymbol

2 days ago

[-]

There is another big difference: natural languages have ambiguity baked in. If a programming language has any ambiguity in how it can be parsed, that is rightly considered a major bug. But it's almost a feature of natural languages, allowing poetry, innuendo, and other nuanced forms of communication.

1 day ago

[-]

There are constructed languages that preserve the expressivity of natural human languages but without the implicit ambiguity, though; most notably, Loglan and its successor Lojban. If you read Golden Age sci-fi, Loglan sometimes shows up there specifically in this role - e.g. "Moon is a Harsh Mistress":

> By then Mike had voder-vocoder circuits supplementing his read-outs, print-outs, and decision-action boxes, and could understand not only classic programming but also Loglan and English, and could accept other languages and was doing technical translating—and reading endlessly. But in giving him instructions was safer to use Loglan. If you spoke English, results might be whimsical; multi-valued nature of English gave option circuits too much leeway.

For those unfamiliar with it, it's not that Lojban is perfectly unambiguous. It's that its design strives to ensure that ambiguity is always deliberate by making it explicit.

The obvious problem with all this is that Lojban is a very niche language with a fairly small corpus, so training AI on it is a challenge (although it's interesting to note that existing SOTA models can read and write it even so, better than many obscure human languages). However, Lojban has the nice property of being fully machine parseable - it has a PEG grammar. And, once you parse it, you can use dictionaries to construct a semantic tree of any Lojban snippet.

When it comes to LLMs, this property can be used in two ways. First, you can use structured output driven by the grammar to constrain the model to output only syntactically valid Lojban at any point. Second, you can parse the fully constructed text once it has been generated, add semantic annotations, and feed the tree back into the model to have it double-check that what it ended up writing means exactly what it wanted to mean.

With SOTA models, in fact, you don't even need the structured output - you can just give them parser as a tool and have them iterate. I did that with Claude and had it produce Lojban translations that, while not perfect, were very good. So I think that it might be possible, in principle, to generate Lojban training data out of other languages, and I can't help but wonder what would happen if you trained a model primarily on that; I suspect it would reduce hallucinations and generally improve metrics, but this is just a gut feel. Unfortunately this is a hypothesis that requires a lot of $$$ to properly test...

low_tech_punk

2 days ago

[-]

I had a similar thought, feature not bug.

The nature of programming might have to shift to embrace the material property of LLM. It could become a more interpretative, social, and discovery-based activity. Maybe that's what "vibe coding" would eventually become.

2 days ago

[-]

> The nature of programming might have to shift to embrace the material property of LLM. It could become a more interpretative, social, and discovery-based activity. Maybe that's what "vibe coding" would eventually become

This sounds like an unmaintainable, tech debt nightmare outcome to me

archy_

2 days ago

[-]

C has a lot of ambiguity in how it is parsed ("undefined behavior") but people usually view that as a benefit because it allows compilers more freedom to dictate an implementation.

roncesvalles

1 day ago

[-]

It's not the same. There is an explosion in expressiveness/ambiguity in the step from high-level programming languages to natural languages. This "explosion" doesn't exist in the steps between machine code and assembly, or assembly and a high-level programming language.

It is, for example, possible to formally verify or do 100% exhaustive testing as you go lower down the stack. I can't imagine this would be possible between NLs and PLs.

lxgr

2 days ago

[-]

> The only big difference is that high level programming languages are still deterministic but natural language is not.

Arguably, determinism isn't everything in programming: It's very possible to have perfectly deterministic, yet highly surprising (in terms of actual vs. implied semantics to a human reader) code.

In other words, the axis "high/low level of abstraction" is orthogonal to the "deterministic/probabilistic" one.

raincole

2 days ago

[-]

Yes, but determinism is still very important in this case. It means you only need to memorize the surprising behavior once (like literally every single senior programmer has memorized their programming language's quirks even they don't want to).

Without determinism, learning becomes less rewarding.

tossandthrow

2 days ago

[-]

A program with ambiguities will not work, a spec with ambiguities is, on the other hand, incredibly common.

Specs are not more abstract but more ambiguous, which is not the same thing.

drdrek

2 days ago

[-]

Somehow many very smart AI entrepreneurs do not understand the concept of limits to lossless data compression. If an idea cannot be reduced further without losing information, no amount of AI is going to be able to compress it.

This is why you see so many failed startup around slack/email/jira efficiency. Half the time you do not know if you missed critical information so you need to go to the source, negating gains you had with information that was successfully summarized.

dorkrawk

2 days ago

[-]

Downloading music off the internet is just the next logical step after taping songs off the radio. Cassette tapes didn't really affect the music industry, so I wouldn't worry about this whole Napster thing.

donatj

2 days ago

[-]

A friend was recently telling me about an LLM'd PR he was reviewing submitted by a largely non-technical manager where the feature from the outside entirely appeared to work, but actually investigating the thousands of lines of generated code, it was instead hacking their response cache system to appear to work without actually updating anything on the backend.

It took a ton of effort on his part to convince his manager that this wasn't ready to be merged.

I wonder how much vibe coded software is out there in the wild that just appears to work?

rAum

2 days ago

[-]

lol you should absolutely merge it and go with it in such cases, just collect evidence first to have enough deniability and enjoy the show. You can tell a child not to do the thing over and over or just accept it will very quickly learn for their life that touching hot oven is not a smart thing to do. With so much AI hype induced brainrot seems for certain individuals the only antitode is to make them feel direct consequences of their false beliefs. Without feedback loop there is no learning occurring at all.

More dangerous thing is such idiot managers can judge you by their lens of shipping LLM garbage they didn't applied in reality to see consequences, living in fantasy due to lack of technical knowledge. Of course it directly leads to firing people and adding more tasks/balloning expectation on leftover team who are force trapped to burn out and be replaced as trash as that makes total sense in their world view and "evidence".

2 days ago

[-]

That's only an option when it's not you who will have to clean up the mess.

https://www.scottsmitelli.com/articles/altoids-by-the-fistfu...

1 day ago

[-]

Even if it's not you, it's still ethically questionable. It won't be the manager in question who will have to deal with the fallout, even in the long run - it will be another software engineer. It would ultimately contribute to this:

ebiester

2 days ago

[-]

Where are these non-technical engineering managers and how did they stay in the business?

I haven't seen a truly non-technical manager in over 15 years.

OutOfHere

2 days ago

[-]

I would report the manager to the CTO or CEO or business owners/investors.

1 day ago

[-]

You mean, the very people who keep doubling down on investments into AI combined with layoffs? You'd go and tell them that this thing that they signed off on, pitched to others, and thus are ultimately responsible for if it fails in a way that cannot be denied or covered up, is not working.

OutOfHere

1 day ago

[-]

> is not working

Even a calculator doesn't work if one doesn't use it correctly. Agentic coding works very well if used correctly, such as in the following way:

1. Define your task prompt as well as possible. Refine it via the LLM, having the LLM review it, and repeat this process ad infinitum until there are no important issues left to fix. If possible, use multiple LLMs to identify gaps in your task prompt. You now have your refined task specification. This is the most time consuming step. Sometimes it's necessary to add API docs and SDKs to the context.

2. Use a good reasoning LLM by OpenAI or Claude or Gemini or Grok to execute the spec.

3. Review the generated code line by line. Make any necessary changes, either manually or again using the LLM. With any luck there won't be anything to fix.

If used in this way, it works so well.

10 hours ago

[-]

You forgot the most important part:

0. Pick a task that's not too complicated for the LLM, and use language and frameworks that it knows about.

If you're within that zone, it all feels magical. Step outside and things fall apart really quickly.

iamleppert

1 day ago

[-]

Have you tried Loveable or seen any of their marketing? They are innovating a new category of software that is passable in all the ways a typical user can examine, but none of the ways of traditional software.

And why should they? Most people will pay them, churn out whatever code, it will likely never be deployed or used by anyone (this is true of most code created by a real engineer too). By the time the user has figured out what they have "created" isn't real, Loveable is on to the next mark/user.

wkirby

2 days ago

[-]

I see this as the next great wave of work for me and my team. We sustained our business for a good 5–8 years on rescuing legacy code from offshore teams as small-to-medium sized companies re-shored their contract devs. We're currently in a demand lull as these same companies have started relying heavily on LLMs to "write" "code" --- but as long as we survive the next 18 months, I see a large opportunity as these businesses start to feel the weight of their accumulated tech debt accrued by trusting claude when it says "your code is now production ready."

meander_water

2 days ago

[-]

I've done my share of vibe coding, and I completely agree with OP.

You just don't build up the necessary mental model of what the code does when vibing, and so although you saved time generating the code, you lose all that anyway when you hit a tricky bug and have to spend time building up the mental model to figure out what's wrong.

And saying "oh just do all the planning up front" just doesn't work in the real world where requirements change every minute.

And if you ever see anyone using "accepted lines" as a metric for developer productivity/hours saved, take it with a grain of salt.

miguelacevedo

1 day ago

[-]

Agree, Peter Naur famously said programming is theory building. Code you do not understand can be considered dead code.

CaptainOfCoit

2 days ago

[-]

> I've done my share of vibe coding

Why? It was almost meant in jest and as a joke, no one seriously believes you don't need to review code, you end up in spaghetti land so quickly I can't believe anyone tried "vibe coding" for more than a couple of hours then didn't quickly give up on something that is obviously infeasible.

Now, reviewing whatever the LLM gives you back, carefully massage it into the right shape then moving on, definitely helps my programming a lot, but careful review is needed that the LLM had the right context so it's actually correct. But then we're in "pair programming" territory rather than blindly accepting whatever the LLM hands you, AKA "vibe coding".

meander_water

2 days ago

[-]

Vibe coding has its place. I've mainly used it to create personalised ui's for tasks very specific to me. I don't write tests, I may throw it away next week but it's served its purpose at least once. Is this grossly inefficient? Probably.

myflash13

2 days ago

[-]

This is not just for LLM code. This is for any code that is written by anyone except yourself. A new engineer at Google, for example, cannot hit the ground running and make significant changes to the Google algorithm without months of "comprehension debt" to pay off.

However, code that is well-designed by humans tends to be easier to understand than LLM spaghetti.

carlmr

2 days ago

[-]

>However, code that is well-designed by humans tends to be easier to understand than LLM spaghetti.

Additionally you may have institutional knowledge accessible. I can ask a human and they can explain what they did. I can ask an LLM, too and they will give me a plausible-sounding explanation of what they did.

ToValueFunfetti

2 days ago

[-]

I can't speak for others, but if you ask me about code I wrote >6 months ago, you'll also be stuck with a plausible-sounding explanation. I'll have a better answer than the LLM, but it will be because I am better at generating plausible-sounding explanations for my behavior, not because I can remember my thought processes for months.

1 day ago

[-]

This is where stuff like git history often comes in handy. I cannot always reliably explain why some code was the way it is when looking at a single diff of my own from years ago, but give me the history of that file and the issue tracker where I can look up references from commits and see the comments etc, and I can reconstruct it with very high degree of certainty.

HPsquared

2 days ago

[-]

Your sphere of plausibility is smaller than that of an LLM though, at least. You'll have some context and experience to go on.

halfcat

2 days ago

[-]

You also might say ”I don’t remember”, which ranks below remembering, but above making something up.

ctkhn

2 days ago

[-]

There might also be a high level design page about the feature, or jira tickets you can find through git commit messages, or an architectural decision record that this new engineer could look over even if you forgot. The LLM doesn't have that

CaptainOfCoit

2 days ago

[-]

> The LLM doesn't have that

The weights won't have that by default, true, that's not how they were built.

But if you're a developer and can program things, there is nothing stopping you from letting LLMs have access to those details, if you feel like that's missing.

I guess that's why they call LLMs "programmable weights", you can definitely add a bunch of context to the context so they can use it when needed.

carlmr

1 day ago

[-]

>But for asking a clarifying question during a training class?

LLMs can barely do 2+2, humans don't even understand the weights if they see them. LLMs can have all the access they want to their own weights and they won't be able to explain their thinking.

VikingCoder

2 days ago

[-]

Kernighan's Law - Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Modern Addendum: And if you have an LLM generate your code, you'll need one twice as smart to debug it.

projektfu

2 days ago

[-]

Not entirely my experience, but I do have to be the driver. I mostly use Claude Code, and it will sometimes make a dumb mistake. I can usually ask it to fix the problem and it will. Every now and again, I have to tell it to stop barking up the wrong tree, and tell it where the problem lies in the code it wrote.

In other words, debugging can be at the same "intelligence" level, but since an LLM doesn't really know what it is doing, it can make errors it won't comprehend on its own. The experience is a lot like working with a junior programmer, who may write a bunch of code but cannot figure out what they got wrong.

OutOfHere

2 days ago

[-]

> you'll need one twice as smart to debug it

Just maybe, it's the difference between the "medium" and "high" suffixed thinking modes of an LLM.

Fwiw, for complicated functions that must exist, I have the LLM write a code comment explaining the intent and the approach.

BoxFour

2 days ago

[-]

I'm not of the "LLMs will replace all software developers within a year" mindset, but this critique feels a bit overstated.

The challenge of navigating rapidly changing or poorly documented code isn’t new: It’s been a constant at every company I’ve worked with. At larger organizations the sheer volume of code, often written by adjacent teams, will outpace your ability to fully understand it. Smaller companies tend to iterate so quickly (and experience so much turnover) that code written two weeks ago might already be unrecognizable, if the original author is even still around after those two weeks!

The old adage still applies: the ability to read code is more crucial than the ability to write it. LLMs just amplify that dynamic. The only real difference is that you should assume the author is gone the moment the code lands. The author is ephemeral, or they went on PTO/quit immediately afterward: Whatever makes you more comfortable.

1 day ago

[-]

> The old adage still applies: the ability to read code is more crucial than the ability to write it. LLMs just amplify that dynamic

LLMs don't "just" amplify that dynamic

They boost it to impossibly unsustainable levels

IgorPartola

2 days ago

[-]

So far I have found two decent uses for LLM generated code.

First, refactoring code. Specifically, recently I used it on a library that had solid automated testing coverage. I needed to change the calling conventions of a bunch of methods and classes in the library, but didn’t want to rewrite the 100+ unit tests by hand. Claude did this quickly and without fuss.

Second is one time use code. Basically let’s say you need to convert a bunch of random CVS files to a single YAML file, or convert a bunch of video files in different formats to a single standard format, or find any photos in your library that are out of focus. This works reasonably well.

Bonus one is just generating sample code for well known libraries.

I have been curious what would happen if I handed something like Claude a whole server and told it to manage it however it wants with relatively little instruction.

romaniv

2 days ago

[-]

No, this is not a pre-existing problem.

In the past the problem was about transferring a mental model from one developer to the other. This applied even when people copy-pasted poorly understood chunks of example code from StackOverflow. There was specific intent and some sort of idea of why this particular chunk of code should work.

With LLM-generated software there can be no underlying mental model of the code at all. None. There is nothing to transfer or infer.

captainkrtek

2 days ago

[-]

It’s even worse because the solution an LLM produces is not obvious as to whether it was inherently chosen by the user and favored over a different approach for any reason, or it was just what happened to be output and “works”.

I’ve had to give feedback to some junior devs who used quite a bit of LLM created code in a PR, but didn’t stop to question if we really wanted that code to be “ours” versus using a library. It was apparent they didn’t consider alternatives and just went with what it made.

gwbas1c

2 days ago

[-]

One of the things I find AI is best for is coding operations that don't need to understand context.

IE, if I need a method (or small set of methods) that have clearly defined inputs and outputs, probably because they follow a well-known algorithm, AI is very useful. But, in this case, wider comprehension isn't needed; because all the LLM is doing is copying and adjusting.

mattlondon

2 days ago

[-]

Yes this is where I find it most useful - I tell it what and where to do it, and it fills in the blanks.

E.g. "extract the logic in MyFunc() in foo.cc into a standalone helper and set up all the namespaces and headers so that it can be called from MyFunc() and also in bar.cc. Add tests and make sure it all compiles and works as expected, then call it in bar.cc in the HTTP handler stub there."

It never needs to make architectural decisions. If I watch it and it looks like it is starting to go off the rails and do something odd, I interrupt it and say "Look at baz.cc and follow the coding style and pattern there" or whatever.

Seems to work well.

I feel like as an engineer I am moving away from concrete syntax, and up an abstraction level into more of abstract form where I am acting more like a TL reviewing code and making the big-brush decisions on how to structure things, making course corrects as I go. Pure vibe-coding is rare.

mcoliver

2 days ago

[-]

The counterpoint to this is that LLMs cannot only write code, they can comprehend it! They are incredibly useful for getting up to speed on a new code base and transferring comprehension from machine to human. This of course spans all job functions and is still immature in its accuracy but rapidly approaching a point where people with an aptitude for learning and asking the right questions can actually have a decent shot at completing tasks outside of their domain expertise.

1 day ago

[-]

This would be great if said comprehension is reliable. But I've seen tools designed to "understand" and document repos hallucinate many times, often coming up with a plausible but completely wrong explanation of how things actually work, or, even more subtly, of why they work the way they work.

And while I could catch that because I wrote the code in question and know the answers to those questions, others do not have that benefit. The notion that someone new to the codebase - especially a relatively unexperienced dev - would have AI "documentation" as a starting point is honestly quite terrifying, and I don't see how it could possibly end with anything other than garbage out.

roncesvalles

1 day ago

[-]

I agree. Almost all of the value that I'm getting out of LLMs is when it helps me understand something, as opposed to when it helps me produce something.

I'm not sure how or why the conversation shifted from LLMs helping you "consume" vs helping you "produce". Maybe there's not as much money in having an Algolia-on-steroids as there is in convincing execs that it will replace people's jobs?

dbuxton

2 days ago

[-]

I think this is a relative succinct summary of the downside case for LLM code generation. I hear a lot of this and as someone who enjoys a well-structured codebase, I have a lot of instinctive sympathy.

However I think we should be thinking harder about how coding will change as LLMs change the economics of writing code: - If the cost of delivering a feature is ~0, what's the point in spending weeks prioritizing it? Maybe Product becomes more like an iterative QA function? - What are the risks that we currently manage through good software engineering practices and what's the actual impact of those risks materializing? For instance, if we expose customer data that's probably pretty existential, but most companies can tolerate a little unplanned downtime (even if they don't enjoy it!). As the economics change, how sustainable is the current cost/benefit equilibrium of high-quality code?

We might not like it but my guess is that in ≤ 5 years actual code is more akin to assembler where sure we might jump in and optimize but we are really just monitoring the test suites and coverage and risks rather than tuning whether or not the same library function is being evolved in a way which gives leverage across the code base.

dns_snek

2 days ago

[-]

> As the economics change, how sustainable is the current cost/benefit equilibrium of high-quality code

"High quality code"? The standard today is "barely functional", if we lower the standards any further we will find ourselves debating how many crashes a day we're willing to live with, and whether we really care about weekly data loss caused by race conditions.

abraxas

2 days ago

[-]

And if that's what's economically beneficial then it shall be. Unfortunately.

https://www.scottsmitelli.com/articles/altoids-by-the-fistfu...

1 day ago

[-]

patrickmay

2 days ago

[-]

> However I think we should be thinking harder about how coding will change as LLMs change the economics of writing code: - If the cost of delivering a feature is ~0, what's the point in spending weeks prioritizing it?

Writing code and delivering a feature are not synonymous. The time spent writing code is often significantly less than the time spent clarifying requirements, designing the solution, adjusting the software architecture as necessary, testing, documenting, and releasing. That effort won't be driven to 0 even if an LLM could be trusted to write perfect code that didn't need human review.

grandfugue

2 days ago

[-]

I agree with your point on finding a new standard on what developers should do given LLM coding. Something that matters before may not be relevant in future.

My so far experiences boil down to: APIs, function descriptions, overall structures and testing. In other words, ask a dev to become an architect that defines the project and lay out the structure. As long as the first three points are well settled, code gen quality is pretty good. Many people believe the last point (testing) should be done automatically as well. While LLM may help with unit tests or tests on macro structures, I think people need to define high-levle, end-to-end testing goals from a new angel.

kace91

2 days ago

[-]

The question is whether treating code as a borderline black box balances out with the needed extra QA (including automated tests).

Just like strong typing reduces the amount of tests you need (because the scope of potential errors is reduced), there is a giant increase in error scope when you can’t assume the writer to be rational.

HPsquared

2 days ago

[-]

Black box designs beget black swan events.

simonsarris

2 days ago

[-]

A softer version of this has existed since word processing and Xerox machines (copiers) took off, in law and regulations. Tax code, zoning code etc exploded in complexity once words became immensely easy to create and copy.

rho4

2 days ago

[-]

Nice pattern detection.

dexterlagan

2 days ago

[-]

We've been through many technological revolutions, in computing alone, through the past 50 years. The rate of progress of LLMs and AI in general over the past 2 years alone makes me think that this may be unwarranted worry and akin to premature optimization. Also, it seems to be rooted in a slightly out of date, human understanding of the tech/complexity debt problem. I don't really buy it. Yes complexity will increase as a result of LLM use. Yes eventually code will be hard to understand. That's a given, but there's no turning back. Let that sink in: AI will never be as limited as it is today. It can only get better. We will never go back to a pre-LLM world, unless we obliterate all technology by some catastrophy. Today we can already grok nearly any codebase of any complexity, get models to write fantastic documentation and explain the finer points to nearly anybody. Next year we might not even need to generate any docs, the model built in the codebase will answer any question about it, and will semi-autonomously conduct feature upgrades or more.

Staying realistic, we can say with some confidence that within the next 6-12 months alone, there are good reasons to believe that local, open source models will equate their bigger cloud cousins in coding ability, or get very close. Within the next year or two, we will quite probably see GPT6 and Sonnet 5.0 come out, dwarfing all the models that came before. With this, there is a high probability that any comprehension or technical debt accumulated over the past year or more will be rendered completely irrelevant.

The benefits given by any development made until then, even sloppy, should more than make up for the downside caused by tech debt or any kind of overly high complexity problem. Even if I'm dead wrong, and we hit a ceiling to LLM's ability to grok huge/complex codebases, it is unlikely to appear within the next few months. Additionally, behind closed doors the progress made is nothing short of astounding. Recent research at Stanford might quite simply change all of these naysayers' mind.

__mharrison__

2 days ago

[-]

My experience has been that LLMs help me make sense of new code much faster than before.

When I really need to understand what's happening with code, I generally will write it each step.

LLMs make it much easier for me to do this step and more. I've used LLMs to quickly file PRs for new (to me) code bases.

kristianc

2 days ago

[-]

Soon a capable LLM will have enough training material to spit out LLMs are atrophying coding skills / LLM code is unmaintainable/ LLM code is closing down opportunities for juniors / LLMs do the fun bits of coding pieces on demand.

A lot of these criticisms are valid and I recognise there's a need for people to put their own personal stake in the ground as being one of the "true craftsmen" but we're now at the point where a lot of these articles are not covering any real new ground.

At least some individual war stories about examples where people have tried to apply LLMs would be nice, as well as not pretending that the problem of sloppy code didn't exist before LLMs.

2 days ago

[-]

> as well as not pretending that the problem of sloppy code didn't exist before LLMs

Certainly not remotely the same volume of sloppy code

Impossibly high volumes of bad code is a new problem

injidup

2 days ago

[-]

Shouldn't you be getting the LLM to also generate test cases to drive the code and also enforce coding standards on the LLM to generate small easily comprehensible software modułes with high quality inline documentation.

Is this something people are doing?

https://github.com/github/spec-kit

injidup

1 hour ago

[-]

I came back to this. NOBODY mentioned speckit!

""" Spec-Driven Development flips the script on traditional software development. For decades, code has been king — specifications were just scaffolding we built and discarded once the "real work" of coding began. Spec-Driven Development changes this: specifications become executable, directly generating working implementations rather than just guiding them. """

The takeaway is that instead of vibecoding you write specs and you get the LLM to align the generated code to the specs.

kace91

2 days ago

[-]

The problem is similar to that of journalism vs social media hoaxes.

An llm-assisted engineer writes code faster than a careful person can review.

Eventually the careful engineers get ran over by the sheer amount of work to check, and code starts passing reviews when it shouldn’t.

It sounds obvious, that careless work is faster than careful one, but there are psychological issues in play - expectation by management of ai as a speed multiplier, personal interest in being perceived as someone who delivers fast, concerns of engineers of being seen as a bottleneck for others…

1 day ago

[-]

> expectation by management of ai as a speed multiplier

In many cases, it's more than expectation. For top management especially, these are the people who have signed off on massive AI spending on the basis that it will improve productivity. Any evidence to the contrary is not just counter to their expectations - it's a giant flashing neon sign screaming "YOU FUCKED UP". So of course organizations run by those people are going to pretend that everything is fine, for as long as anything works at all.

And then the other side of this is the users. Who have already been conditioned to shrug at crappy software because we made that the norm, and because the tech market has so many market-dominant players or even outright monopolies in various niches that users often don't have a meaningful choice. Which is a perfect setup for slowly boiling the frog - even if AI is used to produce sloppy code, the frog is already used to hot water, and already convinced that there's no way out of the pot in any case, so if it gets hotter still they just rant about it but keep buying the product.

Which is to say, it is a shitshow, but it's a shitshow that can continue for longer than most engineers have emotional capacity to sustain without breaking down. In the long term, I expect AI coding in this environment to act as a filter: it will push out all the people who care about quality and polish out of the industry, and reward those who treat clicking "approved" on AI slop as their real job description.

wongarsu

2 days ago

[-]

I have no issue getting LLMs to generate documentation, modular designs or test cases. Test cases require some care; just like humans LLMs are prone to making the same mistake in both the code and the tests, and LLMs are particularly prone to not understanding whether it's the test or the code that's wrong. But those are solvable.

The things I struggle more with when I use LLMs to generate entire features with limited guidance (so far only in hobby projects) is the LLM duplicating functionality or not sticking to existing abstractions. For example if in existing code A calls B to get some data, and now you need to do some additional work on that data (e.g. enriching or verifying) that change could be made in A, made in B, or you could make a new B2 that is just like B but with that slight tweak. Each of those could be appropriate, and LLMs sometimes make hillariously bad calls here

fzeroracer

2 days ago

[-]

The LLM will generate test cases that do not test anything or falsely flag the test as passing, which means you need to deeply review and understand the tests as well as the code it's testing. Which goes back to the point in the article, again.

2 days ago

[-]

>which means you need to deeply review and understand the tests as well as the code it's testing

Yes...? Why wouldn't you always do this LLM or not?

tgv

2 days ago

[-]

I just wrote a reply elsewhere, but we got a new vibe-coded (marketing) website. How is an LLM going to write test cases for that? And what good will they do? I assume it will also change the test cases when you ask it to rewrite things.

fragmede

2 days ago

[-]

> How is an LLM going to write test cases for that?

"Please generate unit tests for the website that exercise documented functionality" into the LLM used to generate the website should do it.

Herring

2 days ago

[-]

The people who are doing that aren't writing these blog posts. They're writing much better code & faster, while quietly internally panicking a bit about the future.

[1] https://www.hanselman.com/blog/dark-matter-developers-the-un...

Agraillo

2 days ago

[-]

You're probably talking about infamous Dark Matter Developers [1]. When the term was coined, I thought there were many of them, now seeing how many developers are here at HN (including myself) I doubt there are many left /s.

The quote that is interesting in the context of the fast-pacing LLM development is this

> The Dark Matter Developer will never read this blog post because they are getting work done using tech from ten years ago and that's totally OK

estimator7292

2 days ago

[-]

At my first programmer job, a large majority of the code was written by a revolving door of interns allowed to push to main with no oversight. Much of the codebase was unknown and irreplaceable, which meant it slowly degraded and crumbled over the years. Even way back then, everyone knew the entire project was beyond salvage and needed to be rewritten from scratch.

Well they kept limping along with that mess for another ten years while the industry sprinted ahead. The finally released a new product recently, but I don't think anyone cares because everyone else did it better five years ago

vdupras

2 days ago

[-]

Our collective future has "mediocrity" written all over it.

And when you think about it, LLMs are pretty much, by design, machines that look for truth in mediocrity in its etymological sense.

blindriver

2 days ago

[-]

No, I 100% don't think it will happen.

LLMs have made the value of content worth precisely zero. Any content can be duplicated with a prompt. That means code is also worth precisely zero. It doesn't matter if humans can understand the code, what matters is if the LLM can understand the code and make modifications.

As long as the LLM can read the code and adjust it based on the prompt, what happens on the inside doesn't matter. Anything can be fixed with simply a new prompt.

1 day ago

[-]

But how do you know that it's "fixed", if you don't understand the code?

You can have functional tests, sure, but if there's one thing that LLMs (and AI in general) is good at, it's finding unconventional ways to game metrics.

blindriver

22 hours ago

[-]

TDD is perfect for vibe coding.

osigurdson

2 days ago

[-]

I wonder how long it will take for the world to kind of catch up to reality with today's (and likely tomorrow's) AI? Right now, most companies are in a complete holding pattern - sort of doing nothing other than small scale layoffs here and there - waiting for AI to get better. It is like a self-induced global recession where everyone just decides to slow down and do less.

https://kau.sh/blog/container-traffic-control/

sorcercode

2 days ago

[-]

There's truth to a lot of what's said in this post and I see many people complain but these opinions feel short sighted (not meant derogatorily - just that these are shorter term problems).

> Teams that care about quality will take the time to review and understand (and more often than not, rework) LLM-generated code before it makes it into the repo. This slows things down, to the extent that any time saved using the LLM coding assistant is often canceled out by the downstream effort.

I recently tried a mini experiment for myself to (dis)prove similar notions. I feel more convinced we'll figure out a way to use LLMs and keep maintainable repositories.

i intentionally tried to use a language I'm not as proficient in (but obv have a lot of bg in programming) to see if I could keep steering the LLM effectively

and I saved a *lot* of time.

jayd16

2 days ago

[-]

> i intentionally tried to use a language I'm not as proficient in (but obv have a lot of bg in programming) to see if I could keep steering the LLM effectively

I think this might be the wrong assumption. In the same way the news happens to be wrong about topics you know, I think it's probably better to judge code you know over code you don't.

It's easy to accept whatever the output was if you don't know what you're looking at.

It'll be interesting to see what it tells experts about sloppy, private code bases (you can't use, existing OSS examples because opinions and docs would be in the LLM corpus and not just derived from the code itself.)

sorcercode

2 days ago

[-]

fair point.

but i'm not proficient != i don't know (i.e. i have worked on javascript many moons ago, but i wouldn't consider myself an expert at it today).

i like to think i can still spot unmaintainable vs maintainable code but i understand your point that maybe the thinking is to have an expert state that opinion.

the code is [oss](https://github.com/kaushikgopal/ff-container-traffic-control) btw so would love to get other takes.

sixhobbits

2 days ago

[-]

One side of the equation is definitely that we'll get more 'bad' code.

But nearly every engineer I've ever spoken to has over-indexed on 'tech debt bad'. Tech debt is a lot like normal debt - you can have a lot of it and still be a healthy business.

The other side of the equation is that it's easier to understand and make changes to code with LLMs. I've been able to create "Business Value" (tm) in other people's legacy code bases in languages I don't know by making CRUD apps do things differently from how they currently do things.

Before, I'd needed to have hired a developer who specialises in that language and paid them to get up to speed on the code base.

So I agree with the article that the concerns are valid, but overall I'm optimistic that it's going to balance out in the long run - we'll have more code, throw away more code, and edit code faster, and a lot of that will cancel.

efitz

2 days ago

[-]

I think I disagree with the premise.

If the assertion is, I want to use non-LLM methods to maintain LLM-generated code, then I agree, there is a looming problem.

The solution to making LLM-generated code maintainable involves:

1) Using good design practices before generating the code, e.g. have a design and write it down. This is a good practice regardless of maintainability issues because it is part of how you get good results getting LLMs to generate code.

2) Keeping a record of the prompts that you used to generate the code, as part of the code. Do NOT exclude CLAUDE.md from your git repo, for instance, and extract and save your prompts.

3) Maintain the code with LLMs, if you generated it with LLMs.

Mandatory car analogy:

Of course there was a looming maintenance problem when the automobile was introduced, because livery stables were unprepared to deal with messy, unpredictable automobiles.

2 days ago

[-]

Having to do this pretty much destroys the value proposition that AI companies are pushing though. At best, what this means is that current software shops can write taller software stacks. Which is valuable, don't get me wrong. But the value proposition, the fantasy of LLMs, is that they will be able to replace your entire development team. If all it does is make the dev team 15% more capable -- because no one else has the knowledge to use it -- that's not a trillion dollar world-shifting technology, it's just another layer in the tech stack.

righthand

2 days ago

[-]

You’re suggesting software design principles to a world where people are trying to escape having to learn (anything) software design principles. The only thing Llm users largely want is to say “computer do thing”.

strangescript

2 days ago

[-]

So many of these concepts only make sense under the assumption that AI will not get better and humans will continue to pour over code by hand.

They won't. In a year or two these will be articles that get linked back to similar to "Is the internet just a fad?" articles of the late 90s.

gwbas1c

2 days ago

[-]

I disagree. They Not every technological advance "improves" at an exponential rate.

The issue is that LLMs don't "understand." They merely copy without contributing original thought or critical thinking. This is why LLMs can't handle complicated concepts in codebases.

What I think we'll see in the long run is:

(Short term) Newer programming models that target LLMs: IE, describe what you want the computer to do in plain English, and then the LLM will allow users to interact with the program in a more conversational manner. Edit: These will work in "high tolerance" situations where small amounts of error is okay. (Think analog vs digital, where analog systems tend to tolerate error more gracefully than digital systems.)

(Long term) Newer forms of AI that "understand." These will be able to handle complicated programs that LLMs can't handle today, because they have critical thinking and original thought.

https://www.nysaflt.org/workshops/colt/2010/The%20Internet.p...

tkgally

2 days ago

[-]

A couple of those articles, in case anyone is interested:

“The Internet? Bah! Hype alert: Why cyberspace isn't, and will never be, nirvana” by Clifford Stoll (1995)

Excerpt: “How about electronic publishing? Try reading a book on disc. At best, it's an unpleasant chore: the myopic glow of a clunky computer replaces the friendly pages of a book. And you can't tote that laptop to the beach. Yet Nicholas Negroponte, director of the MIT Media Lab, predicts that we'll soon buy books and newspapers straight over the Internet. Uh, sure.”

“Why most economists' predictions are wrong” by Paul Krugman (1998)

Excerpt: “By 2005 or so, it will become clear that the Internet's impact on the economy has been no greater than the fax machine's.”

https://web.archive.org/web/19980610100009/http://www.redher...

senordevnyc

2 days ago

[-]

Krugman's quotes are even worse in full:

The growth of the Internet will slow drastically, as the flaw in "Metcalfe's law"--which states that the number of potential connections in a network is proportional to the square of the number of participants--becomes apparent: most people have nothing to say to each other! By 2005 or so, it will become clear that the Internet's impact on the economy has been no greater than the fax machine's.

As the rate of technological change in computing slows, the number of jobs for IT specialists will decelerate, then actually turn down; ten years from now, the phrase information economy will sound silly.

singleshot_

2 days ago

[-]

If you assume Krugman was talking about a positive impact, it makes sense to make fun of him.

yfw

2 days ago

[-]

Or it could also be like blockchain and nfts...

monkmartinez

2 days ago

[-]

I have been programming as a hobby for almost 20 years. At least for me, there is huge value using LLM's for code. I don't need anyone else's permission, nor anyone else to participate for the LLM's to work for me. You absolutely can not say that about blockchain, nft, or crypto in general.

rhetocj23

2 days ago

[-]

Nah that comparison doesnt make sense.

There is certainly real market penetration with LLMs. However, there is a huge gap between fantasy and reality - as in what is being promised vs what is being delivered and the effects on the economy are yet to play out.

metalliqaz

2 days ago

[-]

This requires an assumption that LLM capability growth will continue on an exponential curve, when there are already signs that in reality the curve is logistic

otabdeveloper4

2 days ago

[-]

Two more weeks and "AI" will finally be intelligent. Trust the plan.

2 days ago

[-]

I know this is sarcasm but if you've been using LLMs for most than two weeks you've probably noticed significant improvements in both the models and the tooling.

Less than a year ago I was generated somewhat silly and broken unit tests with copilot. Now I'm generating entire feature sets while doing loads of laundry.

1 day ago

[-]

That's all true, yet the problem of hallucinations is as stark today as it was three years ago when GPT-3.5 was all the rage. Until that is solved, I don't think there's any amount of "smartness" of the models that can truly compensate for it.

randallsquared

2 days ago

[-]

Exactly so. From the article:

> But those of us who’ve experimented a lot with using LLMs for code generation and modification know that there will be times when the tool just won’t be able to do it.

The pace of change here--the new normal pace--has the potential to make this look outdated in mere months, and finding that the curve topped out exactly in late 2025, such that this remains the state of development for many years, seems intuitively very unlikely.

fhennig

2 days ago

[-]

Just like how in a year or two we will have fully self-driving cars, right?

The last percentage point for something to get just right are the hardest, why are you so sure that the flaws in LLMs will be gone in such a short time frame?

senordevnyc

2 days ago

[-]

We do have fully self-driving cars. You can go to a number of American cities and take a nap in the backseat of one while it drives you around safely.

justsocrateasin

2 days ago

[-]

But it's not fully self driving. SF Waymo can't bring you to the airport. You missed OPs point, which was that the last few percentage points are the hardest.

senordevnyc

15 hours ago

[-]

They recently got approval for the airport, and the issue was legal / regulatory, not technical. They could have been doing rides from the airport years ago.

exasperaited

2 days ago

[-]

pore*

malkosta

2 days ago

[-]

I fight against this by using it mostly on trivial tasks, which require no comprehension at all, also fixing docs and extending tests. It helps me to focus on what I love, and let the boring stuff automated.

For complex tasks, I use it just to help me plan or build a draft (and hacky) pull request, to explore options. Then I rewrite it myself, again leaving the best part to myself.

LLMs made writing code even more fun than it was before, to me. I guess the outcomes only depends on the user. At this point, it's clear that all my peers that can't have fun with it are using it as they use ChatGPT, just throwing a prompt, hoping for the best, and then getting frustrated.

wiradikusuma

2 days ago

[-]

From my experience, you should either treat LLM-generated code as the usual code (before LLM age) that you need to review every time it changes, or you should not review it at all and treat it as black box with clearly defined boundaries. You test it by putting on your QA hat, not your Developer hat.

You can't change your stance later, it will just give you a headache.

When the former breaks, you fix it like conventional bug hunting. When the latter breaks, you fix it by either asking LLM to fix it or scrap it and ask LLM to regenerate it.

randomtoast

2 days ago

[-]

I this the only way to escape this trap is by developing better LLMs in the future. The rapid rate at which new AI-generated code is produced means that humans will no longer be able to review it all.

2 days ago

[-]

That's why you have other LLMs review.

sparkie

2 days ago

[-]

The concern here is getting sufficient quality training data for the newer LLMs. They'll be not only learning from human written code, but also from the slop produced by previous generation LLMs. It may be that they get worse over time unless there are further breakthroughs in making the AI have some actual intelligence.

book_mike

2 days ago

[-]

LLMs are powerful tools but they are not going to save the world. I have seen this before. The experienced crowd gets chuffed because it is a new pattern that radically changes their current workflow. The new crowd haven't optimised yet so they over use the new way of doing things until they moderate it. The only difference I can detect is that rate of change increased to an almost uncomprehensable pace.

The wave’s still breaking, so I’m going to ride it out until it smooths into calm water. Maybe it never will. I don't know.

2 days ago

[-]

> The only difference I can detect is that rate of change increased to an almost uncomprehensable pace

This is a pretty seriously bad difference imo

cadamsdotcom

1 day ago

[-]

Easy way to understand the code: have your AI write tests for it. Especially the gnarliest parts.

Tests prevent regressions and act as documentation. You can use them to prove any refactor is still going to have the same outcome. And you can change the production code on purpose to break the tests and thus prove that they do what they say they do.

And your AI can use them to work on the codebase too.

purpleredrose

2 days ago

[-]

Code will be write only soon enough. It doesn't work, regenerate it until it passes your tests, which you have vetted, but probably was also generated.

titaniumrain

2 days ago

[-]

If this problem has existed before, why start worrying now? And if scale might make it problematic, can we quantify the impact instead of simply worrying?

softwaredoug

2 days ago

[-]

You learn more when you take notes. In the same way, I understand the structure of the code better when my hands are on keyboard.

I like writing code because eventually I have to fix code. The writing will help me have a sense for what's going on. Even if it will only be 1% of the time I need to fix a bug, having that context is extremely valuable.

Then reserve AI coding when there's true boilerplate or near copy-paste of a pattern.

pshirshov

2 days ago

[-]

It can be partially addressed with proper set of agent instructions. E.g. follow SOLID, use constructor injection, avoid mutability, write dual tests, use explicit typings (when applicable) etc. Though the models are remarkably bad at design, so that provides just a minor relief. Everything has to be thoroughly reviewed and (preferably) rewritten by a human.

giancarlostoro

2 days ago

[-]

If the code outputted does not look like code I cannot maintain in a meaningful way (barring like some algorithm or something specialized) I don't check it in. I treat it as if it were code from Stack Overflow, sometimes its awful code, so I rewrite it if applicable (things change, understandings change), other times it works and makes sense.

segmondy

2 days ago

[-]

Well, IMO, the issue is that we are trying to merge with AI/LLM. Why must both of us understand the code base? Before it was us that just understood it, why not just have the AI understand it all? why do you need to understand it? to do what exactly? document it? improve it? fix it? Well, let the LLM do all of that too.

justinhj

2 days ago

[-]

Technical leaders need to educate their teams not to create this kind of technical debt. We have a new tool for designing and implementing code, but ultimately the agent is the software engineer and the same practices we have always followed still have value; more value perhaps.

drnick1

2 days ago

[-]

Unrelated to the content of the article, but please stop including Gravatar in your blogs. It is disrespectful to your readers since you allow them to be tracked by a company that has a notoriously poor security and privacy record. In fact, everyone should blackhole that domain.

jebarker

2 days ago

[-]

The phenomenon is not just true in coding. I think over time we’ll see that outsourcing thinking isn’t always a good idea if you wish to develop long term knowledge and critical thinking skills. Much like social media has destroyed the ability for many to distinguish truth and fiction.

mlhpdx

2 days ago

[-]

Building roads, power, sewer and schools without budgeting for maintenance, upgrades and ultimately replacement. Having a capital burn that can’t plausibly be repaid. Focusing on having more code rather than the right code. Artfully similar behaviors to me.

vjvjvjvjghv

2 days ago

[-]

You have a similar problem with projects where a large number of offshore developers is used. Every day you get a huge pile of code to review which is basically impossible within the available time. So you end up with a system that nobody really understands.

daveaiello

2 days ago

[-]

I started using LLMs to refactor and maintain utility scripts that feed data into one of my database driven websites. I don't see a downside to this sort of use of something like Claude Code or Cursor.

This is not full blown vibe coding of a web application to be sure.

m3kw9

2 days ago

[-]

This is what happens when you let the AI run for 30 minutes. Ain’t no way you will read the code with much critique if it’s a 1 hour+ read. You have to generate compartmentized code so you don’t need to check much

jv22222

1 day ago

[-]

I've found LLMs to be pretty good at explaining how legacy codebases work. Couldn't you just use that to create documentation and a cheat sheet to help you understand how it all works?

holtkam2

1 day ago

[-]

I’m convinced this is the root cause of the strange phenomenon I call “LLM coding assistants don’t increase team velocity in the long run in real world settings”

bongodongobob

2 days ago

[-]

This is just tech debt. It's all around us. This isn't a new concept and it's not something new with LLMs/AI. This isn't ANY different than on boarding any tech into your stack.

energy123

2 days ago

[-]

Been trying to figure out a way to use LLMs to better understand code that comes from LLMs at a level of abstraction somewhere between the code itself and the prompt, but haven't succeeded yet.

rhelz

2 days ago

[-]

When was the golden age, when everybody understood how everything worked?

fullstop

2 days ago

[-]

Mid to late 90s, IMO.

dweinus

2 days ago

[-]

Ok, not peeking at the comments yet, but I am going to predict the "put more AI on it" people will recommend solving it by putting more AI on it. Please don't disappoint!

wilg

2 days ago

[-]

Luckily LLMs can also comprehend code (and are getting better at doing so), this problem will probably solve itself with more LLMs. (Don't shoot the messenger!)

hnthrow09382743

2 days ago

[-]

Frankly, I'm on the side of no tech/comprehension debt ever being paid down if you want to believe this idea is true.

The analogy of debt breaks when you can discard the program and start anew, probably at great cost to the company. But since that cost is externalized to developers, no developer is actually paying the debt because greenfield development is almost always more invigorating than maintaining legacy code. It's a bailout (really debt forgiveness) of technical debt by the company, who also happens to be paying the developers a good wage on the very nebulous promise that this won't happen again (spoiler: it will).

What developers need to do to get a bailout is enough reputation and soft skills to convince someone a rewrite is feasible and the best option. And leadership who is not completely convinced you should never rewrite programs from scratch.

Joel Spolsky's beliefs here are worth a revisit in the face of hastened code generation by LLMs too, as it was based completely on human-created code: https://www.joelonsoftware.com/2000/04/06/things-you-should-...

Some programs still should not be rewritten: Excel, Word, many of the more popular and large programs. However many smaller/medium applications that are being maintained by developers using LLMs in this way will more easily have a larger fraction of LLM generated code that is harder to understand (again, if you believe the article). Where-as before you might have rewritten a small program, you might now rewrite a medium program.

rafaelbeirigo

2 days ago

[-]

I haven't used them in big codebases, but they were also able to help me understand the code they generated. Isn't this feasible (yet) on big codebases?

vanillax

2 days ago

[-]

Offshore coding practices in the 2010's is the same thing as LLM. Id take LLM over offshore 10/hr devs any day of the week...

codazoda

2 days ago

[-]

I find LLMs most useful in this understanding of legacy code.

I can ask questions like, “how is this code organized” and, “where does [thing] happen?”

axpy906

2 days ago

[-]

I get what the author is saying but isn’t that why we have problem solving, test coverage and documentation?

JCM9

2 days ago

[-]

It’s not just code, but across the board we’re not seeing AI help people do better things faster, we’re seeing them meh mediocre things faster under the guise of being “good.”

The market will eventually self correct once folks get more burned by that.

jermberj

2 days ago

[-]

> An effect that’s being more and more widely reported is the increase in time it’s taking developers to modify or fix code that was generated by Large Language Models.

And this is where I stop reading. You cannot make such a descriptive statement without some sort of corroborating evidence other than your intuition/anecdotes.

gipp

2 days ago

[-]

Sounds like a great way to shift our problems from categories that are easy to measure to ones that are hard to measure.

purpleredrose

2 days ago

[-]

Code is going to be write only soon enough. There will be no debt just regenerated code.

HarHarVeryFunny

2 days ago

[-]

The only way that could work would be if there was 100% test coverage of every input scenario, whether documented as part of requirements or not, otherwise the regenerated code is almost certain to have regression bugs in it.

Most complex production systems do not have this level of documentation and/or regression coverage, nor I suspect will any AI-generated system. The requirements you fed the AI to "specify" the system aren't even close to a 100% coverage regression test suite, even of the product features, let alone all the more detailed behaviors that customers may be used to.

It's hard to see mission-critical code (industrial control, medical instruments, etc) ever being written in this way since the cost of failure is so high.

1 day ago

[-]

Sadly what OP said jives well with our presently low standards for most software. Regressions and bugs are routine, and often deprioritized because "90% of our users don't care about this".

vonneumannstan

2 days ago

[-]

This is surely an issue and more and more serious people are admitting 50% or more of their code is now AI generated. However it looks like AI is improving fast enough that they will take the cognitive load of understanding large code bases and humans are relegated to System Architecture and Design.

2 days ago

[-]

No, but people's brains are rotting from using AI and their standards are getting low enough to accept AI code

laweijfmvo

2 days ago

[-]

is no one using LLMs to help them read/understand code? reading code is definitely a skill that needs to be acquired, but LLMs can definitely help. we should be pushing that instead of “vibe coding”

intrasight

2 days ago

[-]

> ... taking developers to modify or fix code

Fix your tests not your resulting code

danans

2 days ago

[-]

The guiding principle for most of the tech industry is to produce the cheapest thing you can get away with. There is little intrinsic motivation toward quality left in the culture.

When velocity and quantity are massively incentivized over understanding, strategy, and quality, this is the result. Enshittification of not only the product, but our own professional minds.

Havoc

2 days ago

[-]

At least llms are pretty good at explaining code

bparsons

2 days ago

[-]

This is a problem that is not unique to software engineering, and predates LLMs.

Large organizations are increasingly made up of technical specialists who are very good at their little corner of the operation. In the past, you had employees present at firms for 20+ years who not only understand the systems in a holistic way, but can recall why certain design or engineering decisions were made.

There is also a demographic driver. The boomer generation with all the institutional memory have left. Gen-X was a smaller cohort, and was not able to fully absorb that knowledge transfer. What is left are a lot of organizations run by people under the age of 45 working on systems where they may not fully understand the plumbing or context.

ccvannorman

2 days ago

[-]

I joined a company with 20k lines of Next/React generated in 1 month. I spent over a week rewriting many parts of the application (mostly the data model and duplicated/conflicting functionality).

At first I was frustrated but my boss said it was actually a perfect sequence, since that "crappy code" did generate a working demo that our future customers loved, which gave us the validation to re-write. And I agree!

LLMs are just another tool in the chest; a curious, lighting fast jr developer with an IQ of 85 who can't learn and needs a memory wipe whenever they make a design mistake.

When I use it knowing its constraints it's a great tool! But yeah if used wrong you are going to make a mess, just like any powerful tool

1 day ago

[-]

Here's a question for you then. Imagine your own future years and decades spent doing nothing but rewriting crappy code like that. Not as a one-off thing, but as a job description. Does that sound enticing? Do you think you'd able to avoid burnout in the long run?

claytongulick

2 days ago

[-]

I'm so glad someone has finally described the phenomena so well.

"Comprehension debt" is a perfect description for the thing I've been the most concerned about with AI coding.

One I got past the Dunning-Kruger phase and started really looking at what was being generated, I ran into this comprehension issue.

With a human, even a very junior one, you can sort of "get in the developer's head". You can tell which team member wrote which code and what they were thinking at the time. This leads to a narrative, or story of execution which is mostly comprehensible.

With the AI stuff, it's just stochastic parrot stuff. It may work just fine, but there will be things like random functions that are never called, hundreds or thousands of lines of extra code to do very simple things. References to things that don't exist, and never have.

I know this stuff can exist in human code bases too - but generally I can reason about why. "Oh, this was taken out for this issue and the dev forgot to delete it".

I can track it, even if it's poor quality.

With the AI stuff, it's just randomly there. No idea why, if it is used, was ever used, makes sense, is extra fluff or brilliant.

It takes a lot of work to figure out.

pnathan

2 days ago

[-]

I am running a lightweight experiment - I have a Java repo I am vibecoding from scratch essentially. I am effectively acting as a fairly pointy haired PM on it.

The goal is to see what how far I can push the LLM. How good is it... really?

scotty79

2 days ago

[-]

I'm sure future LLMs will be able to comprehend more. So the debt, similarly to real world debt, is fine, as long as the line goes up.

tgv

2 days ago

[-]

Idk. The company I work for had a new website designed, and it was built with an (unknown) LLM, through repeated prompting, I believe (so basically a large document that starts the initial description and then adds fixes). It's deployed on an unknown stack, at a random server somewhere. Us techies simply got a few remarks from the LLM about changing the DNS (which were not up to any standard for such requests). The moment some marketeer wants some change on that website, the whole thing may come undone. It's like outsourcing to the lowest bidder. But the CEO is happy, because AI.

I'm also not sure about your basic premise that understanding will improve. That depends on the size of network's internal representation(s), which will start overfitting at some point.

actionfromafar

2 days ago

[-]

But won't that future LLM be able to spew out even more? I mean, I regularly produce stuff I can't comprehend at a later date, why won't the same happen to an LLM?

scotty79

2 days ago

[-]

Because you comprehension grows in time only slightly or not at all and at some point will start to decline. That won't be true for LLMs for some time hopefully. And even then at some point they learn to make stuff in a more "divide and conquer" style so they don't need to understand whole big ball of spaghetti all at once.

actionfromafar

2 days ago

[-]

I mean, even if we don't understand in our brains, what's stopping people from buliding ever more complex systems with LLMs, until the LLMs themselves can't keep up?

righthand

2 days ago

[-]

I think it’s a lot worse. My coworkers don’t even read the code base for easily answered questions anymore. They just ping me on Slack. I want to believe there are no dumb questions, but now it’s become “be ignorant and ask the expert for non-expert related tasks”.

What happened? I don’t use Llms really so I’m not sure how people have completely lost their ability to problem solve. They surely must remember 6 months ago when they were debugging just fine?

cmrdporcupine

1 day ago

[-]

After a few months of going down this hole with agentic coding (mostly Claude Code) I personally think the problem comes down to a few factors:

1. Initially euphoria both with having the tool and seeing how much can be done quickly, not having a good sense of its limits or reach. Mining too deep, and disturbing the Balrog. Basically: doing too much.

2. Not sufficiently reviewing the work it produces.

3. The tools themselves being badly designed from a UX POV to encourage #1 and #2.

From my perspective, there's a fundamental mis-marketing of the agentic tools, and a failure on the part of the designers of these products -- what they could be producing is a tool to work with developers in a Socratic dialogue, in an interactive manner, having the engineer have more of mandatory review and discussion process that makes sure there's a guided authoring process.

When guided and fenced with a good foundational architecture, Claude can produce good work. But the long term health of the project depends on the engineer doing the prompting to be 100% involved. And this can actually be an insanely exhausting process.

In the last 6 months, I have gone from highly skeptical and cynical about LLMs as coding agents, to euphoric and delighted, back to a more cautious approach. I use Claude Code daily and constantly. But I try to use it in a very supervised fashion.

What I'd like to see is agentic tools that are less agentic and more interactive. Claude will prompt you Yes/No diff by diff but this is the wrong level of granularity. What we need is something more akin to a pair programming process and instead of Yes/No prompts there needs to be a combination of an educational aspect (tool tells you what it's discovered, and you tell it what you've discovered) with review.

The makers of these tools need to have them slow down and stop pretending to automate us out of work, and instead take their place as tools used by skilled engineers. If they don't, we're in for a world of mess

josefrichter

2 days ago

[-]

I guess this is just another definition of vibe coding. You're deliberately creating code that you don't fully understand. This has always existed, but LLMs greatly amplify it.

throw_m239339

1 day ago

[-]

That's fantastic IMHO, it guarantees competent engineers decades of work to fix all the bad code deployed, eventually. Let's not even get started on performance optimization jobs.

1 day ago

[-]

Do you really want to spend half of your conscious time every work day reviewing and fixing AI slop, though?

It might be a way to collect the paycheck if that's the only thing you care about. But for people who want to find at least some enjoyment in what they do, it's a shortcut to hell.