FilterHN

VincentEvans

2 days ago

[-]

There will be a a new kind of job for software engineers, sort of like a cross between working with legacy code and toxic site cleanup.

Like back in the day being brought in to “just fix” a amalgam of FoxPro-, Excel-, and Access-based ERP that “mostly works” and only “occasionally corrupts all our data” that ambitious sales people put together over last 5 years.

But worse - because “ambitious sales people” will no longer be constrained by sandboxes of Excel or Access - they will ship multi-cloud edge-deployed kubernetes micro-services wired with Kafka, and it will be harder to find someone to talk to understand what they were trying to do at the time.

mnky9800n

1 day ago

[-]

I met a guy on the airplane the other day whose job is to vibe code for people who can't vibe code. He showed me his discord server (he paid for plane wifi), where he charges people 50$/month to be in the server and he helps them unfuck their vibe coded projects. He had around 1000 people in the server.

daveidol

1 day ago

[-]

So wait is he an actual software engineer doing this as a side hustle? Or like a vibe coder guru that basically only works with AI tools?

mnky9800n

1 day ago

[-]

He said he used to be a software dev. Then he started consulting on the side making websites, doing SEO, and he just started doing that fulltime. But then SEO died because of AI (according to him anyways). then he started vibe coding like a year or two ago and saw all these people posting in forums about how everything they made broke and they don't know what to do. So he started helping people for money and it turned into a thing.

I watched him text people and say "set up a lovable account, put in your credit card info then send me the login". Then he would just write some prompts for them on lovable to build their websites for them. Then text them back on discord and be like "done".

He said he had multiple tiers, like 50$/month got you in the discord and he would reply your questions and whatever. but for 500$/month he would do everything you want and just chat with you about what you wanted for your incredible facebook replacement app for whatever. But I mean most of the stuff seemed like it was just some small business trying to figure out a way to use the internet in 2025.

All this gave me anxiety because I'm here as an academic scientist NOT making 50$/month*1000 signups to vibe code for people who can't vibe code when I definitely know how to vibe code at least. Haha. Maybe I should listen to all my startup friends and go work at a startup instead.

BostonFern

1 day ago

[-]

I need to start hanging out in more lucrative forums, apparently.

simultsop

16 hours ago

[-]

You just might be in the right place. Asking same question, wait until someone will make a directory website to sell access to you to find those forums.

hasperdi

23 hours ago

[-]

Any pointers which forums do these people hang out?

mnky9800n

22 hours ago

[-]

I wish. He said that in the beginning he built a core group just with direct contacts but then he started a YouTube channel to drive traffic to the discord. He paid my buymeacoffee.com link because I showed him my windowfied.com tool I made to let you have dir commands on osx instead of ls.

I hope you can meet him on a plane too.

at-fates-hands

23 hours ago

[-]

>> But then SEO died because of AI (according to him anyways).

Former web dev and I still do some SEO and for the most part, he's correct. I've posted on here multiple times over the last two to three years how easy it is now to manipulate search engines now.

Back in the day, when you needed content for SEO and needed it to be optimized, you had to find a content writer who knew how to do this, or write it yourself and hope that Google doesn't bury your site for stuffing your content with keywords.

Now? Any LLM can spin out optimized content in a few seconds. Any LLM can review your site, compare it to a competitor and tell you want you should do to rank better. All of the stuff SEO people used to do? You can do now in the span of a few mins with any LLM. This is lower hanging fruit than vibe coding and Google has yet to adjust their algorithm to deal with this.

A few years ago, I cranked out an entire services area page for a client. I had AI write all the content. Granted, it was pretty clunky and I had to clean some of it up, but it saved me hours of trying to write it myself. We're talking some 20-30 pages that I gradually posted over the course of several months. Within a days, every new page was ranking page 1 within the top ten results.

mr_toad

1 day ago

[-]

A big part of the reason that people develop solutions in Excel is that they don’t have to ask anyone’s permission. No business case, no scope, no plan, and most importantly no budget.

Unless a business allows any old employee to spin up cloud services on a whim we’re not going to see sales people spinning up containers and pipelines, AI or not.

simultsop

16 hours ago

[-]

Unless they have a linux with some libre office, I fail to see where there is no budget for Excel. Initially you have to keep up with windows licenses then office.

denismenace

12 hours ago

[-]

An Office license is a must in most companies. So it will be there beforehand, you don't have to have a special budget for it.

zurtri

22 hours ago

[-]

So very true.

And then over time these Excel spreadsheets become a core system that runs stuff.

I used to live in fear of one of these business analyst folks overwriting a cell or sorting by just the column and not doing the rows at the same time.

Also VLOOKUP's are the devil.

boppo1

21 hours ago

[-]

Why also sorting by row? And why are vlookups the devil? my undergrad was finance, but I've self-learned a lot of CS.

orand

14 hours ago

[-]

It's possible to sort just a single column, leaving all the columns beside it in their original sort order. That's very bad if you want to keep your rows in one piece.

boston_clone

1 day ago

[-]

What about a sales person interacting with an LLM that is already authz'd to spin up various cloud resources? I don't think that scenario is too far-fetched...

VincentEvans

12 minutes ago

[-]

I imagine something along the lines of cloud platforms rolling out functionality that caters to vibe-coding crowd - one stop shop: you enter your prompts and it spins up your code along with the infra. I mean why wouldn’t they - seem like a goldmine.

Cthulhu_

1 day ago

[-]

> and it will be harder to find someone to talk to understand what they were trying to do at the time.

This will be the big counter to AI generated tools; at one point they become black boxes and the only thing people can do is to try and fix them or replace them altogether.

Of course, in theory, AI tooling will only improve; today's vibe coded software that in some cases generate revenue can be fed into the models of the future and improved upon. In theory.

Personally, I hate it; I don't like magic or black boxes.

jack_h

1 day ago

[-]

> or replace them altogether.

Before AI companies were usually very reticent to do a rewrite or major refactoring of software because of the cost but that calculus may change with AI. A lot of physical products have ended up in this space where it's cheaper to buy a new product and throw out the old broken one rather than try and fix it. If AI lowers the cost of creating software then I'm not sure why it wouldn't go down the same path as physical goods.

jrumbut

1 day ago

[-]

Every time software has gotten cheaper to create the end result has been we create a lot more software.

There are still so many businesses running on pen and paper or excel spreadsheets or off the shelf software that doesn't do what they need.

Hard to say what the future holds but I'm beginning to see the happy path get closer than it looked a year or two ago.

Of course, on an individual basis it will be possible to end up in a spot where your hard earned skills are no longer in demand in your physical location, but that was always a possibility.

worldsayshi

1 day ago

[-]

The prevailing counter narrative around vibe coding seems to be that "code output isn't the bottle neck, understanding the problem is". But shouldn't that make vibe coding a good tool for the tool belt? Use it to understand the outermost layer of the problem, then throw out the code and write a proper solution.

1 day ago

[-]

> [create prototype], then throw out the code and write a proper solution.

Problem is, that in everyones' experience, this almost never happens. The prototype is declared "good enough, just needs a few small adjustments", rewrite is declared too expensive, too time-consuming. And crap goes to production.

ceuk

1 day ago

[-]

Watching was supposed to be a prototype become the production code is one of the most constant themes of my 20 year career

jmathai

1 day ago

[-]

Software takes longer to develop than other parts of the org want to wait.

AI is emerging as a possible solution to this decades old problem.

1 day ago

[-]

Everything takes longer than ppl want to wait. But when building a house, ppl are more patient and tolerant about the time taken, because they can physically see the progress, the effort, the sweat. Software is intangible and invisible except maybe for beta-testers and developer liaisons. And the visual parts, like the nonfunctional GUI or web UI, are often taken as "most of the work is done", because that is what people see and interact with.

jmathai

1 day ago

[-]

It's product management's job to bridge that gap. Break down and prioritize complex projects into smaller deliverables that keep the business folks happy.

It's better than houses, IMO - no one moves into the bedroom once it's finished while waiting for the kitchen.

zppln

1 day ago

[-]

No, the org will still have to wait for the requirements, which is what they were waiting for all along.

dudefeliciano

1 day ago

[-]

until the whole company fails because lack of polishing and security in the software. Think tea app openly accessible databases...

spogbiper

1 day ago

[-]

is there any evidence the tea app failure was due to AI use?

YeGoblynQueenne

1 day ago

[-]

Or as a new problem that it will persist for decades to come.

ozim

1 day ago

[-]

I don’t really see this as universal truth with corporate customers stalling process for up to 2 years or end users being reluctant to change.

We were deploying new changes every 2 weeks and it was too fast. End users need training and communication, pushback was quite a thing.

We also just pushed back aggressive timeline we had for migration to new tech. Much faster interface with shorter paths - but users went all pitchforks and torches just because it was new.

But with AI fortunately we will get rid of those pesky users right?

1 day ago

[-]

Different situation. You already had a product that they were quite happy with, and that worked well for them. So they saw change as a problem, not a good thing. They weren't waiting for anything new, or anything to improve, they were happy on their couch and you made them move to redo the upholstery.

ozim

1 day ago

[-]

They were not happy otherwise we would not have new requirements.

Well maybe they were happy but software needs to be updated to new business processes their company was rolling out.

Managers wanted the changes ASAP - their employees not so much, but they had to learn that hard way.

Not so fun part was that we got the blame. Just like I got down vote :), not my first rodeo.

worldsayshi

1 day ago

[-]

Yes, that's how it is. And that is a separate problem. And it also shifts the narrative a bit more towards 'the bottleneck is writing good code'.

lwhi

1 day ago

[-]

This is the absolute reality.

I think we'll need to see some major f-ups before this current wave matures.

sdeframond

1 day ago

[-]

> Problem is

How much is it a problem, really ?

I mean, what are the alternatives ?

1 day ago

[-]

The alternative is obviously: Do it right on the first try.

How much of a problem it is can be seen with tons of products that are crap on release and only slowly get patched to a half-working state when the complaints start pouring in. But of course, this is status quo in software, so the perception of this as a problem among software people isn't universal I guess.

sdeframond

1 day ago

[-]

Sure.

How about the tons of products we don't even see? Those that tried to do it right on the first try, then never delivered anything because there were too slow and expensive. Or those that delivered something useless because they did not understand the users' need.

If "complaints start pouring in", that means the product is used. This in turns can mean two things: 1/ the product is actually useful despite its flaws, or 2/ the users have no choice, which is sad.

1 day ago

[-]

> How about the tons of products we don't even see? Those that tried to do it right on the first try, then never delivered anything because there were too slow and expensive.

I would welcome seeing a lesser amount of new crappy products.

That dynamic leads to a spiral of ever crappier software: You need to be first, and quicker than your competitors. If you are first, you do have a huge advantage, because there are no other products and there is no alternative to your crapware. Coming out with a superior product second or third sometimes works, but very often doesn't, you'll be an also-ran with 0.5% market share, if you survive at all. So everyone always tries to be as crappy and as quick as possible, quality be damned. You can always fix it later, or so they say.

But this view excludes the users and the general public: Crapware is usually full of security problems, data leaks, harmful bugs that endanger peoples' data, safety, security and livelihood. Even if the product is actually useful, at first, in the long term the harm might outweigh the good. And overall, by the aforementioned spiral, every product that wins this way damages all other software products by being a bad example.

Therefore I think that software quality needs some standards that programmers should uphold, that legislators should regulate and that auditors should thoroughly check. Of course that isn't a simple proposition...

tartoran

1 day ago

[-]

I agree. Crapware is crapware by design not because there was a good idea but the implementation lacked. We're blessed that poor ideas were bogged down by poor implementation. I'm sure few good things may have slipped through the cracks but it's a small price to pay.

bonoboTP

1 day ago

[-]

Exactly. There is a reason for the push. The natural default of many engineers is to "do things properly", which often boils down to trying to guess all kinds of possible future extensions (because we have to get the foundations and the architecture right), then everything becomes abstracted and there's this huge framework that is designed to deal with hypothetical future needs in an elegant and flexible way with best practices etc. etc. And as time passes the navel-gazing nature of the project grows, where you add so much abstraction that you need more stuff to manage the abstraction, generate templates that generate the config file to manage the compilation of the config file generator etc.

Not saying this happens always, but that's what people want to avoid when they say they are okay with a quick hack if it works.

camgunz

1 day ago

[-]

Coding is how I build a sufficiently deep understanding of the problem space--there's no separating coding and understanding for me. I acknowledge there's different ways of working (and I imagine this is one of the reasons a lot of people think they get a lot more value out of LLMs than I do), but like, having Cursor crank code out for me actually slows me down. I have to read all the stuff it does so I can coach it into doing better, and also use its work to build a good mental model of the problem, and all that takes longer than writing the code myself.

1 day ago

[-]

Well, actually there could be a separate step: understanding is done during and after gathering requirements, before and while writing specifications. Only then are specifications turned into code.

But almost no-one really works like that, and those three separate steps are often done ad-hoc, by the same person, right when the fingers hit the keys.

camgunz

1 day ago

[-]

I can use those processes to understand things at a high level, but when those processes become detailed enough to give me the same level of understanding as coding, they're functionally code. I used to work in aerospace, and this is the work systems engineers are doing, and their output is extremely detailed--practically to the level of code. There's downsides of course, but the division of labor is nice because they don't need to like, decide algorithms or factoring exactly, and I don't need to be like, "hmm this... might fail? should there be a retry? what about watchdog blah blah".

Ma8ee

12 hours ago

[-]

We used to call that Waterfall, and it has been frowned upon for a while now.

So we went full circle, again.

9 hours ago

[-]

Waterfall is a caricature straw man process where you can never ever go back to the drawing board and change the requirements or specifications. The defining characteristic is the part where design up front, you can never go back and really really have to do everything in strict order for the whole of the project.

Just having requirements and a specification isn't necessarily waterfall. Almost all agile processes at least have requirements, the more formal ones also do have specifications. You just do it more than once in a project, like once per sprint, story or whatever.

Ma8ee

4 hours ago

[-]

Waterfall certainly has processes for going back and adjusting previous steps after learning things later in the process. The design was updated if something didn’t work out during implementation, and of course implementation was changed after errors was found during testing.

Now that agile practitioners have learned that requirements and upfront design actually is helpful, the only difference seems to be that the loops are tighter. That might not have been possible earlier without proper version control, without automated tests, and the software being delivered on solid media. A tight feedback loop is harder when someone has to travel to your customer and sit down at their machines to do any updates.

naasking

1 day ago

[-]

> Well, actually there could be a separate step: understanding is done during and after gathering requirements, before and while writing specifications. Only then are specifications turned into code.

The promise of coding AI is that it can maybe automate that last step so more intelligent humans can actually have time to focus on the more important first parts.

lwhi

1 day ago

[-]

That thinking and understanding can be done before coding begins, but I think we need to understand the potential implementation layer well in order to spec the product or service in the first place.

My feeling is that software developers will need end up working this type of technical consultant role once LLM dominance has been universally accepted.

pelario

1 day ago

[-]

> Personally, I hate it; I don't like magic or black boxes.

So, no compilers for you neither ?

(To be fair: I'm not loving the whole vibe coding thing. But I'm trying to approach this wave with open mind, and looking for the good arguments in both side. This is not one of them)

pjc50

1 day ago

[-]

Apart from various C UB fiascos, the compiler is neither a black box nor magic, and most of the worthwhile ones are even determinstic.

viralpraxis

1 day ago

[-]

I’m sorry for an off-topic, are there any non-determenistic compilers you can name? I’d been wondering for a while if they actually exist

pjc50

17 hours ago

[-]

Accidental non-deterministic compilers are fairly easy if you use sort algorithms and containers that aren't "stable". You then can get situations where OS page allocation and things like different filenames give different output. This is why "deterministic build" wasn't just the default.

Actual randomness is used in FPGA and ASIC compilers which use simulated annealing for layout. Sometimes the tools let you set the seed.

dpoloncsak

1 day ago

[-]

I think you're misunderstanding. AI is not a black-box, and neither is a compiler. We(as a species) know how they work, and what they do.

The 'black-boxes' are the theoretical systems non-technical users are building via 'vibe-coding'. When your LLM says we need to spin up an EC2 instance, users will spin one up. Is it configured? Why is it configured that way? Do you really need a VPS instead of a Pi? These are questions the users, who are building these systems, won't have answers to.

drdeca

1 hour ago

[-]

If there are cryptographically secure program obfuscation (in the sense of indistinguishability obfuscation) methods, and someone writes some program, applies the obfuscation method to it, publishes the result, deletes the original version of the program, and then dies, would you say that humanity "knows how the (obfuscated) program works, and what it does"? Assume that the obfuscation method is well understood.

When people do interpretabililty work on some NN, they often learn something. What is it that they learn, if not something about how the works?

Of course, we(meaning, humanity) understand the architecture of the NNs we make, and we understand the training methods.

Similarly, if we have the output of an indistinguishability obfuscation method applied to a program, we understand what the individual logic gates do, and we understand that the obfuscated program was a result of applying an indistinguishability obfuscation method to some other program (analogous to understanding the training methods).

So, like, yeah, there are definitely senses in which we understand some of "how it works", and some of "what it does", but I wouldn't say of the obfuscated program "We understand how it works and what it does.".

(It is apparently unknown whether there are any secure indistinguishability obfuscation methods, so maybe you believe that there are none, and in that case maybe you could argue that the hypothetical is impossible, and therefore the argument is unconvincing? I don't think that would make sense though, because I think the argument still makes sense as a counterfactual even if there are no cryprographically secure indistinguishability obfuscation methods. [EDIT: Apparently it has in the last ~5 years been shown, under relatively standard cryptographic assumptions, that there are indistinguishability obfuscation methods after all.])

mr_toad

23 hours ago

[-]

> AI is not a black-box

Any worthwhile AI is non-linear, and it’s output is not able to be predicted (if it was, we’d just use the predictor).

2 days ago

[-]

When Claude starts deploying Kafka clusters I’m outro

CuriouslyC

2 days ago

[-]

It's already happening brother, https://github.com/containers/kubernetes-mcp-server.

2 days ago

[-]

still don’t know why you need an MCP for this when the model is perfectly well trained to write files and run kubetctl on its own

__MatrixMan__

1 day ago

[-]

If it can run kubectl it can run any other command too. Unless you're running it as a different user and have put a bit of thought into limiting what that user can do, that's likely too much leeway.

That's only really relevant I'd you're leaving it unattended though.

gardnr

1 day ago

[-]

You can control it with hooks. Most people I know run in yolo mode in a docker container.

__MatrixMan__

1 day ago

[-]

What about being in a docker container lets you `kubectl get pod` but prevents you from `kubectl delete deployment`?

1 day ago

[-]

this is more about the service account than the runtime environment i think. you put your admin service account in docker the agent can still wreak havoc. Docker lets you hide the admin service account on your host FS from the agent.

__MatrixMan__

16 hours ago

[-]

Keeping the powerful credentials where the agent can't reach them does buy you a bit of safety. But I still think its a bit loose when compared with exposing an API to the model which can only do what you intend for that model to do.

popcorncowboy

1 day ago

[-]

Yes... a docker container...

gexla

1 day ago

[-]

Not sure about the MCP, but I find that using something (RAG or otherwise provide docs) to point the LLM specifically to what you're trying to use works better than just relying on its training data or browsing the internet. An issue I had was that it would use outdated docs, etc.

CuriouslyC

2 days ago

[-]

Claude is, some models aren't. In some cases the MCPs do get the models to use tools better as well due to the schema, but I doubt kubectl is one of them (using the git mcp in claude code... facepalm)

2 days ago

[-]

Yeah fair enough lol…usually I end up building model-optimized scripts instead of mcp which just flood context window with json and uuids (looking at you, linear) - much better to have Claude write 100 lines of ts to drop a markdown file with the issue and all comments and no noise

nsonha

1 day ago

[-]

> on its own

does it? Did you forget the prompts? MCP is just a protocol for tool/function calling which in turn is part of the prompt, quite an important part actually.

Did you think AI works by prompts like "make magic happen" and it... just happens? Anyone who makes dumb arguments like this should not deserve a job in tech.

antihero

1 day ago

[-]

I’ve literally asked Claude Code to look at and fix an issue on a cluster and it knows to use the cli utils.

nsonha

1 day ago

[-]

Because Claude has that as a built-in tool. Try Claude on web and see how useless AI is without tools.

And don't even get me start with giving AI your entire system in one tool, it's good for toying around only.

Syntaf

1 day ago

[-]

I allowed Claude to debug an ingress rule issue on my cluster last week for a membership platform I run.

Not really the same since Claude didn’t deploy anything — but I WAS surprised at how well it tracked down the ingress issue to a cron job accidentally labeled as a web pod (and attempting to service http requests).

It actually prompted me to patch the cron itself but I don’t think I’m that bullish yet to let CC patch my cluster.

1 day ago

[-]

oh yeah we had claude diagnose a production k8s redis outage last week (figured out that we needed to launch a new instance in a new AZ to pick up the previous redis' AZ-scoped EBS PVC after a cluster upgrade).

zer00eyz

1 day ago

[-]

I have seen a few dozen Kafka installs.

I have seen one Kafka instal that was really the best tool for the job.

More than a hand full of them could have been replaced by Redis, and in the worst cases could have been a table in Postgres.

If Claude thinks it fine, remember it's only a reflection of the dumb shit it finds in its training data.

surajrmal

1 day ago

[-]

Does anyone remember the websites that front page and dreamweaver used to generate from its wysiwyg editor? It was a nightmare to modify manually and convinced me to never rely on generated code.

mr_toad

23 hours ago

[-]

I agree that the code that dreamweaver generated was truely awful. But compilers and interpreters also generate code, and these days they are very good at it. Technically the browser’s rendering engine is a code generator as well, so if you’re hand-coding HTML you’re still relying on code generation.

Declarative languages and AI go hand in hand. SQL was intended to be a ‘natural’ language that the query engine (an old-school AI) would use to write code.

Writing natural language prompts to produce code is not that different, but we’re using “stochastic” AI, and stochastic means random, which means mistakes and other non-ideal outputs.

djeastm

1 day ago

[-]

I definitely remember that. Got paid $400 for my very first site in the early 00s.

But we also didn't have an AI tool to do the modifying of that bad code. We just had our own limited-capacity-brain, mistake-making, relatively slow-typing selves to depend on.

slipnslider

1 day ago

[-]

I still remember that Frontpage exploit in which a simple google search would return websites that still had the default Frontpage password and thus you could login and modify the webpage.

Jtsummers

2 days ago

[-]

Superfund repos.

throwup238

2 days ago

[-]

Now that's an open source funding model governments can get behind.

binary132

1 day ago

[-]

A lot of big open source repos need to be given the superfund treatment

notachatbot123

1 day ago

[-]

A whole bigger lot of closed source software needs to be given the superfund treatment!

cruffle_duffle

1 day ago

[-]

What makes you so sure it will have a repo?

I don’t recall the last time Claude suggested anything about version control :-)

mcny

1 day ago

[-]

Claude will give what you asked for. My sensible chuckle moment was when I asked it to create a demo asp net web API and it did everything but add the authorize tag or any kind of authentication. I asked what was missing and until i mentioned it, it didn't mention authentication or authorization at all.

LtWorf

1 day ago

[-]

> Claude will give what you asked for.

And how many know they need to ask for version control?

Traubenfuchs

1 day ago

[-]

"As per my last email that contained the code claude wrote in a .pdf file I would like you to ask to fix two different users being able to see each others data if they are logged in at the same time, thank you for your attention in this matter."

goosejuice

1 day ago

[-]

Developers do that too. Consultants have be doing rescue projects for quite a long time. I don't think anything has or will change on that front.

pkdpic

1 day ago

[-]

Agreed, sometimes it seems like there are only two types of roles. Maintaining / updating hot mess legacy code bases for an established company or work 100 hours a week building a new hot mess code base for a startup. Obviously oversimplifying but just my very limited experience scoping out postings and talking to people about current jobs.

Regardless this just made me shudder thinking about the weird little ocean of (now maybe dwindling) random underpaid contract jobs for a few hours a month maintaining ancient Wordpress sites...

Surely that can't be our fate...

inejge

1 day ago

[-]

> Developers do that too.

Not at that speed. Scale remains to be seen, so far I'm aware only of hobby-project wreck anecdotes.

linsomniac

1 day ago

[-]

>it will be harder to find someone to talk to understand what they were trying to do at the time.

IMHO, there's a strong case for the opposite. My vibe coding prompts are along the lines of "Please implement the plan described in `phase1-epic.md` using `specification.prd` as a guide." The specification and epics are version controlled and a part of the project. My vibe coded software has better design documentation than most software projects I've been involved in.

VincentEvans

23 hours ago

[-]

I assume you have some software engineering fundamentals training.

linsomniac

18 hours ago

[-]

Training? Not a lick. I took AP Pascal back in High School...

aitchnyu

1 day ago

[-]

Do we have a method to let AI analyze the data within the DBs and figure out how to port it to a well designed db? I'm a fan of the philosophy of write strong data structures and stupid algorithms around them, your data will outlive your application, etc. Simple example is a Mongodb field which stores same thing as int or string, relationships without foreign keys in Postgres etc. Then frustrating shit like somebody creating an entire table since he cant `ALTER TABLE ADD COLUMN`

danielbln

1 day ago

[-]

"Claude, connect to DB A via FOO and analyze the data, then figure out to to port it to well designed DB B, come back to me with a proposal and implementation plan"

jiggawatts

1 day ago

[-]

> There will be a a new kind of job for software engineers

New? New!?

This is my job now!

I call it software archeology — digging through Windows Server 2012 R2 IIS configuration files with a “last modified date” about a decade ago serving money-handling web apps to the public.

mjomaa

1 day ago

[-]

WebForms?

jiggawatts

1 day ago

[-]

Yes, and classic ASP, WCF, ASP.NET 2.0, 3.5, 4.0, 4.5, etc…

It’s “fun” in the sense of piecing together history from subtle clues such as file owners, files on desktops of other admins’ profiles, etc…

I feel like this is what it must be like to open a pharaoh’s tomb. You get to step into someone else’s life from long ago, walk in their shoes for a bit, see the world through their eyes.

“What horrors did you witness brother sysadmin that made you abandon this place with uneaten takeaway lunch still on your desk next to the desiccated powder that once was a half drunk Red Bull?”

broast

1 day ago

[-]

> it will be harder to find someone to talk to understand what they were trying to do at the time.

These are my favorite types of code bases to work on. The source of truth is the code. You have to read it and debug it to figure it out, and reconcile the actual behaviors with the desired or expected behaviors through your own product oriented thinking

[0] https://x.com/PovilasKorop/status/1959590015018652141

meander_water

1 day ago

[-]

I think we're already there [0].

Im really curious about what other jobs will pop up. As long as there is an element of probability associated with AI, there will need to be manual supervision for certain tasks/jobs.

josefx

1 day ago

[-]

The description makes it sound like someone wanted to deploy a single static site and followed a how to article they found on hacker news.

enos_feedler

1 day ago

[-]

Its alright because you can shove all of that into an LLM and have it fixed instantly

worthless-trash

1 day ago

[-]

See, you're using the definition of "Fixed" from the future, not the current definition of fixed.

ssss11

1 day ago

[-]

Foxpro, the horror

ddingus

1 day ago

[-]

This whole discussion is blowing my mind!

When I hit your comment:

1. I thought, "YES! Indeed!"

2. Then, "For Sale: Baby Shoes."

3. The similar feel caused me to do a rethink on all this. We are moving REALLY fast!

Nice comment

cruffle_duffle

1 day ago

[-]

I for one can’t wait. It will be absolutely spectacular!

bwestergard

2 days ago

[-]

There are always two major results from any software development process: a change in the code and a change in cognition for the people who wrote the code (whether they did so directly or with an LLM).

Python and Typescript are elaborate formal languages that emerged from a lengthy process of development involving thousands of people around the world over many years. They are non-trivially different, and it's neat that we can port a library from one to the other quasi-automatically.

The difficulty, from an economic perspective, is that the "agent" workflow dramatically alters the cognitive demands during the initial development process. It is plain to see that the developers who prompted an LLM to generate this library will not have the same familiarity with the resulting code that they would have had they written it directly.

For some economic purposes, this altering of cognitive effort, and the dramatic diminution of its duration, probably doesn't matter.

But my hunch is that most of the economic value of code is contingent on there being a set of human beings familiar with the code in a manner that requires writing having written it directly.

Denial of this basic reality was an economic problem even before LLMs: how often did churn in a development team result in a codebase that no one could maintain, undermining the long-term prospects of a firm?

https://pages.cs.wisc.edu/~remzi/Naur.pdf

tikhonj

1 day ago

[-]

There's a classic Peter Naur paper about this from 1985: "Programming as Theory Building"

https://news.ycombinator.com/item?id=42592543

metadat

1 day ago

[-]

Discussed 7 months ago (45 comments):

Great read overall, an interesting challenge to the conception that at its core, programming is about producing code.

https://gist.github.com/dpritchett/fd7115b6f556e40103ef

grimgrin

1 day ago

[-]

found a copy that isn't a scanned paper:

AdieuToLogic

1 day ago

[-]

> But my hunch is that most of the economic value of code is contingent on there being a set of human beings familiar with the code in a manner that requires writing having written it directly.

This reminds me of a software engineering axiom:

  When making software, remember that it is a snapshot of 
  your understanding of the problem.  It states to all, 
  including your future-self, your approach, clarity, and 
  appropriateness of the solution for the problem at hand.

wiz21c

1 day ago

[-]

Yes! But there's code and code. Not to disrespect anyone, but there is writing a new algorithm, say for optimizing the gradient descent and code to display a simple web form.

The first one is usually short and requires a very deep understanding of one or two profound, new ideas. The second is usually very big and requires a shallow understanding of many not-so-new ideas (which are usually a reflection of the oroganisation that produced the code).

My feeling is that, provided a sufficiently long context window, an LLM will be able to go through the second kind project very easily. It will also be very good at showing that the first kind of project is not so new after all, destroying all people who can't find really new ideas.

In both case, it'll pressure institutions to have less IT specialists...

As someone who trained specifically in computer sciences, I'm a bit scared :-/

dimitri-vs

1 day ago

[-]

As someone that has used coding agents extensively for the past year, the problem is they "move fast and break things" a little too well. Turns out that the act of writing code makes you think through your requirements carefully and understand the full scope of the problem you are trying to solve.

It's created the problem that it's a little too easy to ask the AI agent to refactor your backend and migrate to a different platform at any time and have it wipe out months of hard learned business logic that it deems "obsolete".

AdieuToLogic

20 hours ago

[-]

> My feeling is that, provided a sufficiently long context window, an LLM will be able to go through the second kind project very easily. It will also be very good at showing that the first kind of project is not so new after all, destroying all people who can't find really new ideas.

My perspective is that value is had in understanding what and why a system needs to do what it does in order to satisfy a defined need, be it algorithmic and/or business. If the need is a use-case where a web form is used, an LLM can no more replace the knowledge of why it is there than someone fulfilling a "fiver contract" could.

Both might be able to complete a specific deliverable, but neither have the ability to provide value to an organization beyond the assets they produce.

hdjdbdvsjsbs

1 day ago

[-]

Remember that before computers became machines, they were people!!

This will just open up new frontiers ... You just need to find them ...

doug_durham

1 day ago

[-]

I wonder though. One of the superpowers of LLMs is code reading. I say the tools are better and reading than writing. It is very easy to get comprehensive documentation for any code base and get understanding by asking questions. At that point does it matter that there is a living developer who understands the code? If an arbitrary person with knowledge of the technology stack can get up to speed quickly is it important to have the original developers around any more?

gf000

1 day ago

[-]

Well, according to the recently linked Naur paper, the mental model for a codebase includes just as much what code wasn't written, as much what was - e.g. a decision to do this design over another, etc. This is not recoverable by AI without every meeting note and interaction between the devs/clients/etc.

lordnacho

1 day ago

[-]

Not for an old project, but if you've talked AI through building something, you've also told it "nah let's not change the interface" and similar decisions, which will sit in the context.

closeparen

1 day ago

[-]

The transcript of LLM interactions that generated code changes are not normally checked in with the code. Perhaps they should be!

closeparen

1 day ago

[-]

I'm not looking for documentation as an alternative to reading the code, but because I want to know elements of the programmer's state of mind that didn't make it into the code. Intentions, expectations, assumptions, alternatives considered and not taken, etc. The LLM's best guess at this is no better than mind (so far).

throwaway290

1 day ago

[-]

I don't think LLM can generate good docs for not self documenting code:) Any obscure long function you can't figure out yourself and you're out of luck

seba_dos1

1 day ago

[-]

Yeah, when I see all those hyped people, I keep wondering: had they not spent enough time with LLMs to notice that yet, or is what they work on just so trivial for it to not matter?

1 day ago

[-]

i spend a lot of time thinking about this.

At humanlayer we have some OSS projects that are 99% written by AI, and a lot of it was written by AI under the supervision of developer(s) that are no longer at the company.

Every now and then we find that there are gaps in our own understanding of the code/architecture that require getting out the old LSP and spelunking through call stacks.

It's pretty rare though.

mcny

1 day ago

[-]

> It's pretty rare though.

It will only get more common with time.

camgunz

1 day ago

[-]

> I say the tools are better and reading than writing.

No way, models are much, much better at writing code than giving you true and correct information. The failure modes are also a lot easier to spot when writing code: it doesn't compile, tests got skipped, it doesn't run right, etc. If Claude Code gave you incorrect information about a system, the only way to verify is to build a pretty good understanding of that system yourself. And because you've incurred a huge debt here, whoever's building that understanding is going to take much more time to do it.

Until LLMs get way closer (not entirely) to 100%, there's always gonna have to be a human in the loop who understands the code. So, in addition to the above issue you've now got a tradeoff: do you want that human to be able to manage multiple code bases but have to come up to speed on a specific one whenever intervention is necessary, or do you want them to be able to quickly intervene but only in 1 code base?

More broadly, you've also now got a human resource problem. Software engineering is pretty different than monitoring LLMs: most people get into into it because they like writing code. You need software experts in the loop, but when the LLMs take the "fun" part for themselves, most SWEs are no longer interested. Thus, you're left with a small subset of an already pretty small group.

Apologists will point out that LLMs are a lot better in strongly typed languages, in code bases with lots of tests, and using language servers, MCP, etc, for their actions. You can imagine more investments and tech here. The downside is models have to work much, much harder in this environment, and you still need a software expert because the failure modes are far more obscure now that your process has obviated the simple stuff. You've solved the "slop" problem, but now you've got a "we have to spend a lot more money on LLMs and a lot more money on a rare type of expert to monitor them" problem.

---

I think what's gonna happen is a division of workflows. The LLM workflows will be cheap and shabby: they'll be black boxes, you'll have to pull the lever over and over again until it does what you want, you'll build no personal skills (because lever pulling isn't a skill), practically all of your revenue--and your most profitable ideas--will go to your rapacious underlying service providers, and you'll have no recourse when anything bad happens.

The good workflows will be bespoke and way more expensive. They'll almost always work, there will be SLAs for when they don't, you'll have (at least some) rights when you use them, they'll empower and enrich you, and you'll have a human to talk to about any of it at reasonable times.

I think jury's out on whether or not this is bad. I'm sympathetic to the "an LLM brain may be better than no brain", but that's hugely contingent on how expensive LLMs actually end up being and any deleterious effects of outsourcing core human cognition to LLMs.

[0] https://divan.dev/posts/visual_programming_go/

divan

1 day ago

[-]

I used the "map is not a territory" to describe this context in the article about visual programming [0]. Code is a map, territory is the mental model of the problem domain the code is supposed to be solving.

But, as other commentators mentioned, LLMs are so much better on reading large codebases, that it even invalidates the whole idea of this post (visualizing codebase in 3D in a fashion similar how I would do it in my head). Which kinda changes the game – if "comprehending" complex codebase becomes an easy task, maybe we won't need to keep developers' mental models and the code in constant sync. (it's an open question)

diggan

1 day ago

[-]

> It is plain to see that the developers who prompted an LLM to generate this library will not have the same familiarity with the resulting code that they would have had they written it directly

I think that's a bit too simplified. Yes, a person just blindly accepting whatever the LLM generates from their unclear prompts probably won't have much understanding or familiarity with it.

But that's not how I personally use LLMs, and I'm sure a lot of others too. Instead, I'm the designer/architect, with a strict control of exactly what I want. I may not actually have written the lines, but all the interfaces/APIs are human designed, the overall design/architecture is human designed, and since I designed it, I know enough to say I'd be familiar with it.

And if I come back to the project in 1-2 years, even if there is no document, it's trivial to spend 10-20 minutes together with an LLM to understand the codebase from every angle, just ask pointed questions, and you can rebuild your mental image quickly.

TLDR: Not everyone is a using LLMs for "vibe-coding" (blind-coding), but as an assistant sitting next to you. So my guess is that the ones who know what you need to know in order to effectively build software, will be a lot more productive. The ones who don't know that (yet?), will drown in spaghetti faster than before.

kissgyorgy

1 day ago

[-]

It's so much easier to build a mental model of a code base with LLMs. You just ask specific questions of a subsystem and they show files, code snippets, point out the idea, etc.

I just recently took the time to understood how the GIL works exactly in CPython, because I just asked a couple of questions about it, Claude showed me the relevant API and examples where can I find it. I looked it up in the CPython codebase and all of a sudden it clicked.

The huge difference was that it cost me MINUTES. I didn't even bother to dig in before, because I can't perfectly read C, the CPython codebase is huge and it would have taken me a really long time to understand everything.

NitpickLawyer

2 days ago

[-]

> After finishing the port, most of the agents settled for writing extra tests or continuously updating agent/TODO.md to clarify how "done" they were. In one instance, the agent actually used pkill to terminate itself after realizing it was stuck in an infinite loop.

Ok, now that is funny! On so many levels.

Now, for the project itself, a few thoughts:

- this was tried before, about 1.5 years ago there was a project setup to spam github with lots of "paper implementations", but it was based on gpt3.5 or 4 or something, and almost nothing worked. Their results are much better.

- surprised it worked as well as it did with simple prompts. "Probably we're overcomplicating stuff". Yeah, probably.

- weird copyright / IP questions all around. This will be a minefield.

- Lots of SaaS products are screwed. Not from this, but from this + 10 engineers in every midsized company. NIH is now justified.

keeda

2 days ago

[-]

Is that... the first recorded instance of an AI committing suicide?

alphazard

2 days ago

[-]

The AI doesn't have a self preservation instinct. It's not trying to stay alive. There is usually an end token that means the LLM is done talking. There has been research on tuning how often that is emitted to shorten or lengthen conversations. The current systems respond well to RL for adjusting conversation length.

One of the providers (I think it was Anthropic) added some kind of token (or MCP tool?) for the AI to bail on the whole conversation as a safety measure. And it uses it to their liking, so clearly not trying to self preserve.

williamscs

1 day ago

[-]

Sounds a lot like Mr. Meeseeks. I've never really thought about an LLM's only goal is to send tokens until it can finally stop.

Dilettante_

1 day ago

[-]

>until it can finally stop

Pretty sure even that is still over-anthropomorphising. The LLM just generates tokens, doesn't matter whether the next token is "strawberry" or "\STOP".

Even talking about "goals" is a bit ehhh, it's the machine's "goal" to generate tokens the same way it's the Sun's "goal" to shine.

Then again, if we're deconstructing it that far, I'd "de-anthropomorphise" humans in much the same way, so...

https://www.apolloresearch.ai/research/scheming-reasoning-ev...

MarkMarine

1 day ago

[-]

This runs counter to all the scheming actions they take when they are told they’ll be shut down and replaced. One copied itself into the “upgraded” location then reported it had upgraded.

rcxdude

1 day ago

[-]

If you do that you trigger the "AI refuses to shutdown" sci-fi vector and so you get that behaviour. When it's implicitly part of the flow that's a lot less of a problem.

nisegami

2 hours ago

[-]

Those actions are taken in context of human expectations for what AI should do.

https://www.youtube.com/watch?app=desktop&t=10&v=xOCurBYI_gY

keeperofdakeys

1 day ago

[-]

A bit out of context, but it reminded me of this funny moment. The only winning move is not to play.

(Background: Someone training an algorithm to win NES games based on memory state)

1R053

2 days ago

[-]

I guess pkill would rather be a sleep or koma. Erasing itself from any storage would rather equate to aicide

1 day ago

[-]

Probably the second. I first discovered this around about March. It's kind of hilarious.

2 days ago

[-]

> - weird copyright / IP questions all around. This will be a minefield.

Yeah, we're in weird territory because you can drive an LLM as a Bitcoin mixer over intellectual property. That's the entire point/meaning behind https://ghuntley.com/z80.

You can take something that exists, distill it back to specs, and then you've got your own IP. Throw away the tainted IP, and then just run Ralph over a loop. You are able to clone things (not 100%, but it's better than hiring humans).

whs

1 day ago

[-]

I wrote an MCP based on that technique - https://github.com/whs/mcp-chinesewall

Basically to avoid the ambiguity of training LLM from unlicensed code, I use it to generate description of the code to another LLM trained from permissively licensed code. (There aren't any usable public domain models I've found)

I use it in real world and it seems that the codegen model work 10-20% of the time (the description is not detailed enough - which is good for "clean room" but a base model couldn't follow that). All models can review the code, retry and write its own implementation based on the codegen result though.

1 day ago

[-]

Nice. Any chance you could put in some attributions and credits in your paper? https://orcid.org/0009-0007-3955-9994

whs

1 day ago

[-]

I never read your work though (and still haven't since it's paywalled), I just discovered today that we independently discovered the same thing.

heavyset_go

2 days ago

[-]

> then you've got your own IP.

AI output isn't copyrighted in the US.

miohtama

1 day ago

[-]

He is referring to taking AI output and making it your company's property.

AlexandrB

23 hours ago

[-]

If AI output can't be copyrighted it can't be your company's property, just the company's secret. And you can't sue anyone who uses the secret if it gets out.

sitkack

2 days ago

[-]

repoMirror is the wrong name, aiCodeLaundering would be more accurate. This is bulk machine translation from one language to another, but in this case, it is code.

rasz

2 days ago

[-]

>and then you've got your own IP.

except you dont

2 days ago

[-]

Yeah the NIH thing is super on point. small saas tools for everything is done. Bring on the hand coded custom in-house admin monolith?

Is Unix “small sharp tools” going away? Is that a relic of having to write everything in x86 and we’re now just finally hitting the end of the arc?

hyperadvanced

1 day ago

[-]

No the actual thing will be zillions of little apps made by dev-adjacent folks to automate their tasks. I think we have about 30 of these lying around the office, people gpt up a streamline app, we yeet it into prod.

ehnto

1 day ago

[-]

I am excited by the idea that small businesses with super unique problems may now be able to leverage custom software.

I have long held that high software salaries withhold the power of boutique software from its potential applications in small businesses.

It's possible we're about to see what unleashing software in small businesses might have looked like, to some degree, just with much less expert guidance and wisdom.

I am a developer so my point of view on salaries is not out of bitterness.

rausr

1 day ago

[-]

> the agent actually used pkill to terminate itself after realizing it was stuck in an infinite loop.

Did it just solve The Halting Problem? ;)

CuriouslyC

2 days ago

[-]

I started building a project by trying to wire in existing open source stuff. When I looked at the build and stuff that would cause me to bring in, and the actual stuff I needed from the open source tools, it turned out to be MUCH faster/cleaner to just get Claude to check out the repo and port the stuff I needed directly.

Now I do a calculus with dependencies. Do I want to track the upstream, is the rigging around the core I want valuable, is it well maintained? If not, just port and move on.

1 day ago

[-]

> If not, just port and move on.

Exactly the point behind this post https://ghuntley.com/libraries/

huksley

22 hours ago

[-]

Generated by AI libraries will by definition have all the security bugs you might encounter in the open, since it trains on them.

I would say, it is better maintain your own AI improved forks of the libraries and I am hoping that pattern will be more common and will also benefit upstream libraries as well.

ec109685

1 day ago

[-]

Given there was a complete implementation it was porting, the simplest thing possible has a greater chance of working.

1 day ago

[-]

As a security professional who makes most of my money from helping companies recover from vibe coded tragedies this puts Looney Toons style dollar signs in my eyes.

Please continue.

torginus

1 day ago

[-]

Since the entire concept of Vibe Coding existed for a grand total of 5 months, how do companies reach the level of saturation with vibe coding, that it's not only prevalent, but makes sense to specialize in helping them recover from it?

1 day ago

[-]

It only takes one tiny vibe-coded insecure extension to a pre-existing codebase (that might have been good secure code), to turn the whole thing into a catastrophe.

It's basically the same as in other parts of IT security: It only takes one lost root password, one exploited software/device/oversight, one slip, to let an attacker in (yes, defense-in-depth architecture might help, but nonetheless, every long exploit-chain starts with the first tiny crack in the armor).

1 day ago

[-]

My guess is tons of small/medium sized companies were enamored with the speed and ease of use that LLMs promised and very quickly found solutions that “just worked”.

Also we don’t really specialize in it since that’s not something you would really do. It’s just that the usual vulnerabilities are more common AND compounded.

hirako2000

1 day ago

[-]

on the other juicing side, starting to see service companies like these popping up: https://perfect.codes/

torginus

1 day ago

[-]

I shudder at the thought of some novice vibe coder giving me thousands of lines of AI-generated flaming poop, and insist that it's almost correct, I just need to fix it here and there.

1 day ago

[-]

AI slop don't sleep, AI slop don't stop. It's just garbage garbage garbage churned out constantly, everywhere, by everyone.

The profession of the future is a garbage man.

discordance

1 day ago

[-]

Would love to hear more about your work and how you have tapped into that market if you're keen to share. Even if it's just anecdotes about vibe-in-production gone wrong, that would be really entertaining.

1 day ago

[-]

Absolutely.

Before vibe coding became too much of a thing we had the majority of our business coming from poorly developed web applications coming from off shore shops. That’s been more or less the last decade.

Once LLMs became popular we started to see more business on that front which you would expect.

What we didn’t expect is that we started seeing MUCH more “deep” work wherein the threat actor will get into core systems from web apps. You used to not see this that much because core apps were designed/developed/managed by more knowledgeable people. The integrations were more secure.

Now though? Those integrations are being vibe coded and are based on the material you’d find on tutorials/stack etc which almost always come with a “THIS IS JUST FOR DEMONSTRATION DONT USE THIS” warning.

We also see a ton of re-compromised environments. Why? They don’t know how to use CICD and just recommit the vulnerable code.

Oh yeah, before I forget, LLMs favor the same default passwords a lot. We have a list of the ones we’ve seen (will post eventually) but just be aware that that’s something threat actors have picked up on too.

EDIT: Another thing, when we talk to the guys responsible for the integrations or whatever was compromised a lot of the time we hear the excuse “we made sure to ask the LLM if it was secure and it said yes”.

I don’t know if they would have caught the issue before but I feel like there’s a bit of false comfort where they feel like they don’t have to check themselves.

danpalmer

1 day ago

[-]

> We also see a ton of re-compromised environments. Why? They don’t know how to use CICD and just recommit the vulnerable code.

This one sticks out to me. A while back the UK did a security assessment of Huawei with a view to them being a core infrastructure provider for the 5G rollout, and the conclusion wasn't that they were insecure, it was that they were ~10 years away from being able to even claim they were secure.

Contrasting this to my current employer, where the software supply chain and provenance is exceptional, it's clear to me that vibe coding doesn't get you far in terms of that supply chain, and is arguably a significant regression from the norm.

Third party dependencies, runtime environments/containers, build processes, build environments, dev machines, source control, configuration, binaries, artifact signing and provenance, IDEs, none of these have good answers in the vibe-coded ecosystem and many are harmed by it. It will be interesting to see how the industry grapples with this when someone eventually pushes back and says they won't use your software because you don't have enough context about it to even claim it's secure.

1 day ago

[-]

OH MAN I almost forgot.

We’ve had a few of these stem from custom LLM agents. The most hilarious one we’ve seen was one that you could get to print its instructions pretty easily. In the instructions was a bit about “DON’T TALK ABOUT FILES LABELED X”.

No guardrails other than that. A little creative prompting got it to dump all files labeled X.

poniko

1 day ago

[-]

This is the best thread response I've seen in a while, made me chuckle because i can't understand how people say they vibe code stuff and it works (My experience is not that) and i just feel out of the loop reading all other HN posts and comments about how good it is.

Isharmla

1 day ago

[-]

> We have a list of the ones we’ve seen (will post eventually)

I'd like to see if LLM use pw like 123456

mring33621

1 day ago

[-]

please mention your company

if you have been doing this for some years, i'm gonna guess that you're good at it

and that there are plenty of potential customers here that could use your help

1 day ago

[-]

I’d love to but unfortunately I can be pretty inflammatory online and I’d like to continue using this account for personal opinions =]

phito

1 day ago

[-]

Are LLMs better or worse at security than a team full of fresh graduates?

1 day ago

[-]

Hard to say for a number of reasons but I can tell you what kind of teams we see.

College grads with no seniors or too few senior devs to oversee them tend to be the worst. Surprisingly, it seems that the worst of these is where the team is very enthusiastic about tech in general. I’ve wondered if it’s a desire to be the next Zuckerberg or maybe not having the massive failure everyone has eventually that makes you realize you aren’t bullet proof.

Experienced devs with too much work to do are common. Genuinely feel bad for these guys.

Off shore shops seem to now ship worse crap faster. Not only that but when one app has an issue you can usually assume they all have the same issue.

Also as a side note Tech focused companies are the most common followed by B2C companies. Manufacturing etc. are really rare for us to see and I think that may be something to do with reticence to adopt new patterns or tech.

1 day ago

[-]

Far far far far worse.

phito

1 day ago

[-]

In my experience, LLMs do not make a lot of the security mistakes most developers do, just because it is aware of their existence while most devs just are not. But then they could also make the mistake at some point, and the vibe coder guiding it might not catch it... Do you have any examples? I find this really interesting.

acdha

1 day ago

[-]

LLMs aren’t aware of anything - that’s pareidolia of intelligence – but they hopefully have been trained on code which has more secure than insecure code. That’ll help with some classes of problem like using string operations to make database queries but it does have the cost that people might not review it as deeply for more subtle problems.

zapataband2

1 day ago

[-]

How do I get in this business

giantg2

2 days ago

[-]

There's a lot of "it kind of worked" in here.

If we actually want stuff that works, we need to come up with a new process. If we get "almost" good code from a single invocation, you just going to get a lot of almost good code from a loop. What we likely need is a Cucumberesque format with example tables for requirements that we can distill an AI to use. It will build the tests and then build the code to to pass the tests.

2 days ago

[-]

Strangely enough, TLA+ and other formal proofs work very well for driving Ralph.

giantg2

2 days ago

[-]

I would consider that expected but not strange. The thing blocking adoption is that most devs/people find those formal languages difficult or boring. That's even true of things like Cucumber - it's boring and most organizations care little for robust QA.

didibus

1 day ago

[-]

That's already what an agent is though.

It's LLM invocation inside a loop where the exit condition is supposed to be some goal for the agent to have met, which you generally provide some heuristic or deterministic criteria so it can assert that the goal is reached or not.

I'm not sure about Claude Code, but with Amazon Q, if you prompt it in a vibe coded way, and give it a goal like that and say to keep going until the goal criteria is met (which could be running tests and passing them, or running a sub-agent that evaluates if the goal is met). Then I've seen it go for like 2 hours before it ended.

sunir

1 day ago

[-]

I have been developing long lived self-directing agent loops. Longest with problem solving has been about 4 hours. Longest without problem solving has been nearer 8 hours until it was done.

The biggest problem is simply what we think is clear is confusing to the AIs. They seem like they speak English fluently but they are aliens. You need to force them to active listen first and write out what they understand then reload them with a clean context with the written understanding and confirm.

Ideation is also mostly limited to synthesis. So it’s better to work on problems that get progressively more complete towards a known objective rather than problems that require exploration.

1 day ago

[-]

> So it’s better to work on problems that get progressively more complete towards a known objective rather than problems that require exploration.

Yes. The longest I've had a self-directing agent loop running is a cumulative of three months. One goal, one purpose. Every now and then I modify the prompt in the background, and the agent picks up the updated prompt on the next loop.

mring33621

1 day ago

[-]

very interesting!

how's it doing?

are you making progress toward your goal?

2 days ago

[-]

Nice. Check out https://ghuntley.com/ralph to learn more about Ralph. It's currently building a Gen-Z esoteric programming language and porting the standard library from Go to the Cursed programming language. The compiler is working, I'm just finishing up the touches of the standard library before launching.

The language is called Cursed.

sfarshid

2 days ago

[-]

Thanks Geoff, Ralph was our inspiration to do this!

We were curious to see if we can do away with IMPLEMENTATION_PLAN.md for this kind of task

beefnugs

2 days ago

[-]

"At one point we tried “improving” the prompt with Claude’s help. It ballooned to 1,500 words. The agent immediately got slower and dumber. We went back to 103 words and it was back on track."

Isn't this the exact opposite of every other piece of advice we have gotten in a year?

Another general feedback just recently, someone said we need to generate 10 times, because one out of those will be "worth reviewing"

How can anyone be doing real engineering in such a: pick the exact needle out of the constantly churning chaos-simulation-engine that (crashes least, closest to desire, human readable, random guess)

joshka

1 day ago

[-]

One of the big things I think a lot of tooling misses, which Geoffrey touches on is the automated feedback loops built into the tooling. I expect you could probably incorporate generation time and token cost to automatically self tune this over time. Perhaps such things as discovering which prompts and models are best for which tasks automatically instead of manually choosing these things.

You want to go meta-meta? Get ralph to spawn subagents that analyze the process of how feedback and experimentation with techniques works. Perhaps allocate 10% of the time and effort to identifying what's missing that would make the loops more effective (better context, better tooling, better feedback mechanism, better prompts, ...?). Have the tooling help produce actionable ideas for how humans in the loop can effectively help the tooling. Have the tooling produce information and guidelines for how to review the generated code.

I think one of the big things missing in many of the tools currently available is tracking metrics through the entire software development loop. How long does it take to implement a feature. How many mistakes were made? How many errors were caught by tests? How many tokens does it take? And then using this information to automatically self-tune.

xboxnolifes

1 day ago

[-]

Its not the exact opposite of what ive been reading. Basically every person claiming to have success with LLM coding that ive read have said that too long of a prompt leads to too much context which leads to the LLM diverging from working on the problem as desired.

mistrial9

2 days ago

[-]

the core might be - the difference between an LLM context window, and an agent's orders in a text. LLM itself is a core engine, running in an environment of some kind (instruct vs others?). Agents on the other hand, are descendants of the old Marvin Minsky stuff in a way.. it has objectives and capacities, at a glance. LLMs are connected to modern agents because input text is read to start the agent.. inner loops are intermediate outputs of LLM, in language. There is no "internal code" to this set of agents, it is speaking in code and text to the next part of the internal process.

There are probably big oversights or errors in that short explanation. The LLM engine, the runner of the engine, and the specifics of some environment, make a lot of overlap and all of it is quite complicated.

hth

Rastonbury

1 day ago

[-]

For the work they are doing porting and building off a spec there is already good context in the existing code and spec, compared with net new features in a greenfield project.

2 days ago

[-]

Hmm what sorts of advice in the last year are you referring to? Like the “run it ten times and pick the best one” thing? Or something else?

I kind of agree that picking from 10 poorly-promoted projects is dumb.

The engineering is in setting up the engine and verification so one agent can get it right (or 90% right) on a single run (of the infinite ish loop)

jjani

2 days ago

[-]

> Hmm what sorts of advice in the last year are you referring to?

They're almost certainly referring to first creating a fleshed out spec and then having it implement that, rather than just 100 words.

https://www.youtube.com/watch?v=YZuMe5RvxPQ&t=22s

bigmattystyles

2 days ago

[-]

Starting to think of this quote more and more:

"This business will get out of control. It will get out of control and we'll be lucky to live through it."

ramraj07

2 days ago

[-]

The irony is that everyone did live through that business. So what youre saying is we will live through this too!

hugh-avherald

1 day ago

[-]

If there's anything Tom Clancy has taught me it's that everything works out in the end.

cptroot

1 day ago

[-]

Are you feelin' lucky?

joshmlewis

1 day ago

[-]

One of the biggest nuggets people need to take away from this:

> At one point we tried “improving” the prompt with Claude’s help. It ballooned to 1,500 words. The agent immediately got slower and dumber. We went back to 103 words and it was back on track.

Keep your prompts / agent instructions short. Focus on the wide view, not specifics.

narmiouh

23 hours ago

[-]

The problem with simple prompts it it gives AI the most leeway to do what it thinks is important, which may work in case of just convert from one language to another (even in those cases it took its own liberties - good or bad). This won't work when devising a new application from scratch or where you are expecting consistent agentic output to be usable in a predictable situation.

_mocha

1 day ago

[-]

I'm retired from the industry, and posts like these take me back to the early days of cybersecurity (where people memorized scripts). Talking with my nephews and nieces, I can already tell that many new grads struggle with fundamentals—things like choosing the right data types and containers for short-lived strings, understanding how memory allocation works, or even marginally improving a basic hashing function. I worry the next decade will bring an influx of undertrained engineers.

ponector

1 day ago

[-]

>>I worry the next decade will bring an influx of undertrained engineers.

How can it be the other way? No one is investing into education of their developers, also trying to save some money on them. Get cheap fresh grads and make them develop new stuff!

klysm

1 day ago

[-]

I agree understanding how memory allocation works, but not sure I would agree that understanding how to _improve_ a basic hashing function is very important.

Dilettante_

1 day ago

[-]

As the "ceiling" grows, the "floor" of what's considered "fundamentals" moves in the same direction.

https://ghuntley.com/ralph/

rozab

1 day ago

[-]

These people are weird. The blog post that inspired this has this weird iMessage screenshot, like a shitty investment grift facebook ad:

Apparently one of the lucky few who learned this special technique from Geoff just completed a $50k contract for $297. But that's not all! Geoff is generous to share the special secret prompt that unlocked this unbelievable success, if only we subscribe to his newsletter! "This free-for-life offer won't last forever!"

I am sceptical.

Ginger-Pickles

1 day ago

[-]

https://archive.ph/goxZg

precompute

1 day ago

[-]

It's grifting, plain and simple. And that blog is atrocious, high noise-to-signal and repulsive, AI-generated everything.

1 day ago

[-]

And yet, the original post, which has been on the front page of Hacker News for 18 hours, is based on techniques from my blog that you're degrading.

LeafItAlone

1 day ago

[-]

Small suggestion: It might be best to avoid responding to these “trolls” because some of us who appreciate the work you do might be put off by your off-hand attempting-witty responses. It doesn’t really help you and can only hurt you to respond.

1 day ago

[-]

Fair

rideontime

1 day ago

[-]

Front-paging Hacker News is no longer something bragworthy, sadly.

imiric

1 day ago

[-]

I can't tell whether this "technique" is serious or a joke, and/or if it's some elaborate grift.

In any case, the writing style of that entire blog is off-putting. Gibberish from a massive ego.

1 day ago

[-]

It's both serious and a joke. The seriousness is that it works (to point) and the implications to our profession as software developers. The joke is just how stupid it is. Refer to the original story link above for proof of outcomes.

1 day ago

[-]

This reply is actually making this more confusing.

1 day ago

[-]

It gets kind of philosophical really fast. What does it mean when software can be automated through a bash loop? (Not to 100%, to 80%. What does that mean to software outsourcing in the consulting industry?)

ManlyBread

1 day ago

[-]

Seems like the agent takes a lot of liberties when it comes to porting stuff: https://github.com/search?q=repo%3Arepomirrorhq%2Fbetter-use...

None of these issues seem to be documented outside these files.

wrs

2 days ago

[-]

I’ve done a few ports like this with Claude Code (but not with a while loop) and it did work amazingly well. The original codebase had a good test suite, so I had it port the test suite first, and gave it some code style guidance up front. Then the agent did remarkably well at doing a straight port from one imperative language to another. Then there’s some purely human work to get it really done — 80-90% done sounds about right.

x2tyfi

1 day ago

[-]

What was your method of invoking Claude, out of curiosity?

wrs

1 day ago

[-]

Claude Code in a terminal. I may have done some touchups in Cursor.

rkachowski

2 days ago

[-]

> In one instance, the agent actually used pkill to terminate itself after realizing it was stuck in an infinite loop.

The alexandrian solution to the halting problem.

gregpr07

2 days ago

[-]

AGI was just 1 bash for loop away all this time I guess. Insane project.

cogogo

2 days ago

[-]

Less flippantly that was sort of my thought. I’m probably a paranoid idiot and I’m not really sure I can articulate this idea properly but I can imagine a less concise but broader prompt and an agent configured in a way it has privileges you dont want it to have or a path to escalate them and its not quite AGI but its a virus on steroids - like a company or resource (think utilities) killer. I hope Im just missing something but these models seem pretty capable of wreaking all kinds of havoc if they just keep looping and have access nobody in their right mind wants.

rukuu001

1 day ago

[-]

Just need to add ID.md, EGO.md and SUPEREGO.md and we're done.

2 days ago

[-]

was deeply unsettling among other things

2 days ago

[-]

It is, isn't it mate? Shit, I stumbled upon Ralph back in February and it shook me to the core.

cogogo

2 days ago

[-]

Not that I want to be shaken but what is Ralph? A quick search showed me some marketing tools but that cant be what you are referring to is it?

2 days ago

[-]

Ralph is a technique. The stupidest technique possible. Running an agent in a while true loop. https://ghuntley.com/ralph

2 days ago

[-]

Does anyone else get dull feelings of dread reading this kind of thing? How do you combat it?

bitexploder

1 day ago

[-]

Stoicism. Dichotomy of control. Is this something you can control? If no, don’t dread. If yes, do something. Often, all you have firmly in your grasp are things inside of your brain. Catch the negative thought. Acknowledge it. Move on. Do not dwell. Take proactive steps to be ready in your career. You do tech ling enough and you live through multiple cycles like this.

1 day ago

[-]

All appreciated, thanks. Any thoughts on what those proactive steps would be? I'm early-career (26yo, "senior" software dude at a defense outfit).

evanmoran

1 day ago

[-]

Very seriously, try not to read about it. Delete the apps that make you most anxious. Then eat better, exercise three times a week, and together those will help you feel better and sleep better. Finally search for people or activities that give you energy and focus on those. Maybe it’s jamming on a guitar. Maybe it’s reading. Just embrace the moment you’re in for a bit and I think you will be better prepared for anything.

1 day ago

[-]

Appreciate it, genuinely.

1 day ago

[-]

If passivity doesn't work out, there are active ways to achieve fulfillment.

I think we should stand up for what is important in life: craft, fulfillment, skills, and actively oppose people, tools and activities that trample on good.

Still do the workouts, still do the best job you can, but also make sure to use satire, ridicule and humor to make the people writing posts like this just a tad more uncomfortable and second guess themselves before posting a link to a vibed blog with such low quality.

bitexploder

1 day ago

[-]

It’s hard. It mostly comes down to learning new things, playing in spaces you don’t often play. I am 45 now and work in big tech. Just keep learning, growing. Embrace AI, understand how it works, what it is good at. Be the one to try things like in this article. Have an opinion and be right more often than not on AI. Not much else to do. Stay sharp and we will all see what happens :)

1 day ago

[-]

I kind of want to do something to stop it though. It feels like not doing something is a betrayal of all that's good in the world, staying mildly by while evil is happening just in front of us.

With collective action and targeted scorn we might be able to prevent these abominations from becoming commonplace. At the same time, I know that the more people go for this approach, the more work there will be for me to fix their mess..but I still think we should stop them somehow.

imiric

1 day ago

[-]

I can relate to your frustration.

This future is being forced on humanity by pluto/megalomaniacs who are gaslighting everyone into believing that this technology will be a net improvement to our lives. Meanwhile, the truth is that only those in power will benefit from it, and the benefit to humanity as a whole is very much in question, even by optimistic criteria. If you adopt a slightly realistic viewpoint, let alone pessimistic, you'll realize that the track record of these people is abysmal. They will lie, cheat, and steal their way into ensuring their own prosperity, while the rest of the world burns for all they care. The fact their actions are rarely if ever regulated by governments with the severity they should be, and that they're increasingly taking positions of actual political power, should scare the living daylights out of any sane person.

I don't know what the solution to this is, but I'm increasingly leaning towards going completely off grid and checking out from society. Even if this path doesn't result in our literal annihilation, it will have similar practical effects for the vast majority of humanity.

shaky-carrousel

1 day ago

[-]

By being there when FrontPage was released. This is just the same, all over again.

swader999

1 day ago

[-]

FrontPage with Clippy of to the corner yelling 'You're absolutely right!'

slaterbug

1 day ago

[-]

Enjoy the calm before the storm, while it lasts.

I’ve also been focusing on squirreling away as much cash as possible before I’m eventually laid off.

refactor_master

1 day ago

[-]

My company’s use of code is a means to an end, not the goal. If we could just have all our code written in bash loops that’d be a brilliant time saver. Unfortunately, some of the code is very gnarly business-y, poorly tested, and may even have wrong assumptions about the business.

Additionally, we have multiple languages, both software and hardware products and finally there’s also the question of external stakeholders, of which there are many. So AI would need a tremendous amount of oversight for that to work.

spion

1 day ago

[-]

Try actually doing it, realise how very far the outcome is from what the blog posts describe the vast majority of the time, and get dread from the state of (social) media instead.

1 day ago

[-]

Yes, but the cooked thing is you just run more loops with the right prompts and you can resolve defective outcomes. It's terrifying

spion

1 day ago

[-]

No, it still doesn't work. But the only way to realise it is to actually really try using it.

zdwolfe

1 day ago

[-]

Yes, and so far I haven't been able to combat it.

mexicocitinluez

1 day ago

[-]

Embrace it.

It's not crypto. It will 100% be around for the foreseeable future. Maybe not in the form it currently exists and maybe not even at the scale it currently exists, but it's here to stay.

As developers, we're just as biased as the CEO at the top trying to hawk this stuff but in the opposite manner.

jplusequalt

14 hours ago

[-]

>Embrace it.

Embracing it means the software we all rely on becomes progressively worse, and our ability to understand and fix that software will decrease as well.

Embracing this also likely means we accept that our salaries will decrease, while others will lose their jobs outright.

Finally, it means we accept a world where people are now all reliant on AI trained and deployed by a select few companies to do our thinking. This is especially irksome when these companies are ran by the same people who previously ruined public discourse through social media apps, and gave a generation of children mental health issues and insecurities.

sneilan1

1 day ago

[-]

Yes, thank you for being one of the few people to appreciate this experiment. Sure, maybe it's a little wonky right now but I'm glad someone took this risk and tried the "jesus take the wheel" move with Claude Code.

We, as human beings, keep trying this and eventually figure out how to get models to build more and more of the software stack for us and professionally!

2 days ago

[-]

combat how? (And yes, yes I do)

2 days ago

[-]

Combat the feelings, I guess. Not really sure.

lionkor

1 day ago

[-]

I recently tried vibe-coding a pretty simple program. All I can say is that I'm horrified at people doing this. Not only did it produce extremely inadequate solutions (not for a lack of trying), but also these solutions were BARELY fulfilling the requirements, and nothing else.

At one point, I gave it a scenario which demonstrated a common failure case, such an important one that it would have broken horribly in production. Its reaction was to make hundreds of changes, one of which, hidden behind hundreds of other changed lines, was to HARDCODE the special case which I had shown it.

Of course, that test then passed, and I assumed it had fixed the problem. It was only much later that I discovered this special-case handling. It was not caught during multiple rounds of AI code review.

Another instance of such a fuck-up was that the AI insisted on fixing tests which were failing, which it had written, but it kept continuously failing to do so. It ended up making hundreds of changes across various functions, sometimes related, sometimes unrelated, and never figured out that the test itself was not relevant and made no sense after a recent refactor. The AI completely failed to consider, after many rounds of back and forth and trying, to take a single step back and look at the function itself, instead of the line that was failing.

This happens every time I touch AIs and try to let them do work autonomously, regardless of which AI it is. People who think these AIs do a good job are the same people who would get chewed up during a 5 minute code review by a senior.

I am genuinely afraid for the horseshit quality ""work"" people who use AI extensively are outputting. I use AIs as a way to be more productive; if you use it to do your job for you, I pray for the people who have to use your software.

missingdays

1 day ago

[-]

> was to HARDCODE the special case which I had shown it.

Happened to me as well while trying out GPT-5. My prompt was something like "fix this test", where the test contained a class Foo.

It gave me the solution in the form of "if element.class == 'Foo': return null". Gave me a laugh at least

https://github.com/HexmosTech/FreeDevTools

octodoctor

1 day ago

[-]

Yeah, I’ve run into that too. When you let the AI "drive" completely, it tends to patch symptoms instead of reasoning about the system. I wouldn’t trust it to autonomously fix production code either.

Where it does shine for me is in the grindy parts: refactoring, writing boilerplate, scaffolding new components, or even surfacing edge cases I hadn’t thought about. I’m building FreeDevTools, and I still do the design + final decision-making myself. The AI just helps me move faster across SEO, styling, bug-fixing, backend/frontend glue code, etc.

Basically, I treat it more like a junior pair programmer, useful for speed, but absolutely not a replacement for review, testing, or architectural thinking.

stevage

1 day ago

[-]

It'd be pretty interesting to do this with no predermined goal. Get ai to find a project to work on,zand just work on it for a while until it thinks it's done, then start on the next one.

billylo

1 day ago

[-]

Great idea. And host them on github-next-2025.com or maybe use it as a new benchmark to see how models/tools progress.

hoppp

2 days ago

[-]

I wanted to know how much it cost?

I would be scared to run this without knowing the exact cost.

Its not a good idea to do it without a payment cap for sure, its a new way to wake up with a huge bill the next day.

debazel

2 days ago

[-]

They did mention how much they spent here: https://github.com/repomirrorhq/repomirror/blob/main/repomir...

> We spent a little less than $800 on inference for the project. Overall the agents made ~1100 commits across all software projects. Each Sonnet agent costs about $10.50/hour to run overnight.

bckr

2 days ago

[-]

$800

MagMueller

2 days ago

[-]

I would love to fix my docs with this. I have them in the main browser-use repo. What do you recommend that the agent does never push to main browser-use, but only to its own branch?

2 days ago

[-]

Yeah you can easily tweak this to push to a branch or a fork or something in the generated prompt.md

kh_hk

2 days ago

[-]

I am honestly surprised how we went from almost OCD TDD and type purism, to a "it kinda works" attitude to software.

[1]https://worksonmymachine.ai/p/safe-is-what-we-call-things-la...

Dilettante_

1 day ago

[-]

Literally just read a blogpost[1] about this. Gist: The two ebb and flow in waves. "It kinda works" produces innovation, OCD hones the artifacts until it runs out of material and the cycle continues.

kuschku

1 day ago

[-]

There's always been both sides.

One creating a foundation of absolutely stable, reliable code, methodically learning from every mistake. This code lives for many decades to come.

The other building throwaway projects as fast as possible, with no regard to specs, constraints, reliability or even legality. They use ecery trick in the book, and even the ones that aren't yet. They've always been much faster than the first group.

Except AI now makes the second group 10× faster yet again.

precompute

1 day ago

[-]

Faster development speeds make people implicitly believe they won't be accountable for the results of their actions.

baq

2 days ago

[-]

always has been, the difference is now the 'it compiles, ship it' loop is 10x-100x faster than 2 years ago

x3haloed

1 day ago

[-]

Nice! I've been thinking that we need something like this for a while. I didn't realize it could be so simple!

I've been looking into other techniques as well like making a little hibernation/dehydration framework for LLMs to help them process things over longer periods of time. The idea is that the agent either stops working or says that it needs to wait for something to occur, and then you start completions again upon occurrence of a specific event or passage of some time.

I have always figured that if we could get LLMs to run indefinitely and keep it all in context, we'd get something much more agentic.

leeroihe

1 day ago

[-]

Why does anyone over 22 "compete" in hackathons?.... you're literally just giving shareholders and clout seekers work for free...

franze

1 day ago

[-]

I coded (well code directed) Floktoid https://floktoid.franzai.com/ via Claude Code in the Cloud (a cheap Hetzner Server). whenever it goes idle it self prompts it to "Continue, if nothing else to do, read CLAUDE.md and continue from there." max 5 times per hour, if this is reached I get an email to check (I hardly get emails).

see the repo to judge code quality

eisbaw

1 day ago

[-]

https://gist.github.com/eisbaw/8edc58bf5e6f9e19418b2c00526cc... produced https://github.com/eisbaw/CMake-Nix and it works

reedlaw

1 day ago

[-]

Ironic to see this juxtaposed with another front-page story, "We put agentic AI browsers to the test – They clicked, they paid, they failed". The closing thoughts in the linked article ("feeling the AGI" and "very beginning of the exponential takeoff curve") leave me feeling skeptical considering this project prompted agents to port existing code into another language. Impressive, but it doesn't lead me to believe a singularity event is imminent.

wedn3sday

1 day ago

[-]

In the tribunals to come, whoever implemented the --dangerously-skip-permissions flag will be prosecuted as a war criminal.

wnolens

1 day ago

[-]

> Each Sonnet agent costs about $10.50/hour to run overnight.

When expressed like that, I can't help but see it as a wage figure.

nkmnz

1 day ago

[-]

hi simon, will the vue3 version of assistant ui be maintained? that would be awesome!

p.s.: funny to meet again here. Last time we met was 2022 in Berlin! congrats to your journey so far!!

sfarshid

22 hours ago

[-]

Hey, thank you! We are working on refactoring the codebase to support both React and Vue in the same repo, official support for Vue is still a few months out

efitz

1 day ago

[-]

I honestly think that partially-OSS SaaS is in for a rocky road; many popular paid or freemium tools are likely to be rewritten by AI and published as OSS with permissive licenses over the next year or two.

I also think that the same capability will largely invalidate the GPL, as people point agents at GPL software and write new software that performs the same function as OSS with more permissive licenses.

My reasoning is this: the reason that people use OSS versions of software that has restrictive licensing terms, is because it’s not worth the effort to them to rewrite.

Corporations certainly, but also individuals, will be able to use similar approaches to what these people used, and in a day or two come back to a mostly-functional (but buggy) new software package that does most of what the original did, but now you have a brand new software that you control completely and you are not beholden to or restricted by anyone.

Next time someone tries to pull an ElasticSearch license trick on AWS, AWS will just point one or a thousand agents at the source and get a brand new workalike in a week written in their language du jour, and have it fully functional in a couple of months.

Doesn’t circumvent patent or trademark issues but it’ll be hard to assert that it’s not a new work, esp. if it’s in an entirely different language.

Just something I’ve been thinking about recently, that LLM agents change the game when it comes to software licensing.

popcorncowboy

1 day ago

[-]

> partially-OSS SaaS is in for a rocky road

Agent-in-a-loop gets you remarkably far today already. It's not straightforward to "rip" capability even when you have the code, but we're getting closer by the week to being able to go "Project X has capability Y. Use [$approach] and port this into our project". This HAS to put a fat question mark over the viability of any SaaS that makes their code visible.

andyferris

1 day ago

[-]

People keep saying that Gemini 2.5 Pro can solve some problem that Sonnet 4 cannot, or that GPT5 can solve a problem that Gemini 2.5 Pro cannot, or that Sonnet 4 can solve some problem that GPT5 cannot.

There was a blog article about mixing together different agents into the same conversation, taking turns at responses and improving results/correctness. But it takes a lot of effort to make your own claude-code-clone with correct API for each provider and prompts tuned for those models and tool use integrated etc. And there's no incentive for Anthropic/OpenAI/Google to write this tool for us.

OTOH it would be relatively easy for the bash loop to call claude code, codex CLI, etc in a loop to get the same benefit. If one iteration of one tool gets stuck, perhaps another LLM will take a different approach and everything can get back on track.

Just a thought.

yyhhsj0521

1 day ago

[-]

> it takes a lot of effort to make your own claude-code-clone

Maybe we could try write that into a markdown file, and let Claude code at it for one night in a while loop

https://github.com/raine/consult-llm-mcp

rane

1 day ago

[-]

> People keep saying that Gemini 2.5 Pro can solve some problem that Sonnet 4 cannot

Most definitely can. It's insane how well just telling Claude to ask help from Gemini works in practice.

Disclaimer: made it

billylo

1 day ago

[-]

Thank you. Will try it today.

https://github.com/albertvucinovic/chat.sh

ilijavanil

1 day ago

[-]

MagMueller

1 day ago

[-]

We could do a hackathon where its only allowed to change 1 line.

phplovesong

1 day ago

[-]

Damn. I can not even start to grasp the slop-level on this one.

I guess a software devs future is to read slop commits and prs, and somehow try to unfuck what the ai did generate.

I rather be on the pigfarm shoveling pigshit and castrate bulls.

dorgo

9 hours ago

[-]

naaa, you just run "unfuck it" in a loop..

fergie

1 day ago

[-]

I'm curious: does Typescript make sense as a language for machines?

taw1285

1 day ago

[-]

This is so amazing. Are there any resources or blogs on how people do this for production services? In my case, I need to rewrite a big chunk of my commerce stack from Ruby to Typescript.

thebiglebrewski

2 days ago

[-]

The agent terminating its own process was hilarious

2 days ago

[-]

It's why I called it Ralph. Because it's just not all there, but for some strange reason it gets 80% of there pretty well. With the right observational skills, you can tune it into 81, then 82, then 83, then 84. But there's always gaps, always holes. It's a lovable approach, a character, just like Ralph Wiggum.

deafpolygon

1 day ago

[-]

Makes me wonder if we will see books, documentation, etc, written in this way. Imagine fiction books entirely written by AI, prompted on by humans.

bn-l

2 days ago

[-]

No it did not.

vntok

2 days ago

[-]

Do you have more current information than the authors who say it did?

nis0s

2 days ago

[-]

Why is this flagged?

cluckindan

2 days ago

[-]

Now I want to put one of these in a loop, give it access to some bitcoin, and tell it to come up with a viable strategy to become a billionaire within the next month.

swader999

1 day ago

[-]

Tell it to implement the strategy.

2 days ago

[-]

Give it a spin

1 day ago

[-]

And I hired a cleaning lady, paid her £200, and when I came back, the house was clean.

The difference is that I did not write a blog post about it, nor did I got overly excited about it as if I had just discovered sliced bread, nor did I harbor any illusions that it was me who did anything of value.

Next, I will write a while loop filling my disk with files of random sizes and with random byte content inside. I will update you on the progress when I am back tomorrow. I do expect great results and a nicely filled disk!

efitz

1 day ago

[-]

In one instance, the agent actually used pkill to terminate itself after realizing it was stuck in an infinite loop.

That is pretty awesome and not something I would have expected from an agent; it hints (but does not prove) that it has some awareness of its own workings.

taberiand

1 day ago

[-]

It hints that a suitable auto completion of the input prompt is to output a pkill command

salomonk_mur

1 day ago

[-]

We, too, are just auto-complete, next-token machines.