Only for people to start screwing around with his database and API keys because the generated code just stuck the keys into the Javascript and he didn't even have enough of a technical background to know that was something to watch out for.
IIRC he resorted to complaining about bullying and just shut it all down.
First, use Claude's plan mode, which generates a step-by-step plan that you have to approve. One tip I've seen mentioned in videos by developers: plan mode is where you want to increase to "ultrathink" or use Opus.
Once the plan is developed, you can use Sonnet to execute the plan. If you do proper planning, you won't need to worry about Claude skipping things.
It's a bit annoying having to swap back and forth tbh.
I also find planning to be a bit vague, where as i feel like sonnet benefits from more explicit instructions. Perhaps i should push it to reduce the scope of the plan until it's detailed enough to be sane, will give it a try
> Tries to fix some tests for a while > Fails and just .skip the test
I thought we are currently in it now ?
Sure there are many more people building slop with AI now, but I meant the peak of "vibe coding" being parroted around everywhere.
I feel like reality is starting to sink in a little by now as the proponents of vibe coding see that all the companies telling them that programming as a career is going to be over in just a handful of years, aren't actually cutting back on hiring. Either that or my social media has decided to hide the vibe coding discourse from me.
As an old man, this is hilarious.
One trick is to write goto statements that don't go anywhere.
So I ran a bourn shell in my emacs, which was the style at the time.
Now just to build the source code cost an hour, and in those days, timesheets had hours on them.
Take my five hours for $20, we'd say.
They didn't have blue checkmarks, so instead of tweeting, we'd just finger each other.
The important thing was that I ran a bourn shell in my emacs, which was the style at the time...
In those days, we used to call it jiggle coding.
Seems like he's still going on about being able to replicate billion dollar companies' work quickly with AI, but at least he seems a little more aware that technical understanding is still important.
Unless models get better people are not going to pay more.
If people stop bothering to ask and answer questions online, where will the information come from?
Logically speaking, if there's going to be a continuous need for shared Q&A (which I presume), there will be mechanisms for that. So I don't really disagree with you. It's just that having the model just isn't enough, a lot of the time. And even if this sorts itself out eventually, we might be in for some memorable times in-between two good states.
The biggest 'rug pull' here is that the coding agent company raises there price and kills you're budget for "development."
I think a lot of MBA types would benefit from taking a long look at how they "blew up" IT and switched to IaaS / Cloud and then suddenly found their business model turned upside down when the providers decided to up their 'cut'. It's a double whammy, the subsidized IT costs to gain traction, the loss of IT jobs because of the transition, leading to to fewer and fewer IT employees, then when the switch comes there is a huge cost wall if you try to revert to the 'previous way' of doing it, even if your costs of doing it that way would today would be cheaper than the what the service provider is now charging you.
Spending a bunch of money on GPUs and running them yourself, as well as using tools that are compatible with Ollama/OpenAI type APIs feels like a safe bet.
Though having seen the GPU prices to get enough memory to run anything decent, I feel like the squeeze is already happening there at a hardware level and options like Intel Arc Pro B60 can't come soon enough!
This feels like a bit of a leap?
That's like saying "I just bought the JetBrains IDE Ultimate pack and some other really cool tools, so we no longer need a founding engineer!" All of that AI stuff can just be a force multiplier and most attempts at outright replacing people with them are a bit shortsighted. Closer to a temporary and somewhat inconsistent freelance worker, if anything.
That said, not wanting to pay for AI tools if they indeed help in your circumstances would also be like saying "What do you need JetBrains IDEs for, Visual Studio Code is good enough!" (and sometimes it is, so even that analogy is context dependent)
I'm reminded of rule 9 of the Joel Test: https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...
I've been using a tool I developed (https://github.com/stravu/crystal) to run several sessions in parallel. Sometimes I will run the same prompt multiple times and pick the winner, or sometimes I'll be working on multiple features at once, reviewing and testing one while waiting on the others.
Basically, with the right tooling you can burn tokens incredibly fast while still receiving a ton of value from them.
AI is a large motivating factor in data center build outs, and data centers are projected to form an increasing portion of new energy usage. An individual query may not use much but the macro effect is quite serious, especially considering the climate crisis we are already failing to manage. It’s a bit like throwing plastic out your window on the highway and ignoring the garbage patch floating in the middle of the Pacific.
But based on my costs, yours sounds much much higher :)
Love the idea by the way! We do need new IDE features which are centered around switching between Git worktrees and managing multiple active agents per worktree.
Edit: oh, do you invoke normal CC within your tool to avoid this issue and then post-process?
Surprised that this works, but useful if true.
`pathToClaudeCodeExecutable`!
I'm on $100 and i'm shocked how much usage i get out of Sonnet, while Opus feels like no usage at all. I barely even bother with Opus since most things i want to do just runout super quick.
Usage for Opus is my only "complaint", but i've used it so little i don't even know if it's that much better than Sonnet. As it is, even with more generous Opus limits i'd probably want a more advanced Claude Code behavior - where it uses Opus to plan and orchestrate, and Sonnet would do the grunt work for cheaper tokens. But i'm not aware of that as a feature atm.
Regardless, i'm quite pleased with Claude Code on $100 Max. If it was a bit smarter i might even upgrade to $200, but atm it's too dumb to give it more autonomy and that's what i'd need for $200. Opus might be good enough there, but $100 Opus limits are so low i've not even gotten enough experience with it to know if it's good enough for $200
I use and abuse mine, running multiple agents, and I know that I'd spend the entire month of fees in a few days otherwise.
So it seems like a ploy to improve their product and capture the market, like usual with startups that hope for a winner-takes-all.
And then, like uber or airbnb, the bait and switch will raise the prices eventually.
I'm wondering when the hammer will fall.
But meanwhile, let's enjoy the free buffet.
In their dreams.
Does "one" Claude Opus instance count as the full model being loaded onto however many GPUs it takes ?
For anything moderately complex, use Claude's plan mode; you get to approve the plan before turning it loose. The planning phase is where you want to use a more sophisticated model or use extended thinking mode.
Once you have a great plan, you can use a less sophisticated model to execute it.
Even if you're a great programmer, you may suck at prompting. There's an art and a science to prompting; perhaps learn about it? [1]
Don't forget; in addition to telling Claude or any other model what to do, you can also tell them what not to do in the CLAUDE.md or equivalent file.
[1]: https://docs.anthropic.com/en/docs/build-with-claude/prompt-...
But $200/month is unbearable for open source / free software developers.
Last I checked no one is still there who was there originally, except the vendor. And the vendor was charging around $90k/mo for integration services and custom development in 2017 when my team was let go. My team was around $10k/mo including rent for our cubicles.
That was another weird practice I've never seen elsewhere, to pay rent, we had to charge the other departments for our services. They turned IT and infrastructure into a business, and expected it to turn a profit, which pissed off all the departments who had to start paying for their projects, so they started outsourcing all development work to vendors, killing our income stream, which required multiple rounds of layoffs until only management was left.
Second largest private university of my state, 30000 students. They cut 5 software development positions that were halfway on their rewrite, then purchased a blank slate ERP for 1 million (50% discount, imagine that!), and had spent a few years and around 2-3 million on customising said ERP with consultants by the time I left them.
He would have considered that company to be running a perfectly controlled cost experiment. Though it was so perfectly controlled they forgot that humans actually did the work. With cost accounting projects, you pay morale and staffing charges well after the project itself was costed.
I hadn’t thought of that since the late 90s. Good comment but how the heck did I get that old??? :)
They even had one of their vendors extend a job offer to me for slightly more than I was making, but I couldn't in good conscience take that offer. Fool me once, and all that.
But yeah, doesn't explain non-payment for AI tools.
Current job "permits" Claude usage, but does not pay for it.
That seems like the worst of all worlds from their perspective.
By not paying for it they introduce a massive security concern.
My read was the article takes it as a given that $200/m is worth it.
The question in the article seems more: is an extra $800/m to move from Claude Code to an agent using o3 worth it?
Here in EU, if not stated in your work agreement, it's pretty common people work full time job and also as a self-employed contractor for other companies.
So when I'm finished with my work, HO of course, I just work on my "contractor" projects.
Honestly, I wouldn't sign a full time contract banning me from other work.
And if you have enough customers, you just drop full time job. And just pay social security and health insurance, which you must pay by law anyway.
And specially in my country, it's even more ridiculous that as self-employed you pay lower taxes than full time employees, which truth to be told are ridiculously high. Nearly 40% of your salary.
Freelancing as a side hustle may be forbidden if your employer refuses
And it makes sense to pay more taxes since you also have more social benefits (paid leaves, retirement money and unemployment money), nothing is free
There are always loopholes and ways to work around which our tax code will happily discover and kill year on year.
So you get to pick how you want to pay tax but the amount usually isn’t much different when you get to the highest brackets
First time I'm hearing this. Where in the EU are you? I don't know anybody doing this, but it could depend on the country (I'm in the nordics).
Absolutely not a common thing in my corner of the EU.
Any employer with 2 brain cells will figure out that you are more productive as a developer by using AI tools, they will mandate all developers use it. Then that's the new bar and everyone's salary stays the same.
By the way, this also applies to the "Free market" ideal...
There being problems with absolute libertarian free markets doesn't mean all policies that evoke the free market ideal must be disregarded, nor does the problems with communism mean that all communist actions must be ignored.
We can see a problem with an ideal, but still wish to replicate the good parts.
For example, mislabelling socialism as communism. The police department, fire department, and roads are all socialist programs. Only a moron would call this communism and yet for some reason universal healthcare...
There's also this nonsense when someone says "That's the free market at work", and I'm like, if we really lived in a free market then you'd be drinking DuPont's poison right now.
Using the words "Communism" and "Free market" just show a (often intentional) misunderstanding of the nuance of how things actually work in our society.
The communism label must be the most cited straw man in all of history at this point.
There is nothing ideal about communism. I'd rather own my production tools and be as productive as I want to be. I'd rather build wealth over trading opportunities, I'd rather hire people and reinvest earnings. That is ideal.
If you don't address that, you'll end up with a "dictatorship of the proletariat".
Who in the actual real world with any authority at all is telling you you can't be as productive as you want to be, build wealth, hire people, and reinvest your earnings?
Just because it hasn't been "successfully implemented" according to your personal opinion doesn't mean it cannot be scrutinized.
That's like if there is a sign that says "do not cross 3km/h" when someone says "that's too slow" you go "a-hah! straw man! How do you know you can't go 300kmph with that in place? nobody implemented that sign before!". Socrates would be proud.
OK but that's irrelevant to my post. There's lots of books and manifestos that say lots of stupid things. You're arguing as if this manifesto is a real threat, and I'm saying "show me this threat". This isn't a real person with any impact on your day to day, like say a politician. It's a fantasy opposition.
> Just because it hasn't been "successfully implemented" according to your personal opinion doesn't mean it cannot be scrutinized.
OK sure, where? Where is this real world communism that meets the manifesto you are railing against?
> That's like if there is a sign that says "do not cross 3km/h" when someone says "that's too slow" you go "a-hah! straw man! How do you know you can't go 300kmph with that in place? nobody implemented that sign before!". Socrates would be proud.
OK that's an awkward analogy. It's more like someone wrote a manifesto that said cars shouldn't go over 3km/h and you want to use this "slow manifesto" to argue that any laws that would slow you down are some sort of slippery slope in to "slowmunism".
No one with any authority in the real world is trying to implement the communist manifesto on to you. Not even the terrifying Bernie Sanders wants anything to do with communism. For the love of god, there is no communist threat. You can relax.
But I get it. You are basically arguing that nothing and nobody exists or ever existed or do or does anything to anything or anyone or had any ideas and arguing ideas or what people do or could do or would do is pointless.
Well, have fun with that. Sorry all this thread space was a waste.
And so now you are just putting words in my mouth I assume because you have no argument. I can’t even parse this.
You started an argument with a stance I never took by railing against a bogeyman I never advanced. And now you’re doing it again.
If communism was anything more than an impractical ideal then you should have been able to point out where it actually exists. But of course it doesn’t exist. It’s just a fantasy. Maybe you want it to exist so you can point a finger and say “see what happens when you don’t do what I want?”?
your salary stays x1
and your work hours stay x1
Productivity multiplies x2 You keep your job x0.5 Your salary x0.8 (because the guy we just fired will gladly do your job for less) Your work hours x1.4 (because now we expect you to do the work of 2 people, but didn’t account for all the overhead that comes with it)
tbh, if im gonna bust my ass I'd rather own the thing.
99% of startups die off worthless and your equity never realises.
Capitalism encourages you to put your butt in your own seat and reap the rewards of your efforts.
Of course it also provides you the decision making to keep your butt in someone else’s seat if the risk vs. reward of going your own isn’t worth it.
And then it allows your employer to put another butt in your seat if you don’t adopt efficiency patterns.
So: capitalism is compatible with communism as an option, but it’s generally a suboptimal option for one or both parties.
but the state keeps meddling and making oligarchs and friends have unfair advantages.
It's hard to compete when the system is rigged from the start.
In true capitalist market you end up with oligarchy.
This might also just be a feature of the change in problem size - perhaps the larger problems that necessitate o3 are also too open-ended and would require much more planning up front. But at that point it's actually more natural to just iterate with sonnet and stay in the driver's seat a bit. Plus sonnet runs 5x faster.
I'm fortunate in that my own use of the AI tools I'm personally paying for is squished into my off-time on nights and weekends, so I get buy with a $20/month Claude subscription :).
Sources
Is it? Many hobbies cost much more money. A nice bike (motorbike or road bike, doesn't matter), a sailing boat, golf club/trips, a skiing season pass ... $100/month is significantly less than what you'd burn with those other things. Sure you can program in your free time without such a subscription, and if you enjoy that then by all means, but if it takes away the grunt work and you are having more fun, I don't see the issue.
Gym memberships are in that order of magnitude too, even though you could use some outdoor gym in a city park for free. Maybe those indoor perks of heating, lights, roof and maintained equipment are worth sth? Similar with coding agents for personal projects...
That's the only reason I subscribed to GitHub Copilot. Currently using it for Aider.
Can't have your cake and eat it too.
Behold the holy trifecta of: Number of Projects - Code Quality - Coding Agent Cost
Can anyone name one single widely-used digital product that does _not_ have to be precisely correct/compatible/identical to The Original and that everyone _does_ pay $200/month for?
Therefore, should prices that users pay get anywhere even close to that number, there will naturally be opportunities for competitors to bring prices down to a reasonable level.
As to the nailgun thing, that's an interesting analogy, I'm actually building my own house right now entirely with hand tools, it's on track to finish in 1/5 the time some of this mcmansions do with 1/100th of the cost because I'm building what I actually need and not screwing around with stuff for business reasons. I think you'll find software projects are more similar to that than you'd expect.
My point was not that AI will necessarily be cheaper to run than $200, but that there is not much profit to be made. Of course the cost of inference will form a lower bound on the price as well.
If we all go that way, there might be no new haskells and webassemblies in the future.
"given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content"
Source: Gemini 1.5's paper from March 2024 https://storage.googleapis.com/deepmind-media/gemini/gemini_...
Solo developer doing small jobs but I code every day and $10 per month would be a busy month for me. I still read every line of code though..
Wow. It really is like a ridiculous, over-confident, *very* junior developer.
In my experience, o4-mini-high is good enough, even just through the chat interface
Cursor et al can be more comfy because they have access to the files directly. But when working on a sufficiently large/old/complex code base, the main limitation is the human in the loop and managing context, so things end up evening out. Not only that, but a lot of times it’s just easier/better to manually feed things to ChatGPT/Claude - that way you get to more carefully curate and understand the tasks and the changes
I still haven’t seen any convincing real life scenario with larger in-production code bases in which agents are able to autonomously write most of the code
If anyone has a video/demo, would love to see it
A breakpoint in a debugger is much much quicker than feeding the AI all the context needed and then confirming it didn’t miss some flow in some highly abstract code
Also, local agents miss context all the time, at which point you need to manually start adding/managing the context anyway
And, if you are regularly working on the codebase, at some point you’ll probably have better in-brain initial context than the agent
* It's not clear on how much revenue or new customers is generated by using a coding agent
* It's not clear on how things are going on production. There's only talks about development in the article
I feel ai coding agents will give you the edge. Just this article doesn't talk about revenue or PnL side of things, just perceived costs saved from not employing an engineer.
It will instead sign a deal with Microsoft for ai that is 'good enough' and limit expensive ai to some. Or being in the big consultancys as usual to do the projects.
Having typing skills >= 120 wpm will triple your efficacy.
I worry, with an article like this floating around, and with this as the competition, and with the economics of all this stuff generally... major price increases are on the horizon.
Businesses (some) can afford this, after all it's still just a portion of the costs of a SWE salary (tho $1000/m is getting up there). But open source developers cannot.
I worry about this trend, and when the other shoe will drop on Anthropic's products, at least.
I'm very bullish on the future of smaller, locally-run models, myself.
That said, I suspect a lot of the value in Claude Code is hand-rolled fined-tuned heuristics built into the tool itself, not coming from the LLM. It does a lot of management of TODO lists, backtracking through failed paths, etc which look more like old-school symbolic AI than something the LLM is doing on its own.
Replicating that will also be required.
The underlying inference is not super expensive. All the tricks they're pulling to make it smarter certainly multiply the price, but the price being charged almost certainly covers the cost. Basic inference on tuned base models is extremely cheap. But certainly it looks like Anthropic > OpenAI > Google in terms of inference cost structure.
Prices will only come up if there's a profit opportunity; if one of the vendors has a clear edge and gains substantial pricing power. I don't think that's clear at this point. This article is already equivocating between o3 and Opus.
You wouldn't gain anything from asking the most expensive model to adjust some css.
But broadly agree to the argument of the post - just spending more might still be worth it.
Everyone time I try it I find it to be useless compared to Claude or Gemini.
I occasionally use ChatGPT (free version without logging in) and the amount of times it's really wrong is very high. Often times it takes a lot of prompting and feeding it information from third party sources for it to realize it has incorrect information and then it corrects itself.
All of these prompts would be using money on a paid plan right?
I also used Cursor (free trial on their paid plan) for a bit and I didn't find much of a difference. I would say whatever back-end it was using was possibly worse. The code it wrote was busted and over engineered.
I want to like AI and in some cases it helps gain insight on something but I feel like literally 90% of my time is it prodiving me information that straight up doesn't work and eventually it might work but to get there is a lot of time and effort.
1. Go to https://aider.chat/docs/leaderboards/ and pick one of the top (but not expensive) models. If unsure, just pick Gemini 2.5 Pro (not Flash).
2. Get API access.
3. Find a decent tool (hint: Aider is very good and you can learn the basics in a few minutes).
4. Try it on a new script/program.
5. (Only after some experience): Read people's detailed posts describing how they use these tools and steal their ideas.
Then tell us how it went.
[1] https://openrouter.ai No affiliation
It took some time for me to learn how to use agents, but they are very powerful once you get the hang of it.
Claude Pro + Projects is a good middle ground between the two. Things didn't really "click" for me as a non-developer until I got access to both.
I'm actually employer mandated to continue to try/use AI bots / agents to help with coding tasks. I'm sort of getting them to help me but I'm still really not being blown away and still tending to prefer not to bother with them with things I'm frequently iterating on, they are more useful when I have to learn some totally new platform/API. Why is that? do we think there's something wrong with me?
I think a lot of this comes down to the context management. I've found that these tools work worse at my current employer than my prior one. And I think the reason is context - my prior employer was a startup, where we relied on open source libraries and the code was smaller, following public best practices regarding code structure in Golang and python. My current employer is much bigger, with a massive monorepo of custom written/forked libraries.
The agents are trained on lots of open source code, so popular programming languages/libraries tend to be really well represented, while big internal libraries are a struggle. Similarly smaller repositories tend to work better than bigger ones, because there is less searching to figure out where something is implemented. I've been trying some coding agents with my current job, and they spend a lot more time searching through libraries looking to understand how to implement or use something if it relies on an internal library.
I think a lot of these struggles and differences are also present with people, but we tend to discount this struggle because people are generally good at reasoning. Of course, we also learn from each task, so we improve over time, unlike a static model.
I would invert the question, how can you think it's a waste (for OP) if they're willing to spend $1000/mo on it? This isn't some emotional or fashionable thing, they're tools, so you'd have to assume they derive $1000 of value.
> free version... the amount of times it's really wrong is very high... it takes a lot of prompting and feeding it information from third party
Respectfully, you're using it wrong, and you get what you paid for. The free versions are obviously inferior, because obviously they paywall the better stuff. If OP is spending $50/day, why would the company give you the same version for free?
The original article mentions Cursor. With (paid) cursor, the tool automatically grabs all the information on behalf of the user. It will grab your code, including grepping to find the right files, and it will grab info from the internet (eg up to date libraries, etc), and feed that into the model which can provide targeted diffs to update just select parts of a file.
Additionally, the tools will automatically run compiler/linter/unit tests to validate their work, and iterate and fix their mistakes until everything works. This write -> compile -> unit test -> lint loop is exactly what a human will do.
I used the paid (free trial) version of Cursor to look at Go code. I used the free version of ChatGPT for topics like Rails, Flask, Python, Ansible and various networking things. These are all popular techs. I wouldn't describe either platform as "good" if we're measuring good by going from an idea to a fully working solution with reasonable code.
Cursor did a poor job. The code it provided was mega over engineered to the point where most of the code had to be thrown away because it missed the big picture. This was after a lot of very specific prompting and iterations. The code it provided also straight up didn't work without a lot of manual intervention.
It also started to modify app code to get tests to pass when in reality the test code was the thing that was broken.
Also it kept forgetting things from 10 minutes ago and repeating the same mistakes. For example when 3 of its solutions didn't work, it started to go back and suggest using the first solution that was confirmed to not work (and it even output text explaining why it didn't work just before).
I feel really bad for anyone trusting AI to write code when you don't already have a lot of experience so you can keep it in check.
So far at best I barely find it helpful for learning the basics of something new or picking out some obscure syntax of a tool you don't well after giving it a link to the tool's docs and source code.
You definitely should be skilled in your domain to use it effectively.
If someone spends a lot of money on something but they don't derive commensurate value from that purchase, they will experience cognitive dissonance proportional to that mismatch. But ceasing or reversing such purchases are only some of the possibilities for resolving that dissonance. Another possibility is adjusting one's assessment of the value of that purchase. This can be subconscious and automatic, but it an also involve validation-seeking behaviors like reading positive/affirming product reviews.
In this present era of AI hype, purchase-affirming material is very abundant! Articles, blog posts, interviews podcasts, HN posts.. there's lots to tell people that it's time to "get on board", to "invest in AI" both financially and professionally, etc.
How much money people have to blow on experiments and toys probably makes a big difference, too.
Obviously there are limits and caveats to this kind of distortion. But I think the reality here is a bit more complicated than one in which we can directly read the derived value from people's purchasing decisions.
This is not born out in my personal experience at all. In my experience, both in the physical and software tool worlds, people are incredibly emotional about their tools. There are _deep_ fashion dynamics within tool culture as well. I mean, my god, editors are the prima donna of emotional fashion running roughshod over the developer community for decades.
There was a reason it was called "Tool Time" on Home Improvement.
That’s right. Productivity does go up, but most of these employees aren’t really contributing directly to revenue. There is no code to dollar pipeline. Finishing work faster means some roadmap items move quicker, but they just move quicker toward true bottlenecks that can’t really be resolved quickly with AI. So the engineers sit around doing nothing for longer periods of time waiting to be unblocked. Deadlines aren’t being estimated tighter, they are still as long as ever.
Enjoy this time while it lasts. Someday employers might realize they need to hire less and just cram more work into individual engineers schedules, because AI should supposedly make work much easier.
We are already past that point. The high water mark for Devs was ironically in late 2020 during Covid, before RTO when we were in high demand.
I dont talk about some SV megacorps where better code can directly affect slightly revenue or valuation and thus more time is spend coding and debugging, I talk about basically all other businesses that somehow need developers.
Even if I would be 10x faster project managers would barely notice that. And I would lose a lot of creative fun that good coding tends to bring. Also debugging, 0 help there its all on you and your mind and experience.
Llms are so far banned in my banking megacorp and I aint complaining.