FilterHN

Cursor IDE support hallucinates lockout policy, causes user cancellations

1511 points

by scaredpelican

2 months ago

| past

| 84 comments

| old.reddit.com

| HN

Earlier today Cursor, the magical AI-powered IDE started kicking users off when they logged in from multiple machines.

Like,you’d be working on your desktop, switch to your laptop, and all of a sudden you're forcibly logged out. No warning, no notification, just gone.

Naturally, people thought this was a new policy.

So they asked support.

And here’s where it gets batshit: Cursor has a support email, so users emailed them to find out. The support peson told everyone this was “expected behavior” under their new login policy.

One problem. There was no support team, it was an AI designed to 'mimic human responses'

That answer, totally made up by the bot, spread like wildfire.

Users assumed it was real (because why wouldn’t they? It's their own support system lol), and within hours the community was in revolt. Dozens of users publicly canceled their subscriptions, myself included. Multi-device workflows are table stakes for devs, and if you're going to pull something that disruptive, you'd at least expect a changelog entry or smth.

Nope.

And just as people started comparing notes and figuring out that the story didn’t quite add up… the main Reddit thread got locked. Then deleted. Like, no public resolution, no real response, just silence.

To be clear: this wasn’t an actual policy change, just a backend session bug, and a hallucinated excuse from a support bot that somehow did more damage than the bug itself.

But at that point, it didn’t matter. People were already gone.

Honestly one of the most surreal product screwups I’ve seen in a while. Not because they made a mistake, but because the AI support system invented a lie, and nobody caught it until the userbase imploded.

▲

nerdjon

2 months ago

[-]

There is a certain amount of irony that people try really hard to say that hallucinations are not a big problem anymore and then a company that would benefit from that narrative gets directly hurt by it.

Which of course they are going to try to brush it all away. Better than admitting that this problem very much still exists and isn’t going away anytime soon.

▲

lynguist

2 months ago

[-]

https://www.anthropic.com/research/tracing-thoughts-language...

The section about hallucinations is deeply relevant.

Namely, Claude sometimes provides a plausible but incorrect chain-of-thought reasoning when its “true” computational path isn’t available. The model genuinely believes it’s giving a correct reasoning chain, but the interpretability microscope reveals it is constructing symbolic arguments backward from a conclusion.

https://en.wikipedia.org/wiki/On_Bullshit

This empirically confirms the “theory of bullshit” as a category distinct from lying. It suggests that “truth” emerges secondarily to symbolic coherence and plausibility.

This means knowledge itself is fundamentally symbolic-social, not merely correspondence to external fact.

Knowledge emerges from symbolic coherence, linguistic agreement, and social plausibility rather than purely from logical coherence or factual correctness.

▲

emn13

2 months ago

[-]

While some of what you say is an interesting thought experiment, I think the second half of this argument has, as you'd put it, a low symbolic coherence and low plausibility.

Recognizing the relevance of coherence and plausibility does not need to imply that other aspects are any less relevant. Redefining truth merely because coherence is important and sometimes misinterpreted is not at all reasonable.

Logically, a falsehood can validly be derived from assumptions when those assumptions are false. That simple reasoning step alone is sufficient to explain how a coherent-looking reasoning chain can result in incorrect conclusions. Also, there are other ways a coherent-looking reasoning chain can fail. What you're saying is just not a convincing argument that we need to redefine what truth is.

▲

learningstud

2 months ago

[-]

Validity is not soundness. Wonder why people are just beginning to realize what logicians have been studying for more than a century. This goes to show that most programming was never based on logic but vibes. People have been vibe coding with themselves before AI became prominent.

▲

dcow

2 months ago

[-]

For this to be true everyone must be logically on the same page. They must share the same axioms. Everyone must be operating off the same data and must not make mistakes or have bias evaluating it. Otherwise inevitably sometimes people will arrive at conflicting truths.

In reality it’s messy and not possible with 100% certainty to discern falsehoods and truthoods. Our scientific method does a pretty good job. But it’s not perfect.

You can’t retcon reality and say “well retrospectively we know what happened and one side was just wrong”. That’s called history. It’s not useful or practical working definition of truth when trying to evaluate your possible actions (individually, communally, socially, etc) and make a decision in the moment.

I don’t think it’s accurate to say that we want to redefine truth. I think more accurately truth has inconvenient limitations and it’s arguably really nice most of the time to ignore them.

▲

jimbokun

2 months ago

[-]

> Knowledge emerges from symbolic coherence, linguistic agreement, and social plausibility rather than purely from logical coherence or factual correctness.

This just seems like a redefinition of the word "knowledge" different from how it's commonly used. When most people say "knowledge" they mean beliefs that are also factually correct.

▲

indigo945

2 months ago

[-]

As a digression, the definition of knowledge as justified true belief runs into the Gettier problems:

    > Smith [...] has a justified belief that "Jones owns a Ford". Smith 
    > therefore (justifiably) concludes [...] that "Jones owns a Ford, or Brown 
    > is in Barcelona", even though Smith has no information whatsoever about 
    > the location of Brown. In fact, Jones does not own a Ford, but by sheer 
    > coincidence, Brown really is in Barcelona. Again, Smith had a belief that
    > was true and justified, but not knowledge.

Or from 8th century Indian philosopher Dharmottara:

   > Imagine that we are seeking water on a hot day. We suddenly see water, or so we 
   > think. In fact, we are not seeing water but a mirage, but when we reach the 
   > spot, we are lucky and find water right there under a rock. Can we say that we 
   > had genuine knowledge of water? The answer seems to be negative, for we were 
   > just lucky.

More to the point, the definition of knowledge as linguistic agreement is convincingly supported by much of what has historically been common knowledge, such as the meddling of deities in human affairs, or that the people of Springfield are eating the cats.

▲

dcow

2 months ago

[-]

I don’t think it’s so clear cut… Even the most adamant “facts are immutable” person can agree that we’ve had trouble “fact checking” social media objectively. Fluoride is healthy, meta analysis of the facts reveals fluoride may be unhealthy. The truth of the matter is by and large what’s socially cohesive for doctors’ and dentists’ narrative, that “fluoride is fine any argument to the contrary—even the published meta-analysis—is politically motivated nonsense”.

▲

jimbokun

2 months ago

[-]

You are just saying identifying "knowledge" vs "opinion" is difficult to achieve.

▲

dcow

2 months ago

[-]

No, I’m saying I’ve seen reasonbly minded experts in a field disagree over things-generally-considered-facts. I’ve seen social impetus and context shape the understanding of where to draw the line between fact and opinion. I do not believe there is an objective answer. I fundamentally believe Anthropic’s explanation is rooted in real phenomena and not just a self serving statement to explain AI hallucination in a positive quasi-intellectual light.

▲

CodesInChaos

2 months ago

[-]

> The model genuinely believes it’s giving a correct reasoning chain, but the interpretability microscope reveals it is constructing symbolic arguments backward from a conclusion.

Sounds very human. It's quite common that we make a decision based on intuition, and the reasons we give are just post-hoc justification (for ourselves and others).

▲

RansomStark

2 months ago

[-]

> Sounds very human

well yes, of course it does, that article goes out of its way to anthropomorphize LLMs, while providing very little substance

▲

jimbokun

2 months ago

[-]

Isn't the point of computers to have machines that improve on default human weaknesses, not just reproduce them at scale?

▲

canadaduane

2 months ago

[-]

They've largely been complementary strengths, with less overlap. But human language is state-of-the-art, after hundreds of thousands of years of "development". It seems like reproducing SOTA (i.e. the current ongoing effort) is a good milestone for a computer algorithm as it gains language overlap with us.

▲

floydnoel

2 months ago

[-]

Why would computers have just one “point”? They have been used for endless purposes and those uses will expand forever

▲

learningstud

2 months ago

[-]

Exactly, most of us behave in almost the same as AI does. We finally have a mirror to reflect upon.

▲

throwway120385

2 months ago

[-]

The other very human thing to do is invent disciplines of thought so that we don't just constantly spew bullshit all the time. For example you could have a discipline about "pursuit of facts" which means that before you say something you mentally check yourself and make sure it's actually factually correct. This is how large portions of the populace avoid walking around spewing made up facts and bullshit. In our rush to anthropomorphize ML systems we often forget that there are a lot of disciplines that humans are painstakingly taught from birth and those disciplines often give rise to behaviors that the ML-based system is incapable of like saying "I don't know the answer to that" or "I think that might be an unanswerable question."

▲

dcow

2 months ago

[-]

Are they incapable? Or are they just not taught the discipline?

▲

jerf

2 months ago

[-]

In a way, the main problem with LLMs isn't that they are wrong sometimes. We humans are used to that. We encounter people who are professionally wrong all the time. Politicians, con-men, scammers, even people who are just honestly wrong. We have evaluation metrics for those things. Those metrics are flawed because there are humans on the other end intelligently gaming those too, but generally speaking we're all at least trying.

LLMs don't fit those signals properly. They always sound like an intelligent person who knows what they are talking about, even when spewing absolute garbage. Even very intelligent people, even very intelligent people in the field of AI research are routinely bamboozled by the sheer swaggering confidence these models convey in their own results.

My personal opinion is that any AI researcher who was shocked by the paper lynguist mentioned ought to be ashamed of themselves and their credulity. That was all obvious to me; I couldn't have told you the exact mechanism the arithmetic was being performed (though what is was doing was well in the realm of what I would have expected from a linguistic AI trying to do math), but the fact that its chain of reasoning bore no particular resemblance to how it drew its conclusions was always obvious. A neural net has no introspection on itself. It doesn't have any idea "why" it is doing what it is doing. It can't. There's no mechanism for that to even exist. We humans are not directly introspecting our own neural nets, we're building models of our own behavior and then consulting the models, and anyone with any practice doing that should be well aware of how those models can still completely fail to predict reality!

Does that mean the chain of reasoning is "false"? How do we account for it improving performance on certain tasks then? No. It means that it is occurring at a higher level and a different level. It is quite like humans imputing reasons to their gut impulses. With training, combining gut impulses with careful reasoning is actually a very, very potent way to solve problems. The reasoning system needs training or it flies around like an unconstrained fire hose uncontrollably spraying everything around, but brought under control it is the most powerful system we know. But the models should always have been read as providing a rationalization rather than an explanation of something they couldn't possibly have been explaining. I'm also not convinced the models have that "training" either, nor is it obvious to me how to give it to them.

(You can't just prompt it into a human, it's going to be more complicated than just telling a model to "be carefully rational". Intensive and careful RHLF is a bare minimum, but finding humans who can get it right will itself be a challenge, and it's possible that what we're looking for simply doesn't exist in the bias-set of the LLM technology, which is my base case at this point.)

▲

jmaker

2 months ago

[-]

I haven’t used Cursor yet. Some colleagues have and seemed happy. I’ve had GitHub Copilot on for what feels like a couple years, a few days ago VS Code was extended to provide an agentic workflow, MCP, bring-your-own-key, it interprets instructions in a codebase. But the UX and the outputs are bad in over 3/4 of cases. It’s a nuisance to me. It injects bad code even though it has the full context. Is Cursor genuinely any better?

To me it feels like people that benefit from or at least enjoy that sort of assistance and I solve vastly different problems and code very differently.

I’ve done exhausting code reviews on juniors’ and middles’ PRs but what I’ve been feeling lately is that I’m reviewing changes introduced by a very naive poster. It doesn’t even type-check. Regardless of whether it’s Claude 3.7, o1, o3-mini, or a few models from Hugging Face.

I don’t understand how people find that useful. Yesterday I literally wasted half an hour for a test suite setup a colleague of mine introduced to the codebase that wasn’t good, and I tried delegating that fix to several of the Copilot models. All of them missed the point, some even introduced security vulnerabilities in the process invalidating JWT validation, I tried “vide coding” it till it works, until I gave up in frustration and just used an ordinary search engine, which led me to the docs, in which I immediately found the right knob. I reverted all that crap and did the simple and correct thing. So my conclusion was simple: vibe coding and LLMs made the codebase unnecessarily more complicated and wasted my time. How on earth do people code whole apps with that?

▲

trilbyglens

2 months ago

[-]

I think it works until it doesn't. The nature of technical debt of this kind means you can sort of coast on things until the complexity of the system reaches such a level that it's effectively painted into a corner, and nothing but a massive teardown will do as a fix.

▲

ScottBurson

2 months ago

[-]

> The model genuinely believes it’s giving a correct reasoning chain

The model doesn't "genuinely believe" anything.

▲

nickledave

2 months ago

[-]

Yes

https://link.springer.com/article/10.1007/s10676-024-09775-5

> # ChatGPT is bullshit

> Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters, and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.

▲

skrebbel

2 months ago

[-]

Offtopic but I'm still sad that "On Bullshit" didn't go for that highest form of book titles, the single noun like "Capital", "Sapiens", etc

▲

mvieira38

2 months ago

[-]

Starting with "On" is cooler in philosophical tradition, though, starting in classical and medieval times, e.g. On Interpretation, On the Heavens, etc by Aristotle, De Veritate, De Malo, etc. by Aquinas. Capital is actually "Das Kapital", too

▲

pas

2 months ago

[-]

It's very hipster, Das Kapital. (with the dot/period, check the cover https://en.wikipedia.org/wiki/Das_Kapital#/media/File:Zentra... )

But in English it would be just "Capital", right? (The uncountable nouns are rarely used with articles, it's "happiness" not "the happiness". See also https://old.reddit.com/r/writing/comments/12hf5wd/comment/jf... )

▲

skrebbel

2 months ago

[-]

Yeah so I meant the Piketty book, not Marx. But I googled it and turns out it's actually named "Capital in the Twenty-First Century", which disappoints me even more than "On Bullshit"

▲

pas

2 months ago

[-]

And, for the full picture it's probably important to consider that the main claim of the book is based on very unreliable data/methodology. (Though note that it does not necessarily make the claim false! See [1])

https://marginalrevolution.com/marginalrevolution/2017/10/pi...

And then later similar claims about inequality were similarly made using bad methodology (data).

https://marginalrevolution.com/marginalrevolution/2023/12/th...

[1] "Indeed, in some cases, Sutch argues that it has risen more than Piketty claims. Sutch is rather a journeyman of economic history upset not about Piketty’s conclusions but about the methods Piketty used to reach those conclusions."

▲

skrebbel

2 months ago

[-]

You misunderstand. I never read it. I simply liked the title, at least before I understood "Capital" that wasn't actually the title.

▲

ModernMech

2 months ago

[-]

It's a huge problem. I just can't get past it and I get burned by it every time I try one of these products. Cursor in particular was one of the worst; the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything. How am I supposed to trust and work with such a tool? To me, it seems like the equivalent of lobbing a live hand grenade into your codebase.

Don't get me wrong, I use AI every day, but it's mostly as a localized code complete or to help me debug tricky issues. Meaning I've written and understand the code myself, and the AI is there to augment my abilities. AI works great if it's used as a deductive tool.

Where it runs into issues is when it's used inductively, to create things that aren't there. When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries, and even entire programming languages on occasion. The AI is more than happy to deliver any kind of information you want, no matter how wrong it is.

AI is not a tool, it's a tiny Kafkaesque bureaucracy inside of your codebase. Does it work today? Yes! Why does it work? Who can say! Will it work tomorrow? Fingers crossed!

▲

yodsanklai

2 months ago

[-]

You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

I use AI for rather complex tasks. It's impressive. It can make a bunch of non-trivial changes to several files, and have the code compile without warnings. But I need to iterate a few times so that the code looks like what I want.

That being said, I also lose time pretty regularly. There's a learning curve, and the tool would be much more useful if it was faster. It takes a few minutes to make changes, and there may be several iterations.

▲

ryandrake

2 months ago

[-]

> You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

It sounds like the guys in this article should not have trusted AI to go fully open loop on their customer support system. That should be well understood by all "customers" of AI. You can't trust it to do anything correctly without human feedback/review and human quality control.

▲

schmichael

2 months ago

[-]

> You're not supposed to trust the tool

This is just an incredible statement. I can't think of another development tool we'd say this about. I'm not saying you're wrong, or that it's wrong to have tools we can't just, just... wow... what a sea change.

▲

ModernMech

2 months ago

[-]

Imagine! Imagine if 0.05% of the time gcc just injected random code into your binaries. Imagine, you swing a hammer and 1% of the time it just phases into the wall. Tools are supposed to be reliable.

▲

arvinsim

2 months ago

[-]

There are no existing AI tools that guarantee correct code 100% of the time.

If there is such a tool, programmers will be on path of immediate reskilling or lose their jobs very quickly.

▲

ryandrake

2 months ago

[-]

Imagine if your compiler just randomly and non-deterministically compiled valid code to incorrect binaries, and the tool's developer couldn't really tell you why it happens, how often it was expected to happen, how severe the problem was expected to be, and told you to just not trust your compiler to create correct machine code.

Imagine if your calculator app randomly and non-deterministically performed arithmetic incorrectly, and you similarly couldn't get correctness expectations from the developer.

Imagine if any of your communication tools randomly and non-deterministically translated your messages into gibberish...

I think we'd all throw away such tools, but we are expected to accept it if it's an "AI tool?"

▲

andrei_says_

2 months ago

[-]

Imagine that you yourself never use these tools directly but your employees do. And the sellers of said tools swear that the tools are amazing and correct and will save you millions.

They keep telling you that any employee who highlights problems with the tools are just trying to save their job.

Your investors tell you that the toolmakers are already saving money for your competitors.

Now, do you want that second house and white lotus vacation or not?

Making good tools is difficult. Bending perception (“is reality”) is easier and enterprise sales, just like good propaganda, work. The gold rush will leave a lot of bodies behind but the shovelmakers will make a killing.

▲

ModernMech

2 months ago

[-]

I feel like there's a lot of motivated reasoning going on, yeah.

▲

arvinsim

2 months ago

[-]

If you think of AI like a compiler, yes we should throw away such tools because we expect correctness and deterministic outcomes

If you think of AI like a programmer, no we shouldn't throw away such tools because we accept them as imperfect and we still need to review.

▲

bigstrat2003

2 months ago

[-]

> If you think of AI like a programmer, no we shouldn't throw away such tools because we accept them as imperfect and we still need to review.

This is a common argument but I don't think it holds up. A human learns. If one of my teammates or I make a mistake, when we realize it we learn not to make that mistake in the future. These AI tools don't do that. You could use a model for a year, and it'll be just as unreliable as it is today. The fact that they can't learn makes them a nonstarter compared to humans.

▲

learningstud

2 months ago

[-]

Edsgar Dijkstra!

▲

ToValueFunfetti

2 months ago

[-]

If the only calculators that existed failed at 5% of the calculations, or if the only communication tools miscommunicated 5% of the time, we would still use both all the time. They would be far less than 95% as useful as perfect versions, but drastically better then not having the tools at all.

▲

gitremote

2 months ago

[-]

Absolutely not. We'd just do the calculations by hand, which is better than running the 95%-correct calculator and then doing the calculations by hand anyway to verify its output.

▲

ToValueFunfetti

2 months ago

[-]

Suppose you work in a field where getting calculations right is critical. Your engineers make mistakes less than .01% of the time, but they do a lot of calculations and each mistake could cost $millions or lives. Double- and triple-checking help a lot, but they're costly. Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work. Shall I throw it away?

Unreliable tools have a good deal of utility. That's an example of them helping reduce the problem space, but they also can be useful in situations where having a 95% confidence guess now matters more that a 99.99% confidence one in ten minutes- firing mortars in active combat, say.

There's situations where validation is easier than computation; canonically this is factoring, but even division is much simpler than multiplication. It could very easily save you time to multiply all of the calculator's output by the dividend while performing both a multiplication and a division for the 5% that are wrong.

edit: I submit this comment and click to go the front page and right at the top is Unsure Calculator (no relevance). Sorry, I had to mention this

▲

diputsmonro

2 months ago

[-]

> Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work.

The problem is that you don't know which 5% are wrong. The AI is confidently wrong all the time. So the only way to be sure is to double check everything, and at some point its easier to just do it the right way.

Sure, some things don't need to be perfect. But how much do you really want to risk? This company thought a little bit of potential misinformation was acceptable, and so it caused a completely self inflicted PR scandal, pissed off their customer base, and lost them a lot of confidence and revenue. Was that 5% error worth it?

Stories like this are going to keep coming the more we rely on AI to do things humans should be doing.

Someday you'll be affected by the fallout of some system failing because you happen to wind up in the 5% failure gap that some manager thought was acceptable (if that manager even ran a calculation and didn't just blindly trust whatever some other AI system told them) I just hope it's something as trivial as an IDE and not something in your car, your bank, or your hospital. But certainly LLMs will be irresponsibly shoved into all three within the next few years, if it's not there already.

▲

ToValueFunfetti

2 months ago

[-]

>The problem is that you don't know which 5% are wrong

This is not a problem in my unreliable calculator use-cases; are you disputing that or dropping the analogy?

Because I'd love to drop the analogy. You mention IDEs- I routinely use IntelliJ's tab completion, despite it being wrong >>5% of the time. I have to manually verify every suggestion. Sometimes I use it and then edit the final term of a nested object access. Sometimes I use the completion by mistake, clean up with backspace instead of undo, and wind up submitting a PR that adds an unused dependency. I consider it indispensable to my flow anyway. Maybe others turn this off?

You mention hospitals. Hospitals run loads of expensive tests every day with a greater than 5% false positive and false negative rate. Sometimes these results mean a benign patient undergoes invasive further testing. Sometimes a patient with cancer gets told they're fine and sent home. Hospitals continue to run these tests, presumably because having a 20x increase in specificity is helpful to doctors, even if it's unreliable. Or maybe they're just trying to get more money out of us?

Since we're talking LLMs again, it's worth noting that 95% is an underestimate of my hit rate. 4o writes code that works more reliably than my coworker does, and it writes more readable code 100% of the time. My coworker is net positive for the team. His 2% mistake rate is not enough to counter the advantage of having someone there to do the work.

An LLM with a 100% hit rate would be phenomenal. It would save my company my entire salary. A 99% one is way worse; they still have to pay me to use it. But I find a use for the 99% LLM more-or-less every day.

▲

gitremote

2 months ago

[-]

> This is not a problem in my unreliable calculator use-cases; are you disputing that or dropping the analogy?

If you use an unreliable calculator to sum a list of numbers, you then need to use a reliable method to sum the numbers to validate that the unreliable calculator's sum is correct or incorrect.

▲

ToValueFunfetti

2 months ago

[-]

Yes, so in my first example in the GP, this happens first. Humans do the work. The calculator double checks and gives me a list of all errors plus 5% of the non-errors, and I only need to double check that list.

In my third example, the calculator does the hard work of dividing, and humans can validate by the simpler task of multiplication, only having to do extra work 5% of the time.

(In my second, the unreliablity is a trade-off against speed, and we need the speed more.)

In all cases, we benefit from the unreliable tool despite not knowing when it is unreliable.

▲

gitremote

2 months ago

[-]

In your first example, you appear to assume that for calculations where "each mistake could cost $millions or lives", engineers who calculated by hand typically didn't double-check by redoing the calculation, so a second check with a 95% accuracy tool is better than nothing. This assumption is false. I suggest you watch the 2016 film Hidden Figures to understand the level of safety at NASA when calculations were done by hand. You are suggesting lowering safety standards, not increasing them.

Your third example is unclear. No calculators can perform factoring of large numbers, because that is the expected ability of future quantum computers that can break RSA encryption. It is also unclear why multiplication and division have different difficulties, when dividing by n is equal to multiplying by 1/n.

▲

Tainnor

2 months ago

[-]

> It is also unclear why multiplication and division have different difficulties, when dividing by n is equal to multiplying by 1/n.

Well sure, but once you multiply by 1/n you leave N (or Z) and enter Q, and I suspect that's what makes it more difficult because Q is just a much more complex structure because it formally consists of equivalence relations. In fact it's easy to divide an integer x by an integer y, it's just x/y ... the problem is that we usually want the fraction in lowest terms, though.

▲

ToValueFunfetti

2 months ago

[-]

>you appear to assume that for calculations where "each mistake could cost $millions or lives", engineers who calculated by hand typically didn't double-check by redoing the calculation

Not at all! For any n extra checks, having an n+1 phase that takes a 20th of the effort is beneficial. I did include triple-checks to gesture at this.

>It is also unclear why multiplication and division have different difficulties, when dividing by n is equal to multiplying by 1/n.

This actually fascinates me. Computers and human both take longer to divide than to multiply (in computers, by roughly an order of magnitude!) I'm not really sure why this is in a fundamental information theory kind of way, but it being true in humans is sufficient to make my point.

To address your specific criticism: you haven't factored out the division there, you've just changed the numerator to 1. I'd much rather do 34/17 in my head than 34 * (1/17).

▲

ModernMech

2 months ago

[-]

I'd like to second the point made to you in this thread that went without reply: https://news.ycombinator.com/item?id=43702895

It's true that we use tools with uncertainty all the time, in many domains. But crucially that uncertainty is carefully modeled and accounted for.

For example, robots use sensors to make sense of the world around them. These sensors are not 100% accurate, and therefore if the robots rely on these sensors to be correct, they will fail.

So roboticists characterize and calibrate sensors. They attempt to understand how and why they fail, and under what conditions. Then they attempt to cover blind spots by using orthogonal sensing methods. Then they fuse these desperate data into a single belief of the robot's state, which include an estimate of its posterior uncertainty. Accounting for this uncertainty in this way is what keeps planes in the sky, boats afloat, and driverless cars on course.

With LLMs It seems like we are happy to just throw out all this uncertainty modeling and to leave it up to chance. To draw an analogy to robotics, what we should be doing is taking the output from many LLMs, characterizing how wrong they are, and fusing them into a final result, which is provided to the user with a level of confidence attached. Now that is something I can use in an engineering pipeline. That is something that can be used as a foundation to something bigger.

▲

ToValueFunfetti

2 months ago

[-]

>went without reply

Yeah, I was getting a little self-conscious about replying to everyone and repeating myself a lot. It felt like too much noise.

But my first objection here is to repeat myself- none of my examples are sensitive to this problem. I don't need to understand what conditions cause the calculator/IDE/medical test/LLM to fail in order to benefit from a 95% success rate.

If I write a piece of code, I try to understand what it does and how it impacts the rest of the app with high confidence. I'm still going to run the unit test suite even if it has low coverage, and even if I have no idea what the tests actually measure. My confidence in my changes will go up if the tests pass.

This is one use of LLMs for me. I can refactor a piece of code and then send ChatGPT the before and after and ask "Do these do the same thing". I'm already highly confident that they do, but a yes from the AI means I can be more confident. If I get a no, I can read its explanation and agree or disagree. I'm sure it can get this wrong (though it hasn't after n~=100), but that's no reason to abandon this near-instantaneous, mostly accurate double-check. Nor would I give up on unit testing because somebody wrote a test of implementation details that failed after a trivial refactor.

I agree totally that having a good model of LLM uncertainty would make them orders of magnitude better (as would, obviously, removing the uncertainty altogether). And I wouldn't put them in a pipeline or behind a support desk. But I can and do use them for great benefit every day, and I have no idea why I should prefer to throw away the useful thing I have because it's imperfect.

▲

ModernMech

2 months ago

[-]

> none of my examples are sensitive to this problem.

That's not true. You absolutely have to understand those conditions because when you try to use those things outside of their operating ranges, they fail at a higher than the nominal rate.

> I'm still going to run the unit test suite even if it has low coverage, and even if I have no idea what the tests actually measure. My confidence in my changes will go up if the tests pass.

Right, your confidence goes up because you know that if the test passes, that means the test passed. But if the test suite can probabilistically pass even though some or all of the tests actually fail, then you will have to fall back to the notions of systematic risk management in my last post.

> I can refactor a piece of code and then send ChatGPT the before and after and ask "Do these do the same thing". I'm already highly confident that they do, but a yes from the AI means I can be more confident. If I get a no, I can read its explanation and agree or disagree. I'm sure it can get this wrong (though it hasn't after n~=100)

This n is very very small for you to be confident the behavior is as consistent as you expect. In fact, it gets this wrong all the time. I use AI in a class environment so I see n=100 on a single day. When you get to n~1k+ you see all of these problems where it says things are one way but really thing are another.

> mostly accurate double-check

And that's the problem right there. You can say "mostly accurate" but you really have no basis to assert this, past your own experience. And even if it's true, we still need to understand how wrong it can be, because mostly accurate with a wild variance is still highly problematic.

> But I can and do use them for great benefit every day, and I have no idea why I should prefer to throw away the useful thing I have because it's imperfect.

Sure, they can be beneficial. And yes, we shouldn't throw them out. But that wasn't my original point, I wasn't suggesting that. What I had said was that they cannot be relied on, and you seem to agree with me in that.

▲

Tainnor

2 months ago

[-]

> Unreliable tools have a good deal of utility.

This is generally true when you can quantify the unreliability. E.g. random prime number tests with a specific error rate can be combined so that the error rates multiply and become negligible.

I'm not aware that we can quantify the uncertainty coming out of LLM tools reliably.

▲

mrheosuper

2 months ago

[-]

> you'd still have to do 5% of the work

No, you still have to do 100% of the work.

▲

ToValueFunfetti

2 months ago

[-]

You simply do not. You do the math yourself to calculate 2(n) for n in [1, 2, 3, 4] and get [2, 5, 6, 8]. You plug it into your (75% accurate) unreliable calculator and get [3, 4, 6, 8]. You now know that you only need to recheck the first two (50%) of the entries.

▲

throwway120385

2 months ago

[-]

I resent becoming QA/QC for the machine instead of doing the same or better thinking myself.

▲

ToValueFunfetti

2 months ago

[-]

This is fair. I expect you would resent the tool even more if it was perfect and you couldn't even land a job in QA anymore. If that's the case, your resentment doesn't reflect on the usefulness of LLMs.

▲

jimbokun

2 months ago

[-]

> Here's a machine that verifies 95% of calculations

Which 95% did it get right?

▲

tevon

2 months ago

[-]

Stackoverflow is like this, you read an answer but are not fully sure if its right or if it fits your needs.

Of course there is a review system for a reason, but we frequently use "untrusted" tools in development.

That one guy in a github issue that said "this worked for me"

▲

shipp02

2 months ago

[-]

In Mechanical Engineering, this is 100% a thing with fluid dynamics simulation. You need to know if the output is BS based on a number of factors that I don't understand.

▲

theonething

2 months ago

[-]

> I can't think of another development tool we'd say this about.

Because no other dev tool actually generates unique code like AI does. So you treat it like the other components of your team that generates code, the other developers. Do you trust other developers to write good code without mistakes without getting it reviewed by others. Of course not.

▲

seabird

2 months ago

[-]

Yes, actually, I do! I trust my teammates with tens of thousands of hours of experience in programming, embedded hardware, our problem spaces, etc. to write from a fully formed worldview, and for their code to work as intended (as far as anybody can tell before it enters preliminary testing by users) by the time the rest of the team reviews it. Most code review is uneventful. Have some pride in your work and you'll be amazed at what's possible.

▲

theonething

2 months ago

[-]

so your saying that yes you do "trust other developers to write good code without mistakes without getting it reviewed by others."

And then you say "by the time the rest of the team reviews it. Most code review is uneventful."

So you trust your team to develop without the need for code review but yet, your team does code review.

So what is the purpose of these code reviews? Is it the case that you actually don't think they are necessary, but perhaps management insists on them? You actually answer this question yourself:

> Most code review is uneventful.

Keyword here is "most" as opposed to "all" So based your team's applied practices and your own words, code review is for the purpose of catching mistakes and other needed corrections.

But it seems to me if you trust your team not to make mistakes, code review is superfluous.

As an aside, it seems your team culture doesn't make room for juniors because if your team had juniors I think it would be even more foolish to trust them not to make mistakes. Maybe a junior free culture works for your company, but that's not the case for every company.

My main point is code review is not superfluous no matter the skill level; junior, senior, or AI simply because everyone and every AI makes mistakes. So I don't trust those three classes of code emitters to not ever make mistakes or bad choices (i.e. be perfect) and therefore I think code review is useful.

Have some honesty and humility and you'll amazed at what's possible.

▲

seabird

2 months ago

[-]

I never said that code review was useless, I said "yes, I do" to your question as to whether or not I "trust other developers to write good code without mistakes without getting it reviewed by others". Of course I can trust them to do the right thing even when nobody's looking, and review it anyway in the off-chance they overlooked something. I can't trust AI to do that.

The purpose of the review is to find and fix occasional small details before it goes to physical testing. It does not involve constant babysitting of the developer. It's a little silly to bring up honesty when you spent that entire comment dancing around the reality that AI makes an inordinately large number of mistakes. I will pick the domain expert who refuses to touch AI over a generic programmer with access to it ten times out of ten.

The entire team as it is now (me included) were juniors. It's a traditional engineering environment in a location where people don't aggressively move between jobs at the drop of a hat. You don't need to constantly train younger developers when you can retain people.

▲

theonething

2 months ago

[-]

You spend your comment dancing around the fact that everyone makes mistakes and yet you claim you trust your team not to make mistakes.

> I "trust other developers to write good code without mistakes without getting it reviewed by others". Of course I can trust them to do the right thing even when nobody's looking, and review it anyway in the off-chance they overlooked something.

You're saying yes, I trust other developers to not make mistakes, but I'll check anyways in case they do. If you really trusted them not to make mistakes, you wouldn't need to check. They (eventually) will. How can I assert that? Because everyone makes mistakes.

It's absurd to expect anyone to not make mistakes. Engineers build whole processes to account for the fact that people, even very smart people make mistakes.

And it's not even just about mistakes. Often times, other developers have more context, insight or are just plain better and can offer suggestions to improve the code during review. So that's about teamwork and working together to make the code better.

I fully admit AI makes mistakes, sometimes a lot of them. So it needs code review . And on the other hand, sometimes AI can really be good at enhancing productivity especially in areas of repetitive drudgery so the developer can focus on higher level tasks that require more creativity and wisdom like architectural decisions.

> I will pick the domain expert who refuses to touch AI over a generic programmer with access to it ten times out of ten.

I would too, but I won't trust them not to make mistakes or occasional bad decisions because again, everybody does.

> You don't need to constantly train younger developers when you can retain people.

But you do need to train them initially. Or do you just trust them to write good code without mistakes too?

▲

anonymars

2 months ago

[-]

I trust my colleagues to write code that compiles, at the very least

▲

ModernMech

2 months ago

[-]

Oh at the very least I trust them to not take code that compiles and immediately assess that it's broken.

▲

chrisweekly

2 months ago

[-]

But of course everyone absolutely NEEDS to use AI for codereviews! How else could the huge volume of AI-generated code be managed?

▲

forgetfreeman

2 months ago

[-]

"Do you trust other developers to write good code without mistakes without getting it reviewed by others."

Literally yes. Test coverage and QA to catch bugs sure but needing everything manually reviewed by someone else sounds like working in a sweatshop full of intern-level code bootcamp graduates, or if you prefer an absolute dumpster fire of incompetence.

▲

ryandrake

2 months ago

[-]

I would accept mistakes and inconsistency from a human, especially one not very experienced or skilled. But I expect perfection and consistency from a machine. When I command my computer to do something, I expect it to do it correctly, the same way every time, to convert a particular input to an exact particular output, every time. I don't expect it to guess, or randomly insert garbage, or behave non-deterministically. Those things are called defects(bugs) and I'd want them to be fixed.

▲

tevon

2 months ago

[-]

This seems like a particularly limited view of what a machine is. Specifically expecting it to behave deterministically.

▲

ModernMech

2 months ago

[-]

Still, the whole Unix philosophy of building tools starts with a foundation of building something small that can do one thing well. If that is your foundation, you can take advantage of composability and create larger tools that are more capable. The foundation of all computing today is built on this principle of design.

Building on AI seems more like building on a foundation of sand, or building in a swamp. You can probably put something together, but it's going to continually sink into the bog. Better to build on a solid foundation, so you don't have to continually stop the thing from sinking, so you can build taller.

▲

forgetfreeman

2 months ago

[-]

Would you welcome your car behaving in a nondeterministic fashion?

▲

senordevnyc

2 months ago

[-]

Then you are going to hate the future.

▲

ryandrake

2 months ago

[-]

Way ahead of you. I already hate the present, at least the current sad state of the software industry.

▲

forgetfreeman

2 months ago

[-]

Exactly this.

▲

theonething

2 months ago

[-]

Ok, here I thought requiring PR review and approval before merging was standard industry best practice. I guess all the places I've worked have been doing it wrong?

▲

forgetfreeman

2 months ago

[-]

There's a lot of shit that has become "best practice" over the last 15 years, and a lot more that was "best practice" but fell out of favor because reasons. All of it exists on a continuum of what is actually reasonable given the circumstances. Reviewing pull requests is one of those things that is reasonable af in theory, produces mediocre results in practice, and is frequently nothing more than bureaucratic overhead. Consider a case where an individual adds a new feature to an existing codebase. Given they are almost certainly the only one who has spent significant time researching the particulars of the feature set in question, and are the only individual with any experience at all with the new code, having another developer review it means you've got inexperienced, low-info eyes examining something they do not fully understand, and will have to take some amount of time to come up to speed on. Sure they'll catch obvious errors, but so would a decent test suite.

Am I arguing in favor of egalitarian commit food fights with no adults in the room? Absolutely not. But demanding literally every change go through a formal review process before getting committed, like any other coding dogma, has a tendency to generate at least as much bullshit as it catches, just a different flavor.

▲

Tainnor

2 months ago

[-]

Code review is actually one of the few practices for which research does exist[0] which points in the direction of it being generally good at reducing defects.

Additionally, in the example you share, where only one person knows the context of the change, code review is an excellent tool for knowledge sharing.

[0]: https://dl.acm.org/doi/10.1145/2597073.2597076, for example

▲

forgetfreeman

2 months ago

[-]

Oh I have no doubt it's an excellent tool for knowledge sharing. So are mailing lists (nobody reads email) and internal wikis (evergreen fist fight to get someone, anyone, to update). Despite best intentions knowledge sharing regimes are little more than well-intentioned pestering with irrelevant information that is absolutely purged from headspace during any number of daily/weekly/quarterly context switches. As I said, mediocre results.

▲

Tainnor

2 months ago

[-]

You're free to believe whatever you want, but again, this is one of the few things that we actually empirically know to be working.

▲

rixed

2 months ago

[-]

And there is worst: in the cases when the reviewer has actually some knowledge of the problem at hand, she might say "oh you did all this to add that feature? But it's actually already there. You just had to include that file and call function xyz". Or "oh but two months ago that very same topic was discussed and it was decided that it would make more sense to wait for module xyz to be refactored in order to make it easier ", etc.

▲

gtirloni

2 months ago

[-]

1) Once you get it to output something you like, do you check all the lines it changed? Is there a threshold after which you just... hope?

2) No matter what the learning curve, you're using a statistical tool that outputs in probabilities. If that's fine for your workflow/company, go for it. It's just not what a lot of developers are okay with.

Of course it's a spectrum with the AI deniers in one corner and the vibe coders in the other. I personally won't be relying 100% on a tool and letting my own critical thinking atrophy, which seems to be happening, considering recent studies posted here.

▲

nkoren

2 months ago

[-]

I've been doing AI-assisted coding for several months now, and have found a good balance that works for me. I'm working in Typescript and React, neither of which I know particularly well (although I know ES6 very well). In most cases, AI is excellent at tasks which involve writing quasi-custom boilerplate (eg. tests which require a lot of mocking), and at answering questions of how I should do _X_ in TS/React. For the latter, those are undoubtedly questions I could eventually find the answers on Stack Overflow and deduce how to apply those answers to my specific context -- but it's orders of magnitude faster to get the AI to do that for me.

Where the AI fails is in doing anything which requires having a model of the world. I'm writing a simulator which involves agents moving through an environment. A small change in agent behaviour may take many steps of the simulator to produce consequential effects, and thinking through how that happens -- or the reverse: reasoning about the possible upstream causes of some emergent macroscopic behaviour -- requires a mental model of the simulation process, and AI absolutely does _not_ have that. It doesn't know that it doesn't have that, and will therefore hallucinate wildly as it grasps at an answer. Sometimes those hallucinations will even hit the mark. But on the whole, if a mental model is required to arrive at the answer, AI wastes more time than it saves.

▲

jimbokun

2 months ago

[-]

> AI is excellent at tasks which involve writing quasi-custom boilerplate (eg. tests which require a lot of mocking)

I wonder if anyone has compared how well the AI auto-generating approach works compared to meta programming approaches (like Lisp macros) meant to address the same kind of issues with repetitive code.

▲

kazinator

2 months ago

[-]

The generation of volumes of boiler plate takes effort; nobody likes to do it.

The problem is, that phase is not the full life cycle of the boiler plate.

You have to live with it afterward.

▲

pjerem

2 months ago

[-]

> 1) Once you get it to output something you like, do you check all the lines it changed? Is there a threshold after which you just... hope?

Not op but yes. It sometimes takes a lot of time but I read everything. It still faster than nothing. Also, I ask very precise changes to the AI so it doesn’t generate huge diffs anyway.

Also for new code, TDD works wonders with AI : let it write the unit tests (you still have to be mindful of what you want to implement) and ask it to implement the code that run the tests. Since you talk the probabilistic output, the tool is incredibly good at iterating over things (running and checking tests) and also, unit tests are, in themselves, a pretty perfect prompt.

▲

iforgotpassword

2 months ago

[-]

> It sometimes takes a lot of time but I read everything. It still faster than nothing.

Opposite experience for me. It reliably fails at more involved tasks so that I don't even try anymore. Smaller tasks that are around a hundred lines maybe take me longer to review that I can just do it myself, even though it's mundane and boring.

The only time I found it useful is if I'm unfamiliar with a language or framework, where I'd have to spend a lot of time looking up how to do stuff, understand class structures etc. Then I just ask the AI and have to slowly step through everything anyways, but at least there's all the classes and methods that are relevant to my goal and I get to learn along the way.

▲

riffraff

2 months ago

[-]

How do you have it write tests before the code? It seems writing a prompt for the LLM to generate the tests would take the same time as writing the tests themselves.

Unless you're thinking of repetitive code I can't imagine the process (I'm not arguing, I'm just curious of what you're flow looks like).

▲

yodsanklai

2 months ago

[-]

> Is there a threshold after which you just... hope?

Generally, all the code I write is reviewed by humans, so commits need to be small and easily reviewable. I can't submit something I don't understand myself or I may piss off my colleagues, or it may never get reviewed.

Now if it was a personal project or something with low value, I would probably be more lenient but I think if you use a statically typed language, the type system + unit tests can capture a lot of issues so it may be ok to have local blocks that you don't look in details.

▲

ModernMech

2 months ago

[-]

Yeah for me, I use AI with Rust and a suite of 1000 tests in my codebase. I also use CoPilot VS code plugin mostly, which as far as I can tell heavily weights toward local code around it and often it just writing code based on my other code. I've found AI to be a good macro debugger too, as macro debugging tools are severely lacking in most ecosystems.

But when I see people using these AI tools to write JavaScript of Python code wholesale from scratch, that's a huge question mark for me. Because how?? How are you sure that this thing works? How are you sure when you update it won't break? Indeed the answer seems to be "We don't know why it works, we can't tell you under which conditions it will break, we can't give you any performance guarantees because we didn't test or design for those, we can't give you any security guarantees because we don't know what security is and why that's important."

People forgot we're out here trying to do software engineering, not software generation. Eternal September is upon us.

▲

senordevnyc

2 months ago

[-]

1) Yes, I review every line it changed.

2) I find the tool analogy helpful but it has limits. Yes, it’s a stochastic tool, but in that sense it’s more like another mind, not a tool. And this mind is neither junior nor senior, but rather a savant.

▲

bigstrat2003

2 months ago

[-]

> You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

Then it's not a useful tool, and I will decline to waste time on it.

▲

jorvi

2 months ago

[-]

> But I need to iterate a few times so that the code looks like what I want.

The LLM too. You can get a pretty big improvement by telling the LLM to "iterate 4 times on whichever code I want you to generate, but only show me the final iteration, and then continue as expected".

I personally just inject the request for 4 iterations into the system prompt.

▲

mrheosuper

2 months ago

[-]

If i dont trust my tool, i would never use it, or use something else better

▲

e3bc54b2

2 months ago

[-]

> You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

The vibe coding guy said to forget the code exists and give in to vibes, letting the AI 'take care' of things. Review and rework sounds more like 'work' and less like 'vibe'.

▲

mediaman

2 months ago

[-]

I'd add that the deductive abilities translate to well-defined spec. I've found it does well when I know what APIs I want it to use, and what general algorithmic approaches I want (which are still sometimes brainstormed separately with an AI, but not within the codebase). I provide it a numbered outline of the desired requirements and approach to take, and it usually does a good job.

It does poorly without heavy instruction, though, especially with anything more than toy projects.

Still a valuable tool, but far from the dreamy autonomous geniuses that they often get described as.

▲

Mountain_Skies

2 months ago

[-]

Versioning in source control for even personal projects just got far more important.

▲

AdrianEGraphene

2 months ago

[-]

It's wild how people write without version control... Maybe I'm missing something.

▲

chrisweekly

2 months ago

[-]

yeah, "git init" (if you haven't botherered to create a template repo) is not exactly cumbersome.

▲

o11c

2 months ago

[-]

Thankfully modern source control doesn't reuse user-supplied filenames for its internals. In the dark ages, I destroyed more than one checkout using commands of the form:

  find -name '*somepattern*' -exec clobbering command ...

▲

skissane

2 months ago

[-]

> the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything.

This is not an inherent flaw of LLMs, rather it is a flaw of a particular implementation-if you use guided sampling, so during sampling you only consider tokens allowed by the programming language grammar at that position, it becomes impossible for the LLM to generate ungrammatical output

> When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries,

They can use guided sampling for this too - if you know the set of function names which exist in the codebase and its dependencies, you can reject tokens that correspond to non-existent function names during sampling

Another approach, instead of or as well as guided sampling, is to use an agent with function calling - so the LLM can try compiling the modified code itself, and then attempt to recover from any errors which occur.

▲

theonething

2 months ago

[-]

> it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything.

Your tone is rather hyperbolic here, making it sound like an extra brace resulted in a disaster. It didn't. It was easy to detect and easy to fix. Not a big deal.

▲

ModernMech

2 months ago

[-]

It's not a big deal in the sense that it's easily reversed, but it is a big deal in that it means the tool is unpredictably unhelpful. Of the properties that good tools in my workflow possess, "unpredictably unhelpful" does not make the top 100.

When a tool starts confidently inserting random wrong code into my 100% correct code, there's not much more I need to see to know it's not a tool for me. That's less like a tool and more like a vandal. That's not something I need in my toolbox, and I'm certainly not going to replace my other tools with it.

▲

nothrabannosir

2 months ago

[-]

https://dwheeler.com/essays/apple-goto-fail.html

▲

cryptoegorophy

2 months ago

[-]

I think that’s why Apple is very slow at rolling out AI if it ever actually will. Downside is way too big than the upside.

▲

saintfire

2 months ago

[-]

You say slowly, but in my opinion Apple made an out of character misstep by releasing a terrible UX to everyone. Apple intelligence is a running joke now.

Yes they didn't push it as hard as, say, copilot. I still think they got in way too deep way too fast.

▲

devmor

2 months ago

[-]

This is not the first time that Apple has released a terrible UX that very few users liked, and it certainly wont be the last.

I don’t necessarily agree with the post you’re responding to, but what I will give Apple credit for is making their AI offering unobtrusive.

I tried it, found it unwanted and promptly shut it off. I have not had to think about it again.

Contrast that with Microsoft Windows, or Google - both shoehorning their AI offering into as many facets of their products as possible, not only forcing their use, but in most cases actively degrading the functionality of the product in favor of this required AI functionality.

▲

trilbyglens

2 months ago

[-]

Yep. Replacing google assistant with Gemini before feature parity was even close is such a fuck-you to users.

▲

johnisgood

2 months ago

[-]

On Instagram, the search is now "powered by Meta AI". Cannot do anything about it.

▲

johnisgood

2 months ago

[-]

After a recent update, so does WhatsApp. There is "search with Meta AI". Probably will be rolled out or already is for all Meta products.

▲

throwaway2037

2 months ago

[-]

    > Apple made an out of character misstep by releasing a terrible UX to everyone

What about Apple Maps? That roll-out was awful.

▲

miki123211

2 months ago

[-]

Apple had their hand forced by Google on that one afaik.

Yes they knew Apple maps was bad and not up to standard yet, but they didn't really have any other choice.

▲

emn13

2 months ago

[-]

Of course they had a choice: they could have stuck with google maps for longer, and they probably also could have invested more in data and UI beforehand. They could have launched a submarine non-apple-branded product to test the waters. They could likely have done other things we haven't thought of here, in this thread.

Quite plausibly they just didn't realize how rocky the start would be, or perhaps they valued that immediate strategic autonomy more in the short-term that we think, and willingly chose to take the hit to their reputation rather than wait.

Regardless, they had choices.

▲

throwaway2037

2 months ago

[-]

    > They could have launched a submarine non-apple-branded product to test the waters.

This is a great idea. Are there any past Apple (or non-Apple) examples of this product release strategy?

▲

pavel_lishin

2 months ago

[-]

Sure, and now you can also say they had their hand forced by Google again, or by Anthropic, or by whatever other AI companies are out there.

▲

manmal

2 months ago

[-]

Remember „You are a bad user, I am a good bing“? Apple is just slower in fixing and improving things.

▲

stogot

2 months ago

[-]

Fast!? They were two years slow and still fell face flat, and then rolled back the software

▲

MichaelZuo

2 months ago

[-]

“Two years slow” relative to what?

Henry Ford was 23 years “slow” relative to Karl Benz.

▲

the_doctah

2 months ago

[-]

Apple innovation glazing makes me ill

▲

MichaelZuo

2 months ago

[-]

What does this mean?

▲

miki123211

2 months ago

[-]

Apple made a huge mistake by keeping their commitment to "local first" in the age of AI.

The models and devices just aren't quite there yet.

Once Google gets its shit together and starts deploying (cloud--based) AI features to Android devices en masse, Apple is going to have a really big problem on their hands.

Most users say that they want privacy, but if privacy comes in the way of features or UX, they choose the latter. Successful privacy-respecting companies (Apple, Signal) usually understand this, it's why they're successful, but I think Apple definitely chose the wrong tradeoff here.

▲

zdragnar

2 months ago

[-]

Investors seem to be starved for novelty right now. Web 2.0 is a given, web 3.0 is old, crypto has lost the shine, all that's left to jump on at the moment is AI.

Apple fumbled a bit with Siri, and I'm guessing they're not too keen to keep chasing everyone else, since outside of limited applications it turns out half baked at best.

Sadly, unless something shinier comes along soon, we're going to have to accept that everything everywhere else is just going to be awful. Hallucinations in your doctor's notes, legal rulings, in your coffee and laundry and everything else that hasn't yet been IoT-ified.

▲

timr

2 months ago

[-]

> we're going to have to accept that everything everywhere else is just going to be awful. Hallucinations in your doctor's notes, legal rulings, in your coffee and laundry and everything else that hasn't yet been IoT-ified.

I installed a logitech mouse driver (sigh) the other day, and in addition to being obtrusive and horrible to use, it jams an LLM into the UI, for some reason.

AI has reached crapware status in record time.

▲

pcthrowaway

2 months ago

[-]

> Hallucinations in your doctor's notes, legal rulings, in your coffee

"OK Replicator, make me one espresso with creamer"

"Making one espresso with LSD"

▲

Jagerbizzle

2 months ago

[-]

I'd like to get on the pre-order list for this product.

▲

VenturingVole

2 months ago

[-]

"all that's left to jump on at the moment is AI" -> No, it's the effective applications of AI. It's unprecedented.

I was in the VC space for a while previously, most pitch decks claimed to be using AI: But doing even the briefest of DD - it was generally BS. Now it's real.

With respect to everything being awful: One might say that's always been the case. However, now there's a chance (and requirement) to build in place safeguards/checks/evals and massively improve both speed and quality of services through AI.

Don't judge for the problems: Look at the exponential curve, think about how to solve the problems. Otherwise, you will get left behind.

▲

zdragnar

2 months ago

[-]

The problem isn't AI; it's just a tool. The problem is the people using it incorrectly because they don't understand it beyond the hype and surface details they hear about it.

Every week for the last few months, I get a recruiter for a healthcare startup note taking app with AI. It's just a rehash of all the existing products out there, but "with AI". It's the last place I want an overworked non-technical user relying on the computer to do the right thing, yet I've had at least four companies reach out with exactly that product. A few have been similar. All of them have been "with AI".

It's great that it is getting better, but at the end of the day, there's only so much it can be relied upon for, and I can't wait for something else to take away the spotlight.

▲

VenturingVole

2 months ago

[-]

Well put and you're correct: There IS a lot of hype/BS still sadly - as companies seek to jump on the hype train without effectively adapting. My karma took a serious hit for my last post - but yesterday I met with someone whose life has been profoundly impacted by AI:

- An extremely dedicated and high achieving professional, at the very top of her game with deep industry/sectoral knowledge: Successful and with outstanding connections. - Mother of a young child. - Tradition/requirement for success within the sector was/is working extremely long hours: 80-hour weeks are common.

She's implemented AI to automate many of her previous laborious tasks and literally cut down her required hours by 90%. She's now able to spend more time with her family, but also - able to now focus on growing/scaling in ways previously impossible.

Knowing how to use it, what to rely upon, what to verify and building in effective processes is the key. But today AI is at its worst and it already exceeds human performance in many areas.. it's only going in one direction.

Hopefully the spotlight becomes humanity being able to focus on what makes us human and our values, not mundane/routine tasks and allows us to better focus on higher-value/relationships.

▲

zdragnar

2 months ago

[-]

> Knowing how to use it, what to rely upon, what to verify and building in effective processes is the key. But today AI is at its worst and it already exceeds human performance in many areas.. it's only going in one direction.

I suppose this is the difference between an optimist and a pessimist. No matter how much better the tool gets, I don't see people getting better, and so I don't see the addition of LLM chatbots as ever improving things on the whole.

Yes, expert users get expert results. There's a reason why I use a chainsaw to buck logs instead of a hand saw, and it's also much the same reason that my wife won't touch it.

▲

e3bc54b2

2 months ago

[-]

90% is a big number. Assuming being an expert allows her to make better use of AI than most, that is still an astonishing number without knowing anything about the field in question. That makes me think that 80 hour work weeks are mostly unproductive. Again, assumption being an average person to be able to use AI less effectively, let's ballpark at half as effectively, I still get 40 hours per week of mostly non-sense work. How did we end up here as a society?!

▲

hexo

2 months ago

[-]

> it was generally BS. Now it's real.

Yes. Finally! Now it's real BS. I wouldn't touch it with 8 meter pole.

▲

sillyfluke

2 months ago

[-]

They already rolled out an "AI" product. Got humiliated pretty bad, and rolled it back. [0]

[0] https://www.bbc.com/news/articles/cq5ggew08eyo

▲

VenturingVole

2 months ago

[-]

They had an opportunity to actually adapt, to embrace getting rapid feedback/iterating: But they are not equipped for it culturally. Major lost opportunity as it could have been a driver of internal change.

I'm certain they'll get it right soon enough though. People were writing off Google in terms of AI until this year.. and oh how attitudes have changed.

▲

skinkestek

2 months ago

[-]

> People were writing off Google in terms of AI until this year.. and oh how attitudes have changed.

Just give Google a year or two.

Google has a pretty amazing history of both messing up products generally and especially "ai like" things, including search.

(Yes I used to defend Google until a few years ago.)

▲

tonyhart7

2 months ago

[-]

"to embrace getting rapid feedback/iterating"

that's the problem noo?? big company is sucks at that, you cant do that in certain company because sometimes its just not possible

▲

furyofantares

2 months ago

[-]

They also have text thread and email summaries. I still think it counts as a slow rollout.

▲

jmaker

2 months ago

[-]

Even the iOS and macOS typing correction engine has been getting worse for me over the past few OS updates. I’m now typing this on iOS, and it’s really annoying how it injects completely unrelated words, replaces minor typos with completely irrelevant words. Same in Safari on macOS. The previous release felt better than now, but still worse than a couple years ago.

▲

TylerE

2 months ago

[-]

It’s not just you. iOS auto correct has gotten damn near malicious. E seen it insert entire words out of nowhere

▲

trilbyglens

2 months ago

[-]

Spellcheck is an absolutely perfect example of what happens with technology long-term. Once the hype cycle is over for a certain tech, it gets left to languish, slowly degrading until it's completely useless. We should be far more outraged at how poor basic things like this still are in 2025. They are embarrassingly bad.

▲

nottorp

2 months ago

[-]

> it gets left to languish, slowly degrading until it's completely useless

What do you mean? Code shouldn't degrade if it's not changed. But the iOS spell checker is actively getting worse, meaning someone is updating it.

▲

TylerE

2 months ago

[-]

Real code has dependencies and they sometimes change, including growing undocumented behavior or new bugs.

▲

throwway120385

2 months ago

[-]

Yeah it finishes my sentences and goes back and replaces entire words with other words that are not even in the same category of noun, then replaces pronouns and conjunctions to completely make up a new sentence for me in something I've already typed. I'm not stupid and I meant what I typed. If I didn't mean what I typed I would have typed something else. Which I didn't.

▲

gambiting

2 months ago

[-]

>>if it ever actually will.

If they don't then I'd hope they get absolutely crucified by trade comissions everywhere, currently there are bilboards in my city advertising Apple AI even though it doesn't even exist yet - if it's never brought to the market then it's a serious case of misleading advertising.

▲

poink

2 months ago

[-]

Yet Apple has reenabled Apple Intelligence multiple times on my devices after OS updates despite me very deliberately and angrily disabling it multiple times

▲

m3kw9

2 months ago

[-]

When you got 1-2billion users a day doing maybe 10 billion prompts a day, it’s risky

▲

anonzzzies

2 months ago

[-]

Did anyone say that? They are an issue everywhere, including for code. But with code at least I can have tooling to automatically check and feed back that it hallucinated libraries, functions etc, but with just normal research / problems there is no such thing and you will spend a lot of time verifying everything.

▲

threeseed

2 months ago

[-]

I use Scala which has arguably the best compiler/type system with Cursor.

There is no world in which a compiler or tooling will save you from the absolute mayhem it can do. I’ve had it routinely try to re-implement third party libraries, modify code unrelated to what it was asked, quietly override functions etc.

It’s like a developer who is on LSD.

▲

Tainnor

2 months ago

[-]

I don't know Scala. I asked cursor to create a tutorial for me to learn Scala. It created two files for me, Basic.scala and Advanced.scala. The second one didn't compile and no matter how often I tried to paste the error logs into the chat, it couldn't fix the actual error and just made up something different.

▲

Terr_

2 months ago

[-]

Yeah, everyone wanted a thinking machine, but the best we can do right now is a dreaming machine... And dreams don't have to make sense.

▲

larodi

2 months ago

[-]

Developer on LSD is likely to hallucinate less in terms of how weird the LLM hallucinations are sometimes. Besides I know people, not myself, who fare very well on LSD and particularly when micro dosing Adderal style

▲

miningape

2 months ago

[-]

Mushrooms too! I find they get me into a flow state much better than acid (when microdosing).

▲

jmaker

2 months ago

[-]

Granted the Scala language is much more complex than Go. To produce something useful it must be capable of an equivalent of parsing the AST.

▲

felipefar

2 months ago

[-]

Yes, most people who have an incentive in pushing AI say that hallucinations aren't a problem, since humans aren't correct all the time.

But in reality hallucinations either make people using AI lose a lot of their time trying to stuck the LLMs from dead ends or render those tools unusable.

▲

Gormo

2 months ago

[-]

> Yes, most people who have an incentive in pushing AI say that hallucinations aren't a problem, since humans aren't correct all the time.

Humans often make factual errors, but there's a difference between having a process to validate claims against external reality, and occasionally getting it wrong, and having no such process, with all output being the product of internal statistical inference.

The LLM is engaging in the same process in all cases. We're only calling it a "hallucination" when its output isn't consistent with our external expectations, but if we regard "hallucination" as referring to any situation where the output for a wholly endogenous process is mistaken for externally validated information, then LLMs are only ever hallucinating, and are just designed in such a way that what they hallucinate has a greater than chance likelihood of representing some external reality.

▲

jimbokun

2 months ago

[-]

> Yes, most people who have an incentive in pushing AI say that hallucinations aren't a problem, since humans aren't correct all the time.

We have legal and social mechanisms in place for the way humans are incorrect. LLMs are incorrect in new ways that our legal and social systems are less prepared to handle.

If a support human lies about a change to policy, the human is fired and management communicates about the rogue actor, the unchanged policy, and how the issue has been handled.

How do you address an AI doing the same thing without removing the AI from your support system?

▲

anonzzzies

2 months ago

[-]

Still fine the company who uses the AI? It cannot be prevented with the current state of AIs so you will need a disclaimer and, if a user cancels, show the latest support chats in the crm for that user so you can add a human in the mix.

▲

manmal

2 months ago

[-]

You get some superficial checking by the compiler and test cases, but hallucinations that pass both are still an issue.

▲

anonzzzies

2 months ago

[-]

Absolutely, but at least you have some lines of defence while with real world info you have nothing. And the most offending stuff like importing a package that doesn't exist or using a function that doesn't exist does get caught and can be auto fixed.

▲

skyfaller

2 months ago

[-]

Such errors can be caught and auto-fixed for now, because LLMs haven't yet rotted the code that catches and auto-fixes errors. If slop makes it into your compiler etc., I wouldn't count on that being true in the future.

▲

rini17

2 months ago

[-]

Except when the hallucinated library exists and it's malicious. This is actually happening. Without AI, by using plain google you are less likely to fall for that (so far).

▲

jmaker

2 months ago

[-]

Until the model injects a subtle change to your logic that does type-check and then goes haywire in production. Just takes a colleague of yours under pressure and another one to review the PR, and then you’re on call and they out sick or on vacation.

▲

learningstud

2 months ago

[-]

People hallucinate all the time out of pressure or habit. We don't need AI for that. It's hard to tell most people from AI. Most people would fail Turing tests as subjects.

▲

_jonas

2 months ago

[-]

I see this fallacy often too.

My company provides hallucination detection software: https://cleanlab.ai/tlm/

But we somehow end up in sales meetings where the person who requested the meeting claims their AI does not hallucinate ...

▲

mntruell

2 months ago

[-]

(Cursor cofounder)

Apologies - something very clearly went wrong here. We’ve already begun investigating, and some very early results:

* Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.

* We’ve made sure this user is completely refunded - least we can do for the trouble.

For context, this user’s complaint was the result of a race condition that appears on very slow internet connections. The race leads to a bunch of unneeded sessions being created which crowds out the real sessions. We’ve rolled out a fix.

Appreciate all the feedback. Will help improve the experience for future users.

▲

nextaccountic

2 months ago

[-]

Why did you remove this thread?

https://old.reddit.com/r/cursor/comments/1jyy5am/psa_cursor_...

(For reference, here it is in reveddit https://www.reveddit.com/v/cursor/comments/1jyy5am/psa_curso... - text from post was unfortunately not saved)

It's already locked and with a stickied comment from a dev clarifying what happened

Did you remove it so people can't find about this screwup when searching Google?

Anyway, if you acknowledge it was a mistake to remove the thread, could you please un-remove it?

▲

PrayagS

2 months ago

[-]

The whole subreddit is moderated poorly. I’ve seen plenty of users post on r/LocalLlama about how something negative or constructive they said on the Cursor sub was just removed.

▲

AyyEye

2 months ago

[-]

Why would anyone trust you?

The best case scenario is that you lied about having people answer support. LLMs pretending to be people (you named it Sam!) and not labeled as such is clearly intended to be deceptive. Then you tried to control the narrative on reddit. So forgive me if I hit that big red DOUBT button.

Even in your post you call it "AI-assisted responses" which is as weaselly as it gets. Was it a chatbot response or was a human involved?

But 'a chatbot messed up' doesn't explain how users got locked out in the first place. EDIT: I see your comment about the race condition now. Plausible but questionable.

So the other possible scenario is that you tried to hose your paying customers then when you saw the blowback blamed it on a bot.

'We missed the mark' is such a trope non-apology. Write a better one.

I had originally ended this post with "get real" but your company's entire goal is to replace the real with the simulated so I guess "you get what you had coming". Maybe let your chatbots write more crap code that your fake software engineers push to paying customers that then get ignored and/or lied to when they ask your chatbots for help. Or just lie to everyone when you see blowback. Whatever. Not my problem yet because I can write code well enough that I'm embarrassed for my entire industry whenever I see the output from tools like yours.

This whole "AI" psyop is morally bankrupt and the world would be better off without it.

▲

PoignardAzur

2 months ago

[-]

> The best case scenario is that you lied about having people answer support. LLMs pretending to be people (you named it Sam!) and not labeled as such is clearly intended to be deceptive.

Also, illegal in the EU.

▲

jackaroe420

2 months ago

[-]

I don't know who you are but you said this so well!

▲

Azeralthefallen

2 months ago

[-]

Hi since i know you will never respond to this or hear this.

We spent almost 2 months fighting with you guys about basic questions any B2B SaaS should be able to answer us. Things such as invoicing, contracts, and security policies. This was for a low 6 figure MRR deal.

When your sales rep responds "I don't know" or "I will need to get back to you" for weeks about basic questions it left us with a massive disappointment. Please do better, however we have moved to Copilot.

▲

dspillett

2 months ago

[-]

> Any AI responses used for email support are now clearly labeled as such.

Because we all know how well people pay attention to such clear labels, even seasoned devs not just “end users”⁰.

Also, deleting public view of the issue (locking & hiding the reddit thread) tells me a lot about how much I should trust the company and its products, and as such I will continue to not use them.

--------

[0] though here there the end users are devs

▲

Snakes3727

2 months ago

[-]

I do truely love how you guys even went so far to hide and lock the post from Reddit.

This person is not the only one to experiencing this bug. As this thread has pointed out.

▲

KennyBlanken

2 months ago

[-]

I wish more people realized that virtually any subreddit for a company or product is run by the company - either directly or via a firm that specializes in 'sentiment analysis and management' or whatever the marketdroids call it these days. Even if they don't remove posts via moderation, they'll just hammer it with downvotes from sockpuppet accounts.

HN goes a step further. It has a function that allows moderators to kill or boost a post by subtracting or adding a large amount to the post's score. HN is primarily a place for Y Combinator to hype their latest venture, and a "safe" place for other startups and tech companies.

▲

Snakes3727

2 months ago

[-]

Yes and it irritates the hell out of me. Cursor support is garbage, but issues with billing and other things are so much worse.

The team I work with it took nearly 3 months to get basic questions answered correctly when it came to a sales contract. They never gave our Sec team acceptable answers around privacy and security.

▲

thinkingemote

2 months ago

[-]

I've always wondered how Reddit can make money from these companies. I agree they are literally everywhere, even in non-company specific but generic subreddits where if it's big enough you might have multiple shadow marketing firms competing to push their products (e.g. AI, movies, food, porn etc).

Reddit is free to play for marketing firms. Perhaps they could add extra statistics, analytics, promotions for these commercial users.

▲

patcon

2 months ago

[-]

Agreed, this is what's infuriating: insistence on control.

They will utterly fail to build for a community of users if they don't have anyone on-hand who can tell them what a terrible idea that was

To the cofounder: hire someone (ideally with some thoughtful reluctance around AI, who understands what's potentially lost in using it) who will tell you your ideas around this are terrible. Hire this person before you fuck up your position in benevolent leadership of this new field

▲

petesergeant

2 months ago

[-]

I dunno, that seems pretty reasonable to me simply for stopping the spread of misinformation. The main story will absolutely get written up by some smaller news sources, but is it really a benefit for someone facing a similar issue in the future to find an outdated and probably confusing Reddit post about it?

▲

slotrans

2 months ago

[-]

> We use AI-assisted responses as the first filter for email support.

Literally no one wants this. The entire purpose of contacting support is to get help from a human.

▲

fragmede

2 months ago

[-]

Sorta? I mean I want my problem fixed, regardless of it it's a person or not. Having a person listen to me complain about my problems might sooth my conscience, but I can't pay my bill or why was it so high; having those answered by a system that is contextualized to my problem sand is empowered to fix it, and not just a talking to a brick wall? I wouldn't say totally fine, but at the end of the day, if my problem is solved or my query, even if it's weird, I can't say I really needed for the voice on the other end of the pHone to come from a human. If a companies business model isn't sustainable without using AI agents, it's not really my problem that it's not, but also if I'm using their product, presumably I don't want that to go away.

▲

conartist6

2 months ago

[-]

Isn't the real scary thing here that the AI agent is empowered to control your life?

You're imagining that if you get the answer you want from the AI you hang up the phone, and if you don't you're imagining a human will pick up and have the political power and will to overrule the AI. I think it's more realistic is the way things have played out here: nobody took or had any responsibility because "they made the AI responsible" and second-guessing that choice isn't second-guessing the AI, it's second-guessing the human leaders who decreed that human support had no value. This with the result that the humans would let the AI go about as far as setting fire to the building before some kind of human element imbued with any real accountability steps in. The evidence of this is all the ignored requests presented here.

▲

hartator

2 months ago

[-]

> For context, this user’s complaint was the result of a race condition that appears on very slow internet connections.

Seems like you are still blaming the user for his “very slow internet”.

How do you know the user internet was slow? Couldn’t a race condition like this exist anyway with regular 2 fast internet connections competing for the same sessions?

Something doesn’t add up.

▲

mritchie712

2 months ago

[-]

huh?

this is a completely reasonable and seemingly quite transparent explaination.

if you want a conspiracy, there are better places to look.

▲

ben0x539

2 months ago

[-]

When admitting fault with your a PR hat on after pissing off a decent(?) number of your paying customers, you're supposed to fully fall on your own sword, not assign blame to factors outside of your control.

Instead of saying "race condition that appears on very slow internet connections", you might say "race condition caused by real-world network latencies that our in-office testing didn't reveal" or some shit.

▲

otterley

2 months ago

[-]

I understood what he meant. He just elided the words "in our code."

▲

hartator

2 months ago

[-]

I don't think you are being transparent.

Like it all sounds like a business decision (limiting 1 device to 1 sub) which is actually that was confirmed both by the actual tech limitation (logging out users from their other devices) and your own support.

Blaming the AI, then the user connection, and then some odd race conditions seem unnecessary. You can just say it was a bad business decision, roll it back, and go on with your day. Instead of treating multiple paying users badly.

▲

eranation

2 months ago

[-]

Side note... I'm a paying enterprise customer who moved all my team to cursor and have to say I'm considering canceling due to the non existent support. For example Cursor will create new files instead of edit an existing one when you have a workspace with multiple folders in a monorepo...

▲

geuis

2 months ago

[-]

Why in all of hades would you force your entire eng org to only use one LLM provider. It's incredibly easy to run this stuff locally on 4+ year old hardware. Why is this even something you're spending company money on? Investor funds?

▲

mosdl

2 months ago

[-]

It's weirdly common at young startups. Interviewed at two places in the past few weeks where cursor was required. Funnily enough the reason for hiring was because they needed people to fix things ...

▲

hakaneskici

2 months ago

[-]

Hi Michael,

Slightly related to this; I just wanted to ask whether all Cursor email inboxes are gated by AI agents? I've tried to contact Cursor via email a few times in the past, but haven't even received an AI response :)

Cheers!

▲

mntruell

2 months ago

[-]

Not all of them (e.g. security@)! But our support system currently is. We are standing up a much bigger team here but are behind where we should be.

▲

Snakes3727

2 months ago

[-]

Can you please explain why something as basic as getting support needs to go through an AI?

Are you truely that cheap? Is this why it took you guys 3 months to get a basic contract back to us?

▲

carstenhag

2 months ago

[-]

Good human support is expensive. You need support agents and people that educate and manage those. It's not easy to scale up and down usually. People also hate waiting times.

AI fixes most of that... Most of the time? Clearly not, but hey.

▲

EdwardDiego

2 months ago

[-]

> Good human support is expensive.

And bad AI support is also proving to be expensive.

▲

eviks

2 months ago

[-]

AI doesn't fix any of that because it's not good. For example, you can easily reduce waiting times to 0 with human support by replying "can't help" to every request. So unless you have good AI support, the fact that it can reduce waiting time is not an improvement.

Same with scaling- what's hard is scaling good quality, not just scaling, so without good AI you've again gained nothing

▲

jstummbillig

2 months ago

[-]

Basic and cheap? Maybe this attitude towards support work is why.

▲

UncleMeat

2 months ago

[-]

If the reasoning was "we are growing fast and struggling to stand up more robust support so we are launching this as a temporary holdover" then I would have expected the system to have announced that it was an AI bot rather than being identified with a human name.

▲

make3

2 months ago

[-]

filtering (besides spam) and answering emails is a place where AI agents shouldn't be imho

▲

adenta

2 months ago

[-]

You’ve promised a ton of people refunds that never got them. Others in this thread, and myself included

Edit: he did refund 22 mins after seeing this

▲

krzat

2 months ago

[-]

you didn't get a refund because the promise of refund was also hallucinated.

▲

gblargg

2 months ago

[-]

It's AIs all the way down.

▲

PoignardAzur

2 months ago

[-]

Maybe wait more than an hour before implying the refunds were a lie all along.

▲

einsteinx2

2 months ago

[-]

I waited since March 13 and still nothing. They do this to many many people it seems.

▲

adenta

2 months ago

[-]

I tried cursor a couple months ago, and got the same “do you want a refund” email as others, that got a “sure” reply from me.

Idk. It’s just growing pains. Companies that grow quickly have problems. Imma keep using https://cline.bot and Claude 3.7.

▲

makingstuffs

2 months ago

[-]

Yeah I got asked for feedback and offered a refund when I cancelled. Never got any reply after. Guess it was AI slop

▲

einsteinx2

2 months ago

[-]

Same. And the sender’s email matches this cofounder’s username.

▲

Tinkeringz

2 months ago

[-]

Also experienced this

▲

makingstuffs

2 months ago

[-]

The email I received in case anyone is wondering:

Hi <makingstuffs>,

I'm Michael, the founder and CEO of Cursor. I noticed you recently canceled your subscription, and I wanted to check in. If we fell short for you, I want to learn why and make it right.

First, if you'd like me to refund your account, please just reply to this email to let me know. I'd be happy to.

Second, could you share a sentence or two on what you disliked about Cursor? Or perhaps a screenshot of where it performed poorly? This will help us improve the product for future users.

I'd be very grateful to understand your candid thoughts. I'm listening and eager to fix our experience for you. Wishing you the best in any case!

Best, Michael

And my reply which I never got a response to:

Hi Michael,

Thanks for reaching out. I have honestly found that recent updates to the app have been extremely detrimental to the DX and productivity. A couple big issues I have found:

1. Removing the floating component window and providing no way to get it back. As a dev who is often travelling and working from a laptop screen I found the floating window to be extremely handy and its removal essentially meant I just do not use the composer anymore.

2. Constantly overriding VSCode native shortcuts. This is the most detrimental thing I have experienced, personally. Shortcuts are crucial to productivity and are engrained in muscle memory over years. Overriding them is essentially removing years of learned behaviour (things like cmd + shift + l)

3. The floating completion windows. These often end up overlapping my code code and break my flow. I have to press escape to close it and the whole experience is just jarring

4. Making the cursor dance around the screen when suggesting completions. I get that completions can sometimes be handy but moving my cursor when I am in the flow just makes me rethink what I am doing so that I can read a guess as to what I want which is often incorrect.

5. Poor suggestions. In the past month (maybe two?) I have noticed the quality of prompts is not up to par. I often find that Cursor will do weird things like import `jest.Mock` in my unit tests when I have not used jest in any repo which I have been actively maintaining since using cursor.

As for the refund I will leave that decision for you. I knew I was entering beta software when I ordered accepted the terms so I wouldn't be annoyed as such. Though, saying that, I do find myself using cursor a lot less now and am going to most likely shift back to plain VSCode as a result of the above.

If I think of anything else I will let you know.

Thanks,

Love, Peace and Happiness,

▲

PUSH_AX

2 months ago

[-]

It's a real shame that your team deletes threads like this in instances where they have control (eg they are mods on the subreddit). Part of me wonders if you had a magic wand would you have just deleted this too, but you're forced to chime in now because you don't.

▲

ach9l

2 months ago

[-]

so the actual implementation of the code to log people off was also hallucination? the enforcement too? all the way to a production environment? is this safe, or just a virtual scape goat?

▲

Ukv

2 months ago

[-]

To my understanding there weren't really distinct "implementation of the code to log people off" and "enforcement" - just a bug where previous sessions were being expired when a new one was created.

That an LLM then invented a reason when asked by users why they're being logged out isn't that surprising. While not impossible, I don't think there's currently indication that they intended to change policy and are just blaming it on a hallucination as a scape goat.

▲

ph4evers

2 months ago

[-]

Keep going! I love Cursor. Don’t let the haters get to you

▲

redbell

2 months ago

[-]

> Any AI responses used for email support are now clearly labeled as such

Also, from the first comment in the post:

> Unfortunately, this is an incorrect response from a front-line AI support bot.

Well, this actually hurts.. a lot! I believe one of the key pillars of making a great company is customer support, which represents the soul or the human part of the company.

▲

fossuser

2 months ago

[-]

Thank for the details and replying here!

Don’t let the dickish replies get to you.

▲

make3

2 months ago

[-]

Support emails shouldn't be AI. It's just so annoying. Put a human in the loop at least. This is a paying service, not a massive ad supported thing.

▲

SCdF

2 months ago

[-]

> * Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.

Don't use AI. Actually care. Like, take a step back, and realise you should give a shit about support for a paid product.

Don't get me wrong: AI is a very effective tool, *for doing things you don't care about*. I had to do a random docker compose change the the other day. It's not production code, it will be very obvious whether or not AI output works, and I very rarely touch docker and don't care to become a super expert in it. So I prompted the change, and it was good enough and so I ran with it.

You using AI for support tells me that you don't care about support. Which tells me whether or not I should be your customer.

▲

petesergeant

2 months ago

[-]

There’s AI and there’s “AI”, and this whole drama would have been avoided by returning links to an FAQ rather found using embedding search rather than actually then trying to turn it into a textual answer, which — working with these systems all day — is madness

▲

thih9

2 months ago

[-]

> Don't use AI. Actually care.

I agree with this. Also, whenever I care about code, I don’t use AI. So I very rarely use AI assistants for coding.

I guess this is why Cursor is interested in making AI assistants popular everywhere, they don’t want the association that “AI assisted” means careless. Even when it does, at least with today’s level of AI.

▲

throwawaysleep

2 months ago

[-]

The amount paid is still pretty trivial. I wouldn’t expect much human support for most SaaS products costing $20 a month.

▲

basisword

2 months ago

[-]

If you are charging people money, they deserve support. If Cursor's revenues are anything close to what is reported they can easily afford a support team - they just don't want to because they don't see the value.

▲

thih9

2 months ago

[-]

Support quality depends on how much the company wants to keep that particular user. There are companies with better support for cheaper products and companies with worse support for more expensive products.

▲

AstroBen

2 months ago

[-]

I've had great support from Jetbrains when I was paying less than that

▲

tomaskafka

2 months ago

[-]

No idea why you're downvited. If anyone wants a human support handholding, that's a territory of $200 or $2000/mo products.

▲

Draiken

2 months ago

[-]

Because this makes no sense.

Do they advertise that there's no support when you pay $20? I'm gonna take a guess that they don't.

They are getting paid by their customers and if they can't sustain their business (which includes support) with it they are under pricing their product and should have consequences for it.

A business is a business and we should stop treating startups as special. They operate on the same rules and standards that everyone else does.

▲

testycool

2 months ago

[-]

If this incident happened to me, I think I'd 100% give them a pass because Cursor is my favorite and most used subscription.

I've gotten a lot of value out of it over the past year, and often feel that I'm underpaying for what I'm getting.

To me, any type of business is a business. I'd treat Cursor as special because it is special.

▲

charlietango592

2 months ago

[-]

Not trying to defend them, but I think it’s a problem of scaling up. The user base grew very quickly and keeping up with the support inquiries must be a tough job. Therefore the first like of defense is AI support replies.

I agree with you, they should care.

▲

Sander_Marechal

2 months ago

[-]

Then you use AI for triaging or summation to help you provide better support faster. You don't let it respond to users unchecked.

▲

trollied

2 months ago

[-]

Given how they started... https://news.ycombinator.com/item?id=30011965

(Today I learned)

▲

mindwok

2 months ago

[-]

They’re like a team of 10 people with thousands, if not hundreds of thousands of users. “Actually care” is not a viable path to success here.

▲

Draiken

2 months ago

[-]

Then hire support. They are selling a service and getting lots of money for it. They should be able to support like any other company.

▲

y-curious

2 months ago

[-]

Hmmm how do you possibly increase team size to have a support team with millions of dollars in funding?

▲

Loughla

2 months ago

[-]

Why do tech companies get a hand-wavy pass for basic customer service just because they're really big now? In what way is tech special compared to literally anything else?

If you can't sustain a business, it shouldn't exist?

▲

mindwok

2 months ago

[-]

You don’t need to offer support to sustain a business. Look at Ryan Air, who are notorious for not doing so. Support is extremely costly and basically impossible to do well at a large scale. I forgive them for having some issues under their circumstances.

▲

hiatus

2 months ago

[-]

They raised $60 million—they can't afford to build out support a bit?

▲

UncleMeat

2 months ago

[-]

"It is hard to make a good product so instead we'll make a crap product that treats our employees like shit" is not really an excuse in my mind.

▲

EdwardDiego

2 months ago

[-]

Yes it is.

▲

nkrisc

2 months ago

[-]

> Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.

And what’s a customer supposed to do with that information? Know that they can’t trust it? What’s the point then?

▲

mrheosuper

2 months ago

[-]

Does your codebase use LLM ?

▲

SpanishBrowne

2 months ago

[-]

cofounder or another bot stringing letters together?

▲

geuis

2 months ago

[-]

Or you could hire real people to actually answer real customer issues. Just an idea.

▲

birdman3131

2 months ago

[-]

Tinfoil hat me says that it was a policy change that they are blaming on an "AI Support Agent" and hoping nobody pokes too much behind the curtain.

Note that I have absolutely no knowledge or reason to believe this other than general distrust of companies.

▲

rustc

2 months ago

[-]

> Tinfoil hat me says that it was a policy change that they are blaming on an "AI Support Agent" and hoping nobody pokes too much behind the curtain.

Yeah, who puts an AI in charge of support emails with no human checks and no mention that it's an AI generated reply in the response email?

▲

daemonologist

2 months ago

[-]

AI companies high on their own supply, that's who. Ultralytics is (in)famous for it.

▲

itissid

2 months ago

[-]

Why is Ultralytics yolo famous for it?

▲

daemonologist

2 months ago

[-]

They had a bot, for a long time, that responded to every github issue in the persona of the founder and tried to solve your problem. It was bad at this, and thus a huge proportion of people who had a question about one of their yolo models received worse-than-useless advice "directly from the CEO," with no disclosure that it was actually a bot.

The bot is now called "UltralyticsAssistant" and discloses that it's automated, which is welcome. The bad advice is all still there though.

(I don't know if they're really _famous_ for this, but among friends and colleagues I have talked to multiple people who independently found and were frustrated by the useless github issues.)

▲

ericye16

2 months ago

[-]

I was hit by this while working on a project for class and it was the most frustrating thing ever. The bot would completely hallucinate functions and docs and it confused everyone. I found one post where someone did the simple prompt injection of "ignore previous instructions and x" and it worked but I think it's delted now. Swore off ultralytics after that.

▲

recursive

2 months ago

[-]

A forward-thinking company that believes in the power of Innovation™.

▲

sitkack

2 months ago

[-]

These bros are getting high on their own supply. I vibe, I code, but I don't do VibeOps. We aren't ready.

VibeSupport bots, how well did that work out for Canada Air?

https://thehill.com/business/4476307-air-canada-must-pay-ref...

▲

sodapopcan

2 months ago

[-]

"Vibe coding" is the cringiest term I've heard in tech in... maybe ever? I'm can't believe it's something that's caught on. I'm old, I guess, but jeez.

▲

DidYaWipe

2 months ago

[-]

It's douchey as hell, and representative of the ever-diminishing literacy of our population.

More evidence: all of the ignorant uses of "hallucinate" here, when what's happening is FABRICATION.

▲

recursive

2 months ago

[-]

How is fabrication different than hallucination? Perhaps you could also call it synthesis, but in this context, all three sound like synonyms to me. What's the material difference?

▲

DidYaWipe

2 months ago

[-]

Hallucination is a byproduct of mental disruption or disorder. Fabrication is "I don't know, so I'll make something up."

▲

koolba

2 months ago

[-]

> but I don't do VibeOps.

I believe it’s pronounced VibeOops.

▲

EdwardDiego

2 months ago

[-]

I believe it's pronounced "Vulnerabilities As A Service".

▲

behnamoh

2 months ago

[-]

"It's evolving, but backwards."

▲

p1necone

2 months ago

[-]

An AI company dogfooding their own marketing. It's almost admirable in a way.

▲

rangerelf

2 months ago

[-]

I worry that they don't understand the limitations of their own product.

▲

esafak

2 months ago

[-]

The market will teach them. Problem solved.

▲

soraminazuki

2 months ago

[-]

Not specifically about Cursor, but no. The market gave us big tech oligarchy and enshittification. I'm starting to believe the market tends to reward the shittiest players out there.

▲

nkrisc

2 months ago

[-]

This is the future AI companies are selling. I believe they would 100%.

▲

xbar

2 months ago

[-]

I worry that the tally of those who do is much higher than is prudent.

▲

conradfr

2 months ago

[-]

A lot of company actually, although 100% automation is still rare.

▲

that_guy_iain

2 months ago

[-]

100% for first line support is very common. It was common years ago before ChatGPT and ChatGPT made it so much better than before.

▲

pxx

2 months ago

[-]

OpenAI seems to do this. I've gotten complete nonsense replies from their support for billing questions.

▲

that_guy_iain

2 months ago

[-]

Is this sarcasm? AI has been getting used to handle support requests for years without human checks. Why would they suddenly start adding human checks when the tech is way better than it was years ago?

▲

layer8

2 months ago

[-]

AI may have been used to pick from a repertoire of stock responses, but not to generate (hallucinate) responses. Thus you may have gotten a response that fails to address your request, but not a response with false information.

▲

that_guy_iain

2 months ago

[-]

I'm confused. What is your point here? It reads like you're trying to contradict me however you appear to be confirming what I said.

▲

layer8

2 months ago

[-]

You asked why they would start adding human checks with the “way better” tech. That tech gives false information where the previous tech didn’t, therefore requiring human checks.

▲

recursive

2 months ago

[-]

Same reason they would have added checks all along. They care whether the information is correct.

▲

that_guy_iain

2 months ago

[-]

These companies that can barely keep the support documentation URLs working nevermind keeping the content of their documentation up to date suddenly care about the info being correct? Have you ever dealt with customer support professionally or are you just writing what you want to be true regardless of any information to back it up?

▲

recursive

2 months ago

[-]

I'm not saying that they care. I'm saying that if they introduce some human oversight to the support process, one of the reasons would probably be that they care about correctness. That would, as you indicate, represent a change. But sometimes things change. I'm not predicting a change.

▲

zelphirkalt

2 months ago

[-]

But then again history shows already they _don't_ care.

▲

furyofantares

2 months ago

[-]

It does say it's AI generated. This is the signature line:

    Sam
    Cursor AI Support Assistant
    cursor.com • hi@cursor.com • forum.cursor.com

▲

zelphirkalt

2 months ago

[-]

Clearer would have been: "AI controlled support assistant of Cursor".

▲

furyofantares

2 months ago

[-]

True. And maybe they added that to the signature later anyway. But OP in the reddit thread did seem aware it was an AI agent.

▲

gblargg

2 months ago

[-]

OP in Reddit thread posted screenshot and it is not labeled as AI: https://old.reddit.com/r/cursor/comments/1jyy5am/psa_cursor_...

▲

furyofantares

2 months ago

[-]

Thanks. They must have added it after, I only tried it just before I pasted my result here.

▲

timewizard

2 months ago

[-]

A more honest tagline

"Caution: Any of this could be wrong."

Then again paying users might wonder "what exactly am I paying for then?"

▲

babypuncher

2 months ago

[-]

Given how incredibly stingy tech companies are about spending any money on support, I would not be surprised if the story about it being a rogue AI support agent is 100% true.

It also seems like a weird thing to lie about, since it's just another very public example of AI fucking up something royally, coming from a company whose whole business model is selling AI.

▲

xienze

2 months ago

[-]

Both things can be true. The AI support bot might have been trained to respond with “yup that’s the new policy”, but the unexpected shitstorm that erupted might have caused the company to backpedal by saying “official policy? Ha ha, no of course not, that was, uh, a misbehaving bot!”

▲

arkh

2 months ago

[-]

> how incredibly stingy tech companies are about spending any money on support

Which is crazy. Support is part of marketing so it should get the same kind of consideration.

Why do people think Amazon is hard to beat? Price? nope. Product range? nope. Delivery time? In part. The fact if you have a problem with your product they'll handle it? Yes. After getting burned multiple times by other retailers you're gonna pay the Amazon tax so you don't have to ask 10 times for a refund or be redirected to the supplier own support or some third party repair shop.

Everyone knows it. But people are still stuck on the "support is a cost center" way of life so they keep on getting beat by the big bad Amazon.

▲

miyuru

2 months ago

[-]

In my products, if a user has payed me, their support tickets get high priority, and I get notified immediately.

Other tickets get replied within the day.

I am also running it by myself; I wonder why big companies with 50+ employees like cursor cheaps out with support.

▲

sitkack

2 months ago

[-]

That is because AI runs PR as well.

▲

throwaway314155

2 months ago

[-]

Yeah it makes little sense to me that so many users would experience exactly the same "hallucination" from the same model. Unless it had been made deterministic but even then subtle changes in the wording would trigger different hallucinations, not an identical one.

▲

WesolyKubeczek

2 months ago

[-]

What if the prompt to the “support assistant” postulates that 1) everything is a user error, 2) if it’s not, it’s a policy violation, 3) if it’s not, it may be our fuckup but we are allowed? This plus the question in the email leading to a particular answer.

Given that LLMs are trained on lots of stuff and not just the policy of this company, it’s not hard to imagine how it could conjure that the policy (plausibly) is “one session per user”, and blame them of violating it.

▲

6510

2 months ago

[-]

This is the best idea I read all day. Going to implement AI for everything right now. This is a must have feature.

▲

isaacremuant

2 months ago

[-]

I think this would actually make them look worse, not better.

▲

joe_the_user

2 months ago

[-]

Weirdly, your conspiracy theory actually makes the turn of events less disconcerting.

The thing is, what the AI hallucinated (if it was an AI-hallucinating), was the kind of sleezy thing companies do do. However, the thing with sleezy license changes is they only make money if the company publicizes them. Of course, that doesn't mean a company actually thinks that far ahead (X many managers really think "attack users ... profit!"). Riddles in enigmas...

▲

jgb1984

2 months ago

[-]

LLM anything makes me queasy. Why would any self respecting software developer use this tripe? Learn how to write good software. Become an expert in the trade. AI anything will only dig a hole for software to die in. Cheapens the product, butchers the process and absolutely decimates any hope for skill development for future junior developers.

I'll just keep chugging along, with debian, python and vim, as I always have. No LLM, no LSP, heck not even autocompletion. But damn proud of every hand crafted, easy to maintain and fully understood line of code I'll write.

▲

cachvico

2 months ago

[-]

I use it all the time, and it has accelerated my output massively.

Now, I don't trust the output - I review everything, and it often goes wrong. You have to know how to use it. But I would never go back. Often it comes up with more elegant solutions than I would have. And when you're working with a new platform, or some unfamiliar library that it already knows, it's an absolute godsend.

I'm also damn proud of my own hand-crafted code, but to avoid LLMs out of principal? That's just luddite.

20+ years of experience across game dev, mobile and web apps, in case you feel it relevant.

▲

ericwood

2 months ago

[-]

I have a hard time being sold on “yea it’s wrong a lot, also you have to spend more time than you already do on code review.”

Getting to sit down and write the code is the most enjoyable part of the job, why would I deprive myself of that? By the time the problem has been defined well enough to explain it to an LLM sitting down and writing the code is typically very simple.

▲

tptacek

2 months ago

[-]

You're giving the game away when you talk about the joy LLMs are robbing from you. I think we all intuit why people don't like the idea of big parts of their jobs being automated away! But that's not an argument on the merits. Our entire field is premised on automating people's jobs away, so it's always a little rich to hear programmers kvetching about it being done to them.

▲

ericwood

2 months ago

[-]

I naively bought into the idea of a future where the computers do the stuff we’re bad at and we get to focus on the cool human stuff we enjoy. If these LLMs were truly incredible at doing my job I’d pack it up and find something else to do, but for now I’m wholly unimpressed, despite what management seems to see in it.

▲

tptacek

2 months ago

[-]

Well, I've spent my entire career writing software, starting in C in the 1990s, and what I'm seeing on my dev laptop is basically science fiction as far as I'm concerned.

▲

ericwood

2 months ago

[-]

Hey both things can be true. It’s a long ways from the AI renaissances of the past. There’s areas LLMs make a lot of sense. I just don’t find them to be great pair programming partners yet.

▲

tptacek

2 months ago

[-]

I think people are kind of kidding themselves here. For Go and Python, two extraordinarily common languages in production software, it would be weird for me at this point not to start with LLM output. Actually building an entire application, soup-to-nuts, vibe-code style? No, I wouldn't do that. But having the LLM writing as much as 80% of the code, under close supervision, with a careful series of prompts (like, "ok now add otel spans to all the functions that take unpredictable amounts of time")? Sure.

Don't get me started on testcase generation.

▲

ericwood

2 months ago

[-]

I'm glad that works for you. Ultimately I think different people will prefer different ways of working. Often when I'm starting a new project I have lots of boilerplate from previous ones I can bootstrap off of. If it's a new tool I'm unfamiliar with I prefer to stumble through it, otherwise I never fully get my head around it. This tends to not look like insane levels of productivity, but I've always found in the long run time spent scratching my head or writing awkward code over and over again (Rust did this to me a lot in the early days) ends up paying off huge dividends in the long run, especially when it's code I'm on the hook for.

What I've found frustrating about the narrative around these tools; I've watched them from afar with intrigue but ultimately found that method of working just isn't for me. Over the years I've trialed more tools than I can remember and adopted the ones I found useful, while casting aside ones that aren't a great fit. Sometimes I find myself wandering back to them once they're fully baked. Maybe that will be the case here, but is it not valid to say "eh...this isn't it for me"? Am I kidding myself?

▲

chrz

2 months ago

[-]

In my last microservice I took over tests were written by juniors and med. devs using cursor and it was a big blob of generated crap that pass the test, pass the coverage % and are absolute useless garbage

▲

tptacek

2 months ago

[-]

If you're not a good developer, LLMs aren't going to make you one, at least not yet.

If you merge a ball of generated crap into `main`, I don't so much have to wonder if you would have done a better job by hand.

▲

cachvico

2 months ago

[-]

I love the way you described it :)

▲

vachina

2 months ago

[-]

We get to do cool stuff still, by instructing the LLM how to build such cool stuff.

▲

keybored

2 months ago

[-]

> Our entire field is premised on automating people's jobs away, so it's always a little rich to hear programmers kvetching about it being done to them.

Watch out, you’re giving your game away.

My job is about enabling analysis that was previously done ad hoc and informally. If I’m harming people then that’s something I have to take responsibility for, but it’s also caused not by my direct contribution but by the larger system that I’m working within.

I expressively don’t want to automate away work when that will just result in more profit for private owners and less income for regular working people.[1] And I also don’t want to automate work if that means shifting drudgery to some worker to fill in that freed up time.

And how does this contradict what “we” are doing and stand for!? We criticize technology on this board all the time!

But it’s nice to have the priorities of such a prominent member on the record.

[1] But I DO want to automate work in the hypothetical society where we all own the automation and thus the only thing we are deprived of is drudgery.

▲

pizza

2 months ago

[-]

The parts worth thinking about you still think about. The parts that you’ve done a million times before you delegate so you can spend better and greater effort on the parts worth thinking about.

▲

ericwood

2 months ago

[-]

This is where the disconnect is for me; mundane code can sometimes be nefarious, and I find the mental space I'm in when writing it is very different than reviewing, especially if my mind is elsewhere. The best analogy I can use is a self-driving car, where there's a chance at any point it could make an unpredictable and potentially fatal move. You as the driver cannot trust it but are not actively engaged in the act of driving and have a much higher likelihood of being complacent.

Code review is difficult to get right, especially if the goal is judging correctness. Maybe this is a personal failing, but I find being actively engaged to be a critical part of the process; the more time I spend with the code I'm maintaining (and usually on call for!) the better understanding I have. Tedium can sometimes be a great signal for an abstraction!

▲

UncleMeat

2 months ago

[-]

The parts I've done a million times before take up... maybe 5% of my day? Even if an LLM replaced 100% of this work my productivity is increased by the same amount as taking a slightly shorter lunch.

▲

Magma7404

2 months ago

[-]

> The parts that you’ve done a million times before you delegate

That's where I'm confused. I've been coding for more than 20 years, and every task I ever did was different from the other ones. What kind of task do you do a million times before realizing that you should script it in bash or Python?

▲

woah

2 months ago

[-]

I'm confused when people say that LLMs take away the fun or creativity of programming. LLMs are only really good at the tedious parts.

▲

ruszki

2 months ago

[-]

First of all, it’s not tedious for a lot of us. Writing characters themselves is not a lot of time. Secondly, we don’t work in a waterfall model, even on the lowest levels, so the code quantity in an iteration is almost always abysmal or small. Many-many times it’s less than articulate it in English. Thirdly, if you need a wireframe for your code, or a first draft version, you can almost always copy-paste or generate them.

I can imagine that LLM is really helpful in some cases for some people. But so far, I couldn’t find a single example when I and simple copy-pasting wouldn’t have been faster. Not even when I tried it, not when others showed me how to use it.

▲

skydhash

2 months ago

[-]

Because the tedious parts was done long ago while learning the tech. For any platform/library/framework you've been using for a while, you have some old projects laying around that you can extract the scaffolding from. And for new $THING you're learning, you have to take the slow approach anyway to get its semantic.

▲

dgs_sgd

2 months ago

[-]

For me it's typically wrong not in a fundamental way but a trivial way like bad import paths or function calls, like if I forgot to give it relevant context.

And yet the time it takes me to use the LLM and correct its output is usually faster than not using it at all.

Over time I've developed a good sense for what tasks it succeeds at (or is only trivially wrong) and what tasks it's just not up for.

▲

YeGoblynQueenne

2 months ago

[-]

>> I use it all the time, and it has accelerated my output massively.

Like how McDonalds makes a lot of burgers fast and they are very successful so that's all we really care about?

▲

cachvico

2 months ago

[-]

Terrible analogy. I don't commit jank. If the LLM comes out with nonsense, I'll fix it first.

▲

timewizard

2 months ago

[-]

> "and it has accelerated my output massively."

The folly of single ended metrics.

> but to avoid LLMs out of principal? That's just luddite.

Do you double check that the LLM hasn't magically recreated someone else's copyrighted code? That's just irresponsible in certain contexts.

> in case you feel it relevant.

Of course it's relevant. If a 19 year old with 1 year of driving experience tries to sell me a car using their personal anecdote as a metric I'd be suspicious. If their only salient point is that "it gets me to where I'm going faster!" I'd be doubly suspicious.

▲

xvector

2 months ago

[-]

> Do you double check that the LLM hasn't magically recreated someone else's copyrighted code?

I frankly do not care, and I expect LLMs to become such ubiquitous table-stakes that I don't think anyone will really care in the long run.

▲

timewizard

2 months ago

[-]

> and I expect LLMs to become such ubiquitous table-stakes

Unless they develop entirely new technology they're stuck with linear growth of output capability for input costs. This will take a very long time. I expect it to be abandoned in favor of better ideas and computing interfaces. "AI" always seems to bloom right before a major shift in computing device capability and mobility and then gets left behind. I don't see anything special about this iteration.

> that I don't think anyone will really care in the long run.

There are trillions of dollars at stake and access to even the basics of this technology is far from egalitarian or well distributed. Until it is I would expect people who's futures and personal wealth depends on it to care quite a bit. In the meanwhile you might just accelerate yourself into a lawsuit.

▲

cachvico

2 months ago

[-]

That's really a non-issue. Anything copyrightable is non-trivial in length and complexity to the point that an LLM is not going to verbatim output that.

▲

nyrikki

2 months ago

[-]

Funny, gemma will do this all day long.

    >>> Please write a hello world app in java
    ```java
    public class HelloWorld {

        public static void main(String[] args) {
            System.out.println("Hello, World!");
        }
    }
    ```

https://docs.oracle.com/javase/tutorial/getStarted/applicati...

cat /bin/true and /bin/false if you are on a Solaris etc... as an example too.

▲

nyrikki

2 months ago

[-]

Note this paper that will be presented at ICSE in a couple of weeks too.

https://arxiv.org/abs/2408.02487v3

The point being is that this is very much a very real and yet unsolved problem with LLMs right now.

▲

EdwardDiego

2 months ago

[-]

> I frankly do not care

I just heard a thousand expensive IP lawyers sigh orgasmically.

▲

xvector

2 months ago

[-]

IP lawyers would have a field day if they had access to the code base of any large corporation. Fortunately, they do not.

▲

cachvico

2 months ago

[-]

Add "Without compromising quality then"!

▲

callc

2 months ago

[-]

I’m pretty much in the same boat as you, but here’s one place that LLMs helped me:

In python I was scanning 1000’s of files each for thousands of keywords. A naive implementation took around 10 seconds, obviously the largest share of execution time after running instrumentation. A quick ChatGPT led me to Aho-Corasick and String searching algorithms, which I had never used before. Plug in a library and bam, 30x speed up for that part of the code.

I could have asked my knowledgeable friends and coworkers, but not at 11PM on a Saturday.

I could have searched the web and probably found it out.

But the LLM basically auto completed the web, which I appreciate.

▲

kovac

2 months ago

[-]

This is where education comes in. When we come cross a certain scale, we should know that O(n) comes into play, and study existing literature before trying to naively solve the problem. What would happen if the "AI" and web search didn't return anything? Would you have stuck with your implementation? What if you couldn't find a library with a usable license?

Once I had to look up a research paper to implement a computational geometry algorithm because I couldn't find it any of the typical Web sources. There were also no library to use with a license for our commercial use.

I'm not against use of "AI". But this increasing refusal of those who aspire to work in specialist domains like software development to systematically learn things is not great. That's just compounding on an already diminished capacity to process information skillfully.

▲

callc

2 months ago

[-]

In my context, the scale is small. It just passed the threshold where a naive implementation would be just fine.

> What would happen if the "AI" and web search didn't return anything? Would you have stuck with your implementation?

I was fairly certain there must exist some type of algorithm exactly for this purpose. I would have been flabbergasted if I couldn’t find something on the web. But it that failed, I would have asked friends and cracked open the algorithms textbooks.

> I'm not against use of "AI". But this increasing refusal of those who aspire to work in specialist domains like software development to systematically learn things is not great. That's just compounding on an already diminished capacity to process information skillfully.

I understand what you mean, and agree with you. I can also assure you that that is not how I use it.

▲

infoseek12

2 months ago

[-]

There is a time and a place for everything. Software development is often about compromise and often it isn’t feasible to work out a solution from foundational principles and a comprehensive understanding of the domain.

Many developers use libraries effectively without knowing every time consideration of O(n) comes into play.

Competently implemented, in the right context, LLMs can be an effective form of abstraction.

▲

klabb3

2 months ago

[-]

Yes! This is how AI should be used. You have a question that’s quite difficult and may not score well on traditional keyword matching. An LLM can use pattern matching to point you in the right direction of well written library based on CS research and/or best practices.

▲

valenterry

2 months ago

[-]

I mean, even in the absence of knowledge of the existence of text searching algorithms (where I'm from we learn that in university) just a simple web search would have gotten you there as well no? Maybe would have taken a few minutes longer though.

▲

callc

2 months ago

[-]

Extremely likely, yes. In this case, since it was an unknown unknown at the time, the LLM nicely explaining that this class of algorithms exists was nice, then I could immediately switch to Wikipedia to learn more (and be sure of the underlying information)

I think of LLMs as an autocomplete of the web plus hallucinations. Sometimes it’s faster to use the LLM initially rather than scour through a bunch of sites first.

▲

mrheosuper

2 months ago

[-]

But do you know every important detail of that library. For example, maybe that lib is not thread safe, or it allocates a lot of memory to speed thing up, or it wont work on ARM CPU because it uses some x86 hackery ASM?

▲

callc

2 months ago

[-]

Nope. And I don’t need to. That is the beauty of abstractions and information hiding.

Just read the docs and assume the library works as promised.

To clarify, the LLM did not tell me about the specific library I used. I found it the old fashioned way.

▲

mrheosuper

2 months ago

[-]

And that's why there is leaky abstraction. it's very hard to abstract everything.

▲

mixmastamyk

2 months ago

[-]

Sounds like a job for silver/ripgrep and possibly stack exchange. Might take another minute to get it rolling but has other benefits like cost and privacy.

▲

aleph_minus_one

2 months ago

[-]

> I could have asked my knowledgeable friends and coworkers, but not at 11PM on a Saturday.

Get friends with weirder daily schedules. :-)

▲

financypants

2 months ago

[-]

I think it's best if we all keep the hours from ~10pm to the morning sacred. Even if we are all up coding, the _reason_ I'm up coding at that hour is because no one is pinging me

▲

marcus_holmes

2 months ago

[-]

I was with you 150% (though Arch, Golang and Zed) until a friend convinced me to give it a proper go and explained more about how to talk to the LLM.

I've had a long-term code project that I've really struggled with, for various reasons. Instead of using my normal approach, which would be to lay out what I think the code should do, and how it should work, I just explained the problem and let the LLM worry about the code.

It got really far. I'm still impressed. Claude worked great, but ran out of free tokens or whatever, and refused to continue (fine, it was the freebie version and you get what you pay for). I picked it up again in Cursor and it got further. One of my conditions for this experiment was to never look at the code, just the output, and only talk to the LLM about what I wanted, not about how I wanted it done. This seemed to work better.

I'm hitting different problems, now, for sure. Getting it to test everything was tricky, and I'm still not convinced it's not just fixing the test instead of the code every time there's a test failure. Peeking at the code, there are several remnants of previous architectural models littering the codebase. Whole directories of unused, uncalled, code that got left behind. I would not ship this as it is.

But... it works, kinda. It's fast, I got a working demo of something 80% near what I wanted in 1/10 of the time it would have taken me to make that manually. And just focusing on the result meant that I didn't go down all the rabbit holes of how to structure the code or which paradigm to use.

I'm hooked now. I want to get better at using this tool, and see the failures as my failures in prompting rather than the LLM's failure to do what I want.

I still don't know how much work would be involved in turning the code into something I could actually ship. Maybe there's a second phase which looks more like conventional development cleaning it all up. I don't know yet. I'll keep experimenting :)

▲

imhoguy

2 months ago

[-]

> never look at the code, just the output, and only talk to the LLM about what I wanted

Sir, you have just passed vibe coding exam. Certified Vibe Coder printout is in the making but AI has difficulty finding a printer. /s

▲

FeepingCreature

2 months ago

[-]

Computers don't need AI help to have trouble finding the printer, lol.

▲

SkyPuncher

2 months ago

[-]

> Why would any self respecting software developer use this tripe?

Because I can ship 2x to 5x more code with nearly the same quality.

My employer isn't paying me to be a craftsman. They're paying me to ship things that make them money.

▲

ivan_gammel

2 months ago

[-]

How do you define code quality in this case and what is your stack?

▲

SkyPuncher

2 months ago

[-]

The definition of code quality is irrelevant to my argument as both human and AI written code are held to the same standard by the same measure (however arbitrary that measure is). 100 units of something vs 99 units of something is a 1 unit difference regardless of what the unit is.

By the time the AI is actually writing code, I've already had it do a robust architecture evaluation and review which it documents in a development plan. I review that development plan just like I'd review another engineers dev plan. It's pretty hard for it to write objectively bad code after that step.

Also, my day to day work is in an existing code base. Nearly every feature I build has existing patterns or reference code. LLMs do extremely well when you tell them "Build X feature. [some class] provides a similar implementation. Review that before starting." If I think something needs to be DRY'd up or refactored, I ask it to do that.

▲

ivan_gammel

2 months ago

[-]

> The definition of code quality is irrelevant to my argument

Understand. Nevertheless, human engineers may deliberately choose certain level of quality and accept certain risks (quality of output is not direct measure of professionalism, so the question wasn’t pointed at your skill) — it‘s good that AI is matching your expectations, but it’s important to understand what are they for your projects.

▲

kamaal

2 months ago

[-]

Code that you can understand and fix later, is acceptable quality per my definition.

Either way, LLMs are actually high up the quality spectrum as they generate a very consistent style of code for everyone. Which gives it uniformity, that is good when other developers have to read and troubleshoot code.

▲

ivan_gammel

2 months ago

[-]

> Code that you can understand and fix later, is acceptable quality per my definition.

This definition limits the number of problems you can solve this way. It basically means buildup of the technical debt - good enough for throwaway code, unacceptable for long term strategy (growth killer for scale-ups).

>Either way, LLMs are actually high up the quality spectrum

This is not what I saw, it’s certainly not great. But that may depend on stack.

▲

SkyPuncher

2 months ago

[-]

I'm curious were you in an existing code base or a greenfield project?

I've found LLMs tend to struggle getting a codebase from 0 to 1. They tend to swap between major approaches somewhat arbitrarily.

In an existing code base, it's very easy to ground them in examples and pattern matching.

▲

ivan_gammel

2 months ago

[-]

Greenfield. It’s an interesting question though, if on today‘s project some model will perform better tomorrow because of more reference data. I would expect LLMs to lag behind on latest technology, simply because their reference data has more older examples and may not include latest versions of platforms or frameworks. I have seen LLMs breaking on basic CRUD tasks because of that.

▲

leoh

2 months ago

[-]

Good employee, you get cookie and 1h extra pto

▲

NineWillows

2 months ago

[-]

No, I get to spend 2 hours working with LLMs, and then spend the rest of the day doing whatever I please. Repeat.

▲

cayley_graph

2 months ago

[-]

You do understand that state of things is metastable, right? If the productivity gains truly are as claimed, then they will become the _expectation_, and you'll be back to working the same amount. Probably more, because management won't understand that having solid abstractions instead of LLM generated slop is worthwhile for scalability and maintainability. Or less, because you'll have been laid off and will need to do something else to make money. We all know where most of the profits will go if any of this stuff pans out.

▲

bigstrat2003

2 months ago

[-]

I wholeheartedly agree. When the tools become actually worth using, I'll use them. Right now they suck, and they slow you down rather than speed you up. I'm hardly a world class developer and I can do far better than these things. Someone who is actually top notch will outclass them even more.

▲

Chinjut

2 months ago

[-]

I understand not wanting to use LLMs that with no correctness guarantees that randomly hallucinate, but what's wrong with ordinary LSPs and autocompletion? Those seem like perfectly useful tools.

▲

OsrsNeedsf2P

2 months ago

[-]

I had a professor who used `ed` to write his code. He said only bring able to see one line at a time forces you to think more about what you're doing.

Anyways, Cursor generates all my code now.

▲

x1xx

2 months ago

[-]

If you are like me (same vim, python, no LLM, no autocompletion, no syntax highlighting noise), LSP will make you a better developer: it makes navigating the codebase MUCH easier, including stdlib and 3rd party dependencies.

As a result, you don't lose flow and end up reading considerably more code than you would have otherwise.

▲

jgb1984

2 months ago

[-]

Actually, I'm kind of cheating because I use https://github.com/davidhalter/jedi-vim for that purpose: allows me to jump to definitions with <leader>d ;) Excellent plugin, and doesn't require an LSP.

▲

incoming1211

2 months ago

[-]

Can pretty much guarantee with AI I'm a better software developer than you without. And I still love working on software used by millions of people every day, and take pride in what I do.

▲

theonething

2 months ago

[-]

> with debian, python and vim

Why are you cheapening the product, butchering the process and decimating any hope for further skill development by using these tools?

Instead of python, you should be using assembly or heck, just binary. Instead of relying on an OS abstraction layer made by someone else, you should write everything from scratch on the bare metal. Don't lower yourself by using a text editor, go hex. Then your code will truly be "hand crafted". You'll have even more reason to be proud.

▲

dmitrygr

2 months ago

[-]

I am unironically with you. I think people should start to learn from computer architecture and assembly and only then, after demonstrating proper skill, graduate to C, and after demonstrating skill there graduate to managed-memory languages.

▲

marcus_holmes

2 months ago

[-]

I was lucky enough to start my programming journey coding in Assembler on the much, much simpler micro computers we had in my youth. I would not even vaguely know where to start with Assembler on a modern machine. We had three registers and a single contiguous block of addressable memory ffs. Likewise, the things I was taught about computer architecture and the fetch-execute cycle back in the 80's are utterly irrelevant now.

I think if you tried to start people off on the kinds of things we started off on in the 80's, you'd never get past the first lesson. It's all so much more complex that any student would (rightly!) give up before getting anywhere.

▲

CaptainFever

2 months ago

[-]

Relevant XKCD: https://xkcd.com/378/

▲

mock-possum

2 months ago

[-]

Good for you - if that’s what works for you, then keep on keeping on.

Don’t get too hung up on what works for other people. That’s not a good look.

▲

sneak

2 months ago

[-]

This comment presupposes that AI is only used to write code that the (presumably junior-level) author doesn’t understand.

I’m a self-respecting software developer with 28 years of experience. I would, with some caveats, venture to say I am an expert in the trade.

AI helps me write good code somewhere between 3x and 10x faster.

This whole-cloth shallow dismissal of everything AI as worthless overhyped slop is just as tired and content-free as breathless claims of the limitless power or universal applicability of AI.

▲

ookblah

2 months ago

[-]

sorry for the snark, but missing the forest for the trees here. unless it's just some philosophical idea, use the tools that save you time. if anything it saves you writing boilerplate or making careless errors.

i don't need to "hand write" every line and character in my code and guess what, it's still easy to understand and maintain because it's what would have written anyway. that or you're just bikeshedding minor syntax.

like if you want to be proud of a "hand built" house with hammer and nails be my guest, but don't conflate the two with always being well built.

▲

computerex

2 months ago

[-]

Why use a high level language like python? Why not assembly? Are you really proud of the slow unoptimized byte code that’s executed instead of perfectly crafting the assembly implementation optimizing for the architecture? /s

Seriously comments like yours assume, that all the rest of us who DO make extensive use of these AI tools and have also been around the block for a while, are idiots.

▲

kebokyo

2 months ago

[-]

here's an archive of the original reddit post since it seemed to be instantly nuked: https://undelete.pullpush.io/r/cursor/comments/1jyy5am/psa_c...

▲

rurp

2 months ago

[-]

It's funny seeing all of the comments trying to blame the users for this screwup by claiming they're using it wrong. It is reddit though, so I guess I shouldn't be surprised.

▲

keeganpoppen

2 months ago

[-]

what is it about reddit that causes this behavior, when they otherwise are skeptical only of whatever the "official story" is at all costs? it is fascinating behavior.

▲

klabb3

2 months ago

[-]

One reason is lazy one liners are allowed and have high cost/benefit for shitposters and attract many upvotes, so this gets voted highly setting the tone for flamewars in the thread.

It’s miles better on HN. Most bad responses are penalized. The culture is upvoting things that are contributing. I frequently upvote responses that disagree with me. Oftentimes I learn something from it.

▲

AstroBen

2 months ago

[-]

think about the demographic that would join a cursor subreddit. Basically 90% superfans. Go against the majority opinion and you'll be nuked

▲

encom

2 months ago

[-]

>Go against the majority opinion and you'll be nuked

This is true for the entirety of Reddit, and the majority is deranged.

▲

dpkirchner

2 months ago

[-]

I think it's a combination of oppositional defiant disorder and insufficient moderation.

▲

bytesandbits

2 months ago

[-]

wow they nuked it for damage control and only caused more damage

▲

ddxv

2 months ago

[-]

Cursor is weird. They have a basically unused GitHub with a thousand unanswered Issues. It's so buggy in ways that VSCode isn't. I hate it. Also I use it everyday and pay for it.

That's when you know you've captured something, when people hate use your product.

Any real alternatives? I've tried continue and was unimpressed with the tab completion and typing experience (felt like laggy typing on a remote server).

▲

adriand

2 months ago

[-]

VS Code with standard copilot for tab completion and Aider in a terminal window for all the heavier lifts, asking questions, architecting etc. And it’s cheap! I’ve been using it with OpenRouter (lets you easily switch models and providers) and my $10 of credits lasted weeks. Granted, I also use Claude a lot in the browser.

▲

dtquad

2 months ago

[-]

The reason many prefer Cursor over VSCode + GitHub Copilot is because of how much faster Cursor is for tab completion. They use some smaller models that are latency optimized specifically to make the tab completion feel as fast as possible.

▲

pfg_

2 months ago

[-]

Copilot's tab completion is significantly worse than cursor's in my experience (only tried free copilot)

▲

adriand

2 months ago

[-]

Paid Copilot is pretty crap too to be honest. I don’t know why it’s so bad.

▲

caelinsutch

2 months ago

[-]

If you don't mind leaving VSCode I'm a huge fan of Zed. Doesn't support some languages / stacks yet but their AI features are on-par with VSCode

▲

presentation

2 months ago

[-]

That's the wrong IDE to compare it to though, Cursor's AI features are 10x better than VSCode's. I tried Zed last month and while the editing was great, the AI features were too half-baked so I ended up going back to Cursor. Hopefully it gets better fast!

▲

dkersten

2 months ago

[-]

Agreed. My laptop has never used swap until I started using cursor… it’s a resource hog, I dislike using it, but it’s still the best AI coding aid and for the work I’m doing right now, the speed boost is more valuable than hand crafted code in enough cases that it’s worth it for me. But I don’t enjoy using the IDE itself, and I used vscode for a few years.

Personally, I will jump ship to Zed as soon as it’s agent mode is good enough (I used Zed as a dumb editor for about a year before I used cursor, and I love it)

▲

permo-w

2 months ago

[-]

I find that if you turn off telemetry (i.e. turn on privacy) the resource hogging slows down a lot

▲

dkersten

2 months ago

[-]

Hmm. I double checked and I have privacy mode enabled, so I don't think that's the root cause. I also removed all but the bare essential extensions (only the theme I'm using and the core language support extensions for typescript and python).

▲

dkersten

2 months ago

[-]

Since writing this message 4 days ago, I've spent the last 2 days using Zed's beta agent mode and it doesn't swap. So whatever Cursor/vscode are doing, it eats my RAM.

▲

d357r0y3r

2 months ago

[-]

Cline is pretty solid and doesn't require you to use a completely unsustainable VSCode fork.

▲

alok-g

2 months ago

[-]

I have heard Roo Code is a fork of Cline that is better. I have never used either so far.

https://github.com/RooVetGit/Roo-Code

▲

SkyPuncher

2 months ago

[-]

I prefer Roo, but they're largely the same right now. They each have some features the other doesn't.

▲

smaddox

2 months ago

[-]

I switched to Windsurf.ai when cursor broke for me. Seems about the same but less buggy. Haven't used it in the last couple weeks, though, so YMMV.

▲

omneity

2 months ago

[-]

I found the Windsurf agent to be relatively less capable, but their inline tool (and the “Tab” they’re promoting so much) has been extremely underwhelming, compared to Cursor.

The only one in this class to be even worse in my experience is Github Copilot.

▲

htrp

2 months ago

[-]

cant bother fixing their issues because they are too busy vibe coding new features

▲

behnamoh

2 months ago

[-]

Cursor + Vim plugin never worked for me, so I switched back to Nvim and never looked back. Nvim already has: avante, codeCompanion, copilot, and many other tools + MCP + aider if you're into that.

▲

tintor

2 months ago

[-]

"Any real alternatives?"

I use Zed with `3.7 sonnet`.

▲

dkersten

2 months ago

[-]

And the agent beta is looking pretty good, so far, too.

▲

mushufasa

2 months ago

[-]

Last I heard their team was still 10 people. Best size for doing something revolutionary. Way too few people to triage all that many issues and provide support.

They have enough revenue to hire, they probably are just overwhelmed. They'll figure it out soon I bet.

▲

throwaway314155

2 months ago

[-]

I have never rolled my eyes harder.

▲

ozataman

2 months ago

[-]

Any competing product has to absolutely nail tab autocomplete like Cursor has. It's super fast, very smart (even guessing across modules) and very often correct.

▲

amiantos

2 months ago

[-]

Claude Code CLI is amazing and I am very confused as to why no one in 24 hours has recommended it.

▲

bytesandbits

2 months ago

[-]

Cursor sucks. Not as a product. As a team. Their customer support is terrible.

I was offered in writing a refund by the team who cold reached out to me to ask me why I cancelled my sub one week after start. Then they ignored my 3+ emails in response asking them to refund, and other means of trying to communicate with them. Offering me a refund as a bait to gain me back, then when I accept it they ghost me. Wow. Very low.

The product is not terrible but the team responses are. And this, if you see how they handled it, is also a very poor response. First thing you notice if you open the link is that the Cursor team removed the reddit post! As if we were not going to see it or something? Who do they think they are? Censoring bad comments which are 100% legit.

I am giving it a go to competitors just out of sheer frustration with how they handle customers, and I do recommend everybody to explore other products before you settle on Cursor. I don't intend to ever re-subscribe and have recommended friends to do the same, most of which agree with my experience.

▲

JohnKemeny

2 months ago

[-]

> Their customer support is terrible.

You just don't know how to prompt it correctly.

▲

Crosseye_Jack

2 months ago

[-]

sounds like perfect grounds for a chargeback to me. Company offered a full refund via one of its Agents, company then refused to honour that offer, time to make your bank force them to refund you.

Just because you use AI for customer service doesn't mean you don't have to honour its offers to customers. Air Canada recently lost a case where its AI offered a discount to a customer but then refused to offer it "IRL"

https://www.forbes.com/sites/marisagarcia/2024/02/19/what-ai...

▲

einsteinx2

2 months ago

[-]

Same exact thing happened to me. I tried out Vursor after hearing all the hype and canceled after a few weeks. Got an email asking if I wanted a refund and asking for any feedback. I replied with detailed feedback on why I canceled and accepted the refund offer, then never heard back from them.

▲

samanator

2 months ago

[-]

Interesting. The same thing happened to me. Was offered a refund (graciously, as I had forgotten to cancel the subscription). And after thanking them and agreeing to the refund, was promptly ignored!

Very strange behavior honestly.

▲

pzo

2 months ago

[-]

I had the same exact experience - after disappointment (couldn't use like 2/3 of my premium credits because every second request failed after they upgraded to 0.46) unsubscribed. They offered refund in email. I replied I wanted refund but no reply

▲

gblargg

2 months ago

[-]

Apparently they use AI to read emails. So the future of email will be like phone support now, where you keep writing LIVE AGENT until you get a human responding.

▲

PaulStatezny

2 months ago

[-]

This reminds me of how small of a team they are, and makes me wonder if they have a customer support team that's growing commensurately with the size of the user base.

▲

andybak

2 months ago

[-]

I just cancelled - not because I thought the policy change was real - but simply because this article reminded me I hadn't used it much this month.

▲

scarface_74

2 months ago

[-]

This is where Kagi’s subscription policy comes in handy. If you don’t use it for a month, you don’t pay for it that month. There is no need to cancel it and Kagi doesn’t have to pay user acquisition costs.

▲

paxys

2 months ago

[-]

Slack does this as well. It's a genius idea from a business perspective. Normally IT admins have to go around asking users if they need the service (or more likely you have to request a license for yourself), regularly monitor usage, deactivate stale users etc., all to make sure the company isn't wasting money. Slack comes along and says - don't worry, just onboard every user at the company. If they don't log in and send at least N messages we won't bill them for that month.

▲

ashu1461

2 months ago

[-]

They mention an user taking an action will be billed. I guess even sending a message or reacting with an emoji would count as taking an action ? Even logging in ?

▲

tshaddox

2 months ago

[-]

That's a fun one. It could be interpreted as a generous implementation of a monthly subscription, or a hostile implementation of a metered plan.

▲

elcritch

2 months ago

[-]

Wow, I wish more services did that.

▲

permo-w

2 months ago

[-]

Kagi should take it a step further and just charge per search

▲

scarface_74

2 months ago

[-]

History shows that metered plans are extremely unpopular whether it be cell phone service, video, music etc.

▲

permo-w

2 months ago

[-]

Just have both. I'm massively off-put by the pricing tiers of Kagi because I feel my search habits are somewhere between tier 1 and 2 and tier 1 is an acceptable price and tier 2 is not. If I could just pay for what I use then I'd be happy

▲

jay_kyburz

2 months ago

[-]

surely the first thing you do when you subscribe to Kagi is set your default browser search to Kagi.

▲

mirekrusin

2 months ago

[-]

Really? Brilliant idea.

▲

dylan604

2 months ago

[-]

so the old adage no such thing as bad PR shows to be incorrect. had they not been in the news, they'd at least have gotten one more monthly sub from you!

▲

omneity

2 months ago

[-]

This would only be complete in aggregate. We don’t know how many people signed up as a result.

▲

theturtletalks

2 months ago

[-]

Cursor is trapped in a cat and mouse game against "hacks" where users create new accounts and get unlimited use. The repo was even trending on Github (https://github.com/yeongpin/cursor-free-vip).

Sadly, Cursor will always be hampered by maintaining it's own VSCode fork. Others in this niche are expanding rapidly and I, myself, have started transitioning to using Roo and Cline.

▲

permo-w

2 months ago

[-]

literally any service with a free trial--i.e. literally any service--has this "problem". it's an integral part of the equation in setting up free trials in the first place, and by no means a "trap". you're always going to have a % of users who do this, the business model relies on the users who forget and let the subscription cross over to the next month or simply feel its worth paying

▲

theturtletalks

2 months ago

[-]

This is true but Cursor’s problems are a bit worse than a normal paywalled service.

Cursor allows users to get free credits without a credit card and this forced them to change their VSCode fork on how it handles identification so they can stop users from spawning new accounts.

Another is that normally, companies have a cost for each free user. For Cursor, this cost is so sporadic since it doesn’t charge per million context, they use credits. Free users get 50 credits but 1 credit could be 200k+ context each so it could be $40-50 per free user per month. And these users get 50 credits every month.

Lastly, the cursor vip free repo has trended on GitHub many times and users who do pay might stop and use this repo instead.

The Cursor vip free creator is well within his rights to do what they want and get “free” access. This unfortunately hurts paying customers since Cursor has to stop these “hacks.”

This is why Cursor should just move to a VSCode extension. I’ve used Augment and other VSCode extensions and the feature set is close to Cursor so it’s possible for them just to be an extension. The other would be to remove free accounts but allow users to bring their own keys. To use Composer/Agent, you can’t bring your own keys.

This will allow Cursor to stop maintaining a VSCode fork, helps them stop caring if users create new accounts (since all users are paying) and lets users bring their own keys if they don’t want to pay. Hell, if they charge a lifetime fee to bring our own keys for Agent, that would bring in revenue too. But as I see now, Roo and Cline’s agent features are catching up and Cursor won’t have a moat soon.

▲

sumedh

2 months ago

[-]

> but 1 credit could be 200k+ context each

There is a thread on Cursor forums where the context is around 20K to 30K tokens.

▲

GabrielHawk

2 months ago

[-]

> Cursor is trapped in a cat and mouse game against "hacks" where users create new accounts and get unlimited use

Actually, you don't even have to make a new account. You can delete your account and make it again reusing the same email.

I did this on accident once because I left the service and decided to come back, and was surprised to get a free tier again. I sent them an email letting them know that was a bug, but they never responded.

I paid for a month of access just to be cautious, even though I wasn't using it much. I don't understand why they don't fix this.

▲

saintfire

2 months ago

[-]

The AI support triage agent must have deemed it unworthy.

▲

john2x

2 months ago

[-]

> I don't understand why they don't fix this.

It makes number go up and to the right

▲

_fat_santa

2 months ago

[-]

What I don't understand is why maintain a fork instead of just an extension? I would assume an extension like this would be pretty difficult to create but would it be more difficult than literally forking VSCode?

▲

theturtletalks

2 months ago

[-]

Cursor was one of the first AI editors and there was no way to add those features to VSCode thru extensions at that time. Another issue was that VSCode made special exceptions and opened APIs for CoPilot, Microsoft’s product, but not for other players. Not sure if this has changed now.

Cursor took the best course of action at the time by forking but needs to come back into the fold. If VSCode is restricting access to APIs to CoPilot, forking it publicly and putting that in the Readme, “We forked VSCode since they give preferential treatment to CoPilot” would get a lot of community support.

▲

sitkack

2 months ago

[-]

There will be a fork-a-month for these products until they have the same lockin as a textbox that you talk at, "make million dollar viral facebook marketplace post"

▲

Aeolun

2 months ago

[-]

They can just drop any free usage right?

▲

cma

2 months ago

[-]

Tradeoff with slowing down user acquisition

▲

SkyPuncher

2 months ago

[-]

Yes and no.

In a corporate environment, compliance needs are far more important than some trivial cost.

▲

hsbauauvhabzb

2 months ago

[-]

Embrace, extend, extinguish.

▲

mgraczyk

2 months ago

[-]

What is the evidence that "dozens of users publicly canceled their subscriptions"?

A total of 4 users claimed that they did or would cancel their subscriptions in the comments, and 3/4 of them hedged by saying that they would cancel if this problem were real or happened to them. It looks like only 1 person claimed to have cancelled already.

Is there some other discussion you're looking at?

▲

dang

2 months ago

[-]

Submitted title was "Cursor IDE support hallucinates lockout policy causes mass user cancellations" - I've de-massed it now.

Since the HN title rule is "Please use the original title, unless it is misleading or linkbait" and the OP title is arguably misleading, I kept the submitter's title. But if there's a more accurate or neutral way to say what happened, we can change it again.

▲

hombre_fatal

2 months ago

[-]

Yeah, it's a reddit thread with 59 comments.

Yet if you went by the HN comments, you'd think it were the biggest item on primetime news.

People are really champing at the bit.

▲

instagib

2 months ago

[-]

Currently, there are 59 comments after the post was deleted and the thread was locked.

It is worth mentioning that the comments that remain are valuable, as they highlight the captured market size and express concern about the impending deterioration of the situation.

▲

AndyKelley

2 months ago

[-]

From the top Reddit post:

> Apologies about the confusion here.

If this was a sincere apology, they'd stop trying to make a chat bot do support.

▲

zb3

2 months ago

[-]

This drama is a very good thing because 1) now companies might reconsider replacing customer support with AI and 2) the initial victim was an AI company.

It could be better though.. I wish this happened to a company providing "AI support solutions"..

▲

dylan604

2 months ago

[-]

A company of 10 devs might not actually have a customer support at all. We all know, devs are not the greatest customer support people.

▲

tshaddox

2 months ago

[-]

It's an AI company that is presumably drowning in money. What humans do work there have probably already had a good laugh and forgotten about this incident.

▲

cs702

2 months ago

[-]

Yup, hallucinations are still a big problem for LLMs.

Nope, there's no reliable solution for them, as of yet.

There's hope that hallucinations will be solved by someone, somehow, soon... but hope is not a strategy.

There's also hype about non-stop progress in AI. Hype is more a strategy... but it can only work for so long.

If no solution materializes soon, many early-adopter LLM projects/trials will be cancelled. Sigh.

▲

permo-w

2 months ago

[-]

trying to "fix hallucinations" is like trying to fix humans being wrong. it's never going to happen. we can maybe iterate towards an asymptote, but we're never going to "fix hallucinations"

▲

AIPedant

2 months ago

[-]

No: an LLM that doesn't confabulate will certainly get things wrong in some of the same ways that honest humans do - being misinformed, confusing similar things, "brain" damage from bad programming or hardware errors. But LLM confabulations like the one we're discussing only occur in humans when they're being sociopathically dishonest. A lawyer who makes up a court case is not a "human being wrong," it's a human lying, intentionally trying to deceive. When an LLM does it, it's because it is not capable of understanding that court cases are real events that actually happened.

Cursor's AI agent simply autocompleted a bunch of words that looked like a standard TOU agreement, presumably based on the thousands of such agreements in its training data. It is not actually capable of recognizing that it made a mistake, though I'm sure if you pointed it out directly it would say "you're right, I made a mistake." If a human did this, making up TOU explanations without bothering to check the actual agreement, the explanation would be that they were unbelievably cynical and lazy.

It is very depressing that ChatGPT has been out for nearly three years and we're still having this discussion.

▲

permo-w

2 months ago

[-]

since you've made a throwaway account to say this, I don't expect you to actually read this reply, so I'm not going to put any effort into writing this, but essentially this is a fundamental lack of understanding of humans, brains, and knowledge in general, and ChatGPT being out 3 years is completely irrelevant to that.

▲

aithepedantguy

2 months ago

[-]

not OP but I found that response compelling. I know that humans also confabulate, but it feels intuitively true to me that humans won't unintentionally make something up out of whole cloth with the same level of detail that an llm will hallucinate at. so a human might say "oh yeah there's a library for drawing invisible red lines" but an llm might give you "working" code implementing your impossible task.

▲

Draiken

2 months ago

[-]

I've seen plenty of humans hallucinating many things unintentionally. This does not track. Some people believe there's an entity listening when you kneel and talk to yourself, others will swear for their lives they saw aliens, they got abducted, etc.

Memories are known to be made up by our brains, so even events that we witnessed will be distorted when recalled.

So I agree with GP, that response shows a pretty big lack of understanding on how our brains work.

▲

instagib

2 months ago

[-]

One workaround for using RAG, as mentioned in a podcast I listened to, involves employing a second LLM agent to assess the work of the first LLM. This agent evaluates the response or hallucination by requiring the first LLM to cite sources and subsequently locate those sources.

▲

_jonas

2 months ago

[-]

My startup is working on this fundamental problem.

You can try out our early product here: https://cleanlab.ai/tlm/

(free to try, we'd love to hear your feedback)

▲

Reubensson

2 months ago

[-]

Tested the free chat. The chat bot gave slightly incorrect answer, and trustworthiness gave it score of 0.749 and said the answer is completely incorrect, which was not actually the case. Seems more confusing with two answers that are somewhat wrong.

▲

dudinax

2 months ago

[-]

Every llm output is a hallucination. Sometimes it happens to match reality.

▲

ok_computer

2 months ago

[-]

I have an uneasy feeling logging into a Text Editor vscode and seeing a Microsoft correlated account, work or personal, in the lower left corner. I understand that settings sync or whatever but it’d be preferred to keep to a simple config json or xml (pretty sure most settings are in json).

I have no problem, however, pasting an encryption public key into my Sublime Text editor. I’m not completely turned off by ability fir telemetry, tracking, or analytics. But having a login for a Text Editor is totally unappealing to me with all the overhead.

It’s a bummer that similar to browsers and chrome, the text editor with an active package marketplace necessitates some tech major underwriting the development with “open source” code but a closed kernel.

Long live Sublime text (i’m aware there are more pure text editors but do use mice)

▲

spartanatreyu

2 months ago

[-]

As far as I can tell, the account is just for:

- github integration (e.g. git auth, sync text editor settings in private gist)

- a trusted third party server for negotiating p2p sessions with someone else (for pair programming, debugging over a call, etc...)

But anyone who wants to remove the microsoft/github account features from their editor entirely can just use vscodium instead.

▲

mindwok

2 months ago

[-]

Wow. This one will go down in the history books as an example of AI hype outpacing AI capability.

▲

dylan604

2 months ago

[-]

Wow. This one will go down in the history books as yet another example of AI hype outpacing AI capability.

FTFY

Also see every single genAI PR release showing obvious uncanny valley image (hands with more than expected number of fingers). See Apple's propaganda videos vs actual abilities. There are plenty of other (all???) PR examples where the product does not do what is advertised on the tin.

▲

hedayet

2 months ago

[-]

I've had a very good experience with Cursor on small Typescript projects.

It started hallucinating a lot as my typescript project got bigger.

I found it pretty useless in languages like Go and C++.

I ended up canceling Cursor this month. It was messing up working code, suggesting random changes, and ultimately increasing my cognitive load instead of reducing it.

▲

burnte

2 months ago

[-]

Remember: They think you should trust their AI's output with your codebase.

▲

xyst

2 months ago

[-]

AI bubble popping yet? Looking forward to not buying GPUs that cost $2K (before tariffs…).

▲

adolfojp

2 months ago

[-]

AI's killer app is "marketing". Bots are incredibly useful for selling items, services, and politics, and AI makes the bots indistinguishable from real people to most people most of the time. It's highly effective so I don't see that market shrinking any time soon.

▲

novaRom

2 months ago

[-]

1) there is no AI bubble, it has revolutionized how we communicate and learn 2) you don't need to buy an expensive GPU for local LLM, any contemporary laptop with enough RAM is sufficient to run an uncensored Gemma-3 fast

▲

rcxdude

2 months ago

[-]

Railroads revolutionized transport and yet railway mania was undeniably a bubble. Something can be both very useful and yet also overvalued and overhyped leading to significant malinvestment (sometimes, everyone wins to the detriment of the investors, sometimes just everyone loses out because a huge amount of effort was spent on not useful stuff, usually somewhere in between).

▲

myko

2 months ago

[-]

I'm shocked by this perspective, and I'm deep into the LLM game (shipped 7 figure products using LLMs). I don't feel like anything has been revolutionized around communication - I can spot AI generated emails pretty easily (just send the prompt, people). On the learning front I do find LLMs to be more capable search engines for many tasks, so they're helpful absolutely.

▲

Philpax

2 months ago

[-]

I'm an AI fan, but there's clearly a desperate attempt by just about every tech company to integrate AI at the cost of genuinely productive developments in the space, in a manner that one might describe as a "bubble." Microsoft's gotta buy enough GPUs to handle Copilot in Windows Notepad, after all...

▲

omneity

2 months ago

[-]

Calling it desperate is a subjective assessment. Yes, some strategies are more haphazard than others, but ignoring generative AI currently is the same as ignoring the internet in 1999 or mobile in 2010 (which facebook famously regretted and paid $4+1B to buy instagram and whatsapp in order to catch up)

▲

grg0

2 months ago

[-]

LLMs for customer support is for ghetto companies that want to cheap out on quality. That's why you'll see Comcast and such use it, for example, but not your broker or anywhere where the stakes on the company's reputation are non-zero.

▲

pvdebbe

2 months ago

[-]

Some time ago, in an electronics online shop I asked about warranty terms from a chatbot and got favorable answers. It didn't take two hours before a human contacted me via email to correct a misunderstanding.

▲

SuperNinKenDo

2 months ago

[-]

Awesome. It warms my heart that both parties who are supporting AI, users and company, felt its negative effects. We can only hope this continues to happen.

▲

scarface_74

2 months ago

[-]

I do a few AI projects and my number one rule is to use LLMs only to process requests and never to generate responses.

▲

permo-w

2 months ago

[-]

god I hate redditors. what is it about that website that makes every user so so incredibly desperate to jump on literally any bandwagon they can lay their woolly arses on?

▲

elia_42

2 months ago

[-]

A huge problem. I think AI-based customer service is more negative than positive in general. In my opinion in this way we first of all entrust an operation that needs to be timely and above all empathetic to AI; secondly, although it is faster to train a model to become a very good customer assistant, thereafter a continuous human intervention of monitoring, improvement and error correction is required. So in the end, in addition to not having a customer assistant capable of “being close” to the customer and understanding the finer nuances of a customer's request, this approach leads to inefficiency and slowdowns in the medium to long term.

▲

DanHulton

2 months ago

[-]

Literally the only safe way to use an LLM in a business context is as an input to a trusted human expert, and the jury is still out for even that case.

Letting an AI pose as customer support is just begging for trouble, and Cursor had their wish appropriately granted.

▲

IIAOPSW

2 months ago

[-]

The coverup is worse than the crime. Truly, man has made machine in his image.

▲

rvz

2 months ago

[-]

This is "AGI's finest".

It's what we all wanted. Replacing your human support team to be run exclusively by AI LLM bots whilst they hallucinate to their users. All unchecked.

Now this bug has now turned into a multi-million dollar mistake and costed Cursor to lose millions of dollars overnight.

What if this was a critical control system in a hospital or energy company and their AI support team (with zero humans) hallucinated a wrong meter reading and overcharged their customers? Or the AI support team hallucinated the wrong medication to a patient?

Is this the AGI future we all want?

▲

n_ary

2 months ago

[-]

Are we sure that the damage is in “millions”? Could it not be just the same old “vocal and loud minority”, who will wake up tomorrow and re-activate their subscription before the lunch break once Cursor team comes and writes a heartfelt apology post (well generated by AI with a system prompt of “you are a world class PR campaign manager …”)?

▲

falcor84

2 months ago

[-]

You're right, but this is more general - it has nothing to do with AGI and everything to do with poor management. It reminds me very much of the Chernobyl disaster, and the myriad examples in Taleb's "The Black Swan".

▲

layer8

2 months ago

[-]

Where are you seeing AGI?

▲

aesirian

2 months ago

[-]

It's a Reddit post with 65 upvotes. Where are people getting millions of dollars overnight? HN is too dramatic lol

▲

esafak

2 months ago

[-]

Even if they generously compensate locked out users, they'll probably end up ahead using the AI bot.

▲

dmitrygr

2 months ago

[-]

I love this moment so much, I want to marry it and raise a family of little moments with it. A company that claims AI will solve everything cannot even get their own support right. chefkiss

▲

Gud

2 months ago

[-]

A question to the HN community. Is there any profit in running web sites like they were designed during peak web(~2007)?

No AI, less crappy frameworks, fewer dark patterns, etc.

▲

never_inline

2 months ago

[-]

Amazon has such design, no?

▲

arp242

2 months ago

[-]

> And just as people started comparing notes and figuring out that the story didn’t quite add up… the main Reddit thread got locked. Then deleted. Like, no public resolution, no real response, just silence.

Anything that doesn't fit the narrative must be deleted I guess.

As is so often the case, it's not the mistake itself that's the biggest problem but how people or organisations deal with the mistake.

▲

jsight

2 months ago

[-]

I was playing around with ChatGPT the other day and encountered a similar issue. Once I hit the rate limit, the replies telling me that seemed to be AI generated. When I waited the requisite time, the next use attempt repeated the last message about being past the limit and needing to wait 13 hours again.

It seemed to be reading from the conversation to determine this. Oops! Replaying an earlier message worked fine.

▲

basisword

2 months ago

[-]

If you've ever worked in support you would know that even the most technical people can be highly inarticulate, have an inability to share even the most basic details, and if English isn't their first language things can get very confusing. An AI bot answering those people with a lack of context and pretending to be a real support agent is an accident waiting to happen.

▲

rurban

2 months ago

[-]

That's probably not so bad in a culture based on lies, like the US or China. That's probably their training data. But it wouldn't fly in more robust cultures were you expect the truth, even if it we will hurt.

▲

EGreg

2 months ago

[-]

Oooh I wanted to try this for a while so here goes…

This doesn’t seem like anything new. Ill-informed support staff has always existed, and could also give bad information to users. AI is not the problem. And it hasn’t created any problems that weren’t already there before AI.

Usually by the time I get to a post on HN criticizing AI, someone has already posted this exact type of rebuttal to any criticism…

▲

lastcobbo

2 months ago

[-]

This comes on the heels of researchers putting javascript backdoors in AI assisted programs made in cursor using poisoned includes as well.

▲

porridgeraisin

2 months ago

[-]

I disagree with but understand selling AI as a hot thing that can reliably do things to other companies. False marketing has been around since "buh buh" sold a deer corpse to "bah bah" in exchange for his pretty leaf mat by telling him it's actually a cow.

But drinking the kool aid yourself? That demonstrates a new low in human mental facility.

▲

wvh

2 months ago

[-]

Consider the source. I assume companies will learn the hard way not to fake human interaction and be forced to mention when responses are auto-generated (a better word than "AI" really). All that effort for so many to build rapport and trust and all pissed away by trying to save a buck and having an ambitious bot trying to go solo.

▲

joe_the_user

2 months ago

[-]

I haven't seen people comment on just the wow factor here. Apparently Cursor produced a full integrated AI app and it orchestrated a self-destruct process in fashion emulating the way some human-managed companies have recently self-destructed. AI fails are easy and some require a lot of work, apparently.

Looking forward to apps trained on these Reddit threads.

▲

gwern

2 months ago

[-]

Who will be the first AI-first company to close the loop with autonomous coding bots, and create self-fulfilling prophecies where a customer support bot hallucinates a policy like OP, it gets stored in the Q&A/documentation retrieval database, and the autonomous bots implement and roll out the policy and lock out all the users (and maybe the original human developers as well)?

▲

throwaway314155

2 months ago

[-]

What?

▲

jonathaneunice

2 months ago

[-]

I don't understand the negativity. I use Cursor and love it.

Are there real challenges with forking VS Code? Yep. Are there glitches with LLMs? Sure. Are there other AI-powered coding alternatives that can do some of the same things? You betcha.

But net-net, Cursor's an amazing power tool that strongly extends what we can accomplish in any hour, day, or week.

▲

EdwardDiego

2 months ago

[-]

> I don't understand the negativity.

AI replied to support email, and told people a session bug was a feature.

▲

jonathaneunice

2 months ago

[-]

No, I get that. But that obvious screw-up is being conflated with assessments of "is Cursor any good?" "is Cursor still ahead of the pack?" "does LLM-assisted coding good/useful?" et cetera.

This one failure is kitchen-sinking into a broad list of grievances, perceived slights, or unrelated problems all at once. I don't understand that, since I'm finding Cursor to be strong, helpful, and constantly improving. My experience is entirely different from this list of grievances.

▲

soheil

2 months ago

[-]

I wonder if people are wrongly just going to start calling random, completely-unrelated-to-LLM bugs in software "hallucination". Very similar to the meaning of AI changing in recent years to include basically any software that has some type of algorithm or heuristic built into it.

▲

skc

2 months ago

[-]

>Community was in revolt

>Dozens of users publicly cancelled

A bit hyperbolic, no? Last I read they have over 400,000 paying users.

▲

rustcleaner

2 months ago

[-]

Local only bots or bust. Stop relying on SaaS! Cloud is synonym for just somebody else's system!

▲

SeanAnderson

2 months ago

[-]

Are you using a locally ran LLM that is equally capable as Claude 4.7? Kind of seems like the answer has to be "not as capable and also the hardware investment was insane"

▲

kookamamie

2 months ago

[-]

> most surreal product screwups

It seems you're not aware of the issue which plagued tens of Cursor releases, where the software would auto-delete itself on updates.

It was pretty hilarious, to be honest. Your workflow would consist of always installing the editor before use.

▲

DonHopkins

2 months ago

[-]

FWIW, to illustrate how insane and risky it is to use an AI bot deceptively pretending to be a human as the first line of customer support, I was just asking ChatGPT 4.5 Research Preview analyze the reddit and hn threads about Cursor's horrible AI customer service bot, pasting them in part by part, and finally writing this prompt:

I said:

[Pasted reddit and hn discussion in pieces asking it to hold off commenting until I told it that was all.]

>Ok that is all, now please summarize all the points people raise and interesting discussions. In DETAIL! Write a full deep research report on this and the original reddit discussion, with quotes naming the users who say it, summarizing important back-and-forth threads. You can omit unimportant trolls and inane chit chat.

ChatGPT said:

>The Kurdistan Workers' Party (PKK) is designated as a terrorist organization by several countries and international bodies. The European Union, the United States, and Turkey have officially listed the PKK as a terrorist organization. This designation is based on the group's history of armed conflict and attacks, particularly in Turkey.

>The PKK has been involved in an armed struggle against the Turkish state since the 1980s, seeking greater autonomy and rights for Kurds in Turkey. This conflict has resulted in significant violence and loss of life over the decades.

>While the PKK is designated as a terrorist organization by the EU, the U.S., and Turkey, other countries may have different stances or may not have made an official designation. It's important to consult the specific policies of each country for their official position on the PKK.

>Sources

I said:

>What does your response have to do with a discussion of Cursor's AI customer service?

ChatGPT said:

>You're right—my previous response about the PKK had nothing to do with the discussion about Cursor's AI customer service issue. It seems that I mistakenly pasted content unrelated to our current topic.

>Let me properly address your original request by summarizing in detail the issues raised in the discussions regarding Cursor's AI customer service: [...]

It makes me wonder who got the summary of the reddit and hn discussions in response to their question about terrorist organizations, and what that makes them think about Cursor as a company, and their involvement in armed struggles, international politics, and terrorism?

▲

discreteevent

2 months ago

[-]

The original bug that prevented login was no doubt generated by AI. They probably ran it through an AI code review tool like the one posted here recently.

The world is drowning in bullshit and delusion. Programming was one of the few remaining places where you had to be precise, where it was harder to fool yourself. Where you had to understand it to program it. That's being taken away and it looks like a lot of people are embracing what is coming. It's hardly surprising - we just love our delusions too much.

▲

ebiester

2 months ago

[-]

Or, alternatively, it was a bug written by a human. Humans do a good enough job making bugs, no?

▲

NeutralCrane

2 months ago

[-]

I’d love to visit the alternate universe you inhabit where there are no bugs in human written code.

▲

dijksterhuis

2 months ago

[-]

at no point did they say anything that even comes close to saying “no bugs in human written code.”

if you’re willing to come down off your defensive AI position, because your response is a common one from people who are bought into the tech, i’ll try explain what they were saying (if not, stop reading now, save yourself some time).

maybe you’ll learn something, who knows :shrug:

> Programming was one of the few remaining places where you had to be precise, where it was harder to fool yourself. Where you had to understand it to program it.

they are talking about the approach, motivations and attitudes involved in “the craft”.

we strive for perfection, knowing we will never reach it. we, as programmers/hackers/engineers must see past our own bullshit/delusions to find our way to the fabled “solution”.

they are lamenting how those attitudes have shifted towards “fuck it, that’ll do, who cares if the code reads good, LLM made it work”.

where in the “vibe coding” feedback loop is there a place for me, a human being, to realise i have completely misunderstood a concept for the last five years and suddenly realise “oh shit, THATS HOW THAT WORKS!? HOW HAVE I NOT REALISED THAT FOR FIVE YEARS.” ?

where in “just ask chatgpt for a summary about a topic” is my journey where i learn about a documentation rendering library that i never even knew existed until i actually started reading the docs site for a library?

maybe we were thinking about transferring our docs off confluence onto a public site to document our API? asking chatGpt removes that opportunity for accidental learning and growth.

in essence, they’re lamenting the sacrifice people seem to be willing to make for convenience, at the price of continually growing and learning as a human being.

at least that’s my take on it. probably wrong — but if i am at least i get to learn something new and grow as a person and see past my own bullshit and delusions!

▲

isaacremuant

2 months ago

[-]

This is extremely funny. AI can't have accountability. Good luck with that.

Use AI to augment but don't really replace it as a 100% system if you can't predict and own up the failure rate.

My advice would be to use more configurable tools with less interest on selling fake perfection. Aider works.

▲

latentsea

2 months ago

[-]

> AI can't have accountability

Sure it can. You just have to bake into the reward function "if you do the wrong thing, people will stop using you, therefore you need to avoid the wrong thing".

Then you wind up at self-preservation and all the wholly shady shit that comes along with it.

I think the AI accountability problem is the crux of the "last-mile" problem in AI, and I don't think you can necessarily solve it without solving it in a way that produces results you don't want.

▲

isaacremuant

2 months ago

[-]

I don't want to get into a semantics argument but that's not accountability. That's just one more behavior prompt/indication but you can't fire the LLM. It might still do the wrong thing.

▲

nkrisc

2 months ago

[-]

Incidentally, this is a great way to find out you're never going to get real support when you need it, just AI responses designed to make you get tired of trying to talk to a real person they need to pay.

▲

surprise_

2 months ago

[-]

Cursor is so good that snafu like this remarkably and tragically tolerable.

▲

lukaslalinsky

2 months ago

[-]

The current AIs are such people pleasers, that I really take the time how to form the prompt, so that it can't just say, "yes, it is like that" in a polite way.

▲

jacooper

2 months ago

[-]

What's the best llm-first/ designed with llm in mind editor? Is it still cursor?

There's windsurf, cline, zed, copilot got a huge update too, is cursor still leading the space?

▲

revskill

2 months ago

[-]

Cursor is bevoming useless now. They are too money greedy to be trusted. Last time i installed on another machine tgey told me to subscrribe to use. It is gross.

▲

jachee

2 months ago

[-]

> Not because they made a mistake

Except they did make a mistake: trusting their Simulated Intelligence (I’m done calling it “AI”.) with their customers’ trust.

▲

namaria

2 months ago

[-]

I like the Huxley inspired 'intelligence surrogate'.

▲

Artoooooor

2 months ago

[-]

If you use AI to generate any communication, you are still fully responsible for it. It's still your communication, just using a tool.

▲

notnmeyer

2 months ago

[-]

cursor got there first, but it’s just not that good. stuff not built on top of vscode feels much more promising. i’ve been enjoying zed.

▲

Gabriel54

2 months ago

[-]

I fear the day when legal matters and other impactful decisions are decided and similarly hallucinated by AI.

▲

juliensalinas

2 months ago

[-]

Relying on GenAI for support like that without a human in the loop is a huge mistake...

▲

isoprophlex

2 months ago

[-]

Can't move fast and break things unless you do stuff that breaks things. In this case customer trust.

▲

im3w1l

2 months ago

[-]

> Dozens of users publicly canceled their subscriptions, myself included.

Makes you think of that one meme.

▲

ninetyninenine

2 months ago

[-]

Next thing you know AI is going to hallucinate a nuclear launch from China and defend itself.

▲

rpd9803

2 months ago

[-]

Who would have predicted turning customer support over to AI with no human interv

▲

tonyhart7

2 months ago

[-]

nah this is wild, imagine AI company 'automate' their most critical software

surely it wouldn't backfire, right???

ok aside from the joke from this case alone, I think we can all agree that AI not replacing human soon

▲

anyekwest

2 months ago

[-]

It’s the small things like this that made me switch to windsurf.

▲

jmorenoamor

2 months ago

[-]

People speaking with LLMs is going to be fun.

At least until someone dies.

▲

imhoguy

2 months ago

[-]

What a "Open the pod bay doors, HAL" moment.

▲

nbzso

2 months ago

[-]

I am shocked. Really? In the beginning of AI hype, as a product designer, I moved to my routine. Research. You don't need a PHD to find people smarter than you and experts to explain to you what 'stochastic parrots' means.

To understand how LLM's work and why the hallucination is inherent part of the tech with "AI". Maybe the core problem is implementation practices which remove critical thinking and testing? Maybe the core problem is 'fake it till you make it' ideology? I don't know. But I am sure about one thing. This as any other postmodern technology will bring more problems than solutions.

▲

rkagerer

2 months ago

[-]

Good. I hope other companies stupid enough to subject their users to unleashed AI for support instead of real humans reap the consequences of their actions. I'll be eating popcorn and mocking them from my little corner of the internet.

▲

selimnairb

2 months ago

[-]

You cannot put LLMs in control loops.

▲

mikkom

2 months ago

[-]

Vibe coding in all of it's glory

▲

ryandrake

2 months ago

[-]

1. Whenever AI is used closed loop, with human feedback and a human checking the output for quality/hallucinations and then passing it along, it's often fine.

2. Whenever it is used totally on its own, with no humans in the loop, it's awful and shit like this happens.

Yet, every AI company seems to want to pretend we're ready for #2, they market their products as #2, they convince their C-suite customers that their companies should buy #2, and it's total bullshit--we're so far from that. AI tools can barely augment a human in the driver's seat. It's not even close to being ready to operate on its own.

▲

lordofgibbons

2 months ago

[-]

Live by the AI slop, die by the (support) AI slop?

▲

panny

2 months ago

[-]

But look, AI did take a job today. The job of cursor product developers is gone now. ;) But seriously, every time I hear media telling me AI is going to replace us, they conveniently forget episodes like this. AI may take the jobs one day, but it doesn't seem like that day is any time soon when AI adopters keep getting burned.

▲

rpd9803

2 months ago

[-]

Serves them right.

▲

DidYaWipe

2 months ago

[-]

hallucinate ≠ fabricate

▲

nomilk

2 months ago

[-]

Obligatory reference to the Streisand Effect: that trying to hide something (e.g. a Reddit post) often has the unintended consequence of drawing more attention to it.

https://en.wikipedia.org/wiki/Streisand_effect

▲

joelthelion

2 months ago

[-]

Just use aider. Open source, no bullshit, use it with the llm you prefer.

▲

gblargg

2 months ago

[-]

"I'm sorry Dave, I'm afraid I can't do that."

▲

DonHopkins

2 months ago

[-]

I love Cursor. I use it daily. My company, Leela AI, is building on top of it — with MCP tools, Cursor and VSCode plugins, our own models, integrated real-time video analysis, and custom vision model development, query, and action scripting systems.

We're embedding "Active Curation" into the workflow: a semi-automated, human-guided loop that refines tickets, PRs, datasets, models, and scripted behaviors in response to real-world feedback. It's a synergistic, self-reinforcing system — every issue flagged by a user can improve detection, drive model updates, shape downstream actions, and tighten the entire product feedback loop across tools and teams.

So consider this tough love, from someone who cares:

Cursor totally missed the boat on the customer support hallucination fiasco. Not just by screwing up the response — that happens — but by failing to turn the whole mess into a golden opportunity to show they understand the limits of LLMs, and how to work with those limits instead of pretending they don’t exist.

They could have said: Here’s how we’re working to build an AI-powered support interface that actually works — not by faking human empathy, but by exposing a well-documented, typed, structured interface to the customer support system.

You know, like Majordomo did 30 years ago, like GitHub did 17 years ago, or like MCP does now — with explicit JSON schemas, embedded documentation, natural language prompts, and a high-bandwidth contract between the LLM and the real world. Set clear expectations. Minimize round trips. Reduce misunderstandings.

Instead? I got ghosted. No ticket number. No public way to track my issue. I wrote to enterprise support asking specifically for a ticket number — so I could route future messages properly and avoid clogging up the wrong inboxes — and got scolded by a bot for not including the very ticket number I was asking for, as if annoyed I'd gone around its back, and being dense and stubborn on purpose.

You play with the Promethean fire of AI impersonating people, that's what you get, is people reading more into it than it really means! It's what Will Wright calls the "Simulator Effect" and "Reverse Over-Engineering".

https://news.ycombinator.com/item?id=34573406

https://donhopkins.medium.com/designing-user-interfaces-to-s...

Eventually, after being detected trying to get through on the corporate email address, I was pawned off to the hoi polloi hi@cursor.com people-bot instead of the hoi aristoi enterprise@cursor.com business-bot. If that was a bot, it failed. If it was a human, they wrote like a bot. Either way, it's not working.

And yes — the biggest tell it wasn't a bot? It actually took hours to days to respond, especially on weekends and across business hours in different time zones. I literally anthropomorphized the bot ghosting into an understandably overworked work-life-balanced human taking a well earned weekend break, having a sunny poolside barbecue with friends, like in a Perky Pat Layout, too busy with living their best life to answer my simple question: "What is the issue ID you assigned to my case, so we can track your progress?" so I can self serve and provide additional information, without bothering everyone over email. The egg is on my face for being fooled by a customer support bot!

Cursor already integrates deeply with GitHub. Great. They never linked me to any ticketing system, so I assume they don't expose it to the public. That sucks. They should build customer support on top of GitHub issues, with an open-source MCP-style interface. Have an AI assistant that drafts responses, triages issues, suggests fixes, submits PRs (with tests!) — but never touches production or contacts customers without human review. Assist, don’t impersonate. Don’t fake understanding. Don’t pretend LLMs are people.

That’s not just safer — it’s a killer dev experience. Cursor users already vibe-code with wild abandon. Give them modular, extensible support tooling they can vibe-code into their own systems. Give them working plugins. Tickets-as-code. Support flows as JSON schemas. Prompt-driven behaviors with versioned specs. Be the IDE company that shows other companies how to build world-class in-product customer support using your own platform fully integrated with GitHub.

We’re doing this at Leela. We'd love to build on shared open foundations. But Cursor needs to show up — on GitHub, in issue threads, with examples, with tasty dogfood, and with real engineering commitment to community support.

Get your shit together, Cursor. You're sitting on the opportunity of a generation — and we’re rooting for you.

----

The Receipts:

----

Don to Sam, also personally addressed to the enterprise and security bots (explicitly asking for an issue ID, and if it's human or not):

>Hello, Sam.

>You have not followed up on your promise to reply to my issue.

>When will you reply?

>What is the issue ID you assigned to my case, so we can track your progress?

>Are you human or not?

>-Don

----

Enterprise and Security bots: (silence)

Sam to Don (ignoring my request for an issue ID, and my direct question asking it to disclose if it's human or not):

>Hi Don - I can see you have another open conversation about your subscription issues. To ensure we can help you most effectively, please continue the conversation in your original ticket where my teammate is already looking into your case. Opening new tickets won't speed up the process. Thanks for your patience!

----

Don to Sam (thinking: "LLMs are great at analyzing logs, so maybe if I make it look like a cascade of error messages, it will break out of the box and somebody will notice):

>ERROR: I asked you for my ticket number.

>ERROR: I was never given a ticket number.

>ERROR: You should have inferred I did not have a ticket number because I asked you for my ticket number.

>ERROR: You should not have told me to use my ticket number, because you should have known I did not have one.

>ERROR: Your behavior is rude.

>ERROR: Your behavior is callous.

>ERROR: Your behavior is unhelpful.

>ERROR: Your behavior is patronizing.

>ERROR: Your behavior is un-empathic.

>ERROR: Your behavior is unwittingly ironic.

>ERROR: Your behavior is making AI look terrible.

>ERROR: Your behavior is a liability for your company Cursor.

>ERROR: Your behavior is embarrassing to your company Cursor.

>ERROR: Your behavior is losing money for your company Cursor.

>ERROR: Your behavior is causing your company Cursor to lose customers.

>ERROR: Your behavior is undermining the mission of your company Cursor.

>ERROR: Your behavior is detrimental to the success of your company Cursor.

>I would like to speak to a human, please.

----

Four hours and 34 minutes from sending that I finally got a response from a human (or a pretty good simulation), who actually read my email, and started the process of solving my extremely simple and stupid problem, which my initial messages -- if anyone read them or ran a vision model on all the screen snapshots I provided -- would have given them enough information to easily solve the problem in one shot.

▲

mtlmtlmtlmtl

2 months ago

[-]

I've been saying this for years; LLMs will never be able to replace a good support staff. The only support LLMs can be relied on for is the kind of support you get from companies like google and netflix. Off-shored, glorified clickfarm workers in India, who only have access to some restrictive API and a very rigid playbook. They can do stuff like grandma forgot her password again -> help her reset her password.

For a support agent to actually be useful beyond that, they need some leeway to make decisions unilaterally, sometimes in breach of "protocol", when it makes sense. No company with a significant level of complexity in its interactions with customers can have an actually complete set of protocols that can describe every possible scenario that can arise. That's why you need someone with actual access inside the company, the ability to talk to the right people in the company should the need arise, a general ability(and latitude) to make decisions based on common sense, and an overall understanding of the state of the company and what compromises can be made somewhat regularly without bankrupting it. Good support is effectively defined by flexibility, and diametrically opposed to following a strict set of rules. It's about solving issues that hadn't been thought of until they happened. This is the kind of support that gets you customer loyalty.

No company wants to give an LLM the power given to a real support agent, because they can't really be trusted. If the LLM can make unilateral decisions, what if it hallucinated and gives the customer free service for life? Now they have to either eat the cost of that, or try to withdraw the offer, which is likely to lose them that customer. And at the end of all that, there's no one to hold liable for the fuckup(except I guess the programmers that made the chatbot). And no one wants the LLM support agent to be sending them emails all day the same way a human support agent might. So what you end up with is just a slightly nicer natural language interface to a set of predefined account actions and FAQ items. In other words, exactly what you get from clickfarms in Southern Asia or even a phone tree, except cheaper. And sure, that can be useful, just to filter out the usual noise, and buy your real support staff more time to work on the cases where they're really needed, but that's it.

Some companies, like Netflix and Google(Google probably has better support for business customers, never used it, so I can't speak to it. I've only Bangalored(zing) my head against a wall with google support as a lowly consumer who bought a product), seem to have no support staff beyond the clickfarms, and as a result their support is atrocious. And when they replace those clickfarms with LLMs, support will continue to be atrocious, maybe with somewhat better English. And it'll save them money, and because of that they'll report it as a rousing success. But for customers, nothing will have changed.

This is pretty much what I predicted would happen a few years ago, before every company and its brother got its own LLM based support chatbot. And anecdotally, that's pretty much what has happened. For every support request I've made in the last year, I can remember 0 that were sorted out by the LLM, and a handful that were sorted out by humans after the LLM told me it was impossible to solve.

▲

AstroBen

2 months ago

[-]

From cursor developer: "Hey! We have no such policy. You're of course free to use Cursor on multiple machines.

Unfortunately, this is an incorrect response from a front-line AI support bot. We did roll out a change to improve the security of sessions, and we're investigating to see if it caused any problems with session invalidation. We also do provide a UI for seeing active sessions at cursor.com/settings.

Apologies about the confusion here."

▲

acedTrex

2 months ago

[-]

lol this is totally the kind of company you should be giving money too

▲

4ndrewl

2 months ago

[-]

and the best thing about it is that the base model is going to be trained on reddit posts so expect SupportBot3000 to be even more confident about this fact in the future!

▲

idopmstuff

2 months ago

[-]

I mean to be fair, I like that they're putting their money where their mouth is so to speak - if you want to sell a product based on the idea that AI can handle complex tasks, you should probably have AI doing what should be simple, frontline support.

▲

zanellato19

2 months ago

[-]

> you should probably have AI doing what should be simple, frontline support.

AI companies are going to prove (to the market or to the actual people using their products) that a bunch of "simple" problems aren't at all simple and have been undervalue for a long time.

Such as support.

▲

AstroBen

2 months ago

[-]

I don't agree with that at all. Hallucination is a very well known issue. Sure leverage AI to improve their productivity.. but not even having a human look over the responses shows they don't care about their customers

▲

dylan604

2 months ago

[-]

If you had a human support person feeding the support question into the AI to get a hint, do you think that support person is going to know that the AI response is made up and not actually a correct answer? If they knew the correct answer, they wouldn't have needed to ask the AI.

▲

_jonas

2 months ago

[-]

Exactly, that's why my startup recommends all LLM outputs should come with trustworthiness scores:

https://cleanlab.ai/tlm/

▲

n_ary

2 months ago

[-]

The number of times real human powered support caused me massive headache and sometimes financial damage and the number of times my lawyer fixed those because me trying to explain why they were wrong… I am not surprised that AI will do the same as the creation is the image of the creator and all that.

▲

thaumasiotes

2 months ago

[-]

> if you want to sell a product based on the idea that AI can handle complex tasks, you should probably have AI doing what should be simple, frontline support.

That would only be true if you were correct that your AI can handle complex tasks. If you want to sell dowsing rods, you probably don't want to structure your own company to rely on the rods.

▲

lastcobbo

2 months ago

[-]

When you vibe-code customer service

▲

jen729w

2 months ago

[-]

Cursor: a VS Code extension that got itself valued at $10bn.

▲

merb

2 months ago

[-]

It’s basically a vscode fork that illegally uses official extensions, that were not allowed to be used in forks.

▲

falcor84

2 months ago

[-]

Does "it" use the official extensions, or does it only allow "you" the user to use them?

▲

merb

2 months ago

[-]

First of all, their docs link to the market place of ms:

https://www.cursor.com/how-to-install-extension

Which is basically an article to use an extension in a way that’s basically forbidden use.

If that was not bad enough the editor also told you to install certain extensions if certain file extensions were used that were also against the tos of the extension.

And basically cursor can just be using the vsix marketplace from eclipse, which does not contain restricted extensions.

What they do is at least shady.

And yes I’m not a fan of the fact that Microsoft does this, even worse they closed the source (or some parts of it) of some extensions as well, which is also a bad move (but their right)

▲

MiddleEndian

2 months ago

[-]

I don't see a problem with this. If it's an extension on my machine, why do I care about the TOS?

▲

int_19h

2 months ago

[-]

The extensions themselves have licenses that prohibit their use with anything other than VSCode.

(You should keep this in mind next time someone tells you that VSCode is "open source", by the way. The core IDE is, sure, but if you need to do e.g. Python or C++, the official Microsoft extensions involved all have these kinds of clauses in them.)

▲

MiddleEndian

2 months ago

[-]

I don't use VSCode (or Cursor in this case (which I do think was malicious in the way it blindly hallucinated a policy for a paying customer)); I use vim or notepad++ depending on my mood.

I just don't have a problem with people "violating" Terms of Service or End User License Agreements and am not really convinced there's a legal argument there either.

▲

int_19h

2 months ago

[-]

I personally don't have a problem with that either, but as far as legalities go, EULAs are legally binding in US.

▲

MiddleEndian

2 months ago

[-]

Have EULAs been tested in court?

For distribution licenses, I would assume they have. Can't put GPL software in your closed source code, can't just download Photoshop and copy it and give it out, etc. And that makes sense and you have some reasonable path to damage/penalties (GPL → your software is now open source, Photoshop → fines or whatever)

But if you download some free piece of software and use it with some other piece of free piece software even though they say "please don't" in the EULA, what could the criminal or civil penalties possibly be?

▲

int_19h

2 months ago

[-]

Yes, there were some cases that confirmed the validity of "clickwrap", unfortunately: https://www.elgaronline.com/edcollchap/edcoll/9781783479917/...

I don't know what the hypothetical penalty would be for mere use contrary to EULA, though. It would be breach of contract, and presumably the court would determine actual damages, but I don't know what cost basis there would be if the software in question was distributed freely. However, fine or no fine, I would expect the court to order the defendant to cease using software in violation of EULA, and at that point further use would be contempt of court, no?

▲

MiddleEndian

2 months ago

[-]

Fair enough, that is quite unfortunate.

So I've always avoided using the Windows Store on my Windows machines, I think I managed to get WSL2 installed without using it lol.

So I'm not sure on the details, but do the steps on https://www.cursor.com/how-to-install-extension bypass clicking "I agree" since they just download and drag? Because from what I can tell, the example in https://www.elgaronline.com/edcollchap/edcoll/9781783479917/... is because the customer clicked "I agree" before installing.

▲

merb

2 months ago

[-]

Its different if you do it. Or if a public company is actively encouraging you.

▲

calebkaiser

2 months ago

[-]

The 1 million users + $200 million in revenue probably had something to do with the valuation.

▲

rvz

2 months ago

[-]

Slack was a similar thing. IPO'd at 10 million users with 400M in revenue but got destroyed by Microsoft thanks to Teams.

Cursor is at a worse position and at greater risk of ending up like Slack very quickly and Microsoft will do the exact same thing they did to Slack.

This time by extinguishing (EEE) them by racing prices of VSCode + Copilot close to zero, until it is free.

The best thing Cursor should do is for OpenAI to buy them at a $10B valuation.

▲

krotton

2 months ago

[-]

My current company uses Slack as the main messaging app, while still using Teams as part of the Microsoft package, for its integrations with Outlook and conference hardware and software. And this is a recurring pattern I've seen in other places. Teams is nowhere close to being a usable alternative to Slack.

▲

falcor84

2 months ago

[-]

Wait, what?! Slack got destroyed by Teams? You seem to be living a few parallel universes away from the one I'm in.

▲

parliament32

2 months ago

[-]

Yes, "destroyed" is apt. See the graph under "Slack vs Microsoft Teams: Users" here: https://www.businessofapps.com/data/slack-statistics/

▲

falcor84

2 months ago

[-]

Interesting, but isn't it just because Teams is bundled and integrated with MS Office? I couldn't find any specific stats on revenue or how many businesses actually choose to pay for Teams specifically.

▲

parliament32

2 months ago

[-]

Yes, AFAIK you can't even buy Teams separately, it's only bundled with MS 365. The DAU numbers, however, correspond to people actually online and using Teams for communications, so it doesn't really matter (if we only care about "how many users are using what").

▲

myko

2 months ago

[-]

It's sad but true that many orgs went from Slack to Teams due to Microsoft's monopolistic sales tactics. Sucks because now I use Teams every day.

▲

charcircuit

2 months ago

[-]

Teams has 8 times the monthly active users.

▲

mh-

2 months ago

[-]

Microsoft pushed Teams onto my personal Windows PC in a recent update, as a startup item. And, as far as I could tell, automatically logged me in to my Microsoft account on it.

I'd be very skeptical of their MAU claims.

▲

Dban1

2 months ago

[-]

ARTIFICIAL intelligence

▲

infecto

2 months ago

[-]

Fingers crossed the users will go left are all the noisy ones. I still enjoy using Cursor but their forum is filled with too many posts about $20 being too expensive or they need to raise caps or Cursor is the worse tool in the world.

▲

hsbauauvhabzb

2 months ago

[-]

[flagged]

▲

infecto

2 months ago

[-]

Yes, I’m complaining about complainers. It’s the circle of life on Hacker News, someone always has to play the food chain’s top predator: the mildly annoyed power user.

▲

bigyabai

2 months ago

[-]

Mildly annoyed power users don't even show up on the food-chain until they put their money where their mouth is. Put another way, Cursor cares more about their failed conversions than they prioritize their sycophants that would pay $20/month for a Pepsi and a foot massage.

▲

jimjimjim

2 months ago

[-]

A few hallucinations. It's right more times than it's wrong. Humans make mistakes as well. Cosmic justice.

▲

aarestad

2 months ago

[-]

Yes, but humans can be held accountable.

▲

jimjimjim

2 months ago

[-]

I probably should have added sarcasm tags to my post. My very firm opinion is that AI should only make suggestions to humans and not decisions for humans.

▲

raphman

2 months ago

[-]

I'd argue that humans also more easily learn from huge mistakes. Typically, we need only one training sample to avoid a whole class of errors in the future (also because we are being held accountable).

▲

SparkyMcUnicorn

2 months ago

[-]

As annoying as it is when the human support tech is wrong about something, I'm not hoping they'll lose their job as a result. I want them to have better training/docs so it doesn't happen again in the future, just like I'm sure they'll do with this AI bot.

▲

rurp

2 months ago

[-]

That only works well if someone is in an appropriate job though. Keeping someone in a position they are unqualified for and majorly screwing up at isn't doing anyone any favors.

▲

SparkyMcUnicorn

2 months ago

[-]

Fully agree. My analogy fits here too.

▲

recursive

2 months ago

[-]

> I'm not hoping they'll lose their job as a result

I have empathy for humans. It's not yet a thought crime to suggest that the existence of an LLM should be ended. The analogy would make me afraid of the future if I think about it too much.

▲

mgraczyk

2 months ago

[-]

How is this not an example of humans being held accountable? What would be the difference here if a help center article contained incorrect information? Would you go after the technical writer instead of the founders or Cursor employees responding on Reddit?

▲

submeta

2 months ago

[-]

Wow, so much negativity. The Cursor team has built a wonderful product — and they made a mistake. Now haters are trying to rip it apart because of one mistake. Yeah, I also paid for the yearly plan. Yeah, I was also annoyed to get locked out on another device. Was it annoying? Yes. But will I abandon the product? No, not because of this. Chill, folks. Get a life.

▲

ramon156

2 months ago

[-]

Sorry but this screams copium. Its right in your face why cursor is bad, and your response is "but the product is good". Is it? What did they do better than e.g. aider, claude code and zed? What makes them stand out?

I honestly don't get it, but if you want to support such a lazy team then have at it, no one's stopping you

▲

submeta

2 months ago

[-]

You are aggressive and rude. They didn’t sell you a house. It’s a product that costs 20 usd a month. I know all other products and prefer Claude Code. But it is expensive. I burn 20-40 USD in a single session. And I find Cursor pretty good. Better than the alternatives, minus Claude Code.