FilterHN

    This project uses shared planning documents for collaboration with Claude Code. Please:

    1. First read and understand these files:

       - PLAN.md - current project roadmap and objectives

       - ARCHITECTURE.md - technical decisions and system design

       - TODO.md - current tasks and their status

       - DECISIONS.md - decision history with rationale

       - COLLABORATION.md - handoff notes from other tools

    2. Before making any significant changes, check these documents for:

       - Existing architectural decisions

       - Current sprint priorities

       - Tasks already in progress

       - Previous context from Claude Code

    3. After completing work, update the relevant planning documents with:

       - Task completion status

       - New decisions made

       - Any changes to architecture or approach

       - Notes for future collaboration

    Always treat these files as the single source of truth for project state.

▲

novaleaf

3 hours ago

[-]

problem is that claude doesn't actually read those or keep them in context unless you prompt it to. it has to be in CLAUDE.md or it'll quickly forget about the contents

▲

ako

3 hours ago

[-]

I've added these instructions in CLAUDE.md and .windsurfrules, and yes sometimes you have to remind it, but overall it works quite well.

▲

MrGreenTea

3 hours ago

[-]

Habe you though about adding a Session start hook that reads this file and adds it to the context?

▲

ako

2 hours ago

[-]

Not yet, but that sounds like a good suggestion.

▲

troupo

1 hour ago

[-]

> it has to be in CLAUDE.md or it'll quickly forget about the contents

And then it will promptly forget about CLAUDE.md as well (happened to me on several occasions)

▲

bikeshaving

7 hours ago

[-]

Every time I’ve ever read a {CLAUDE|GEMINI|QWEN}.md I’ve thought all this information could just be in CONTRIBUTING.md instead.

▲

hahajk

7 hours ago

[-]

Yes! I want an option to always add README.md to the context; It would force me to have a useful, up to date document about how to build, run, and edit my projects.

▲

tyre

5 hours ago

[-]

You can include in your prompt for it to read the README!

▲

______

4 hours ago

[-]

Ultimately if this stuff is actually intelligent it should be using the same sources of information that we intelligent beings use. Feels silly to have to have to jump through all these hoops to make it work today

▲

eru

4 hours ago

[-]

> It would force me to have a useful, up to date document about how to build, run, and edit my projects.

Not really: our AI agents are probably smart enough to even make sense of somewhat bad instructions.

▲

dotancohen

58 seconds ago

[-]

Not the case at all. AI agents will happily turn your bad ideas into code.

▲

yunohn

2 hours ago

[-]

They’re definitely not, Claude and all other agents frequently forget the build and test commands present in CLAUDE/etc.md for my various repos (even though most of them were were initialized by the AI).

▲

eru

1 hour ago

[-]

Whether Claude and co understand is probably not a great proxy for whether your docs are good for humans.

▲

fastball

7 hours ago

[-]

That sounds nice and I have the same pain, but not sure AGENT.md is the right abstraction either. After all, these models are indeed different and will respond differently even given the same prompting. Not to mention that different wrappers around those models have different capabilities.

e.g. maybe for CURSOR.md you just want to provide context and best practices without any tool-calling context (because you've found it doesn't do a great job of tool-calling), while for CLAUDE.md (for use with Claude Code) you might want to specify tools that are available to it (because it does a great job with tool calling).

Probably best if you have an AGENT.md that applies to all, and then the tools can also ingest their particular flavor in addition, which (if anything is in conflict) would trump the baseline AGENT file.

▲

zarzavat

3 hours ago

[-]

Never. It's a marketing strategy. Some percentage of users will check these files into their repos, and some percentage of repo browsers will think "what is this X.md?" Given how much money people are spending on these things the value of having a unique filename must be enormous.

▲

drdaeman

2 hours ago

[-]

It’s a marketing strategy that works here and now, but “never” is a very long time. What could be seen as pioneers claiming names today could be also seen as retrogressive stubbornness tomorrow and lose its marketing value.

▲

getflourish

2 hours ago

[-]

Brand asset

▲

neuronexmachina

9 hours ago

[-]

I really like the idea of standardizing on AGENT.md, although it's too bad it doesn't really work with the .cursor/rules/ approach of having several rules files that get included based on matching the descriptions or file globs in frontmatter. Then again, I'm not sure if any other agents support an approach like that, and in my experience Cursor isn't entirely predictable about which rules files it ends up including in the context.

I guess having links to supplementary rules files is an option, but I'm not sure which agents (if any) would work well with that.

▲

prmph

10 hours ago

[-]

Yep, that's a peeve of mine. I've resorted to using AGENT.md, and aliasing Claude, Gemini, etc to a command that calls them with an initial instruction to read that file. But of course they will forget after some time.

The whole agentic coding via CLI experience could be much improved by:

- Making it easy to see what command I last issued, without having to scroll up through reams of output hunting for context - Making it easy to spin up a proper sandbox to run sessions unattended - Etc.

Maybe for code generation, what we actually need is a code generator that is itself deterministic but uses AI, instead of AI that does code generation.

▲

thehamkercat

10 hours ago

[-]

I think most of them provide an option to change the default file, but it'll be really good if they all can switch to AGENT.md by default

Till then you can also use symlinks

there are issues opened in some repos for this

- Support "AGENT.md" spec + filename · Issue #4970 · google-gemini/gemini-cli

https://github.com/google-gemini/gemini-cli/issues/4970#issu...

▲

ximeng

7 hours ago

[-]

https://github.com/anthropics/claude-code/issues/1091

Here for Claude

▲

troupo

1 hour ago

[-]

And then they should standardize on usage rules (an idea in Elixir space: https://hexdocs.pm/usage_rules/readme.html )

▲

codesnik

22 minutes ago

[-]

maybe symlinking will work

▲

manmal

3 hours ago

[-]

In case you haven’t seen this, you can just pipe contents into Claude. Eg

cat AGENT.md | claude

IIRC this saves some tokens.

▲

stpedgwdgfhgdd

1 hour ago

[-]

The deeper problem are the custom commands, hooks and subagents. The time has come that you need to make a strategic choice. Once you have heavily invested into CC, it is not easy to turn to an alternative.

Side remark: CC is very expensive when using API billing (compared to e.g. GPT-5). Once a company adopts CC and all developers start to adapt to it at full scale, the bill will go out of the roof.

▲

js2

10 hours ago

[-]

That's a more obvious (but less fun) name than what I've been using: ROBOTS.md with symlinks.

▲

thehamkercat

10 hours ago

[-]

https://ampcode.com/AGENT.md#migration

they also suggest using symlinks for now

▲

anp

7 hours ago

[-]

FWIW at least with Claude and Jules on a project I have a decent setup where I put all of the real content in an agents.md and then use “@agents.md” in CLAUDE.md. If all of the tools supported these kinds of context references in markdown it wouldn’t be that hard to have a single source of truth for memory files.

▲

yougotwill

5 hours ago

[-]

Same here each specific instruction file (vs code, cursor, etc.) just says read the AGENTS.md for instructions

▲

dgunay

7 hours ago

[-]

I just wish the AGENTS.md standard wasn't a single file. I have a lot of smaller context documents that aren't applicable to every task, so I like to throw them into a folder (.ai/ or .agents/) and then selectively cat them together or tell the agent to read them.

▲

esafak

6 hours ago

[-]

You can put them in subdirectories like CODEOWNERS files. https://ampcode.com/AGENT.md#multiple-agent-files

▲

what

4 hours ago

[-]

>When multiple files exist, tools SHOULD merge the configurations with more specific files taking precedence over general ones.

How is the tool supposed to merge multiple md files?

▲

esafak

4 hours ago

[-]

The same way it synthesizes anything else. We're talking about LLMs here; this is their bread and butter.

▲

troupo

1 hour ago

[-]

Elixir has this idea with usage rules: https://hexdocs.pm/usage_rules/readme.html

▲

stillsut

9 hours ago

[-]

Yeah I suspect some of these providers will become Microsoft in the '90s type bully holdouts on implementing the emerging conventions. But ultimately with CLI interface you have workarounds to all the major providers read in your system guidelines. But in an IDE - e.g. like MS had with VisualStudio - you more lock-in potential for your config files.

Yesterday, I was writing about a way I found to pass the same guideline documents into Claude, Gemini, and Aider CLI-coders: https://github.com/sutt/agro/blob/master/docs/case-studies/a...

▲

rapind

9 hours ago

[-]

Isn't think just a symlink?

▲

mrits

6 hours ago

[-]

I'm at a point where I symlink differnet sets of docs to try to focus context so much I feel like maybe I need a git submodule with different branches of context I want. I left managing people to now manage AI

▲

ijidak

10 hours ago

[-]

Agree. It's all English. That's the whole point of these tools.

Why are we purposely creating CLI dialects?

▲

stonecharioteer

7 hours ago

[-]

Symlinks $AGENT.md to AGENT.md in your repo.

▲

irrationalfab

10 hours ago

[-]

Thank you so much for this!

▲

sneak

10 hours ago

[-]

When they stop getting desperate for differentiation by spamming their brand advertising in your repo against your will.

Claude Code likes to add "attribution" in commit messages, which is just pure spam.

▲

striking

10 hours ago

[-]

You can turn it off: https://docs.anthropic.com/en/docs/claude-code/settings#avai... `includeCoAuthoredBy: false`

▲

what

4 hours ago

[-]

It’s not spam, it lets people know it was written by an LLM and maybe you should look closer at it.

▲

mrcwinn

7 hours ago

[-]

I went from years of vscode to "Cursor is the future" to never using Cursor at all. Claude Code, even with new limits, is just too good. If I were to switch to gpt-5, why wouldn't I just use Codex? I'm struggling to understand the value of what they're presenting.

▲

LeafItAlone

5 hours ago

[-]

I find the Codex CLI to be the worst of the CLI tools I’ve used (including, but not limited to, Claude Code, Gemini, Aider). There’s something about it that makes it clunky. Haven’t tried Cursor CLI yet though.

▲

gpeal

3 hours ago

[-]

We (Codex) shipped a pretty large CLI update today and have many more improvements coming. Give it a try if you haven't.

https://x.com/OpenAIDevs/status/1953559797883891735 (0.19 now)

▲

cadamsdotcom

17 minutes ago

[-]

Tried your latest version - thanks for posting about it.

Codex needs plan mode (shift-tab in Claude Code)

And Codex needs the prompt to always be available. So you can type to the model while it’s working & have it eventually receive the message and act on it, instead of having to Ctrl-C to interrupt it before typing. Claude Code’s prompt is always ready to type at - you can type while it is working. That goes a long way towards it feeling like it cares about the user.

▲

chrisvalleybay

1 minute ago

[-]

Thanks for mentioning this. These are the kinds of features that are 100% required for me to even consider Codex.

▲

LeafItAlone

2 hours ago

[-]

Thanks for the heads up. I’ll check it out!

I don’t visit Twitter links. Why not a link to the GitHub changelog?

Also, as an aside since you are on the team - the organization verification is frustrating in that the docs indicate:

>You must not have recently verified another organization, as each ID can only verify one organization every 90 days.

I champion OpenAI at my work, so naturally I’d be the one to verify there. But I apparently can’t, because I verify for my personal-led org. That gets in the way of me proselytizing gpt-5 based coding tools (such as, possibly, Codex CLI).

▲

cheema33

1 hour ago

[-]

Another +1 to this. Some of us are unwilling to click on a Twitter link. Link to changelog would be more appropriate.

▲

kristjansson

2 hours ago

[-]

Huge +1 to this. We have two orgs at work (for separate budget/rate limit blast radii) and had to get two people to verify this morning…

▲

tough

1 hour ago

[-]

nice! auto-updates like open code could help to not have to remember to update

loving the animations and todos so far

also gpt-5 is just great at agentic stuff

▲

teaearlgraycold

7 hours ago

[-]

Why is Claude Code better than Cursor?

▲

meowtimemania

4 hours ago

[-]

My company has a huge codebase, for me cursor would freeze up / not find relevant files. Claude code seems able to find the right files by itself.

I seem to always have better outcomes with Claude code.

▲

JyB

5 hours ago

[-]

Because iterating multiple sessions through multiple terminals is obviously more efficient and seamless than interacting thought a scuffed IDE side panel ui.

▲

theshrike79

2 hours ago

[-]

Claude Code has some non-LLM magic in it that just makes it better for code in general, despite (or because of) having minimal IDE integration.

▲

fastball

7 hours ago

[-]

In my experience, it is much better at tool-calling, which is huge when we're talking about agentic coding. It also seems to do a better job of keeping things cleaning and not going off on tangents for anything that isn't accomplished in one shot.

▲

benbayard

6 hours ago

[-]

I have had the exact opposite experience. Claude Code in any meaningful codebase for me gets stuck in loops of doing the wrong thing. Then when that doesn't work it deletes files and makes its own that don't have the problem it's encountering.

Cursor on the other hand, especially with GPT-5 today but typically with Sonnet 4.1, has been a workhorse at my company for months. I have never had Claude Code complete a meaningful ticket once. Even a small thing like fixing a small bug or updating the documentation on the site.

Would love any tips on how to make Claude Code not a complete waste of electricity.

▲

alwillis

3 hours ago

[-]

> Cursor on the other hand, especially with GPT-5 today but typically with Sonnet 4.1

You probably mean Opus 4.1; there's no Sonnet 4.1 yet.

▲

benbayard

3 hours ago

[-]

Yes that’s correct.

▲

user3939382

4 hours ago

[-]

If you don’t know how to divide a problem up given a toolset you won’t be able to solve it regardless of what those tools are. Maybe Cursor’s interface is more intuitive for you.

▲

benbayard

3 hours ago

[-]

The problems I’ve given CC are things that are incredibly simple and basic. Things I knew how to fix immediately. I would tell it the gilt to change and how to change it. And it will get lost when the types are incorrect, or when it causes a test to fail. It will like just delete the test.

I don’t doubt I could improve my prompts but I don’t have those same prompting problems with cursor.

▲

JyB

5 hours ago

[-]

Better prompts?

▲

alwillis

3 hours ago

[-]

> Better prompts?

I think you're right.

People getting really poor results probably don't recognize that their prompts aren't very good.

I think some users make assumptions about what the model can't do before they even try, so their prompts don't take advantage of all the capabilities the model provides.

▲

benbayard

3 hours ago

[-]

I don’t really have a problem prompting cursor with the same models. But I have no doubt my prompts could be improved

▲

cft

3 hours ago

[-]

Opposite experience. I worked with Claude code a lot, then switched to Cursor and then tried to switch back and discovered that CC often gets stuck in loops. Cursor just works. It definitely helps that I can switch the foundational models in Cursor when it gets stuck.

▲

plantain

3 hours ago

[-]

CC just feeds the whole codebase and entire files into the model, no RAG, nothing in the way. It works substantially better because of that, but it's $expensive$.

▲

nlh

4 hours ago

[-]

What I have found Claude Code is extremely good at is that it makes one change at a time, gives you a chance to read the code its changing, and lets you give feedback in real time and steer it properly. I find the mental load with this method to be MUCH lower than Cursor or any of the other tools which give you two very different options: "Ask" mode which dumps a ton of suggestions on your and then requires semi-manual implementation, or "Agent" mode which dumps a ton of actual changes on you and requires your inspection and feedback and roll-backs, etc.

This may not work for everyone, but as a solo dev who wants to keep a real mental model of my work (and not let it get polluted with AI slop), the Claude Code approach just works really well for me. It's like having a coding partner who can iterate and change direction as you talk, not a junior dev who dumps a pile of code on your plate without discussion.

▲

anxman

1 hour ago

[-]

+1 to this. Cursors Agent feels too difficult to wrangle. CC is easier to monitor.

▲

amclennon

11 hours ago

[-]

At this point, there are more AI coding agents announced every week than Javascript frameworks, but to be honest, I'm here for it.

▲

mirkodrummer

1 hour ago

[-]

Think how much training has been done on such Javascript frameworks... no one stops wondering what the outcome would be. The only fact that when I ask to create an app, without any further detail about what to use, and it defaults on React, imo it's a total failure whatever the agent

▲

wilg

10 hours ago

[-]

Think how many JavaScript frameworks can be vibe coded now!

▲

kylecordes

10 hours ago

[-]

(This is an exaggeration:)

Sure, you can have your LLM code with any JavaScript framework you want, as long as you don't mind it randomly dropping React code and React-isms in the middle of your app.

▲

throwup238

10 hours ago

[-]

It’s not a real JS framework without JSX support and Typescript types that generate page long errors.

▲

touristtam

2 hours ago

[-]

To be honest I am being positive and hopefully we'll see an explosion of AI agent that will help iron out all the bug in FOSS that is hosted on different source code hosting platform. Renovate on steroid. I would work on that if my daytime job wasn't my main and only source of revenue.

▲

bloppe

31 minutes ago

[-]

Ask a FOSS maintainer and they will not be nearly as optimistic about AI reducing the amount of bugs. A lot of AI generated pull requests are broken or useless and the up wasting a lot of the maintainers' time

▲

irrationalfab

10 hours ago

[-]

Ironically, LLMs might make it very hard for new frameworks to gain popularity since they are trained on the popular ones.

▲

alwillis

3 hours ago

[-]

If we're not there already, it's just a matter of time before LLMs will be able to read and understand a framework they haven't seen before and be able to use it anyway.

LLMs are already trained on JavaScript at a deep level; as LLM reasoning and RAG techniques improve, there will be a time in the not-too-distant future when an LLM can be pointed to the website of a new framework and be able to use it.

▲

stavros

10 hours ago

[-]

Why would we create a framework to make coding easier when nobody writes code by hand any more?

▲

tayo42

10 hours ago

[-]

Make one that's optimal for Ai somehow

▲

irrationalfab

10 hours ago

[-]

Like convex.dev

▲

fullstackwife

10 hours ago

[-]

The concept of JS framework which allows you to rapidly develop an app has the same underlying vibe as coding agent

▲

phren0logy

12 hours ago

[-]

Holy moly. I did not see that coming, but it makes sense. I’m enjoying the terminal-based coding agents way more than I ever would have expected. I can keep one spinning in the background while I do #dayjob, and as a bonus I feel like a haX0r.

2025 is the year of the terminal, apparently?

For my prototype purposes, it’s great, and Claude code the most fun I’ve had with tech in a jillion years.

▲

tsvetkov

11 hours ago

[-]

Fascinating to see how agents are redefining what IDEs are. This was not really the case in the chat AI era. But as autonomy increases, the traditional IDE UI becomes less important form of interaction. I think those CLI tools have pretty good chance to create a new dev tools ecosystem. Creating a full featured language plugin (let alone a full IDE) for VSCode or Intellij is not for a faint-hearted, and cross IDE portability is limited. CLI tools + MCP can be a lot simpler, more composable and more portable.

▲

cheschire

6 hours ago

[-]

IDE UI should shift to focusing on catching agentic problems early and obviously, and providing drop dead simple rollback strategies, parallel survival-of-the-fittest solution generation, etc

▲

lherron

10 hours ago

[-]

With all the frontier labs competing in this space now, and them letting you use your consumer subscription through the CLI, I don’t understand how the Cursor products will survive. Why pay an extra $X/mo when I can get this functionality included in the $Y/mo I’m already paying OAI/Anthropic/GOOG?

▲

risho

6 hours ago

[-]

I think the complete opposite. I love the ux for claude code, but it would be better if it wasnt locked to a single vendor's model. It seems pretty clear to me that a vendor neutral product with a UX as good as Claude Code would be the clear winner.

▲

MrGreenTea

3 hours ago

[-]

Habe you tried opencode? I haven't really, but it can use your anthropic subscription and also switch to most other models. It also looks quite nice IMO

▲

didibus

9 hours ago

[-]

I'm actually starting to think the opposite.

If Cursor can build the better UX for all the use-cases, mobile/desktop chatbot, assistant, in IDE coding agent, CLI coding agent, web-based container coding agent, etc.

In theory, they can spend all their resourcing on this, so you could assume they could have those be more polished.

If they win the market-share here, than the models are just commodity, Cursor lets you pick which ever is best at any given time.

In a sense, "users" are going to get locked in on the tooling. They learn the commands, configuration, and so on of Cursor, it's a higher cost for them to re-learn a different UX. Uninstalling and re-installing another app, plugin, etc. is annoying.

▲

lvl155

9 hours ago

[-]

No, model providers are not going to let Cursor eat their pie. The biggest cost in AI is in developing LLM models and inference. Players incurring those costs will basically control this market.

▲

didibus

6 hours ago

[-]

I don't think we'll have more than 2 players. I think it's like AMD and Intel, the LLM is almost like providing hardware. The software that exposes the LLM capabilities to the user is the layer that will be able to differentiate.

The models are just going to be fighting performance/cost. And people will choose the best performance for their budget.

And that's ignoring how good local models are getting as well.

It's not that they'll have their launch eaten by Cursor, it's just that they can't be as focused on user experience when they're also laser focused on improving the models to stay competitive.

▲

vineyardmike

9 hours ago

[-]

I agree that cursor has to take an aggressive and differentiated approach to succeed, but they have the benefit of pushing each lab into a commodity.

I pay for Cursor and ChatGPT. I can imagine I’d pay for Gemini if I used an android. The chat bots (1) won’t keep the subscription competitive with APIs because the cost and usage models are different and (2) most chat bots today are more of a UX competition than model quality. And the only winners are ChatGPT and whatever integrated options the user has by default (Gemini, MSFT Copilot, etc).

▲

impulser_

10 hours ago

[-]

Because you can always use the best model. Yesterday is was Claude Opus 4.1, today it's GPT-5. If you just were paying Anthropic you will be stuck with Claude.

▲

lherron

10 hours ago

[-]

Yeah but I still want a general purpose chatbot subscription also. So I’d have to buy Cursor + something else.

I guess Cursor makes sense for people who only use LLMs for coding.

▲

byronic

8 hours ago

[-]

I'm having trouble finding a use for this outside of virtualized unused environments. Why not instead give me a virtual machine that runs this in a confined storage space?

I would _never_ give an LLM access to any disk I own or control if it had anything more than read permissions

▲

alwillis

2 hours ago

[-]

For example, Gemini CLI [1] can use native sandboxing on macOS. It's just a matter of time before every major coding agent will run inside of an operating system's native sandbox/container/jail/VM.

[1]: https://github.com/google-gemini/gemini-cli/blob/main/docs/c...

▲

extr

7 hours ago

[-]

Why not? Have you ever actually used these things? The risk is incredibly low. I run claude code with zero permissions every day for hours. Never a problem.

▲

byronic

6 hours ago

[-]

I have (not an exhaustive list) SSH keys and sensitive repositories hanging out on my filesystem. I don't trust _myself_ with that, let alone an LLM, unless I'm running ollama or similar local nonsense with no net connectivity.

I'm a few degrees removed from an air gapped environment so obviously YMMV. Frankly I find the idea of an LLM writing files or being allowed to access databases or similar cases directly distasteful; I have to review the output anyway and I'll decide what goes to the relevant disk locations / gets run.

▲

Touche

6 hours ago

[-]

They don't have arbitrary access over your file system. They ask permission for doing most everything. Even reading files, they can't do that outside of the current working directory without permission.

▲

mark_undoio

52 minutes ago

[-]

I'm pretty comfortable with the agent scaffolding just restricting directory access but I can see places it might not be enough...

If you were being really paranoid then I guess they could write a script in the local directory that then runs and accesses other parts of the filesystem.

I've not seen any evidence an agent would just do that randomly (though I suppose they are nondeterministic). In principle maybe a malicious or unlucky prompt found somewhere in the permitted directory could trigger it?

▲

swader999

6 hours ago

[-]

Your obviously skilled, spending the money on a Claude only machine would pay for itself in less than three weeks. If I was your employer, it would be a no brainer.

▲

byronic

5 hours ago

[-]

Make me that offer :D

▲

LeoPanthera

11 hours ago

[-]

That's funny. I was really hoping that Anthropic would make a "Claude GUI".

▲

consumer451

10 hours ago

[-]

In one of their Claude Code talks they said it didn’t seem worth it, given their expectation that all IDEs will become obsolete by next year.

▲

kridsdale3

9 hours ago

[-]

Xcode pretty much hung up their hat this year, and threw in with Claude.

▲

pizza

7 hours ago

[-]

If I'm not mistaken, it may be feasible to build one with the Claude Code sdk

▲

didibus

9 hours ago

[-]

Isn't that Claude Desktop?

▲

eagerpace

5 hours ago

[-]

I really like the IDE. It makes enough mistakes that I need to be constantly testing and catching little errors. I’ll interrupt the flow often when it’s going down a path I don’t want it to. When using Codex, for example, it’s doing too much in the background that is harder to correct afterwards. Am I doing this wrong?

▲

treve

4 hours ago

[-]

People have preferred either the terminal or chunky IDEs for decades. Neither are wrong.

▲

unsupp0rted

12 hours ago

[-]

What's the benefit of this compared to the IDE? To be more like Claude Code?

▲

gorjusborg

11 hours ago

[-]

Flip your thinking around for a second and consider why an IDE is required for an agent that codes for you?

The IDE/editor is for me, the agent doesn't need it. That also means I am not forced to used whatever imperfect forked IDE the agent is implemented against.

▲

worldsayshi

11 hours ago

[-]

> why an IDE is required for an agent that codes for you

Because the agents aren't yet good enough for a hands off experience. You have to continuously monitor what it does if you want a passable code base.

▲

tsvetkov

10 hours ago

[-]

Sure, but monitoring, reviewing and steering does not really require modern IDEs in their current form. Also, I'm sure agents can benefit from parts of IDE functionality (navigation, static analysis, integration with build tools, codebase indexing, ...), but they sure don't need the UI. And without UI those parts can become simpler, more composable and more portable (being compatible with multiple agent tools). IMO another way to think about CLI agentic coding tools as of new form of IDEs.

▲

imp0cat

2 hours ago

[-]

As was already mentioned elsewhere, Emacs + Magit to monitor incoming changes is a great combo.

▲

dagss

1 hour ago

[-]

Whenever I have to take the wheel myself the AI tab completion makes it much smoother so I am kind of addicted to that. Semi-automatic mode.

I would much rather use IntelliJ so perhaps my habits will change at some point, but right now I am stuck with Cursor/vscode for the tab completion.

▲

stavros

10 hours ago

[-]

I don't really need an IDE, but I do need a great code review interface.

▲

Touche

6 hours ago

[-]

I use lazygit for that. But any diff tool you like will work.

▲

Xenoamorphous

10 hours ago

[-]

As someone who hasn’t used Claude Code yet, can’t you configure it somehow to use a different tool of your liking, or it has to be in the cli?

▲

stavros

9 hours ago

[-]

I end up using the VCS tooling (lazygit for me), but coding agents really need to be integrated with this review environment. We need an extra step where the agent will group its changes into logical units (database models in one commit, types in another, business logic in another, tests in another), rather than having to review per-file.

Programming has changed from writing code to reviewing/QAing and reprompting, but the tooling hasn't yet caught up with that workflow. We need Gerrit for coding agents, basically.

▲

fooster

4 hours ago

[-]

I just merge the change and review the diff. If it’s wrong I either revert or ask Claude to fix it.

▲

bangaladore

11 hours ago

[-]

Many of these companies are realizing that mainline VSCode is a moat of sorts. I and many people I know won't use any of these that require forking VSCode.

With the benefit that you can also pull in people who don't like using VSCode such as people who use Jetbrains or terminal based code editors.

▲

nojs

9 hours ago

[-]

So you can use an IDE other than VS code.

▲

jstummbillig

11 hours ago

[-]

I am so curious to know. Why is Cursor not just putting whatever this supposedly does better into... Cursor?

▲

anthonypasq

10 hours ago

[-]

i dont think it actually does anything better than the chat window in the editor. its strictly worse tbh. it just lets you not be tied to a VSCode interface for editing.im sure Jetbrains diehards would very much appreciate this, but honestly i will find it hard to utilize given the fact Cursor's tab auto-complete is so amazing.

▲

jonplackett

11 hours ago

[-]

To compete with Claude code

▲

jstummbillig

11 hours ago

[-]

They are competing with Claude Code already. The competition is not over who can built the nicest CLI.

▲

sblawrie

11 hours ago

[-]

You can spin up the Cursor CLI inside the terminal of your IDE of choice and not be tethered to Claude's models.

▲

zaphirplane

11 hours ago

[-]

Is there a better agent than the anthropic one

▲

NitpickLawyer

4 hours ago

[-]

Depends how you define "better". Quality/breadth of tasks/capabilities? Probably not (TBD how gpt5 will fare, colleagues were saying that it was better at some frontend tasks than claude4 in the alpha/beta horizon tests).

But if you take speed/availability/cost into account, there might be "better" offers out there. I did some tests w/ windsurf when they announced their swe1 and swe1-lite models, and the -lite could handle easy tasks pretty well. I also tested 4.1-mini and 4.1-nano. There are tasks that I could see them handle reliably enough to make sense (and they're fast, cheap and don't throttle you).

▲

alwillis

2 hours ago

[-]

You can already use non-Anthropic models with Claude Code with tools like Claude Code Router [1].

[1]: https://github.com/musistudio/claude-code-router

▲

ribeyes

8 hours ago

[-]

i'm betting on cursor being the long-term best toolset.

1. with tight integration between cli, background agent, ide, github apps (e.g. bugbot), cursor will accommodate the end-to-end developer experience.

2. as frontier models internalize task routing, there won't be much that feels special about claude code anymore.

3. we should always promote low switching costs between model providers (by supporting independent companies), keeping incentives toward improving the models not ui/data/network lock-in.

▲

postalcoder

7 hours ago

[-]

i’d respectfully bet against this.

cursor and 3rd party tools will, unless they make their own superior foundation model, will always have to fight the higher marginal cost battle. This is particularly bad insofar that they offer fixed pricing subscriptions. That means they’re going to have to employ more context saving tricks which are at odds with better performance.

If the cost economics result in Cursor holding, say, 20% fewer tokens in context versus model-provider coding agents, they will necessarily get worse performance, all things equal.

Unless Cursor offers something dramatically different outside of the basic agentic coding stack it’s hard to see why the market will converge to cursor.

▲

blueblisters

5 hours ago

[-]

> we should always promote low switching costs between model providers (by supporting independent companies), keeping incentives toward improving the models not ui/data/network lock-in

You’re underestimating the dollars at play here. With cursor routing all your tokens, they will become a foundation model play sooner than you may think

▲

TechDebtDevin

5 hours ago

[-]

You're allowing them to train on your code?

▲

blueblisters

5 hours ago

[-]

The code isn’t the valuable part. They know all the most common workflows and failure modes, allowing them to create better environments for training agentic models

▲

ramoz

8 hours ago

[-]

Happy to short that bet as I think agentic harnesses will be molded along the RL training of the actual model. Tony + the suit created together. Why Claude in Claude Code became existential for Cursor, why cursor moved quick to go agentic and build up with OpenAI in big header line way here.

Unless they pair up with OpenAI or Meta.

▲

snthpy

3 hours ago

[-]

What differentiates the CLI tools at this point and makes you prefer one over the other?

opencode and Crush can use any model, so apart from a nicer visual experience, are there any aspects that actually make you more productive in one vs the other?

▲

ankit219

8 hours ago

[-]

I think CLI is a good idea for now. Next abstraction seems to be Github PRs where someone (likely me) files an issue/feature, then I click a button, and the agent fixes the issue/feature. Github has talked about something similar, but surely it were a pain to figure out if it was GA and I had access to it given so many different variations they have called gh copilot. (PS: it exists, but not as smooth as I described: https://docs.github.com/en/copilot/how-tos/use-copilot-agent... )

▲

imp0cat

2 hours ago

[-]

You can already have that with Jules. It's quite impressive.

https://jules.google/

▲

g42gregory

2 hours ago

[-]

Boris Cherny was a (main?) creator of Claude Code at Anthropic. He moved over to Cursor about a month ago. I hope Cursor CLI is an Claude Code Agent port to the Cursor. Hopefully, the code quality would be comparable, modulo Cursor's abridged model access. We will know shortly.

▲

mliker

2 hours ago

[-]

He actually returned to Anthropic shortly after joining Cursor

▲

risho

11 hours ago

[-]

is there a way to get it to display more information? its stuck not doing anything and i cant tell if that's because it timed out or it is running a script or it is thinking or what is even happening. sometimes it just does things without even giving any feedback at all. i dont know what it is thinking or what it is trying to do and i cant really see the output of the terminal commands it is running. it just pauses every once in a while and asks to run a command.

is there a way to make it more verbose?

▲

joshmlewis

11 hours ago

[-]

I noticed it was taking awhile on the first large-ish task I gave it. I'm assuming it was just a bit overloaded at the moment.

▲

daviding

7 hours ago

[-]

Can you pick thinking models with this or is that implied?

GPT-5 seems a bit slow so far (in terms of deciding and awareness). I’ve gone from waiting for a compiler, to waiting for assets to build to now waiting for an agent to decide what to do - progress I guess :)

▲

cheema33

10 hours ago

[-]

My first thought was, "meh, I already have Claude Code". But then I remembered my primary frustration with Claude Code. I need other LLMs to be able to validate Claude Code's assumptions and work. I need to do this in an automated way. Before Cursor CLI, I did not have a way to programmatically ask Cursor do this. It was very manual, very painful. But, now I can create a Claude Code agent that is a "cursor-specialist" that uses cursor cli to do all of that in an automated way.

▲

good8675309

10 hours ago

[-]

Interesting, are you saying you would setup a Stop Hook in Claude Code that calls the Cursor CLI to have it validate and prompt Claude Code with further instructions?

▲

cyounkins

8 hours ago

[-]

Could anyone compare this with Claude Code and aider?

▲

afro88

11 hours ago

[-]

Claude Code but can use GPT-5 built in. Not a bad selling point

▲

jasonjmcghee

10 hours ago

[-]

Claude Code can use GPT-5 via LiteLLM

https://docs.anthropic.com/en/docs/claude-code/llm-gateway#l...

https://docs.litellm.ai/docs/tutorials/claude_responses_api

▲

huydotnet

11 hours ago

[-]

and access to Cursor's background agent on the web as well, like ChatGPT Codex. So to this point, I'm regret cancelling my Cursor subscription already

▲

jameskraus

11 hours ago

[-]

I wonder if this will support directly interfacing with OpenAI's APIs vs. going through Cursor's APIs (and billing).

▲

joshmlewis

11 hours ago

[-]

I would highly doubt it. Even when you BYOK inside of Cursor they still say it's routed through their servers.

▲

ayerajath

5 hours ago

[-]

has to be, given the hype surrounding claude code, a few of them are using claude code just cause it's terminal based.

▲

macawfish

11 hours ago

[-]

Hopefully this one is as good as Claude code. None of them that I've tried have come close yet.

▲

lvl155

9 hours ago

[-]

Have you tried opencode?

▲

macawfish

7 hours ago

[-]

Yeah, opencode and crush. I'm gonna give Claude code router a good try soon.

▲

thornewolf

10 hours ago

[-]

They realized that CLI is the much better interface for these kinds of tasks.

▲

rtuin

10 hours ago

[-]

It seems they haven’t implemented MCP client features in Cursor CLI yet

▲

Rhubarrbb

9 hours ago

[-]

Does it work with local LLMs like through Ollama or llama.cpp?

▲

daft_pink

10 hours ago

[-]

Is the pricing any good?

▲

blitzar

11 hours ago

[-]

Pivot to CLI

▲

cpursley

10 hours ago

[-]

There are certainly some lessons here that go beyond coding agents (when it comes to shipping products).

▲

asadm

11 hours ago

[-]

seems pretty basic. I don't see anything unique here. I am happy with my Gemini CLI.

▲

teaearlgraycold

6 hours ago

[-]

I’m mostly going to use this as a convenient way to run ffmpeg. Previously I’d need to open Cursor and ask for commands in the terminal there.

▲

kamatour

8 hours ago

[-]

So we’re all just waiting for AGENT.md to become the new README, huh? I’m ready when the agents are.

▲

alessandrorubio

11 hours ago

[-]

Wouldn't be better to just use the Warp AI solution at this point?

▲

buremba

10 hours ago

[-]

Only if it would work. I think they miss a big opportunity here by (1) not caring about security at all, (2) trying to develop their own model and only make it available in the cloud.

▲

didibus

9 hours ago

[-]

What's the difference between Warp and just opening multiple tabs in my terminal?

▲

htrp

11 hours ago

[-]

They are all clones of gemini cli at this point?

▲

hollerith

11 hours ago

[-]

Since Gemini CLI was released under the Apache license, a clone is easy to make.

▲

twapi

11 hours ago

[-]

Claude Code finally has a serious competitor.

▲

asdfologist

10 hours ago

[-]

Not sure. So far Reddit seems largely negative on Cursor CLI + GPT-5

https://www.reddit.com/r/cursor/comments/1mk8ks5/discussion_...

▲

wahnfrieden

5 hours ago

[-]

They are all using mid-tier gpt-5 variants (not the "-high" one that's hidden by default, not gpt-5-thinking) and don't realize it

▲

lvl155

9 hours ago

[-]

Seriously Cursor. You can’t just write wrappers all your life. VSCode wrapper and now Gemini CLI wrapper. Can you make something from scratch for once? It’s as if they want an exit and they’re putting in minimum effort until that materializes.

▲

AdieuToLogic

6 hours ago

[-]

When I saw this, the question which immediately came to mind was:

  Who would turn loose arbitrary commands (content)
  generated by an LLM onto their filesystem?

Then I saw the installation instructions, which are:

  curl https://cursor.com/install -fsS | bash

And it made sense.

Only those comfortable with installing software by downloading shell commands from an arbitrary remote web site and immediately executing them would use it.

So what then is the risk of running arbitrary file system modifications generated from a program installed via arbitrary shell commands? None more than what was accepted in order to install it.

Both are opaque, unreviewed, and susceptible to various well known attacks (such as a supply chain attack[0]).

0 - https://en.wikipedia.org/wiki/Supply_chain_attack

▲

adhamsalama

14 minutes ago

[-]

I couldn't even install Cursor on Ubuntu . The issue still exists. Why didn't they ask the AI to fix it?