FilterHN

Show HN: Project management system for Claude Code

168 points

by aroussi

2 days ago

| past

| 24 comments

| github.com

| HN

I built a lightweight project management workflow to keep AI-driven development organized.

The problem was that context kept disappearing between tasks. With multiple Claude agents running in parallel, I’d lose track of specs, dependencies, and history. External PM tools didn’t help because syncing them with repos always created friction.

The solution was to treat GitHub Issues as the database. The "system" is ~50 bash scripts and markdown configs that:

- Brainstorm with you to create a markdown PRD, spins up an epic, and decomposes it into tasks and syncs them with GitHub issues - Track progress across parallel streams - Keep everything traceable back to the original spec - Run fast from the CLI (commands finish in seconds)

We’ve been using it internally for a few months and it’s cut our shipping time roughly in half. Repo: https://github.com/automazeio/ccpm

It’s still early and rough around the edges, but has worked well for us. I’d love feedback from others experimenting with GitHub-centric project management or AI-driven workflows.

▲

moconnor

2 days ago

[-]

"Teams using this system report:

89% less time lost to context switching

5-8 parallel tasks vs 1 previously

75% reduction in bug rates

3x faster feature delivery"

The rest of the README is llm-generated so I kinda suspect these numbers are hallucinated, aka lies. They also conflict somewhat with your "cut shipping time roughly in half" quote, which I'm more likely to trust.

Are there real numbers you can share with us? Looks like a genuinely interesting project!

▲

aroussi

2 days ago

[-]

OP here. These numbers are definitely in the ballpark. I personally went from having to compact or clear my sessions 10-12 times a day to doing this about once or twice since we've started to use the system. Obviously, results may vary depending on the codebase, task, etc., but because we analyze what can be run in parallel and execute multiple agents to run them, we have significantly reduced the time it takes to develop features.

Every epic gets its own branch. So if multiple developers are working on multiple epics, in most cases, merging back to the main branch will need to be done patiently by humans.

To be clear, I am not suggesting that this is a fix-all system; it is a framework that helped us a lot and should be treated just like any other tool or project management system.

▲

datadrivenangel

2 days ago

[-]

How many feature branches can you productively run in parallel before the merge conflicts become brutal?

▲

aroussi

2 days ago

[-]

That's where the human architect comes in (for now at least). We'll try to think of features that would have the least amount of conflicts when merged back to main. We usually max it at 3, and have a senior dev handle any merge conflicts.

▲

CuriouslyC

2 days ago

[-]

That depends on how decoupled your codebase is and how much overlap in the areas being worked on by your agents are. If you have a well architected modular monolith and you don't dispatch overlapping issues, it's fine.

▲

tummler

2 days ago

[-]

A project management layer is a huge missing piece in AI coding right now. Proper scoping, documentation, management, etc is essential to getting good results. The people who are having the most success with “vibe coding” have figured this out, but it should really be incorporated into the process.

▲

swader999

2 days ago

[-]

The advantage with using multiple agents is in context management, not parallelization. A main agent can orchestrate sub agents. The goal is to not overwhelm the main agent with specialized context for each step that can be delegated to separate task focused agents along the way.

Test runner sub agent knows exactly how to run tests, summarize failures etc. It loads up all the context specific to running tests and frees the main agent's context from all that. And so on...

▲

aroussi

2 days ago

[-]

100%! Use agents as "context firewalls". Let them read files, run tests, research bugs, etc, and pass essential data back to the main thread.

▲

jdmoreira

2 days ago

[-]

I'm a huge fan of Claude Code. That being said it blows my mind people can use this at a higher level than I do. I really need to approve every single edit and keep an eye on it at ALL TIMES, otherwise it goes haywire very very fast!

How are people using auto-edits and these kind of higher-level abstraction?

▲

blitzar

2 days ago

[-]

The secret to being an elite 10x dev - push 1000's of lines of code, soak up the ooo's and ahhh's at the standup when management highlight your amazingly large line count, post to linkedin about how great and humble you are, then move to the next role before anyone notices you contribute nothing but garbage and some loser 0.1x dev has to spend months fixing something they could have writting in a week or two from scratch.

▲

aososidjbd

2 days ago

[-]

This has been my experience with coworkers who are big vibe coders as well. Another “sorry, big PR coming in that needs a review” and I’m gonna lose it. 50 comments later and they still don’t change.

When using agents like this, you only see a speedup because you’re offloading the time you’d spend thinking / understanding the code. If you can review code faster than you can write it, you’re cutting corners on your code reviews. Which is normally fine with humans (this is why we pay them), but not AI. Most people just code review for nitpicks anyways (rename a variable, add some white space, use map reduce instead of for each) instead of taking time to understand the change (you’ll be looking a lots of code and docs that aren’t present in the diff).

That is, unless you type really slowly - which I’ve recently discovered is actually a bottle neck for some professionals (slow typing, syntax issues, constantly checking docs, etc). I’ll add I experience this too when learning a new language and AI is immensely helpful.

▲

swader999

2 days ago

[-]

You're absolutely right but I wonder if we'll have to ditch the traditional code review for something else, perhaps automated, if this agentic way continues.

▲

robbs

2 days ago

[-]

> You're absolutely right

Claude! Get off HN and get back to work.

▲

swader999

1 day ago

[-]

Oh my, that was unintentional. What have I become...

▲

CuriouslyC

2 days ago

[-]

AI can actually review PRs decently when given enough context and detailed instructions. It doesn't eliminate the PR problem, but it can catch a lot of bugs and it can add comments to parts of the code that look questionable to instruct humans to manually verify.

▲

mac-mc

2 days ago

[-]

You can also force the agent to write up a summary of the code change, reasoning, etc, fortunately, which can help with review.

▲

aitchnyu

2 days ago

[-]

Which industry are you in, that there is 1:1 ratio of coding to review hours?

▲

aososidjbd

2 days ago

[-]

SaaS company. And there absolutely isn’t - which is why we pay for devs. Mostly due to trust we’re able to review human PRs faster than equivalent AI PRs. Trust is worth a lot of money.

▲

fzeindl

2 days ago

[-]

This. I‘m always amazed on how LLMs are praised for being able to churn out the large amount of code we apparently all need.

I keep wondering why. All projects I ever saw need lines of code, nuts and bolts removed instead of added. My best libraries consist of a couple of thousand lines.

▲

semitones

2 days ago

[-]

LLMs are a godsend when it comes to developing things that fit into one of the tens of thousands (or however many) of templates they have memorized. For instance, a lot of modern B2B software development involves updating CRUD interfaces and APIs to data. If you already have 50 or so CRUD functions in an existing layered architecture implemented, asking an LLM to implement the 51st, given a spec, is a _huge_ time-saver. Of course, you still need to use your human brain to verify before hand that there aren't special edge cases that need to be considered. Sometimes, you can explain the edge cases to the LLM and it will do a perfect job of figuring them out (assuming you do a good job of explaining it, and it's not too complicated). And if there aren't any real edge cases to worry about, then the LLM can one-shot a perfect PR (assuming you did the work to give it the context).

Of course, there are many many other kinds of development - when developing novel low-level systems for complicated requirements, you're going to get much poorer results from an LLM, because the project won't as neatly fit in to one of the "templates" that it has memorized, and the LLM's reasoning capabilities are not yet sophisticated enough to handle arbitrary novelty.

▲

NewsaHackO

2 days ago

[-]

The elephant in the room though is that the vast majority of programming fits into the template style that LLM’s are good at. That’s why so many people are afraid of it.

▲

semitones

1 day ago

[-]

Yes - and I think those people need to expand their skillsets to include the things that the LLMs _cannot_ (yet) do, and/or expand their productivity by wielding the LLMs to do their work for them in a very efficient manner.

▲

ttcbj

2 days ago

[-]

I think Steve Ballmer's quote was something like "Measure a software project's progress by increase in lines-of-code is like measuring an airplane project's progress by increase in weight."

▲

scrollaway

2 days ago

[-]

There's a lot of smart people on HN who are good coders and would do well to stop listening to this BS.

Great engineers who pick up vibe coding without adopting the ridiculous "it's AI so it can't be better than me" attitude are the ones who are able to turn into incredibly proficient people able to move mountains in very little time.

People stuck in the "AI can only produce garbage" mindset are unknowingly saying something about themselves. AI is mainly a reflection of how you use it. It's a tool, and learning how to use that tool proficiently is part of your job.

Of course, some people have the mistaken belief that by taking the worst examples of bullshit-coding and painting all vibe coders with that same brush, they'll delay the day they lose their job a a tiny bit more. I've seen many of those takes by now. They're all blind and they get upvoted by people who either haven't had the experience (or correct setup) yet, or they're in pure denial.

The secret? The secret is that just as before you had a large amount of "bad coders", now you also have a large amount of "bad vibe coders". I don't think it's news to anyone that most people tend to be bad or mediocre at their job. And there's this mistaken thinking that the AI is the one doing the work, so the user cannot be blamed… but yes they absolutely can. The prompting & the tooling set up around the use of that tool, knowing when to use it, the active review cycle, etc - all of it is also part of the work, and if you don't know how to do it, tough.

I think one of the best skills you can have today is to be really good at "glance-reviews" in order to be able to actively review code as it's being written by AI, and be able to interrupt it when it goes sideways. This is stuff non-technical people and juniors (and even mediors) cannot do. Readers who have been in tech for 10+ years and have the capacity to do that would do better to use it than to stuff their head in the sand pretending only bad code can come out of Claude or something.

▲

rs186

2 days ago

[-]

You can't, at least for production code. I have used Claude Code for vibe coding several side projects now, some just for fun, others more serious and need to be well written and maintainable. For the former, as long as it works, I don't care, but I could easily see issues like dependency management. Then for the latter, because I actually need to personally verify every detail of the final product and review (which means "scan" at the least) the code, I always see a lot of issues -- tightly coupled code that makes testing difficult, missing test cases, using regex when it shouldn't, having giant classes that are impossible to read/maintain. Well, many of the issues you see humans do. I needed to constantly interrupt and ask it to do something different.

▲

the_mitsuhiko

2 days ago

[-]

> You can't, at least for production code.

You can. People do. It's not perfect at it yet, but there are success stories of this.

▲

noodletheworld

2 days ago

[-]

Are you talking about the same thing as the OP?

I mean, the parent even pointed out that it works for vibe coding and stuff you don't care about; ...but the 'You can't' refers to this question by the OP:

> I really need to approve every single edit and keep an eye on it at ALL TIMES, otherwise it goes haywire very very fast! How are people using auto-edits and these kind of higher-level abstraction?

No one I've spoken to is just sitting back writing tickets while agents do all the work. If it was that easy to be that successful, everyone would be doing it. Everyone would be talking about it.

To be absolutely clear, I'm not saying that you can't use agents to modify existing code. You can. I do; lots of people do. ...but that's using it like you see in all the demos and videos; at a code level, in an editor, while editing and working on the code yourself.

I'm specifically addressing the OPs question:

Can you use unsupervised agents, where you don't interact at a 'code' level, only at a high level abstraction level?

...and, I don't think you can. I don't believe anyone is doing this. I don't believe I've seen any real stories of people doing this successfully.

▲

adriand

2 days ago

[-]

> Can you use unsupervised agents, where you don't interact at a 'code' level, only at a high level abstraction level?

My view, after having gone all-in with Claude Code (almost only Opus) for the last four weeks, is ”no”. You really can’t. The review process needs to be diligent and all-encompassing and is, quite frankly, exhausting.

One improvement I have made to my process for this is to spin up a new Claude Code instance (or clear context) and ask for a code review based on the diff of all changes. My prompt for this is carefully structured. Some issues it identifies can be fixed with the agent, but others need my involvement. It doesn’t eliminate the need to review everything, but it does help focus some of my efforts.

▲

stavros

2 days ago

[-]

Do you know of any links to writeups (or just mentions) of this?

▲

threecheese

2 days ago

[-]

Check out the_mitsuhiko’s youtube, he has been showing some good techniques in the past few weeks.

▲

stavros

2 days ago

[-]

I don't trust Armin for that, he's too good a developer for vibe coding. The question is whether someone who can't program at all can make something that works well with LLMs, not whether Armin can.

▲

unshavedyak

2 days ago

[-]

Is that the question? I definitely don't think that's remotely reasonable for someone who can't program. For small things yes, but large things? They're going to get into a spin cycle with the LLM on some edge case it's confused about where they consistently say "the button is blue!" and the bot confirms it is indeed not blue.

It really depends on the area though. Some areas are simple for LLMs, others are quite difficult even if objectively simple.

Granted atm i'm not a big believer in vibe coding in general, but imo it requires quite a bit of knowledge to be hands off and not have it fall into wells of confusion.

▲

stavros

2 days ago

[-]

That's what I understood from the top-level question, and it's my experience as well. If you don't review the LLM's code, it breaks very quickly. That's why the question for me isn't "how many agents can I run in parallel?", but "how many changes can I review in parallel?".

For me, that's "just one", and that's why LLM coding doesn't scale very far for me with these tools.

▲

dingnuts

2 days ago

[-]

if you have to understand the code, it's not vibe coding. Karpathy's whole tweet was about ignoring the code.

if you have to understand the code to progress, it's regular fucking programming.

I don't go gushy about code generation when I use yasnippet or a vim macro, why should super autocomplete be different?

this is an important distinction because if Karpathy's version becomes real we're all out of a job, and I'm sick of hearing developers role play publicly towards leaders that their skills aren't valuable anymore

▲

unshavedyak

2 days ago

[-]

I disagree, i think there's degrees of governance that these concepts cover. It's all subjective of course. For me though, i've "vibe coded" projects (as testing grounds) with minimal review, but still used my programming experience to shape the general architecture and testing practices to what i thought would best fit the LLM.

The question is how much do you review, and how much does your experience help it? Even if you didn't know code you're still going to review the app. Ideally incrementally or else you won't know what's working and what isn't. Reviewing the technical "decisions" from the LLM is just an incremental step towards reviewing every LOC. There's a large gulf between full reviews and no reviews.

Where in that gulf you decide to call it "vibe coding" is up to you. If you only consider it vibing if you never look at the code though, then most people don't vibe code imo.

I think of "vibe coding" as synonymous with "sloppy/lazy coding". Eg you're skipping details and "trusting" that the LLM is either correct or has enough guardrails to be correct in the impl. How many details you skip though is variable, imo.

▲

afro88

2 days ago

[-]

> The question is whether someone who can't program at all can make something that works well with LLMs

Is that where the goalposts are now?

▲

stavros

2 days ago

[-]

No, this is where this discussion is, since the top comment. Please go elsewhere for straw men.

▲

afro88

1 day ago

[-]

No one mentioned anything about "people who can't program at all" until your comment. Up until then the discussion was about using LLMs for production ready code. It's a given that people working on production systems know how to program.

▲

stavros

1 day ago

[-]

That is what "auto-edits" (in the first comment) and "vibe coding" (in the second comment) mean.

▲

the_mitsuhiko

2 days ago

[-]

There are few writeups but if you go to agentic coding meetups you can find people that show the stuff the build. It’s really quite impressive.

▲

mac-mc

2 days ago

[-]

Do they go into the architecture and code structure, or more just the user-facing result? Coding agents do a lot of copy-paste or near equivalents and make way too much code to accomplish many things.

▲

stavros

2 days ago

[-]

Ah, we don't have any such meetups where I am... Are these from people who can't program at all?

▲

the_mitsuhiko

1 day ago

[-]

Also, yes. I wrote about this a bit here: https://lucumr.pocoo.org/2025/7/20/the-next-generation/

There are a lot of people who are entering programming via this thing.

▲

stavros

1 day ago

[-]

Sure, but when I tried to vibe code something in a language I didn't have experience with, and so didn't look at the code at all, I had to basically trash the codebase after a few hundred lines because nothing I'd say could make the LLM fix the problems, or if it fixed them, more problems would pop up elsewhere.

In my experience, if you can't review the code and point out the LLM's mistakes to it, the codebase gets brittle fast. Maybe other people are better vibe coders than me, but I never managed to solve that problem, not even with Opus 4.1.

▲

grim_io

2 days ago

[-]

Deep down you know the answer already :)

There is no magic way. It boils down to less strict inspection.

I try to maintain an overall direction and try to care less about the individual line of code.

▲

Nizoss

2 days ago

[-]

Same, I manually approve and steer each operation. I don't see how cleaning up and simplifying after the fact is easier or faster.

▲

CuriouslyC

2 days ago

[-]

That kills iteration speed. Carefully outline, get it to write tests first, then let it go wild and verify the tests while it's doing that. If there are test issues, tell it so without interrupting it, and it'll just queue up those fixes without having to stop and be re-routed.

You want to periodically have coverage improvement -> refactor loops after implementing a few features. You can figure out the refactors you want while the agent is implementing the code, after you've sussed out any test issues, then just queue up instructions on how to refactor once the tests are passing.

▲

siva7

2 days ago

[-]

Oh boy, they can't. Those are inexperienced vibe coding their idea of how product management could be done but lacking the experience to realize it doesn't work that way. How many of these claude code wrappers have been posted here in the last weeks? Must be a higher two digit.

▲

ddxv

2 days ago

[-]

I think a lot of the AI stuff suffers from being better suited to showing off than actually working. How often have I thought, or worse told my friends, that it one shotted some issue of mine only to realize later that it was only partially working. Devils in the details.

▲

rockyj

2 days ago

[-]

I could not even (as of yesterday) get some boilerplate code out of AI. It very confidently spitted code which would not even compile (multiple times). Yes, it is better than parsing StackoverFlow pages when I have some specific task or error and sometimes slightly better than reading a bunch of docs for a library, but even then I have to verify if it is giving code / examples from latest versions.

▲

machiaweliczny

2 days ago

[-]

You can just tell it to read library code in npm_modules or wherever you have vendored libs in your framework. I for example give it whole demo examples and just say look at @demos/ how to do this. Cursor / CC authors don't add these prompts as this would be costly for them (and they run at loss likely now).

▲

allisdust

2 days ago

[-]

Through multi pass development. It's a bit like how processes happen inside a biological cell. There is no structure there. Structure emerges out of chaos. Same thing is with AI coding tools. Especially Claude code. We are letting code evolve to pass our quality gates. I do get to sit on my hands a lot though which frees up my time.

▲

aroussi

2 days ago

[-]

Yeah, I agree. I never let the AI make any architectural decisions (and I also watch Claude Code like a hawk lol). That being said, since we started using this system, we noticed that our PRDs and implementation plans (epics) became more detailed, giving the AI a lot less wiggle room.

Essentially, I'm treating Claude Code as a very fast junior developer who needs to be spoon-fed with the architecture.

▲

onel

1 day ago

[-]

I've seen that happen but usually with code bases that are either not very well documented (reference docs) or that have a lot of abstractions and are complicated

▲

lanthissa

1 day ago

[-]

you cant make 90% of your codebase ai generated if you reuse code all the time, just dont abstract.

▲

yodon

2 days ago

[-]

Lots of thought went into this. It would be very helpful to see examples of the various workflows and documents. Perhaps a short video of the system in use?

▲

aroussi

2 days ago

[-]

Great idea! I'll whip something up over the weekend and post the video here and on the repo

▲

cahaya

2 days ago

[-]

I was also looking for a video. The concept sounds good, but feels like I need to learn a lot of new commands, or have a cheat sheet next to me to be able to be able to use the framework.

▲

aroussi

2 days ago

[-]

Cheatsheet is available via /pm:help

With that being said, a video will be coming very soon.

▲

raimille1

2 days ago

[-]

Agree! I see a lot of pontential here, just hard to get a grasp.

▲

tmvphil

2 days ago

[-]

Sorry, I'm going to be critical:

"We follow a strict 5-phase discipline" - So we're doing waterfall again? Does this seem appealing to anyone? The problem is you always get the requirements and spec wrong, and then AI slavishly delivers something that meets spec but doesn't meet the need.

What happens when you get to the end of your process and you are unhappy with the result? Do you throw it out and rewrite the requirements and start from scratch? Do you try to edit the requirements spec and implementation in a coordinated way? Do you throw out the spec and just vibe code? Do you just accept the bad output and try to build a new fix with a new set of requirements on top of it?

(Also the llm authored readme is hard to read for me. Everything is a bullet point or emoji and it is not structured in a way that makes it clear what it is. I didn't even know what a PRD meant until halfway through)

▲

ebiester

2 days ago

[-]

> So we're doing waterfall again?

I think the big difference between this and waterfall is that waterfall talked about the execution phase before the testing phase, and we have moved past defining the entire system as a completed project before breaking ground. Nothing in defining a feature in documentation up front stops continuous learning and adaptation.

However, LLMs and code breaks the "Working software over comprehensive documentation" component of agile. It breaks because documentation now matters in a way it didn't when working with small teams.

However, it also breaks because writing comprehensive documentation is now cheaper in time than it was three years ago. The big problem now is maintaining that documentation. Nobody is doing a good job of that yet - at least that I've seen.

(Note: I think I have an idea here if there are others interested in tackling this problem.)

▲

Terretta

1 day ago

[-]

> So we're doing waterfall again?

The waterfall we know was always a mistake. The downhill only flow we know and (don't) love was from someone at DOD who only glanced at the second diagram (Figure 2) in the original 1970 Royce paper and said "This makes sense, we'll do it!" and... we're doing waterfall.

So, go to the paper that started it all, but was arguing against it:

- https://www.praxisframework.org/files/royce1970.pdf

I encourage you to look at the final diagram in the paper and see some still controversial yet familiar good ideas:

  - prototype first
  - coding informs design
  - design informs requirements
  - iterate based on tests -> design -> requirements (~TDD)

Crucially, these arrows go backwards.

See also the "Spiral Model" that attempts to illustrate this a different way: https://en.wikipedia.org/wiki/Spiral_model#/media/File:Spira...

Amazing that waterfall arguably spread from this paper, where it's actually an example of "what not to do."

Here's what Royce actually says about the waterfall diagram:

The implementation described above is risky and invites failure. … The testing phase which occurs at the end of the development cycle is the first event for which timing, storage, input/output transfers, etc., are experienced as distinguished from analyzed. These phenomena are not precisely analyzable. … Yet if these phenomena fail to satisfy the various external constraints, then invariably a major redesign is required. … The required design changes are likely to be so disruptive that the software requirements upon which the design is based and which provides the rationale for everything are violated. … One can expect up to a 100-percent overrun in schedule and/or costs.

This is 55 years ago.

▲

euroderf

1 day ago

[-]

That "Spiral Model" sure looks like an OODA loop.

▲

jcmontx

2 days ago

[-]

Waterfall is what works for most consulting businesses. Clients like the buzz of agile but they won't budge on scope, budget or timeframe. You end up being forced to do waterfall.

▲

dc10tonite

2 days ago

[-]

Yep. And you often end up doing waterfall with a veneer of agile that ends up being worse than either one.

▲

andrekandre

1 day ago

[-]

this has been my experience too, its horrible because everyone does all the agile meetings and "planning" but its just used as progress reporting to the product managers... if thats all thats 'agile' being used for just do daily reporting and be done with it

▲

tmvphil

2 days ago

[-]

Waterfall might be what you need when dealing with external human clients, but why would you voluntarily impose it on yourself in miniature?

▲

solatic

1 day ago

[-]

Because agile is a project management process, not an engineering practice. The value of sprints is in delivering product at the end of every sprint. If that's not happening, because the client isn't interested, and you're not getting product feedback from your customer who is the only person whose feedback actually matters, and using that feedback to determine the tasks that go into the next sprint (including potentially cancelling tasks for work the customer is no longer interested in), then you're actually slowing the project down by forcing people to work on fit and finish every sprint before they need to (i.e. project completion).

That's not to say that you shouldn't anyway have good engineering practice, like short-lived branches and continuous integration. But you should be merging in branches on a schedule that is independent of sprints (and hopefully faster than the sprint length).

▲

aroussi

2 days ago

[-]

OP here. I wouldn't necessarily call it a waterfall, but it's definitely systemized. The main idea was to remove the vibe from vibe coding and use the AI as a tool rather than as the developer itself. By starting off with knowing exactly what we want to develop on a high(ish) level (= PRD), we can then create an implementation plan (epic) and break it down into action items (tasks/issues).

One of the benefits of using AI is that these processes, which I personally never followed in the pre-AI era, are now easy and frictionless to implement.

▲

tmvphil

2 days ago

[-]

I think for me personally, such a linear breakdown of the design process doesn't work. I might write down "I want to do X, which I think can be accomplished with design Y, which can be broken down into tasks A, B, and C" but after implementing A I realize I actually want X' or need to evolve the design to Y' or that a better next task is actually D which I didn't think of before.

▲

linkage

2 days ago

[-]

Looks like a simpler version of BMAD

https://github.com/bmad-code-org/BMAD-METHOD

▲

royletron

2 days ago

[-]

It still feels like the more context the agents have the worse the response becomes - and simulataneously the more money ends up being thrown at Anthropic. I have to handhold agents to get anywhere near stuff I actually want to commit with my name on.

▲

aroussi

2 days ago

[-]

That's exactly why we use separate agents as "context firewalls". Instead of having the main thread do all the work and get its context polluted, with sub-agents, each agent works on one thing, then provides a summary to the main thread (much smaller context use) as well as a detailed summary in an empty file.

▲

blancotech

2 days ago

[-]

I’m curious how any project management to code agent workflow can be successful given how messy the process is in real life.

Especially discovering unknown unknowns that lead to changes in your original requirements. This often happens at each step of the process (e.g. when writing the PRD, when breaking down the tickets, when coding, when QAing, and when documenting for users).

That’s when the agent needs to stop and ask for feedback. I haven’t seen (any) agents do this well yet.

▲

CuriouslyC

2 days ago

[-]

This will be a solved problem soon. With an agent wired up to slack, elastic (for all org docs) and your code base, it can iterate over high level project documents with stakeholders, clarifying things, noting codebase challenges that will need to be addressed and creating PM artifacts.

▲

dcreater

2 days ago

[-]

This is a more advanced version of what I'm doing.

I was impressed that someone took it up to this level till I saw the tell tale signs of the AI generated content in the README. Now I have no faith that this is a system that was developed, iterated and tested to actually work and not just a prompt to an AI to dress up a more down to earth workflow like mine.

Evidence of results improvement using this system is needed.

▲

aroussi

2 days ago

[-]

Damn those em dashes lol

Kidding aside, of course we used AI to build this tool and get it ready for the "public". This includes the README.

I will post a video here and on the repository over the weekend with an end-to-end tutorial on how the system works.

▲

dcreater

1 day ago

[-]

Videos would be great, but what you be better are real trials using a) vibe coding b) basic CLAUDE.md c) your system

P.S: And it wasnt the em-dashes, its the general structure and the recognizable bullet points with emojis.

▲

nivertech

2 days ago

[-]

Task decomposition is the most important aspect of software design and SDLC.

Hopefully, your GitHub tickets are large enough, such as covering one vertical scope, one cross-cutting function, or some reactive work such as bug fixing or troubleshooting.

The reason is that coding agents are good at decomposing work into small tasks/TODO lists. IMO, too many tickets on GitHub will interfere with this.

▲

aroussi

2 days ago

[-]

I agree wholeheartedly!

When we break down an epics into tasks, we get CC to analyze what can be run in parallel and use each issue as a conceptual grouping of smaller tasks, so multiple agents can work on the same issue in parallel.

The issues are relatively large, and depending on the feature, every epic has between 5 to 15 issues. When it's time to work on the issue, your local cloud code will break it down into minute tasks to carry out sequentially.

▲

Nizoss

2 days ago

[-]

I'm genuinely curious to see what the software quality looks like with this approach. Particularly how it handles complexity as systems grow. Feature development is one thing, going about it in a clean and maintainable way is another.

I've come across several projects that try to replicate agile/scrum/SAFe for agents, and I'm trying to understand the rationale. Since these frameworks largely address human coordination and communication challenges, I'm curious about the benefits of mapping them to AI systems. For instance, what advantages does separating developer and tester provide versus having unified agents that handle both functions?

▲

aroussi

2 days ago

[-]

The real idea is to make sure that each agent works in its own little world and documents everything. So the main thread context is occupied with the understanding of the project instead of code snippets.

▲

thomask1995

2 days ago

[-]

OK I need to give this a go. tbh, I've been going back to just writing stuff manually and asking ChatGPT doc questions.

I talked to and extremely strong engineer yesterday who is basically doing exactly this.

Would love to see a video/graphic of this in action.

▲

aroussi

2 days ago

[-]

I'm going to create a video and post it both here and on the repository over the weekend

▲

stronglikedan

2 days ago

[-]

And I'm going to reply here so that I remember to check back! Thanks!

▲

greggh

2 days ago

[-]

Good idea. I'd love a video.

▲

euroderf

1 day ago

[-]

AI n00b here. I wonder if this workflow is possible:

I point Claude to my codebase, and Claude writes up a PRD that matches/reflects/describes the codebase. Then I iteratively (a) edit the PRD to reflect where I want my codebase to go, and (b) have Claude execute on it.

▲

vemv

2 days ago

[-]

It will increasingly become common knowledge that the best practice for AI coding is small edits quite carefully planned by a human. Else the LLM will keep going down rabbit holes and failing to produce useful results without supervision.

Huge rules systems, all-encompassing automations, etc all assume that more context is better, which is simply not the case given that "context rot" is a thing.

▲

aroussi

2 days ago

[-]

The system is trying to solve: breaking down large projects into small tasks and assigning sub-agents to work on each task, so the code, research, test logs, etc. stay inside the agent, with only a summary being surfaced back up to the main thread.

▲

nikolayasdf123

2 days ago

[-]

when you go to their website some person immediately starts talking to you at the bottom left corner. this is hilarious, websites today got to tune it down a bit with sales

▲

CuriouslyC

2 days ago

[-]

You should just try to integrate your work with Vibe Kanban, I'm pretty sure it's going to be the winning tool in this space.

▲

jamauro

2 days ago

[-]

Looks interesting. How do you make sure that agents that need to collaborate on the solution actually collaborate if they’re working in parallel?

▲

aroussi

2 days ago

[-]

We gave up on that dream a while back lol. Instead, we analyze which tasks can be run in parallel (different files, etc) and which need to be worked on sequentially.

▲

robbs

1 day ago

[-]

Have you found a way to have the AI assist in analyzing?

▲

apwell23

2 days ago

[-]

> With multiple Claude agents running in parallel

Are ppl really doing this? My brain gets overwhelmed if i have more than 2 or 3.

▲

aroussi

2 days ago

[-]

OP here. It really depends on how you use parallel agents. Personally, I don't run multiple instances of Cloud Code nor do I use multiple screens. I find it hard to focus :)

That being said, if a task requires editing three different files, I would launch three different sub-agents, each editing one file, cutting down implementation time by two-thirds.

▲

nikolayasdf123

2 days ago

[-]

their website also features some shredded bold dude. got to respect their sales skills

▲

brainless

2 days ago

[-]

I love what is happening in this domain, so many people experimenting. Thanks for sharing this.

I recently launched https://letsorder.app, https://github.com/brainless/letsorder.

100% of the product (2 web UI apps, 1 backend, 1 marketing site) was generated by LLMs, including deployment scripts. I follow a structured approach. My workflow is a mix of Claude Code, Gemini CLI, Qwen Code or other coding CLI tools with GitHub (issues, documentation, branches, worktrees, PRs, CI, CodeRabbit and other checks). I have recently started documenting my thoughts about user flow with voice and transcribe them. It has shown fantastic results.

Now I am building https://github.com/brainless/nocodo as the most ambitious project I have tried with LLMs (vibe coding). It runs the entire developer setup on a managed Linux server and gives you access through desktop and mobile apps. All self-hosted on your cloud accounts. It would basically be taking an idea to going live with full stack software.

▲

tantanu

2 days ago

[-]

just out of curiousity, I tried letsorder using the demo you provided. The QR code generation seems to work ok, and then using the table specific menu I tried to put in an order, and got HTTP500 for both the quick order and the other flow. this is entirely my own opinion and may not reflect your experience but this fits in with everything else I've seen that is LLM generated. It looks complete and works for the most part, except the most business critical part.

Maybe the ordering flow does work, but how much traction are you going to really get without the demo actually doing what it's supposed to?

Not trying to be snarky - just trying to understand if people actually pay for mediocre or low-quality products like these

▲

brainless

2 days ago

[-]

You are not being snarky at all. I am sure the bug exists. There are a few out there and I still have not gotten to fix them.

This is nothing to do with LLM generated. I work on about 4-5 projects at the moment, https://github.com/brainless. All of them are to test how far LLM driven development go. This, along with time to daily reach out to people, create posts, host lessons on vibe coding: https://lu.ma/user/brainless

I will get these bugs sorted when I get some time. Let's Order is not a commercial project, it is an exercise to show what a solo founder can get done these days with LLMs.

▲

habinero

2 days ago

[-]

I'm not sure you can call yourself a solo founder if you haven't actually founded a business. You kinda just put up a broken website.

▲

brainless

1 day ago

[-]

So you mean to say you have never seen a founder launch a software that had bugs?

And also, I am not selling Let's Order. I am selling vibe coding, building a product around it, content and coaching - and yes I have customers for this.

I understand people getting mad. I am an engineer, 16 years of building products. I would be mad if I had a cushy job in the US which I felt is under threat from AI. I switched to vibe coding because I see the benefits for the rest of the world, which most engineers with cushy jobs never cared about. And there is money in this for someone like me, driving this solo from a life that is far away from VC funded startups.

▲

mustaphah

2 days ago

[-]

TL;DR workflow phases:

- Brainstorm a PRD via guided prompts (prds/[name].md).

- Transform PRD into epics (epics/[epic-name]/epic.md).

- Decompose epic into tasks (epics/[epic-name]/[feature-name]/[task].md).

- Sync: push epics & tasks to GitHub Issues.

- Execute: Analyze which tasks can be run in parallel (different files, etc). Launch specialized agents per issue.

▲

apwell23

2 days ago

[-]

tl;dr break shit down into small chunks

▲

dalore

2 days ago

[-]

how to use it on an existing repo that has a few issues, a milestone, labels, etc?

▲

aroussi

2 days ago

[-]

We created a "/pm:import" command before making this project public to pull existing repo issues into local. Hopefully, this will work well. We have tested it, but this is not one of the functionalities that we use on a daily basis internally. Fingers crossed :)

▲

poopiokaka

2 days ago

[-]

Make it work with gitlab