It always seems as if the code review is the only time when all stakeholders really gets involved and starts thinking about a change. There may be some discussion earlier on in a jira ticket or meeting, and with some luck someone even wrote a design spec, but there will still often be someone from a different team or distant part of the organization that only hears about the change when they see the code review. This includes me. I often only notice that some other team implemented something stupid because I suddenly get a notification that someone posted a code review for some part of the code that I watch for changes.
Not that I know how to fix that. You can't have everyone in the entire company spend time looking at every possible thing that might be developed in the near future. Or can you? I don't know. That doesn't seem to ever happen anyway. At university in the 1990's in a course about development processes there wasn't only code reviews but also design reviews, and that isn't something I ever encountered in the wild (in any formal sense) but I don't know if even a design review process would be able to catch all the things you would want to catch BEFORE starting to implement something.
Because in the software engineering world there is very little engineering involved.
That being said, I also think that the industry is unwilling to accept the slowliness of the proper engineering process for various reasons, including non criticality of most software and the possibility to amend bugs and errors on the fly.
Other engineering fields enjoy no such luxuries, the bridge either holds the train or it doesn't, you either nailed the manufacturing plant or there's little room for fixing, the plane's engine either works or not
Different stakes and patching opportunities lend to different practices.
Also in many countries, to one call themselves Software Engineer, they actually have to hold a proper degree, from a certified university or professional college, validated by the countrie's engineering order.
Because naturally 5 year (or 3 per country) degree in Software Engineering is the same a six weeks bootcamp.
I don't mind (hypothetically) not being allowed to call myself "engineer", but I do mind false dichotomy of "5 year course" vs "six week bootcamp". In the IT world it's entirely possibly to learn everything yourself and learn it better than one-fits-all course ever could.
I took lots of electives outside my major, and I know that I could have easily loved chemistry, mathematics, mechanical engineering, electrical engineering, or any number of fields. But when you're 12 years old with a free period in the school computer lab, you can't download a chemistry set or an oscilloscope or parts for building your next design iteration. You can download a C compiler and a PDF of K&R's "The C Programming Language," though.
CS just had a huge head-start in capturing my interest compared to every other subject because the barrier to entry is so low.
Strong disagree. However, this is closer to truth:
In the IT world, if you have learned everything yourself, it's entirely possible to think you have learned it better than one-fits-all course ever could.
There is lots of theoretical knowledge that comes with the degree that, while mostly useless in day-to-day work, is priceless in those rare moments that it comes handy. A self-taught developer won't even know they are missing this knowledge. Example of this is knowing how compilers work (which is surprisingly useful) - without the theoretical background one might attempt to parse HTML with regex and expect correct results.
Not that all degrees are created equal. But those X years do give you an edge over self-taught developers. You still need to work on other skills too, of course.
However, already by having been through the degree there is a whole set of skills that one would not have gotten otherwise.
Assuming that they actually did it the right way, and not getting through it with minimal effort.
Of course you would/could.
1) a degree doesn't imply you've built any specific skills or retained any information, just that you passed a set of exams. I've met a huge bunch of people from important universities that clearly studied just to pass exams with good degrees, but where absolutely crap problem solvers and even worse coders.
2) plenty of brilliant engineers did not graduate, from Leonardo Da Vinci to, just to stay in software John Carmack, Zuckerberg, Paul Allen, Romero, Wozniak (technically he did, 12 years after founding Apple), Karp, and many others.
What I'm trying to say: engineering skills are acquired by sheer will of studying and solving problems. And in 2025 you can follow pretty much any course/lecture from most top rated courses just watching your computer. A person doing so with interest will leapfrog anybody sitting there and going through the exam just because he has to.
Exceptional individuals that made an impact to mankind, regardless of which kind.
There are schools for special developed kids with advanced cognitive skills for a reason.
Also there is a huge difference between being immersed in an engineering degree between 3 to 5 years almost every single day with compulsory assignments, depending on the country, and occasionally watch a couple of videos, or read one or two books.
This all without getting into the soft and ethical skills that engineering degrees also require.
Writing code is the design phase.
You don’t need design phase for doing design.
Will drop link to relevant video later.
In my experience (and I have quite a bit of it, in some fairly significant contexts), “It Depends” is really where it’s at. I’ve learned to take an “heuristic” approach to software development.
I think of what I do as “engineering,” but not because of particular practices or educational credentials. Rather, it has to do with the Discipline and Structure of my approach, and a laser focus on the end result.
I have learned that things don’t have to be “set in stone,” but can be flexed and reshaped, to fit a particular context and development goal, and that goals can shift, as the project progresses.
When I have worked in large, multidisciplinary teams (like supporting hardware platforms), the project often looked a lot more “waterfall,” than when I have worked in very small teams (or alone), on pure software products. I’ve also seen small projects killed by overstructure, and large projects, killed, by too much flexibility. I’ve learned to be very skeptical of “hard and fast” rules that are applied everywhere.
Nowadays, I tend to work alone, or on small teams, achieving modest goals. My work is very flexible, and I often start coding early, with an extremely vague upfront design. Having something on the breadboard can make all the difference.
I’ve learned that everything that I write down, “ossifies” the process (which isn’t always a bad thing), so I avoid writing stuff down, if possible. It still needs to be tracked, though, so the structure of my code becomes the record.
Communication overhead is a big deal. Everything I have to tell someone else, or that they need to tell me, adds rigidity and overhead. In many cases, it can’t be avoided, but we can figure out ways to reduce the burden of this crucial component.
It’s complicated, but then, if it were easy, everyone would be doing it.
I disagree. The design phase of a substantial change should be done beforehand with the help of a design doc. That forces you to put in writing (and in a way that is understandable by others) what you are envisioning. This exercise is really helpful in forcing you to think about alternatives, pitfalls, pros & cons, ... . This way, once stakeholders (your TL, other team members) agreed then the reviews related to that change become only code related (style, use this standard library function that does it, ... ) but the core idea is there.
Having an initial design approved and set in stone, and then a purely implementation phase is very waterfall and very rarely works well. Even just "pitfalls and pros & cons" are hard to get right because what you thought was needed or would be a problem may well turn out differently when you get hands-on and have actual data in the form of working code.
Definitely making software can be engineering, most of the time it is not, not because of the nature of software, but the characteristics of the industry and culture that surrounds it, and argument in this article is not convincing (15 not very random engineers is not that much to support the argument from "family resemblance").
In the context of software vs other sub-disciplines, the big difference is in the cost of iterating and validating. A bridge has very high iteration cost (generally, it must be right first time) and validation is proven over decades. Software has very low iteration cost, so it makes much more sense to do that over lots of upfront design. Validation of software can also generally be implemented through software tools, since it's comparatively easy to simulate the running environment of the software.
Other disciplines like electronics live a little closer to a bridge, but it's still relatively cheap to iterate, so you tend to plan interim design iterations to prove out various aspects.
People forget that software is used in those other disciplines. CFD, FEA, model-based design etc. help to verify ideas and design without building any physical prototype and burning money in the real lab.
You can do some strain and stress analysis on a virtual bridge to get a high degree of confidence that the real bridge will perform fine. Of course, then you need to validate it at all stages of development, and at the end perform final validation under weight.
The thing is that people building engines, cars, planes, sensors, PCBs and bridges actually do so, largely because they are required to do so. If you give them freedom to not do that, many of them will spare themselves such effort. And they understand the principles of things they are working on. No one requires any of that from someone that glued together few NPM packages with a huge JS front-end framework, and such person may not even know anything about how the HTTP works, how browser handles the DOM etc. It's like having a mechanical engineer that doesn't even understand basic principles of dynamics.
There are industries that deal with the software (i.e. controls design) that have much higher degree of quality assurance and more validation tools, including meaningful quantitative criteria, so it clearly is not a matter of software vs hardware.
By that standard, doctors and hair stylists are also engineers, as are some chimps and magpies. I don't think it's a useful definition, it's far too broad.
No, the big difference is that in the Engineering disciplines, engineers are responsible end-to-end for the consequences of their work. Incompetence or unethical engineers can and regularly do lose their ability to continue engineering.
It's very rare that software developers have any of the rigour or responsibilities of engineers, and it shows in the willingness of developers to write and deploy software which has real-world costs. If developers really were engineers, they would be responsible for those downstream costs.
That is by definition not engineering.
> Equally, there's plenty of examples of software where careful processes are in place to demonstrate exactly the responsibilities you discuss.
Software engineering of course exists, but 99%+ of software is not engineered.
I'm not sure the generally accepted definition of engineering makes any reference to taking responsibility: https://dictionary.cambridge.org/dictionary/english/engineer...
This is the talk on real software engineering: https://www.youtube.com/watch?v=RhdlBHHimeM
Way to general to be useful. By that definition the store clerk is an engineer (tool cash register, problem solved my lack of gummy bears), janitors swinging a mops, or automotive techs changing oil.
Engineering is applied science.
Software is clearly different than "hardware", but it doesn't mean that other industries do not use experiment and iteration.
This is the talk on real software engineering: https://www.youtube.com/watch?v=RhdlBHHimeM
However many, probably half, that I work with, and most that I worked with overall for the last 25+ years (since after I dropped out) have an engineering degree. Especially the younger ones, since this century it has been more focus on getting a degree and fewer seems to drop out early to get a job like many of us did in my days.
So when American employers insist on giving me titles like "software engineer" I cringe. It's embarrassing really, since I am surrounded by so many that have a real engineering degree, and I don't. It's like if I dropped out of medical school and then people started calling me "doctor" even if I wasn't one, legally. It would be amazing if we could find a better word so that non-engineers like me are not confused with the legally real engineers.
As a aside, I find your example of doctor as amusing because it's overloaded with many considering the term a synonym of physician, and the confusion that can cause with other types of doctors.
And proper software developement definitely has engineering parts. Otherwise titles are just labels.
Rich Hickey agrees it's a part of it, yes. https://www.youtube.com/watch?v=c5QF2HjHLSE
No, it really isn't. I don't know which amateur operation you've been involved with, but that is really not how things work in the real world.
In companies that are not entirely dysfunctional, each significant change to the system's involve a design phase, which often includes reviews from stakeholders and involved parties such as security reviews and data protection reviews. These tend to happen before any code is even written. This doesn't rule out spikes, but their role is to verify and validate requirements and approaches, and allow new requirements to emerge to provide feedback to the actual design process.
The only place where cowboy coding has a place is in small refactoring, features and code fixes.
You need a high level design up-front but it should not be set in stone. Writing code and iterating is how you learn and get to a good, working design.
Heavy design specs up-front are a waste of time. Hence, the agile manifesto's "Working software over comprehensive documentation", unfortunately the key qualifier "comprehensive" is often lost along the way...
On the whole I agree that writing code is the design phase. Software dev. is design and test.
Yes, you need a design that precedes code.
> Writing code and iterating is how you learn and get to a good, working design.
You are confusing waterfall-y "big design upfront" with having a design.
It isn't.
This isn't even the case in hard engineering fields such as aerospace where prototypes are used to iterate over design.
In software engineering fields you start with a design and you implement it. As software is soft, you do not need to pay the cost of a big design upfront.
I do not and I have explained it.
> In software engineering fields you start with a design and you implement it
And part of my previous comment is that this "waterfall-y" approach in which you design first and implement second does not work and has never worked.
> you do not need to pay the cost of a big design upfront
Exactly, and not only that but usually requirements will also change along the way. The design can change and will change as you hit reality and learn while writing actual, working code. So keep your design as a high-level initial architecture then quickly iterate by writing code to flesh out the design.
Software is often opposed to "traditional engineering" but it is actually the same. How many experiments, prototyopes, iterations go into building a car or a rocket? Many. Engineers do not come up with the final design up front. The difference it is that this is expensive while in software we can iterate much more, much quicker, and for free to get to the final product.
No where did anyone claim you need the full final design up front. For cars\rockets how many of those experiments, prototypes, and iterations had designs? All of them. You never see a mechanical engineer walk out to the shop and just start hammering on a pile of slop until it sort of looks like a car.
>The difference it is that this is expensive while in software we can iterate much more, much quicker, and for free to get to the final product.
If you have no design to meet how do you judge the output of an iteration or know you have arrived at the final product?
I think you mean "requirements" here instead of "design".
No. This is exactly what you are getting wrong. Requirements are constraints that guide the design. The design then is used to organize, structure, and allocate work, and determine what code needs to be written.
You should review the sources of your confusions and personal misconceptions, as you deny design and then proceed to admit there is design.
> And part of my previous comment is that this "waterfall-y" approach in which you design first and implement second does not work and has never worked.
Nonsense. "Big design upfront" works, but is suboptimal in software development. That's why it's not used.
"Big design upfront" approaches are costly as it requires know-how and expertise to pull off, which most teams lack, and it assumes requirements don't change, which is never the case.
Once you acknowledge that requirements will change and new requirements will emerge, you start to think of strategies to accommodate them. In software development, unlike in any hard engineering field, the primary resource consumed is man-hours. This means that, unlike in hard engineering fields, a software development process can go through total rebuilds without jeopardizing their success. Therefore in software development there is less pressure to get every detail right at the start, and thus designs can be reviewed and implementations can be redone with minimal impact.
> Exactly, and not only that but usually requirements will also change along the way. The design can change and will change as you hit reality and learn while writing actual, working code.
Yes.
But you do need a design upfront, before code is written. Design means "know what you need to do". You need to have that in place to create tickets and allocate effort. It makes no sense at all to claim that writing code is the design stage. Only in amateur pet projects this is the case.
The "some math" is used in engineering fields in things like preliminary design, sizing, verification&validation, etc. To a lesser degree, "some math" can be used in the design stages of software development projects. For example, estimating the impact of micro services tax in total response times to verify if doing synchronous calls can work vs doing polling/messaging. Another example is estimating max throughput per service based on what data features in a response and how infrastructure is scaled. This is the kind of things that you do way before touching code to determine if the expected impact of going with a particular architecture vs another that mitigates issues.
> In software, the logical details are the finished product. The math is what you're trying to make.
You're confused. The design stage precedes writing any code, let alone the finished product. Any remotely complex work, specially if it involves architecture changes, is preceded by a design stage where alternatives are weighed and validated, and tradeoffs are evaluated.
To further drive the point home, in professional settings you also have design reviews for things like security and data protection. Some companies even establish guidelines such as data classification processes and comparative design to facilitate these reviews.
> If you've actually thought through all of the details, you have written the software (if only in your head). If you haven't thought through all of the details and only figured out a high level design, you've still written some software (essentially, stubbing out some functionality, or leaving it as a dependency to be provided. However you want to think of it).
You're confusing having a design stage with having a big design upfront. This is wrong.
The purpose of the design stage is to get the necessary and sufficient aspects right from the start, before resources are invested (and wasted) in producing something that meets requirements. No one cares what classes or indentation style you use to implement something. The ultimate goal is to ensure the thing is possible to deliver, what it actually does and how it does it, and if it is safe enough to use. You start writing code to fill in the details.
With data classification, you're going to need to think through what data you are using and what you want to do with it. i.e. write a program.
I didn't claim class structure or indentation matters. I'm saying that assuming you are discussing some sort of algorithm or functionality, a formal language is a perfectly fine thing to use for thinking about the problem and writing down your ideas. Writing "what it actually does and how it does it" is just programming. If you write your ideas in a language like Scala, they can easily be more concise (so easier to review) than they would be in English, and you get a compiler helping you think through things.
Operation that delivers features instead of burning budget on discussions.
Operation that uses test/acceptance environments where you deploy and validate the design so people actually see the outcome.
Obviously you have to write down the requirements - but writing down requirements is not design phase.
Design starts with idea, is written down to couple sentences or paragraphs then turned into code and while it is still on test/acceptance it still is design phase. Once feature goes to production in a release "design phase" is done, implementation and changes are part of design and finding out issues, limitations.
My opinion is reality is more nuanced. Both "the code is self documenting" and "the code is the design" are reasonable takes within reasonable situations.
I'll give an example.
I work in a bureaucratic organization where there's a requirement to share data and a design doc that goes through a series of not-really-technical approvals. The entire point of the process is to be consumable to people who don't really know what an API is. It's an entirely reasonable point of view that we should just create the swagger doc and publish that for approval.
I worked in another organization where everything was an RFC. You make a proposal, all the tech leads don't really understand the problem space, and you have no experience doing the thing, so you get the nod to go ahead. You now have a standard that struggles against reality, and is difficult to change because it has broad acceptance.
I'm not saying we should live in a world with zero non-code artifacts, but as someone who hops org to org, most of the artifacts aren't useful, but a CI/CD that builds, tests, and deploys, looking at the output and looking at the code gives me way more insight that most non-code processes.
I can count on one hand the number of times I've been given the time to do a planning period for something less than a "major" feature in the past few years. Oddly, the only time I was able to push good QA, testing, and development practices was at an engineering firm.
I find this to be one of the most important things in our team. Once people don't agree on code it all kinda goes downhill with nobody wanting to interact with code they didn't write for various reasons.
In bigger orgs I believe it's still doable this way as long as responsibilities are shared properly and it's not just 4 guys who know it all and 40 others depend on them.
That is a problem with your organization, not with Git or any version control system. PRs are orthogonal to it.
If you drop by a PR without being aware of the ticket that made the PR happen and the whole discussion and decision process that led to the creation of said tickets, you are out of the loop.
Your complain is like a book publisher complaining that the printing process is flawed because seeing the printed book coming out of the production line is the only time when all stakeholders really get involved. Only if you work in a dysfunctional company.
Sometimes is not even about a PR, it is about an entire project. I always do reviews (design and code, separate stages) for projects where code is almost complete when people come for design reviews and by the time we get to code reviews it is usually too late to fix problems other than showstoppers. I worked in small companies, huge companies (over 100k employees), some are better, most are bad, in my experience. YMMV, of course.
You don't need to. I've seen this generally work with some mix of the following:
1. Try to decouple systems so that it's less likely for someone in a part of the org to make changes that negatively impact someone in a more distant part of the org.
2. Some design review process: can be formal "you will write a design doc, it will be reviewed and formally approved in a design committee" if you care more about integrity and less about speed, or can be "write a quick RFC document and share it to the relevant team(s)".
3. Some group of people that have broad context on the system/code-base (usually more senior or tenured engineers). Again, can be formal: "here is the design review committee" or less formal: "run it by these set of folks who know there stuff". If done well, I'd say you can get pretty broad coverage from a group like this. Definitely not "everyone in the entire company". That group can also redirect or pull others in.
4. Accept that the process will be a bit lossy. Not just because you may miss a reviewer, but also, because sometimes once you start implementing the reality of implementation is different than what people expect. You can design the process for this by encouraging POC or draft implementations or spikes, and set expectations that not all code is expected to make it into production (any creative process includes drafts, rewrites, etc that may not be part of the final form, but help explore the best final form).
I've basically seen this work pretty well at company sizes from 5 engineers all the way up to thousands.
These can be written either for just our team or for the eyes of all other software teams. In the latter case we put these forward as RFCs for discussion in a fortnightly meeting, which is announced well in advance so people can read them, leave comments beforehand, and only need to attend the meeting if there's an RFC of interest to them up for discussion.
This has gone pretty well for us! It can feel like a pain to write some of these, and at times I think we overuse them somewhat, but I much prefer our approach to any other place I've worked where we didn't have any sort of collaborative design process in place at all.
But writing the whole working code just to discuss some APIs is too much and will require extra work to change if problems are surfaced on review.
So a design document is something in the middle: it should draw a line where the picture of the planned change is as clear as possible and can be communicated with shareholders.
Other possible middle grounds include PRs that don’t pass all tests or that don’t even build at all. You just have to choose the most appropriate sequence of communication tools to come to agreements in the team and come to a point where the team is on the same page on all the decisions and how the final picture looks.
Regarding design reviews, we used to have them at my current job. However we stopped doing both formal design documents and design reviews in favor of prototyping and iterative design.
The issue with the design phase is that we often failed to account for some important details. We spent considerable time discussing things and, when implementing, realized that we omitted some important detail or insight. But since we already invested that much time in the design phase, it was tempting to take shortcuts.
What's more, design reviews were not conducted by the whole team, since it would be counter-productive to have 10-more people in the same room. So we'd still discover things during code reviews.
And then not everyone is good at/motivated to producing good design documents.
In the end, I believe that any development team above 5 people is bound to encounter these kinds of inefficiencies. The ideal setup is to put 5 people in the same room with the PO and close to a few key users.
(I suspect you are aware, but just in case this is new to you.) This is essentially the core of Extreme Programming.
It seems like the standard around me is between 8 to 12 people. This is too many in my opinion.
I believe this is because management is unknowingly aiming for the biggest team the does not completely halts instead of seeking a team that delivers the most bang for the buck.
Personally, 8 is the largest I would have a team. At that point you should consider breaking it into two teams of 4 (even if those teams periodically recombine from the original set of 8 people).
If the two teams have to coordinate a lot and work on the same code base, is there still two teams?
To be independent, they would need to work on functionnaly different parts of the project. Not all projects have several independent parts, feature-wise.
What do I miss?
Unfortunately, the conditions where it works well can be difficult to set up. You need people who are into it and have similar schedules. And you don't want two people waiting for tests to run.
However.
I paired full-time, all day, at Pivotal, for 5 years. It was incredible. Truly amazing. The only time in my career when I really thrived. I miss it badly.
Pivotal Labs was a contracting firm that did it for years. They aren’t around anymore, but they had a good run:
Where I work, the structure is such that most parts of the codebase have a team that is responsible for it and does the vast majority of changes there. If any "outsider" plans a change, they come talk to the team and coordinates.
And we also have strong intra-team communication. It's clear who is working on what and we have design reviews to agree on the "how" within the team.
It's rare that what you describe happens. 95% of the code reviews I do are without comments or only with minor suggestions for improvement. Mainly because we have developed a culture of talking to each other about major things beforehand and writing the code is really just the last step in the process. We also have developed a somewhat consistent style within the teams. Not necessarily across the teams, but that's ok.
TL;DR: It's certainly possible to do things better that what you are experiencing. It's a matter of structure, communication and culture.
The solution here may be to add a midterm check. I think this is what you mean by a "design review."
In my experience, there are some rules that need to be followed for it to work.
- Keep the number of stakeholders involved in all decisions, including PR, as small as possible.
- Everyone involved should take part in this check. That way, no one will be surprised by the results.
- This check should have been documented, like in the ticket.
This can be used in any process where the result is only judged at the end. The solution here may be to add a midterm check. I think this is what you mean by a "design review." In my experience, there are some rules that need to be followed for it to work. We should keep the number of stakeholders involved in all decisions, including PR, as small as possible. Everyone involved should take part in this mid-term check. That way, no one will be surprised by the results. This check should have been documented, like in the ticket.
When and how to do this check and how to handle disagreements depend on the task, culture, and personalities.
If you don't have a documented mid-term check, vibe-coded PR might not be what you expected.
Even if you can't fix it this time, hopefully you've taught someone a better pattern. The direction of travel should still be positive.
On personal projects I've used architectural decision records, but I've never tried them with a team.
When we started graphite.dev years ago that was a workflow most developers had never heard of unless they had previously been at FB / Google.
Fun to see how fast code review can change over 3-4yrs :)
And I very much appreciate both the ambition and results that come from making it interop with PRs, its a nightmare problem and its pretty damned amazing it works at all, let alone most of the time.
I would strongly lobby for a prescriptive mode where Graphite initializes a repository with hardcore settings that would allow it to make more assumptions about the underlying repo (merge commits, you know the list better than I do).
I think that's what could let it be bulletproof.
It seems non-obvious that you would have to prohibit git commands in general, they're already "buyer beware" with the current tool (and arcanist for that matter). Certainly a "strict mode" where only well-behaved trees could interact with the tool creates scope for all kinds of performance and robustness optimizations (and with reflog bisecting it could even tell you where you went off script).
I was more referring to the compromises that gt has to make to cope with arbitrary GitHub PRs seem a lot more fiddly than directly invoking git, but that's your area of expertise and my anecdote!
Broad strokes I'm excited for the inevitable decoupling of gt from GitHub per se, it was clearly existential for zero to one, but you folks are a first order surface in 2025.
Keep it up!
So I’m really hoping something like Graphite becomes open-source, or integrated into GitHub.
Frequent, small changes are really a good practice.
Then we have things like trunk-based development and continuous integration.
That’s the only models I can think of and it’s weird to advocate to have a variable time asynchronous process in the middle of your code or review loops. Seems like you’re just handicapping your velocity for no reason.
Stacked PRs are precisely about factoring out small changes into individually reviewable commits that can be reviewed and landed independently, decoupling reviewer and developer while retaining good properties like small commits that the reviewer is going to do a better job on, larger single purpose commits that the reviewer knows to spend more time on without getting overwhelmed dealing with unrelated noise, and the ability to see relationships between smaller commits and the bigger picture. Meanwhile the developer gets to land unobtrusive cleanups that serve a broader goal faster to avoid merge conflicts while getting feedback quicker on work while working towards a larger goal.
The only time stacked commits aren’t as useful is for junior devs who cants organize themselves well enough to understand how to do this well (it’s an art you have to intentionally practice at) and don’t generally have a good handle on the broader scope of what they’re working towards.
But combine it with TDD & pairing and it becomes a license to deliver robust features at warp speed.
I think stacked PRs are a symptoms of the issues the underlying workflow (feature branches with blocking reviews) has.
Stacked pull requests can be an important tool to enable “frequent, small changes” IMO.
Sure, I can use a single pull request and a branch on top of that, but then it's harder for others to leave notes on the future, WIP, steps.
A common situation is that during code review I create a few alternative WIP changes to communicate to a reviewer how I might resolve a comment; they can do the same, and share it with me. Discussion can fork to those change sets.
Gerrit is much closer to my desired workflow than GitHub PRs.
But, to me, "creating a few alternative WIP changes to communicate to a reviewer" indicates an issue with code reviews. I don't think code reviews are the time to propose alternative implementations, even if you have a "better" idea unless the code under review is broken.
The //actually better// workflows stacking enables are the same sort of workflows that `git add -p`, `git commit --fixup` and `git rebase` enable, just at a higher level of abstraction (PRs vs commits).
You can "merge as a stack" as you imply, but you can also merge in sub-chunks, or make a base 2-3 PRs in a stack that 4 other stacks build on top of. It allows you to confidently author the N+1th piece of work that you'd normally "defer" doing until after everything up to N has been reviewed.
An example: I add a feature flag, implement a divergent behavior behind a feature flag gate, delete the feature flag and remove the old behavior. I can do this in one "stack", in which I deploy the first two today and the last one next week.
I don't have to "come back" to this part of the codebase a week from now to implement removing the flag, I can just merge the last PR that I wrote while I had full context on this corner.
In theory you can do all of this stuff with vanilla git and GitHub. In non-stacking orgs, I'd regularly be the only person doing this, because I was the only one comfortable enough with git (and stacking) for it to not be toooo big a burden to my workflow. Graphite (and other stacking tools) make this workflow more accessible and intuitive to people, which is a big net win for reviewers imo.
Empirically this is not true if you also control for review quality. If your code review is a rubber stamp then sure mega PRs win because you put up a PR and then merge. But why review then?
However, code review quality goes up when you break things down into smaller commits because the code reviewer can sanity check a refactor without going over each line (pattern matching) while spending more time on other PRs that do other things.
And if you are breaking things down, then stacked PRs are definitely faster at merged to master/unit of time. I introduced graphite to my team and whereas before we struggled to land a broken down PR of ~5 commits in one week, we’d regularly land ~10+ commit stacks every few days because most of the changes of a larger body of work got approved and merged (since often times the commit order isn’t even important, you can reorder the small commits), conditional approvals (ie cleanups needed) didn’t require further follow ups from the reviewer, and longer discussion PRs could stay open for longer without blocking progress and both developer and reviewer could focus their attention there.
Additionally, graphite is good about automatically merging a group of approved small individual commits from a larger set of changes automatically without you babysitting which is infinitely easier than managing this in GitHub and merging 1 commit, rebasing other PRs after a merge etc.
One thing I've found at $DAYJOB is that I have to set the PR's "base" branch to "main" before I push updated commits (and then switch it back to the parent after), otherwise CI thinks my PR contains everything on main and goes nuts emailing half the company to come review it.
I've played with git town which is great for what it is.
But at $DAYJOB we are now all on graphite and that stacking is super neat. The web part is frustratingly slow, but they got stacking working really well.
The worst offender is a slack notification[0] deep link into a PR I need to review.
It loads in stages, and the time from click to first diff is often so frustratingly long that I end up copying the PR ID and going to GitHub instead.
Sometimes I give up while Graphite is still loading and use the shortcut C-G to go to GitHub.
The second issue might be the landing page. I love what it shows compared to GitHub, but it's often slow to display loading blocks for things that haven’t even changed. Reloads are usually fast after that — until sometime later, maybe a day, when it slows down again.
I don't know why it feels worse than Linear, even though there are clearly many similarities in how it's supposed to load.
The guest instance isn’t so much about loading speed, but usage speed.
When I submit a stack of PRs, I get a nice carousel to fill in PR titles/descriptions and choose where to publish each PR. What’s missing for me there is access to files and diffs, so I can re-review before publishing. I often end up closing it and going back to the PR list instead.
[0] Thank God for those! You've made them much more useful than GitHub's. Also, the landing page is far more helpful in terms of what’s displayed.
What can be a very nice experiment to try something new can easily become a security headache to deal with.
I'd recommend giving it a try to see what it's like. The `gt`/onboarding tour is pretty edifying and brief.
It's likely that you'll find that `gt` is "enabling" workflows that you've already found efficient solutions for, because it's essentially an opinionated and productive subset of git+github. But it comes with some guardrails and bells and whistles that makes it both (1) easier for devs who are new to trunk-based dev to grok and (2) easier for seasoned devs to do essentially the same work they were already doing with fewer clicks and less `git`-fu.
Best AI code review, hands down. (And I’ve tried a few.)
Shockingly, the best code review tool I've ever used was Azure DevOps.
Javascript at scale combined with teams that have to move fast and ship features is a recipe for this.
At least it's not Atlassian.
You might be thinking of Fisheye/Crucible, which were acquisitions, and suffered the traditional fate of being sidelined.
(You are 100% correct that Stash/Bitbucket Server has also been sidelined, but that has everything to do with their cloud SaaS model generating more revenue than selling self-hosted licenses. The last time I used it circa 2024, it was still way faster than Bitbucket Cloud though.)
Source: worked at Atlassian for a long time but left a few years ago.
I use it every day and don't have any issues with the review system, but to me it's very similar to github. If anything, I miss being able to suggest changes and have people click a button to integrate them as commits.
So I'm back to liking dev-ops and github code reviews identically!
I didn't get why stick with the requirement that review is a single commit? To keep git-review implementation simple?
I wonder if approach where every reviewer commits their comments/fixes to the PR branch directly would work as well as I think it would. One might not even need any additional tools to make it convenient to work with. This idea seems like a hybrid of traditional github flow and a way Linux development is organized via mailing lists and patches.
i've had team members edit a correction as a "suggestion" comment and i can approve it to be added as a commit on my branch.
Yeah that is pretty weird. If 5 people review my code, do they all mangle the same review commit? We don't do that with code either, feels like it's defeating the point.
Review would need to be commits on top of the reviewed commit. If there are 5 reviews of the same commit, then they all branch out from that commit. And to address them, there is another commit which also lives besides them. Each commit change process becomes a branch with stacked commits beinf branches chained on top of one another. Each of the commits in those chained branches then has comment commits attached. Those comment commits could even form chains if a discussion is happening. Then when everybody is happy, each branch gets squashed into a single commit and those then get rebased on the main branch.
You likely want to make new commits for that though to preserve the discussions for a while. And that's the crux: That data lives outside the main branch, but needs to live somewhere.
This is eerily similar to how I review large changes that do not have a clear set of commits. The real problem is working with people that don’t realize that if you don’t break work down into small self contained units, everybody else is going to have to do it individually. Nobody can honestly say they can review tons of diffs to a ton of files and truly understand what they’ve reviewed.
The whole is more than just the sum of the parts.
``` review () { if [[ -n $(git status -s) ]] then echo 'must start with clean tree!' return 1 fi
git checkout pristine # a branch that I never commit to
git rebase origin/master
branch="$1"
git branch -D "$branch"
git checkout "$branch"
git rebase origin/master
git reset --soft origin/master
git reset
nvim -c ':G' # opens neovim with the fugitive plugin - replace with your favorite editor
git reset --hard
git status -s | awk '{ print $2 }' | xargs rm
git checkout pristine
git branch -D "$branch"
}
```as a PR review tool in neovim. It's basically vscode's diff tool UI-wise but integrates with vim's inbuilt diff mode.
Also, `git log -p --function-context` is very useful for less involved reviews.
- git itself wont go much further than the change-id which is already a huge win (thanks to jj, git butler, gerrit and other teams)
- graphite and github clearly showed they are not interested in solving this for anyone but their userslaves and have obviously opposing incentives.
- there are dozens of semi abandoned cli tools trying this without any traction, a cli can be a part of a solution but is just a small part
What we need:
- usable fully local
- core team support for vscode not just a broken afterthought by someone from the broader community
- web UI for usecases where vscode does not fit (possibly via vscode web or other ways to reuse as much of the interface work that went into the vscode integration)
- the core needs to be usable from a cli or library with clear boundaries so other editor teams can build as great integrations as the reference but fitting their native ui concepts
- it needs to work for commits, branches, stacked commits and any snapshot an agent creates as well as reviewing a devs own work before pushing
- it needs to incorporate CI/CD signals natively, meta did great UI work on this and its crucial to not ignore all that progress but build on top of it
- it needs to be as fine grained as the situation requires and with editability at every step. Why can i just accept one line in cursor but there is nothing like that when reviewing a humans code? Why can i fix a typo without any effort when reviewing in cursor when i have to go through at least 5 clicks to do the same when fixing a typo of a human.
- It needs to by fully incremental, when a pr is fixed there needs to be a simple way to review just the fix and not re-review the whole pr or the full file
Anyone tried something like this? How did it go?
Braintree was a pair programming company for example.
VSCode is open source, and there are plenty of IDEs...
I guess I'm just focused on different lock-in concerns than you are.
I suspect that since this is possible with VSCode/Github, its probably extensible to other providers editors.
When I started my career, no one did code review. I'm old.
At some point, my first company grew; we hired new people and started to offshore. Suddenly, you couldn't rely on developers having good judgement... or at least being responsible for fixing their own mess.
Code review was a tool I discovered and made mandatory.
A few years later, everyone converged on GitHub, PRs, and code review. What we were already doing now became the default.
Many, many years layer, I work with a 100% remote team that is mostly experienced and 75% or more of our work is writing code that looks like code we've already written. Most code review is low value. Yes, we do catch issues in review, especially with newer hires, but it's not obviously worth the delay of a review cycle.
Our current policy is to trust the author to opt-in for review. So far, this approach works, but I doubt it will scale.
My point? We have a lot of posts about code review and related tools and not enough about whether to review and how to make reviews useful.
I think its easy to add processes under the good intention of "making the code more robust and clean", but I never heard anyone discuss what is the cost of this process to the team's efficiency.
I'm not a fan of automatic syntax formatting but you can have some degree of pre-commit checks.
1. It's easy to optimise for talented, motivated people in your team. You obviously want this, and it should be the standard, but you also want it to be the case that somebody who doesn't care about their work can't trash the codebase.
2. I find even people just leaving 'lgtm' style reviews for simple things, does a lot to make sure folks keep up with changes. Even if there's nothing caught, you still want to make sure there aren't changes that only one person knows about. That's how you wind up with stuff like, the same utility functions written 10 times.
(There should be breakglass mechanisms to bypass code reviews, sure. Just the default should always be to require reviews)
The owner is allowed to make changes without review.
To be fair you don't know if one line change is going to absolutely compromise a flow. OSS needs to maintain a level of disconnect to be safe vs fast.
Sourcehut is missing in the list; it’s built on the classical concept of sending patches / issues / bugs / discussion threads via email and it integrates this concept into its mailing lists and ci solution that also sends back the ci status / log via email.
Drew Devault published helpful resources for sending and reviewing patches via email on git-send-email.io and git-am.io
Now there's official support and tooling for reviews (at least in IDEA, but probably in the others too), where you also get in-line highlighting of changed lines, comments, status checks, etc...
I feel sorry for anyone still using GitHub itself (or GitLab or whatever). It's horrible for anything more than a few lines of changes here and there.
A big issue is that every team has a slightly different workflow, with different rules and requirements. The way GitHub is structured is a result of how the GitHub team works. They built the best tool for themselves with their "just keep appending commits to a PR" workflow.
Either you need to have enough flexibility so that the tool can be adapted to everyone's existing workflow. Or you need to be opinionated about your workflow (GitHub) and force everyone to match it in some way. And in most cases this works very well, because people just want you to tell them the best way of doing things and not spend time figuring out what the best workflow would look like.
[1]: https://lubeno.dev
I was on a lookout for best "precommit" review tool and zeroed on Magit, gitui, Sublime Merge.
I am not an emac user, so i'll have to learn this.
I suggest `git-precom` for conciseness.
More often than not, it either doesn't exist, or turns out in a kind of architecture fetishism that the lead devs/architects have from conferences or space ship enterprise architecture.
Already without this garbage it feels so much better, than arguing about SOLID, clean code, hexagonal architecture, member functions being with an underscore, explicit types or not,...
I'm not convinced that review comments as commits make thing easier, but I think storing them in git in some way is a good idea (i.e. git annotations or in commit messages after merge etc)
GitLab enables this - make the suggestion in-line which the original dev can either accept or decline.
This doesn't seem like much of a problem, does it? It's a matter of alt-tab and a click or two.
Also, what is the point of having reviews in the git history?
This is a pretty cool tool for it: https://github.com/sindrets/diffview.nvim
On the branch that you are reviewing, you can do something like this:
:DiffviewOpen origin/HEAD...HEAD
The patchsets get stacked up and you know where you left off if there are different changes and that is very cool.
AI can already write very good code. I have led teams of senior+ software engineers for many years. AI can write better code than most of them can at this point.
Educational establishments MUST prioritize teaching code review skills, and other high-level leadership skills.
Debatable, with same experience, depends on the language, existing patterns, code base, base prompts, and complexity of a task
For human written code, shape correlates somewhat with correctness, largely because the shape and the correctness are both driven by the human thought patterns generating the code.
LLMs are trained very well at reproducing the shape of expected outputs, but the mechanism is different than humans and not represented the same way in the shape of the outputs. So the correlation is, at best, weaker with the LLMs, if it is present at all.
This is also much the same effect that makes LLMs convincing purveyors of BS in natural language, but magnified for code because people are more used to people bluffing with shape using natural language, but churning out high-volume, well-shaped, crappy substance code is not a particularly useful skill for humans to develop, and so not a frequently encountered skill. And so, prior to AI code, reviewers weren't faced with it a lot.
I find that interesting. That has always been the case at most places my friends and I have worked at that have proper software engineering practices, companies both very large and very small.
> AI can already write very good code. I have led teams of senior+ software engineers for many years. AI can write better code than most of them can at this point.
I echo @ZYbCRq22HbJ2y7's opinion. For well defined refactoring and expanding on existing code in limited scope they do well, but I have not seen that for any substantial features especially full-stack ones, which is what most senior engineers I know are finding.
If you are really seeing that then I would either worry about the quality of those senior+ software engineers or the metrics you are using to assess the efficacy of AI vs. senior+ engineers. You don't have to even show us any code: just tell us how you objectively came to that conclusions and what is the framework you used to compare them.
> Educational establishments MUST prioritize teaching code review skills
Perhaps more is needed but I don't know about "prioritizing"? Code review isn't something you can teach as a self-contained skill.
> and other high-level leadership skills.
Not everyone needs to be a leader and not everyone wants to be a leader. What are leadership skills anyway? If you look around the world today, it looks like many people we call "leaders" are people accelerating us towards a dystopia.
If you’re going to use AI you have to be even more diligent and self reviewed your code, otherwise you’re being a shitty team mate.
It's also caused an uptick in inbound to dev tooling and CI teams since AI can break things in strange ways since it lacks common sense.
AI assisted commits on my team are "precise".
There just hasn't been as many resources yet poured into improving AI code reviews as there has for writing code.
And in the end the whole paradigm itself may change.
So where is your 3 startups?
But it is just as unable to properly reason about anything slightly more complex as when writing code.
If your PR did not fix the issue or implement the feature, that's on you, not the reviewer.
Very different situation if it's open source or an external contribution, of course.
I'm not sure there's even a tech solution to this class of problems and it is down to culture. LGTMs exist because it satisfies the "letter of the law" but not the spirit. Classic bureaucracy problem combined with classic engineer problems. It feels like there are simple solutions but LGTMs are a hack. You try to solve this by requiring reviews but LGTMs are just a hack to that. Fundamentally you just can't measure the quality of a review[0]. Us techie types and bureaucrats have a similar failure mode: we like measurements. But a measurement of any kind is meaningless without context. Part of the problem is that businesses treat reviewing as a second class citizen. It's not "actual work" so shouldn't be given preference, which excuses the LGTM style reviews. Us engineers are used to looking at metrics without context and get lulled into a false sense of security, or convince ourselves that we can find a tech solution to this stuff. I'm sure someone's going to propose a LLM reviewer and hey, it might help, but it won't address the root problems. The only way to get good code reviews is for them to be done by someone capable of writing the code in the first place. Until the LLMs can do all the coding they won't make this problem go away, even if they can improve upon the LGTM bar. But that's barely a bar, it's sitting on the floor.
The problem is cultural. The problem is that code reviews are just as essential to the process as writing the code itself. You'll notice that companies that do good code review already do this. Then it is about making this easier to do! Reducing friction is something that should happen and we should work on, but you could make it all trivial and it wouldn't make code reviews better if they aren't treated as first class citizens.
So while I like the post and think the tech here is cool, you can't engineer your way out of a social problem. I'm not saying "don't solve engineering problems that exist in the same space" but I'm making the comment because I think it is easy to ignore the social problem by focusing on the engineering problem(s). I mean the engineering problems are magnitudes easier lol. But let's be real, avoiding addressing this, and similar, problems only adds debt. I don't know what the solution is[1], but I think we need to talk about it.
[0] Then there's the dual to LGTM! Code reviews exist and are detailed but petty and overly nitpicky. This is also hacky, but in a very different way. It is a misunderstanding of what review (or quality control) is. There's always room for criticism as nothing you do, ever, will be perfect. But finding problems is the easy part. The hard part is figuring out what problems are important and how to properly triage them. It doesn't take a genius to complain, but it does take an expert to critique. That's why the dual can even be more harmful as it slows progress needlessly and encourages the classic nerdy petty bickering over inconsequential nuances or over unknowns (as opposed to important nuances and known unknowns). If QC sees their jobs as finding problems and/or their bosses measure their performance based on how many problems they find then there's a steady state solution as the devs write code with the intentional errors that QC can pick up on, so they fulfill their metric of finding issues, and can also easily be fixed. This also matches the letter but not the spirit. This is why AI won't be able to step in without having the capacity of writing the code in the first place, which solves the entire problem by making it go away (even if agents are doing this process).
[1] Nothing said here actually presents a solution. Yes, I say "treat them as first class citizens" but that's not a solution. Anyone trying to say this, or similar things, is a solution is refusing to look at all the complexities that exist. It's as obtuse as saying "creating a search engine is easy. All you need to do is index all (or most) of the sites across the web." There's so much more to the problem. It's easy to over simplify these types of issues, which is a big part of why they still exist.
I've been out of the industry for a while but I felt this way years ago. As long as everybody on the team has coding tasks, their review tasks will be deprioritized. I think the solution is to make Code Reviewer a job and hire and pay for it, and if it's that valuable the industry will catch on.
I would guess that testing/QA followed a similar trajectory where it had to be explicitly invested in and made into a job to compete for or it wouldn't happen.
I also think there's benefits to review being done by devs. They're already deep in the code and review does have a side benefit of broadening that scope. Helping people know what others are doing. Can even help serve as a way to learn and improve your development. I guess the question is how valuable these things are?
As for prioritization... isn't it enough knowing that other people are blocked on your review? That's what incentivizes me to get to the reviews quickly.
I guess it's always going to depend a lot on your coworkers and your organization. If the culture is more about closing tickets than achieving some shared goal, I don't know what you could do to make things work.
WORKFLOW
Every repository is personal and reviewer merges, kernel style. Merging is taking ownership: the reviewer merges into their own tree when they are happy and not before. By implication there is always one primary code reviewer, there is never a situation where someone chooses three reviewers and they all wait for someone else to do the work. The primary reviewer are on the hook for the deliverable as much as the reviewee is.
There is no web based review tool. Git is managed by a server configured with Gitolite. Everyone gets their own git repository under their own name, into which they clone the product repository. Everyone can push into everyone else's repos, but only to branches matching /rr/{username}/something and this is how you open a pull request. Hydraulic is an IntelliJ shop and the JetBrains git UI is really good, so it's easy to browse open RRs (review requests) and check them out locally.
Reviewing means pushing changes onto the rr branch. Either the reviewer makes the change directly (much faster than nitpicky comment roundtrips), or they add a //FIXME comment that IntelliJ is configured to render in lurid yellow and purple for visibility. It's up to the reviewee to clear all the FIXMEs before a change will be merged. Because IntelliJ is very good at refactoring, what you find is that reviewers are willing to make much bigger improvements to a change than you'd normally get via web based review discussions. All the benefits the article discusses are there except 100x because IntelliJ is so good at static analysis. A lot of bugs that sneak past regular code review are caught this way because reviewers can see live static analysis results.
Sometimes during a review you want to ask questions. 90% of the time, this is because the code isn't well documented enough and the solution is to put the question in a //FIXME that's cleared by adding more comments. Sometimes that would be inappropriate because the conversation would have no value to others, and it can be resolved via chat.
Both reviewee and reviewer are expected to properly squash and rebase things. It's usually easier to let commits pile up during the review so both sides have state on the changes, and the reviewer then squashes code review commits into the work before merging. To keep this easy most review requests should turn into one or two commits at most. There should not be cases where people are submitting an RR with 25 "WIP" commits that are all tangled up. So it does require discipline, but this isn't much different to normal development.
RATIONALE
1. Conventional code review can be an exhausting experience, especially for junior developers who make more mistakes. Every piece of work comes back with dozens of nitpicky comments that don't seem important and which is a lot of drudge work to apply. It leads to frustration, burnout and interpersonal conflicts. Reviewees may not understand what is being asked of them, resulting in wasted time. So, latency is often much lower if the reviewer just makes the changes directly in their IDE and pushes. People can then study the commits and learn from them.
2. Conventional projects can struggle to scale up because the codebase becomes a commons. Like in a communist state things degrade and litter piles up, because nobody is fully responsible. Junior developers or devs under time pressure quickly work out who will give them the easiest code review experience and send all the reviews to them. CODEOWNERS are the next step, but it's rare that the structure of your source tree matches the hierarchy of technical management in your organization so this can be a bad fit. Instead of improving widely shared code people end up copy/pasting it to avoid bringing in more mandatory reviewers. It's also easy for important but rarely changed directories to be left out, resulting in changes to core code slowing down because it'd require the founder of the company to approve a trivial refactoring PR.
FINDINGS
Well, it worked well for me at small scale (decent sized codebase but a small team). I never scaled it up to a big team although it was inspired by problems seen managing a big team.
Because most questions are answered by improving code comments rather than replying in a web UI the answers can help LLMs. LLMs work really well in my codebase and I think it's partly due to the plentiful documentation.
Sometimes the lack of a web UI for browsing code was an issue. I experimented with using IntelliJ link format, but of course not everyone wants to use IntelliJ. I could have set up a web UI over git just for source browsing, without the full GitHub experience, but in the end never bothered.
Gitolite is a very UNIXy set of Perl scripts. You need a gray beard to use it well. I thought about SaaSifying this workflow but it never seemed worth it.
We’ve gone a slightly different route at my team. Instead of reinventing the workflow around Gitolite/IntelliJ, we layered in LiveReview(https://hexmos.com/livereview/). It’s not as hardcore, but it gives us a similar payoff: reviewers spend less time on drudge work because LiveReview auto-catches a ton of the small stuff (we’re seeing ~40% fewer prod bugs). That leaves humans free to focus on the bigger design and ownership questions — the stuff machines can’t solve.
Different tools, same philosophy: make review faster, saner, and more about code quality than bureaucracy.