This is indeed what's missing from the ecosystem at large. People seem to be under the impression that if a new release of software/library/OS/application is released, you need to move to it today. They don't seem to actually look through the changes, only doing that if anything breaks, and then proceed to upgrade because "why not" or "it'll only get harder in the future", neither which feel like solid choices considering the trade-offs.
While we've seen to already have known that it introduces massive churn and unneeded work, it seems like we're waking up to the realization that it is a security tradeoff as well, to stay at the edge of version numbers. Sadly, not enough tooling seems to take this into account (yet?).
What would happen from time to time was that an important reason did come up, but the team was now many releases behind. Whoever was unlucky enough to sign up for the project that needed the updated dependency now had to do all those updates of the dependency, including figuring out how they affected a bunch of software that they weren't otherwise going to work on. (e.g., for one code path, I need a bugfix that was shipped three years ago, but pulling that into my component affects many other code paths.) They now had to go figure out what would break, figure out how to test it, etc. Besides being awful for them, it creates bad incentives (don't sign up for those projects; put in hacks to avoid having to do the update), and it's also just plain bad for the business because it means almost any project, however simple it seems, might wind up running into this pit.
I now think of it this way: either you're on the dependency's release train or you jump off. If you're on the train, you may as well stay pretty up to date. It doesn't need to be every release the minute it comes out, but nor should it be "I'll skip months of work and several major releases until something important comes out". So if you decline to update to a particular release, you've got to ask: am I jumping off forever, or am I just deferring work? If you think you're just deferring the decision until you know if there's a release worth updating to, you're really rolling the dice.
(edit: The above experience was in Node.js. Every change in a dynamically typed language introduces a lot of risk. I'm now on a team that uses Rust, where knowing that the program compiles and passes all tests gives us a lot of confidence in the update. So although there's a lot of noise with regular dependency updates, it's not actually that much work.)
I don’t think this is specific to any one language or environment, it just gets more difficult the larger your project is and the longer you go without updating dependencies.
I’ve experienced this with NPM projects, with Android projects, and with C++ (neglecting to merge upstream changes from a private fork).
It does seem likely that dynamic languages make this problem worse, but I don’t think very strict statically typed languages completely avoid it.
While my recent legacy Java project migration from JDK 8 -> 21 & a ton of dependency upgrades has been a pretty smooth experience so far.
I don't like Java but sometimes I envy their ecosystem.
I'd prefer to upgrade around the time most of the nasty surprises have already been discovered by somebody else, preferably with workarounds developed.
At the same time, you don't want to be so far back that upgrading uncovers novel migration problems, or issues that nobody else cares about anymore.
That's been my experience as well. In addition, the ecosystem largely holds to semver, which means a non-major upgrade tends to be painless, and conversely, if there's a major upgrade, you know not to put it off for too long because it'll involve some degree of migration.
Although this is true, any large ecosystem will have some popular packages not holding to semver properly. Also, the biggest downside is when your `>=v1` depends - indirectly usually - on a `v0` dependency which is allowed to do breaking changes.
For instance if you use a package that provides a calendar widget and your app uses only the “western” calendar and there is a critical vulnerability that only manifests in the Islamic calendar, you have zero reason to worry about an exploit.
I see this as a reasonable stance.
pkg_staleness = (1 * missed_patch + 5 * missed_minor + 10 * missed_major) * (month_diff(installed_release_date, latest_release_date)/2)^2
proj_staleness = sum(pkg.staleness * pkg.weight for pkg in all_packages)If you break my code I'm not wasting time fixing what you broke, I'm fixing the root cause of the bug: finding your replacement.
I think I wouldn't object to "Dependabot on a 2-week delay" as something that at least flags. However working in Go more than anything else it was often the case even so that dependency alerts were just an annoyance if they aren't tied to a security issue or something. Dynamic languages and static languages do not have the same risk profiles at all. The idea that some people have that all dependencies are super vital to update all the time and the casual expectation of a constant stream of vital security updates is not a general characteristic of programming, it is a specific characteristic not just of certain languages but arguably the community attached to those languages.
(What we really need is capabilities, even at a very gross level, so we can all notice that the supposed vector math library suddenly at version 1.43.2 wants to add network access, disk reading, command execution, and cryptography to the set of things it wants to do, which would raise all sorts of eyebrows immediately, even perhaps in an automated fashion. But that's a separate discussion.)
Doing updates on a regular basis (weekly to monthly) seems like a good idea so you don't forget how to do them and the work doesn't pile up. Also, it's easier to debug a problem when there are fewer changes at once.
But they could be rescheduled depending on what else is going on.
that's generally true, no?
of course waiting a few days/weeks should be the minimum unless there's a CVE (or equivalent) that's applies
(Literally at one place we built a SPA frontend that was embedded in the device firmware as a static bundle, served to the client and would then talk to a small API server. And because these NodeJS types liked to have libraries reused for server and frontend, we would get endless "vulnerability reports" - but all of this stuff only ever ran in the clients browser!)
Most tooling (e.g. Dependabot) allows you to set an interval between version checks. What more could be done on that front exactly? Devs can already choose to check less frequently.
EDIT: Github supports this scenario too (as mentioned in the article):
https://github.blog/changelog/2025-07-01-dependabot-supports...
https://docs.github.com/en/code-security/dependabot/working-...
The whole premise of the article is that they’re substantially lower, because some time for the ecosystem of dependency scanners and users to detect and report is better than none.
The practical problem with this is that many large organizations have a security/infosec team that mandates a "zero CVE" posture for all software.
Where I work, if our infosec team's scanner detect a critical vulnerability in any software we use, we have 7 days to update it. If we miss that window we're "out of compliance" which triggers a whole process that no one wants to deal with.
The path of least resistance is to update everything as soon as updates are available. Consequences be damned.
Come 2027-12, the Cyber Resilience Act enters full enforcement. The CRA mandates a "duty of care" for the product's lifecycle, meaning if a company blindly updates a dependency to clear a dashboard and ships a supply-chain compromise, they are on the hook for fines up to €15M or 2.5% of global turnover.
At that point, yes, there is a sense in which the blind update strategy you described becomes a legal liability. But don't miss the forest for the trees, here. Most software companies are doing zero vetting whatsoever. They're staring at the comet tail of an oncoming mass extinction event. The fact that you are already thinking in terms of "assess impact" vs. "blindly patch" already puts your workplace significantly ahead of the market.
We had like 1 or 2 crash-patches in the past - Log4Shell was one of them, and blocking an API no matter what in a component was another one.
In a lot of other cases, you could easily wait a week or two for directly customer facing things.
The solution is to fire those teams.
What you should do instead is talk with them about SLAs and validation. For example, commit to patching CRITICAL within x days, HIGH with y, etc. but also have a process where those can be cancelled if the bug can be shown not to be exploitable in your environment. Your CISO should be talking about the risk of supply chain attacks and outages caused by rushed updates, too, since the latter are pretty common.
So it's more of a cost-cutting/cover-your-ass measure than an actual requirement.
Let's say the reg says you're liable for damages caused by software defects you ship due to negligence, giving you broad leeway how to mitigate risks. The corporate policy then says "CVEs with score X must be fixed in Y days; OWASP best practices; infrastructure audits; MFA; yadda yadda". Finally the enforcement is then done by automated tooling like sonarqube, prisma. dependabot, burpsuite, ... and any finding must be fixed with little nuance because the people doing the scans lack the time or expertise to assess whether any particular finding is actually security-relevant.
On the ground the automated, inflexible enforcement and friction then leads to devs choosing approaches that won't show up in scans, not necessarily secure ones.
As an example I witnessed recently: A cloud infra scanning tool highlighted that an AppGateway was used as TLS-terminating reverse proxy, meaning it used HTTP internally. The tool says "HTTP bad", even when it's on an isolated private subnet. But the tool didn't understand Kubernetes clusters, so a a public unencrypted ingress, i.e. public HTTP didn't show up. The former was treated as a critical issue that must be fixed asap or the issue will get escalated up the management chain. The latter? Nobody cares.
Another time I got pressure to downgrade from Argon2 to SHA2 for password hashing because Argon2 wasn't on their whitelist. I resisted that change but it was a stressful bureaucratic process with some leadership being very unhelpful and suggesting "can't you just do the compliant thing and stop spending time on this?".
So I agree with GP that some security teams barely correlate with security, sometimes going into the negative. A better approach would to integrate software security engineers into dev teams, but that'd be more expensive and less measurable than "tool says zero CVEs".
Browsers get a lot of unknown input, so they have to update often.
A Weather app is likely to only get input from one specific site (controlled by the app developers), so it should be relatively safe.
Yes
"Only then do you need to update that specific dependency right away."
Big no. If you do that it is guaranteed one day you miss a vulnerability that hurts you.
To frame it differently: What you propose sounds good in theory but in practice the effort to evaluate vulnerabilities against your product will be higher than the effort to update plus taking appropriate measures against supply chain attacks.
Nobody is proposing a system that utterly and completely locks you out of all updates if they haven't aged enough. There is always going to be an override switch.
Most of them assume what if they are working on some public accessible website then 99% of the people and orgs in the world are running nothing but some public accessible website.
Dependencies are good. Churn is bad.
It's a shame some ecosystems move waaay too fast, or don't have a good story for having distro-specific packages. For example, I don't think there are Node.js libraries packaged for Debian that allow you to install them from apt and use it in projects. I might be wrong.
An eco system moving too quickly, when it isn't being fundamentally changed, isn't a sign of a healthy ecosystem, but of a pathological one.
No one can think that js has progressed substantially in the last three years, yet trying to build any project three years old without updates is so hard a rewrite is a reasonable solution.
Are we talking about the language, or the wider ecosystem?
If the latter, I think a lot of people would disagree. Bun is about three years old.
Other significant changes are Node.js being able to run TypeScript files without any optional flags, or being able to use require on ES Modules. I see positive changes in the ecosystem in recent years.
The point of javascript is to display websites in the browser.
Ask yourself, in the last three years has there been a substantial improvement in the way you access websites? Or have they gotten even slower, buggier and more annoying to deal with?
Your are comparing the js ecosystem and bad project realizations/designs.
> Action vs motion
I think the main difference you mean is the motivation behind changes, is it a (re)action to achieve a meassurable goal, is this a fix for a critical CVE, or just some dev having fun and pumping up the numbers.
GP mentioned the recent feature of executing ts, which is a reasonable goal imo, with alot of beneficial effects down the line but in the present just another hustle to take care about. So is this a pointless motion or a worthy action? Both statements can be correct, depending on your goals.
And after all, isn’t developer velocity (and investor benefits) really the only things that matter???
/sssss
Web search shows some: https://packages.debian.org/search?keywords=node&searchon=na... (but also shows "for optimizing reasons some results might have been suppressed" so might not be all)
Although probably different from other distros, Arch for example seems to have none.
apt-cache showpkg 'node-*' | grep ^Package:
which returns 4155 results, though 727 of them are type packages.Using these in commonjs code is trivial; they are automatically found by `require`. Unfortunately, system-installed packages are yet another casualty of the ESM transition ... there are ways to make it work but it's not automatic like it used to be.
A small price to pay for the abundant benefits ESM brings.
In particular, the fact that Typescript makes it very difficult to write a project that uses both browser-specific and node-specific functionality is particularly damning.
It's comparing the likelihood of an update introducing a new vulnerability to the likelihood of it fixing a vulnerability.
While the article frames this problem in terms of deliberate, intentional supply chain attacks, I'm sure the majority of bugs and vulnerabilities were never supply chain attacks: they were just ordinary bugs introduced unintentionally in the normal course of software development.
On the unintentional bug/vulnerability side, I think there's a similar argument to be made. Maybe even SemVer can help as a heuristic: a patch version increment is likely safer (less likely to introduce new bugs/regressions/vulnerabilities) than a minor version increment, so a patch version increment could have a shorter cooldown.
If I'm currently running version 2.3.4, and there's a new release 2.4.0, then (unless there's a feature or bugfix I need ASAP), I'm probably better off waiting N days, or until 2.4.1 comes out and fixes the new bugs introduced by 2.4.0!
> I'm sure the majority of bugs and vulnerabilities were never supply chain attacks: they were just ordinary bugs introduced unintentionally in the normal course of software development.
Yes, absolutely! The overwhelming majority of vulnerabilities stem from normal accidental bug introduction -- what makes these kinds of dependency compromises uniquely interesting is how immediately dangerous they are versus, say, a DoS somewhere in my network stack (where I'm not even sure it affects me).
From their point of view it is a trade-off between volume of vulnerable targets, management impatience and even the time value of money. Time to market probably wins a lot of arguments that it shouldn't, but that is good news for real people.
The cooldown security scheme appears like some inverse "security by obscurity". Nobody could see a backdoor, therefor we can assume security. This scheme stands and falls with the assumed timelines. Once this assumption tumbles, picking a cooldown period becomes guess work. (Or another compliance box ticked.)
On the other side, the assumption can very well be sound, maybe ~90% of future backdoors can be mitigated by it. But who can tell. This looks like the survivorship bias, because we are making decisions based on the cases we found.
Libraries themselves should perhaps also take a page from the book of Linux distributions and offer LTS (long term support) releases that are feature frozen and include only security patches, which are much easier to reason about and periodically audit.
Can you cite an example of a moderately-widely-used open source project or library that is pulling in code as a dependency that you feel it should have replicated itself?
What are some examples of "everything libraries" that you view as problematic?
So if you’re adding chalk, that generally means you don’t know jack about terminals.
If chalk emits sequences that aren't supported by your terminal, then that's a deficiency in chalk, not the programs that wanted to produce colored output. It's easier to fix chalk than to fix 50,000 separate would-be dependents of chalk.
The problem is also less about the implementation I want, it's about the 10,000 dependencies of things I don't really want. All of those are attack surface much larger than some simple function.
Of course, small libraries get a bad rap because they're often maintained by tons of different people, especially in less centralized ecosystems like npm. That's usually a fair assessment. But a single author will sometimes maintain 5, 10, or 20 different popular libraries, and adding another library of theirs won't really increase your social attack surface.
So you're right about "pull[ing] in universes [of package maintainers]". I just don't think complexity or number of packages are the metrics we should be optimizing. They are correlates, though.
(And more complex code can certainly contain more vulnerabilities, but that can be dealt with in the traditional ways. Complexity begets simplicity, yadda yadda; complexity that only begets complexity should obviously be eliminated)
Of course it's up to developers to weigh the tradeoffs and make reasonable choices, but now we have a lot more optionality. Reaching for a dependency no longer needs to be the default choice of a developer on a tight timeline/budget.
In many cases I was able to replace 10s of lines of code with a single function call to a dependency the project already had. In very few cases did I have to add a new dependency.
But directly relevant to this discussion is the story of the most copied code snippet on stack overflow of all time [1]. Turns out, it was buggy. And we had more than once copy of it. If it hadn't been for the due diligence effort I'm 100% certain they would still be there.
* Introducing a library with two GitHub stars from an unknown developer
* Introducing a library that was last updated a decade ago
* Introducing a library with a list of aging unresolved CVEs
* Pulling in a million lines of code that you're reasonably confident you'll never have a use for 99% of
* Relying on an insufficiently stable API relative to the team's budget, which risks eventually becoming an obstacle to applying future security updates (if you're stuck on version 11.22.63 of a library with a current release of 20.2.5, you have a problem)
Each line of code included is a liability, regardless of whether that code is first-party or third-party. Each dependency in and of itself is also a liability and ongoing cost center.
Using AI doesn't magically make all first-party code insecure. Writing good code and following best practices around reviewing and testing is important regardless of whether you use AI. The point is that AI reduces the upfront cost of first-party code, thus diluting the incentive to make short-sighted dependency management choices.
I'd still rather have the original than the AI's un-attributed regurgitation. Of course the fewer users something has, the more scrutiny it requires, and below a certain threshold I will be sure to specify an exact version and leave a comment for the person bumping deps in the future to take care with these.
> Introducing a library that was last updated a decade ago
Here I'm mostly with you, if only because I will likely want to apply whatever modernisations were not possible in the language a decade ago. On the other hand, if it has been working without updates in a decade, and people are STILL using it, that sounds pretty damn battle-hardened by this point.
> Introducing a library with a list of aging unresolved CVEs
How common is this in practice? I don't think I've ever gone library hunting and found myself with a choice between "use a thing with unsolved CVEs" and "rewrite it myself". Normally the way projects end up depending on libraries with lists of unresolved CVEs is by adopting a library that subsequently becomes unmaintained. Obviously this is a painful situation to be in, but I'm not sure its worse than if you had replicated the code instead.
> Pulling in a million lines of code that you're reasonably confident you'll never have a use for 99% of
It very much depends - not all imported-and-unused code is equal. Like yeah, if you have Flask for your web framework, SQLAlchemy for your ORM, Jinja for your templates, well you probably shouldn't pull in Django for your authentication system. On the other hand, I would be shocked if I had ever used more than 5% of the standard library in the languages I work with regularly. I am definitely NOT about to start writing my rust as no_std though.
> Relying on an insufficiently stable API relative to the team's budget, which risks eventually becoming an obstacle to applying future security updates (if you're stuck on version 11.22.63 of a library with a current release of 20.2.5, you have a problem)
If a team does not have the resources to keep up to date with their maintenance work, that's a problem. A problem that is far too common, and a situation that is unlikely to be improved by that team replicating the parts of the library they need into their own codebase. In my experience, "this dependency has a CVE and the security team is forcing us to update" can be one of the few ways to get leadership to care about maintenance work at all for teams in this situation.
> Each line of code included is a liability, regardless of whether that code is first-party or third-party. Each dependency in and of itself is also a liability and ongoing cost center.
First-party code is an individual liability. Third-party code can be a shared one.
If what you need happens to be exactly what the library provides — nothing more, less, or different — then I see where you're coming from. The drawback is that the dependency itself remains a liability. With such an obscure library, you'll have fewer eyes watching for supply chain attacks.
The other issues are that 1) an obscure library is more likely to suddenly become unmaintained; and 2) someone on the team has to remember to include it in scope of internal code audits, since it may be receiving little or no other such attention.
> On the other hand, I would be shocked if I had ever used more than 5% of the standard library in the languages I work with regularly.
Hence "non-core". A robust stdlib or framework is in line with what I'm suggesting, not a counterexample. I'm not anti-dependency, just being practical.
My point is that AI gives developers more freedom to implement more optimal dependency management strategies, and that's a good thing.
> unlikely to be improved by that team replicating the parts of the library they need into their own codebase
At no point have I advised copying code from libraries instead of importing them.
If you can implement a UI component that does exactly what you want and looks exactly how you want it to look in 200 lines of JSX with no dependencies, and you can generate and review the code in less than five minutes, why would you prefer to install a sprawling UI framework with one component that does something kind of similar that you'll still need to heavily customize? The latter won't even save you upfront time anymore, and in exchange you're signing up for years of breaking changes and occasional regressions. That's the best case scenario; worst case scenario it's suddenly deprecated or abandoned and you're no longer getting security updates.
It seems like you're taking a very black-and-white view in favor of outsourcing to dependencies. As with everything, there are tradeoffs that should be weighed on a case-by-case basis.
Limiting the number of dependencies, but then rewriting them in your own code, will also increase the maintenance burden and compile times
Bottom line those security bugs are not all from version 1.0 , and when you update you may well just be swapping known bugs for unknown bugs.
As has been said elsewhere - sure monitor published issues and patch if needed but don't just blindly update.
One great example of that is log4shell. If you were still using version 1.0 (log4j 1.x), you were not vulnerable, since the bug was introduced in version 2.0 (log4j 2.x). There were some known vulnerabilities in log4j 1.x, but the most common configuration (logging only to a local file or to the console, no remote logging or other exotic stuff) was not affected by any of them.
These days it seems most software just changes mostly around the margins, and doesn't necessarily get a whole lot better. Perhaps this is also a sign I'm using boring and very stable software which is mostly "done"?
One of the classic scammer techniques is to introduce artificial urgency to prevent the victim from thinking clearly about a proposal.
I think this would be a weakness here as well: If enough projects adopt a "cooldown" policy, the focus of attackers would shift to manipulate projects into making an exception for "their" dependency and install it before the regular cooldown period elapsed.
How to do that? By playing the security angle once again: An attacker could make a lot of noise how a new critical vulnerability was discovered in their project and every dependant should upgrade to the emergency release as quickly as possible, or else - with the "emergency release" then being the actually compromised version.
I think a lot of projects would could come under pressure to upgrade, if the perceived vulnerability seems imminent and the only point for not upgrading is some generic cooldown policy.
If it's "only" technical access, it would probably be harder.
Ok if this is an amazing advice and the entire ecosystem does that: just wait .... then what? We wait even more to be sure someone else is affected first?
Every time I see people saying you need to wait to upgrade it is like you are accumulating tech debt: the more you wait, the more painful the upgrade will be, just upgrade incrementally and be sure you have mitigations like 0 trust or monitoring to cut early any weird behavior.
There are a lot of companies out there, that's scan packages and analyze them. Maintainers might notice a compromise, because a new release was published they didn't authorize. Or just during development, by getting all their bitcoin stolen ;)
Not enough to accumulate tech debt, enough to mitigate the potential impact of any supply-chain vulnerability.
The attacker will try to figure out when they are the least available: during national holidays, when they sleep, during conferences they attend, when they are on sick leave, personal time off, ...
Many projects have only a few or even only a single person, that's going to notice. They are often from the same country (time zone) or even work in the same company (they might all attend the same conference or company retreat weekend).
Except if everyone does it chance of malicious things being spotted in source also drops by virtue of less eyeballs
Still helps though in cases where maintainer spot it etc
I don't think the people automatically updating and getting hit with the supply chain attack are also scanning the code, I don't think this will impact them much.
If instead, updates are explicitly put on cooldowns, with the option of manually updating sooner, then there would be more eyeballs, not fewer, as people are more likely to investigate patch notes, etc., possibly even test in isolation...
The underlying premise here is that supply chain security vendors are honest in their claims about proactively scanning (and effectively detecting + reporting) malicious and compromised packages. In other words, it's not about eyeballs (I don't think people who automatically apply Dependabot bumps are categorically reading the code anyways), but about rigorous scanning and reporting.
Instead of a period where you don't use the new version, shouldn't we instead be promoting a best practice of not just blindly using a package or library in production? This "cooldown" should be a period of use in dev or QA environments while we take the time to investigate the libraries we use and their dependencies. I know this can be difficult in many languages and package managers, given the plethora of libraries and dependencies (I'm looking at you in particular JavaScript). But "it's hard" shouldn't really be a good excuse for our best efforts to maintain secure and stable applications.
If the code just sits there for a week without anyone looking at it, and is then considered cooled down just due to the passage of time, then the cool down hasn't done anything beneficial.
A form of cooldown that could would in terms of mitigating problems would be a gradual rollout. The idea is that the published change is somehow not visible to all downstreams at the same time.
Every downstream consumer declares a delay factor. If your delay factor is 15 days, then you see all new change publications 15 days later. If your delay factor is 0 days, you see everything as it is published, immediately. Risk-averse organizations configure longer delay factors.
This works because the risk-takers get hit with the problem, which then becomes known, protecting the risk-averse from being affected. Bad updates are scrubbed from the queue so those who have not yet received them due to their delay factor wlll not see those updates.
Although, I suppose we've probably updated the patch version.
My understanding of a cooldown, from video games, is a period of time after using an ability where you can't use it again. When you're firing a gun and it gets too hot, you have to wait while it cools down.
I was trying to apply this to the concept in the article and it wasn't quite making sense. I don't think "cooldown" is really used when taking a pie out of the oven, for example. I would call this more a trial period or verification window or something?
I can't in good conscience, say "Don't use dependencies," which solves a lot of problems, but I can say "Let's be careful, out there," to quote Michael Conrad.
I strongly suspect that a lot of dependencies get picked because they have a sexy Website, lots of GH stars and buzz, and cool swag.
I tend to research the few dependencies that I use. I don't depend lightly.
I'm also fortunate to be in the position where I don't need too many. I am quite aware that, for many stacks, there's no choice.
A sane "cooldown" is just for automated version updates relying on semantic versioning rules, which is a pretty questionable practice in the first place, but is indeed made a lot more safe this way.
You can still manually update your dependency versions when you learn that your code is exposed to some vulnerability that's purportedly been fixed. It's no different than manually updating your dependency version when you learn that there's some implementation bug or performance cliff that was fixed.
You might even still use an automated system to identify these kinds of "critical" updates and bring them to your attention, so that you can review them and can appropriately assume accountability for the choice to incorporate them early, bypassing the cooldown, if you believe that's the right thing to do.
Putting in that effort, having the expertise to do so, and assuming that accountability is kind of your "job" as a developer or maintainer. You can't just automate and delegate everything if you want people to be able to trust what you share with them.
I'm reminded of how new Setuptools versions are able to cause problems for so many people, basically because install tools default to setting up isolated build environments using the latest version of whatever compatible build backend is specified (which in turn defaults to "any version of Setuptools"). qv. my LWN article https://lwn.net/Articles/1020576/ .
There's no reason to pretend we live in a world where everyone is manually combing through the source of every dependency update.
The probability of introducing bugs is a function of the amount of development being done. Releasing less often doesn't change that. In fact, under that assumption, delaying releases strictly increases the amount of time users are affected by the average bug.
People who do this tell themselves the extra time allows them to catch more bugs. But in my experience that's a bedtime story, most bugs aren't noticed until after deployment anyway.
That's completely orthogonal to slowly rolling out changes, btw.
(This also doesn't apply to vulnerabilities per se, since known vulnerabilities typically aren't evaluated against cooldowns by tools like Dependabot.)
(Can you say more about what you found unclear in the post? The post definitely does not say "thou shall not update before the cooldown," the argument was that cooldowns are a great default. Engineers are fundamentally always expected to exercise discretion because, per the post, there's no single, sound, perfect solution to supply chain risks.)
^ This is what you wrote. I don't understand how that could possibly be interpreted any other way than I wrote above: an enforced delay on deploying the new code after upstream releases it.
> The post definitely does not say "thou shall not update before the cooldown," the argument was that cooldowns are a great default
Sorry, that is such a cop out. "I didn't actually mean you should do this, I mean you should consider if you should maybe do this and you are free to decide not to and don't argue with me if you disagree every case is different". Either take a stand or don't.
The argument advanced in the post is IMO clear: cooldowns are a sensible default to have, and empirically seem to be effective at mitigating the risk of compromised dependencies. I thought I took sufficient pains to be clear that they're not a panacea.
Me saying your proposed policy is bad is in no way predicated on any assumption you intended it to be "universal". Quite the opposite: the last thing anybody needs at work is yet another poorly justified bullshit policy they have to constantly request an "exception" to to do their job...
If you tell people that cooldowns are a type of test and that until the package exits the testing period, it's not "certified" [*] for production use, that might help with some organizations. Or rather, would give developers an excuse for why they didn't apply the tip of a dependency's dev tree to their PROD.
So... not complaining about cooldowns, just suggesting some verbiage around them to help contextualize the suitability of packages in the cooldown state for use in production. There are, unfortunately, several mid-level managers who are under pressure to close Jira tickets IN THIS SPRINT and will lean on the devs to cut whichever corners need to be cut to make it happen.
[*] for some suitable definition of the word "CERTIFIED."
$ npm i
78 packages upgraded
and upgrading just one dependency from 3.7.2_1 to 3.7.2_2, after carefully looking at the code of the bugfix.The cooldown approach makes the automatic upgrades of the former kind much safer, while allowing for the latter approach when (hopefully rarely) you actually need a fix ASAP.
Austral[0] gets this right. I'm not a user, just memeing a good idea when I see it.
Most languages could be changed to be similarly secure. No global mutable state, no system calls without capabilities, no manual crafting of pointers. All the capabilities come as tokens or objects passed in to the main, and they can be given out down the call tree as needed. It is such an easy thing to do at the language level, and it doesn't require any new syntax, just a new parameter in main, and the removal of a few bad ideas.
If the module provides some small helper function like `appendSmileEmoji(x: String) -> String` then it shouldn't be accessing the network or the filesystem, and the only memory it should have access to is the string passed into the function. The programmer already provided enough information to explain to the compiler that the function should not do these things but poor language design allows them to happen anyways.
Components which do have access to the network should be vetted thoroughly because they can facilitate infiltration and exfiltration. Most supply chain attacks aren't introduced directly in a high profile project like a web framework, they are introduced deeper in the dependency chain in the long tail of transitively imported modules.
In Austral (or any language using these good ideas) a small poorly vetted function like `appendSmileEmoji` wouldn't be called with access to the filesystem or network. Without access to any resources, the worst it can do is append a frowning face or a null character or something annoying. You can tell by its function signature above, it doesn't take a capability for the network or filesystem, just a String. So trying to ship a change that secretly accesses the filesystem would produce a "variable not found" error. To even get the new version to compile, you would have to add an argument to take a filesystem capability. `appendSmileEmoji(x: String, fs: FS) -> String`. That breaks any dependent modules because they have the wrong number of arguments, and looks very suspicious.
P.S. When I was working at Amazon, I remember that a good number of on-call tickets were about fixing dependencies (in most of them are about updating the outdated Scala Spark framework--I believe it was 2.1.x or older) and patching/updating OS'es in our clusters. What the team should have done (I mentioned this to my manager) is to create clusters dynamically (do not allow long-live clusters even if the end users prefer it that way), and upgrading the Spark library. Of course, we had a bunch of other annual and quarterly OKRs (and KPIs) to meet, so updating Spark got the lowest of priorities...
- you are vulnerable for 7 days because of a now public update
- you are vulnerable for x (hours/days) because of a supply chain attack
I think the answer is rather simple: subscribe to a vulnerability feed, evaluate & update. The amount of times automatic updates are necessary is near zero as someone who has ran libraries that are at times 5 to 6 years out of date exposed to the internet without a single event of compromise and it's not like these were random services, they were viewed by hundreds of thousands of unique addresses. There was only 3 times in the last 4 years where I had to perform updates due to a publically exposed service where these vulnerabilities affected me.
Okay, the never being compromised part is a lie because of php, it's always PHP (monero-miner I am sure everyone is familiar with). The solution for that was to stop using PHP and assosiated software.
Another one I had problems with was CveLab (GitLab if you couldn't tell), there has been so many critical updates pointing to highly exploitable CVE's that I had decided to simply migrate off it.
In conclusion avoiding bad software is just as important as updates from my experience lowering the need for quick and automated actions.
Stacking up more sub-par tooling is not going to solve anything.
Fortunately this is a problem that doesn't even have to exist, and isn't one that anyone falls into naturally. It's a problem that you have to actively opt into by taking steps like adding things to .gitignore to exclude them from source control, downloading and using third-party tools in a way that introduces this and other problems, et cetera—which means you can avoid all of it by simply not taking those extra steps.
(Fun fact: on a touch-based QWERTY keyboard, the gesture to input "vendoring" by swiping overlaps with the gesture for "benefitting".)
The practice I described is neither "never upgrading" your dependencies nor predicated upon that. And your comment ignores that there's an entire other class of problem I alluded to (reproducibility/failure-to-reproduce) that normal, basic source control ("vendoring") sidesteps almost entirely, if not entirely in the majority of cases. That, too, is without addressing the fact that what you're tacitly treating as bedrock is not, in fact, bedrock (even if it can be considered the status quo within some subculture).
The problem of "supply chain attacks" (which, to reiterate, is coded language for "not exercising source control"—or, "pretending to, but not actually doing it") is a problem that people have to opt in to.
It takes extra tooling.
It takes extra steps.
You have to actively do things to get yourself into the situation where you're susceptible to it—and then it takes more tooling and more effort to try and get out. In contrast, you can protect yourself from it +and+ solve the reproducibility problem all by simply doing nothing[1]—not taking the extra steps to edit .gitignore to exclude third-party code (and, sometimes, first-party code) from the SCM history, not employing lock files to undo the consequences of selectively excluding key parts of the input to the build process (and your app's behavior).
All these lock file schemes that language package manager projects have belatedly and sheepishly began to offer as a solution are inept attempts at doing what your base-level RCS already does for free (if only all the user guides for these package managers weren't all instructing people to subvert source control to begin with).
Every package manager lock file format (requirements file, etc.) is an inferior, ad hoc, formally-specified, error-prone, incompatible re-implementation of half of Git.
You cannot vendor yourself out of a nuclear waste pile that is the modern OSS ecosystem.
The only thing I've claimed is that keeping dependencies under source control neutralizes the supply chain attacks that the author of the post describes.
They each belong to two totally different genres of comment.
If you have something concrete to say about the relationship between supply chain attacks and dependencies being excluded from source control in lieu of late-fetching them right at/before build time, then go for it.
It helps a lot with these dev cycles.
Recently, I saw even EmberJS moved away from it. I still think they could've done it without leaving semver. Maybe I'm wrong.
* If everybody does it, it won't work so well
* I've seen cases where folks pinned their dependencies, and then used "npm install" instead of "npm ci", so the pinning was worthless. Guess they are the accidental, free beta testers for the rest of us.
* In some ecosystems, distributions (such as Debian) does both additional QA, and also apply a cooldown. Now we try to retrofit some of that into our package managers.
Quoting from the docs:
> This command installs a package and any packages that it depends on. If the package has a package-lock, or an npm shrinkwrap file, or a yarn lock file, the installation of dependencies will be driven by that [..]
Things that everybody does: breathe. Eat. Drink. Sleep. And a few other things that are essential to being alive.
Things that not everybody does: EVERYTHING else.
Indeed, this is a complex problem to solve.
And the "it won't work so well" of this is probably a general chilling effect on trying to fix things because people won't roll them out fast enough anyway.
This may seem theoretical but for example in websites where there are suppliers and customers, there's quite a chilling effect on any mechanism that encourages people to wait until a supplier has positive feedback; there are fewer and fewer people with low enough stakes who are willing to be early adopters in that situation.
What this means is that new suppliers often drop out too quickly, abandon platforms, work around those measures in a way that reduces the value of trust, and worse still there's a risk of bad reviews because of the reviewer's Dunning-Kruger etc.
I think the mechanism is important for people who really must use it, but there will absolutely be side effects that are hard to qualify/correct.
Something like, upgrade once there are N independent positive reviews AND less than M negative reviews (where you can configure which people are organisations you trust to audit). And of course you would be able to audit dependencies yourself (and make your review available for others).
Imagine a world where every software update has hundreds of signoffs from companies across the industry. That is achievable if we work together. For only a few minutes a day, you too can save a CVSS 10.0 vulnerability from going unpatched :)
Also, there should a way to distinguish between security updates and normal updates for this. If there is, a cooldown is a useful idea in general for normal updates, since (presumably) the current version works and the new version may introduce bugs.
The short time span isn’t just because exploits get attention: it’s to allow the groups which do automated analysis time to respond. Significantly increasing the challenge level for an attacker to introduce a vulnerability is a meaningful improvement even if it doesn’t prevent that class of attack entirely.
The article does not discuss this tradeoff.
Clearly I should have mentioned that Dependabot (and probably others) don't consider cooldown when suggesting security upgrades. That's documented here[1].
[1]: https://docs.github.com/en/code-security/dependabot/working-...
Instead of updating the cache of dependencies you have immediately, the suggestion is to use the cooldown to wait....
As you point out, this means that you have a stale cache member has a critical fix applied.
Next week's solution - have a dependency management tool that alerts you when critical fixes are created upstream for dependencies you have
Followed by - now the zero day authors are publishing their stuff as critical fixes...
Hilarity ensues
Copilot seems well placed with its GitHub integration here, it could review dependency suggestions, CVEs, etc and make pull requests.
I don't know, but it's the claimed truth from a lot of vendors! The value proposition for a lot of supply chain security products is a lot weaker if their proactive detection isn't as strong as claimed.
Though I guess it'd be hard to prove intent in this case.
I know Ubuntu and others do the same but I don't know what they call their STS equivalent.
If there's a bugfix or security patch that applies to how your application uses the dependency, then you review the changes, manually update your version if you feel comfortable with those changes, and accept responsibility for the intervention if it turns out you made a mistake and rushed in some malicious code.
Meanwhile, most of the time, most changes pushed to dependencies are not even in the execution path of any given application that integration with them, and so don't need to be rushed in. And most others are "fixes" for issues that were apparently not presenting an eminent test failure or support crisis for your users and don't warrant being rushed in.
There's not really a downside here, for any software that's actually being actively maintained by a responsible engineer.
> Meanwhile, most of the time, most changes pushed to dependencies are not even in the execution path of any given application that integration with them
Sorry, this is really ignorant. You don't appreciate how much churn their is in things like the kernel and glibc, even in stable branches.
You're correct, because it's completely neurotic to worry about phantom bugs that have no actual presence of mind but must absolutely positively be resolved as soon as a candidate fix has been pushed.
If there's a zero day vulnerability that affects your system, which is a rare but real thing, you can be notified and bypass a cooldown system.
Otherwise, you've presumably either adapted your workflow to work around a bug or you never even recognized one was there. Either way, waiting an extra <cooldown> before applying a fix isn't going to harm you, but it will dampen the much more dramatic risk of instability and supply chain vulnerabilities associated with being on the bleeding edge.
Well, I've made a whole career out of fixing bugs like that. Just because you don't see them doesn't mean they don't exist.
It is shockingly common to see systems bugs that don't trigger for a long time by luck, and then suddenly trigger out of the blue everywhere at once. Typically it's caused by innocuous changes in unrelated code, which is what makes it so nefarious.
The most recent example I can think of was an uninitialized variable in some kernel code: hundreds of devices ran that code reliably for a year, but an innocuous change in the userland application made the device crash on startup almost 100% of the time.
The fix had been in stable for months, they just hadn't bothered to upgrade. If they had upgraded, they'd have never known the bug existed :)
I can tell dozens of stories like that, which is why I feel so strongly about this.
The point is that a change to a completely unrelated component caused a latent bug to make the device unusable, ended up delaying a release for weeks and causing them have to pay me a bunch of money to fix it for them.
If they'd been applying upgrades, they would have never even known it existed, and all that trouble would have been avoided.
I mean, I'm sort of working against my own interest here: arguably I should want people to delay upgrades so I can get paid to backport things for them :)
Of course we talk about OS security upgrades here, not library dependencies. But the attack vector is similar.
If upgrading like that scares you, your automated testing isn't good enough.
On average, the most bug free Linux experience is to run the latest version of everything. I wasted much more time backporting bugfixes before I started doing that, than I have spent on new bugs since.
Maybe your codebase is truly filled with code that is that riddled with flaws, but:
1) If so, updating will not save you from zero days, only from whatever bugs the developers have found.
2) Most updates are not zero day patches. They are as likely to (unintentionally) introduce zero days as they are to patch them.
3) In the case where a real issue is found, I can't imagine it isn't hard to use the aforementioned security vendors, and use their recommendations to force updates outside of a cooldown period.
Although, even for Operating Systems, cooldown periods on patches are not only a good thing, but something that e.g. a large org that can't afford downtime will employ (managing windows or linux software patches, e.g.). The reasoning is the same - updates have just as much chance to introduce bugs as fix them, and although you hope your OS vendor does adequate testing, especially in the case where you cannot audit their code, you have to wait so that either some 3rd party security vendor can assess system safety, or you are able to perform adequate testing yourself.
Some of these can be short-lived, existing only on a minor patch and fixed on the next one promptly but you’ll get it if you upgrade constantly on the latest blindly.
There is always risks either way but latest version doesn’t mean the “best” version, mistakes, errors happens, performance degradation, etc.
I think you're using a different definition of zero day than what is standard. Any zero day vulnerability is not going to have a patch you can get with an update.
You will not know how long it takes to get a zero day fixed, because zero in "zero day" ends when the vendor is informed:
> "A zero day vulnerability refers to an exploitable bug in software that is unknown to the vendor."
Yes. I'm saying that's wrong.
The default should always be to upgrade to new upstream releases immediately. Only in exceptional cases should things be held back.
Realistically, no one does full side-by-side tests with and without lockfiles, but it's a good idea to at least do a smoke test or two that way.
I've been working on automatic updates for some of my [very overengineered] homelab infra and one thing that I've found particularly helpful is to generate PRs with reasonable summaries of the updates with an LLM. it basically works by having a script that spews out diffs of any locks that were updated in my repository, while also computing things like `nix store diff-closures` for the before/after derivations. once I have those diffs, I feed them into claude code in my CI job, which generates a pull request with a nicely formatted output.
one thing I've been thinking is to lookup all of those dependencies that were upgraded and have the LLM review the commits. often claude already seems to lookup some of the commits itself and be able to give a high level summary of the changes, but only for small dependencies where the commit hash and repository were in the lock file.
it would likely not help at all with the xz utils backdoor, as IIRC the backdoor wasn't even in the git repo, but on the release tarballs. but I wonder if anyone is exploring this yet?
Totally agree that AI is great for this, it will work harder and go deeper and never gets tired of reading code or release notes or migration guides. What you want instead of summaries is to find the breaking changes, figure out if they matter, then comment on _that_.
For projects with hundreds or thousands of active dependencies, the feed of security issues would be a real fire hose. You’d want to use an LLM to filter the security lists for relevance before bringing them to the attention of a developer.
It would be more efficient to centralize this capability as a service so that 5000 companies aren’t all paying for an LLM to analyze the same security reports. Perhaps it would be enough for someone to run a service like cooldown.pypi.org that served only the most vetted packages to everyone.