What twenty years of DevOps has failed to do
69 points
22 hours ago
| 28 comments
| honeycomb.io
| HN
mosura
19 hours ago
[-]
It failed because there is an ongoing denial that development and operations are two distinct skillsets.

If you think 10x devs are unicorns consider how much harder it is to get someone 10x at the intersection of both domains. (Personally I have never met one). You are far better off with people that can work together across the bridge, but that requires actual mutual trust and respect, and we’re not able to do that.

reply
whynotmaybe
4 hours ago
[-]
The goal of dev is to be able to change everything whenever they want.

The goal of ops is to have a strong infra that has the fewest changes possible.

They are opposite and usually there are more devs than ops but the first respondent to an issue are ops.

You can only have devops if both roles are intertwined in the same team AND, the organization understands the implications.

Everywhere I've been, devops was just an excuse to transfer ops responsibilities to dev because dev where cheaper. Dev became first respondents without having the knowledge of the infrastructure.

So dev insisted to have docker so that they would be the one managing the infra.

But everyone failed to see that whichever expensive tools you buy, the biggest issue was the lack of personal investment to solve a problem.

If you are a 1.5x dev in a 0.9x team, you get all the incidents, and are still expected to build new stuff.

And building new stuff is fun.

Spending 2 days to analyze a performance issue because a 0.3x dev found it easier to do a .sort() in Linq instead of Sql is fun only once.

reply
vee-kay
18 hours ago
[-]
From someone who has managed both Developmentals team and Operations team for decades.. trust me, they are different beasts and have to be handled/tackled differently.

Expecting Devs or Ops to do both types of work, is usually asking for trouble, unless the organization is geared up from the ground up for such seamless work. It is more of a corporate problem, rather than a team working style or work expectations & behavior problem.

The same goes for Agile vs Waterfall. Agile works well if the organization is inherently (or overhauled to be) agile, otherwise it doesn't.

reply
SkiFire13
7 hours ago
[-]
> Expecting Devs or Ops to do both types of work, is usually asking for trouble, unless the organization is geared up from the ground up for such seamless work.

Could you expand on this? How would an organization be geared up for this?

reply
antod
13 hours ago
[-]
> You are far better off with people that can work together across the bridge, but that requires actual mutual trust and respect, and we’re not able to do that.

Wasn't that the original goal of DevOps? Getting dev and ops not being siloes and get them collaborating? The "make devs do ops" definition seemed to come along later.

reply
Conan_Kudo
5 hours ago
[-]
The original goal of DevOps never happened. Companies immediately jumped on this with "rationalizations" and "integrations" to make it so fewer people were in charge of more things.
reply
DarkNova6
14 hours ago
[-]
But is DevOps a role or a principle?

The way I have seen it in my carreer is to have operational and development capabilities within the same team. And the idea of a „DevOps guy“ is a guy „developing operations integrations“.

As opposed to completely siloing ops nd dev.

reply
pjmlp
13 hours ago
[-]
For most companies, it is a role, the new name for IT administrators.
reply
dsr_
14 hours ago
[-]
DevOps is the practice of using modern software methods to automate the tasks of operations work. That includes using version control, templating languages and various forms of role-based configuration automation.

Anyone who thinks they can hire a devop or declare that they do devops is as deluded as 97% of the folks who claim that they are doing Agile. (If you are firmly on the other side of each of the four principles of the Agile Manifesto, you may or may not be doing great software development, but it's not Agile.)

The problem with the typical DevOps team is that there's no operations expertise.

reply
pjmlp
13 hours ago
[-]
My experience is that most companies don't do Agile, and DevOps is basically sys admin that also happens to own Jenkins or similar.
reply
tbrownaw
17 hours ago
[-]
> You are far better off with people that can work together across the bridge, but that requires actual mutual trust and respect, and we’re not able to do that.

Are you claiming it's fundamentally impossible for people to get along, or just that positive interpersonal relationships can't be reliably forced at scale?

reply
tsss
18 hours ago
[-]
You don't need 10x developers. You just need to avoid the 1/10 multiplier of pitting separate development and operations teams against each other.
reply
ramoz
18 hours ago
[-]
I mean, look at Kubernetes though. You have to understand both the application and the infrastructure in order to get the deployment right. Especially in any instance of having to pin the runtime to any type of resource (certain disk writing, GPUs, etc).
reply
firesteelrain
18 hours ago
[-]
My experience has been that devs don’t understand their own app resource requirements
reply
ramoz
18 hours ago
[-]
This would be considered a failure, or are you saying they don't need to?
reply
firesteelrain
17 hours ago
[-]
I am saying that in my experience they get upset when the VM or container they provision blows up because it lacks enough resources or they do not place guardrails on their app and end up getting OOMKilled.
reply
silverquiet
14 hours ago
[-]
I think my favorite interaction with a dev around this was when I was explaining how his java program looked like a big juicy target for the OOM killer and it had killed it in order to keep the system working. His response was, "I don't care about the system, I care about my program!" And he understood the irony of that, but it was a good reminder that we have somewhat different views and priorities.
reply
7bit
7 hours ago
[-]
A developer not caring about the system is why the file explorer is painfully slow today, compared to 15 years back.

If a programmes doesn't care about the system, I already know he's shit at his job.

reply
Spivak
14 hours ago
[-]
And that frustration makes sense in the context of the article. Devs don't care about any of that stuff because they're customer facing, it's a distraction from their primary responsibility.

It would be like asking an Amazon delivery drivers to care about oil changes and tire rotations. It's much easier to have a team of mechanics whose primary responsibility is enabling drivers to just drive and focus on delivering packages.

reply
verdverm
17 hours ago
[-]
That's not a kubernetes specific issue. If you run on VMs or Edge, devs also need to know the resource requirements. If anything, k8s makes that consistent and as easy as setting a config section (assuming you have the observability to know what good values are). The default behavior I've seen is to set reqs w/o lims so you get Sche'd but not OOM'd
reply
jhawk28
19 hours ago
[-]
DevOps is dead because it's run by a bunch of ops people who don't know how to do dev and a bunch of dev people who don't know how to do ops. The only tooling problem is that a bunch of companies created "DevOps tools" that then get dictated to use: K8s, terraform, etc. The only way this works is if you build the application to fit within those frameworks. Writing an indexer that is massively parallel and is mainly constrained by CPU/Memory. Instead, you have devs building something that gets thrown over the fence to a devops team that then containerizes it and throw it on K8s. What happens if the application requires lots of IOPS or network bandwidth? K8s doesn't schedule applications that way. "Oh you can customize the scheduler to take that into account". 2 years later, it's still not "customized" because they are ops people who don't know how to code. If you do customize it, the API is going to change in a few months which will break when you upgrade.
reply
orsorna
18 hours ago
[-]
Would you say it's truly dead or that it fails to meet the performance bar you've described?

The reality is that most devs do not consider a holistic picture that includes the infrastructure they will be deploying to. In many cases, it's certainly a skill issue; good devs are hard to find. And to flip the coin, it's hard to find good ops people too.

The reason DevOps continues to linger, however vague a discipline it is, is because it allows the business to differentiate between revenue generating roles and cost center roles. You want your dev resources to prioritize feature work, at the beckon of PMs or upper management, and let your "DevOps" resources to be responsible for actually getting the product deployed.

In essence, it's a ploy to further commoditize engineering roles, because finding unicorns that understand the picture top-to-bottom is difficult (finding /top/ talent is difficult!). In this way, DevOps is well and alive, as a Romero zombie.

reply
BanAntiVaxxers
18 hours ago
[-]
There are not very many ops people who cannot code. Especially these days. I spent at least the last 20 years doing ops. Ops people are HIGHLY motivated to create things that DON’T FAIL. However, ops teams are often blocked by MANAGERS from doing essentially development in the prod environment. I’m talking about tools and scripts. At the places I’ve worked with the highest uptime, it was because ops had an unlimited, unfettered free hand.

Remove the handcuffs from your ops team and your reliability will SOAR.

reply
verdverm
17 hours ago
[-]
Average ops have never been less capable and adverse to programming than now. The problem is getting worse, not better. I know because I am in ops and one of the few who loves to code and accidentally entered the field
reply
0xbadcafebee
14 hours ago
[-]
But the same is true of devs. Many of them are pretty clueless about coding. It's a whole generation of "bootcamp people" who were designers or bartenders and heard there were more lucrative jobs.
reply
rcoder
13 hours ago
[-]
I think that any kind of “modern ops” necessarily includes coding, even if there isn’t a ton of Python or Rust being generated as part of the workflow.

Kubernetes deployment configurations and Ansible playbooks are code. PromQL is code. Dockerfiles and cloud-init scripts are code. Terraform HCL is code.

It’s all code I personally hate writing, but that doesn’t make it less valid “software development” than (say) writing React code.

reply
CaveTech
13 hours ago
[-]
These things are not nearly equivalent. It’s writing code, it’s not software engineering.
reply
SlightlyLeftPad
8 hours ago
[-]
Correct, it’s systems engineering.
reply
bigstrat2003
11 hours ago
[-]
No way. I have worked in ops for 20 years now; almost everyone knows how to code. Some enjoy it and some don't, but people are capable of it and will do it when needed.
reply
verdverm
9 hours ago
[-]
I agree many can code, though a subset are certainly more scripting than engineering (like a typical 3-tier app)

There is also a subset that is very allergic to coding at this point. I've interviewed enough to see people who only know HCL/yaml. There is enough need and work (waste?) in the space that roles like this can exist

reply
mjr00
20 hours ago
[-]
> most orgs are used to responding to a daytime alert by calling out, “Who just shipped that change?” assuming that whoever merged the diff surely understands how it works and can fix it post-haste. What happens when nobody wrote the code you just deployed, and nobody really understands it?

I assume the first time this happens at any given company will be the moment they realize fully autonomous code changes made on production systems by agents is a terrible idea and every change needs a human to take responsibility for and ownership of it, even if the changes were written by an LLM.

reply
hippo22
20 hours ago
[-]
What happens if the person who wrote the code went on vacation? What happens if the code is many years old and no current team member has touched the code?

Understanding code you didn't personally write is part of the job.

reply
solid_fuel
13 hours ago
[-]
I agree that understanding legacy code and code by other people is part of the job, but I don't see how these points are related.

> What happens if the person who wrote the code went on vacation?

They get yelled at, because shipping code at 5 pm on Friday and then leaving for vacation is typically considered a "dick move".

> What happens if the code is many years old and no current team member has touched the code?

Then the issue probably isn't caused by a recent deployment?

reply
0xbadcafebee
13 hours ago
[-]
> every change needs a human to take responsibility for and ownership of it, even if the changes were written by an LLM

Actually it could be the opposite: they hold the LLM responsible. When the code change breaks production they'll just ask the LLM to fix it. If it can't? "Not my fault, the LLM wrote it not me! We just need to improve our prompting next time!" Never underestimate humans' capacity to avoid doing work.

reply
blutoot
19 hours ago
[-]
I think the opposite will happen - leadership will forego this attitude of "reverse course on the first outage".

Teams will figure out how to mitigate such situations in future without sacrificing the potential upside of "fully autonomous code changes made on production systems" (e.g invest more in a production-like env for test coverage).

Software engineering purists have to get out of some of these religious beliefs

reply
verdverm
17 hours ago
[-]
> Software engineering purists have to get out of some of these religious belief

To me, the Claude superfans like yourself are the religious, like how you run around poffering unsubstantiated claims like this and believe in / anthropomorphize way too much. Is it because Anthrop'ic is an abbreviation of Anthropomorphic?

reply
blutoot
16 hours ago
[-]
I would be in the skeptics' camp 3-4 months ago. Opus-4.5 and GPT-5.2 have changed my mind. I'm not talking about mere code completion. I am talking about these models AND the corresponding agents playing a really really capable software engineer + tester + SRE/Ops role.

The caveat is that we have to be fairly good at steering them in the right direction, as things stand today. It is exhaustive to do it the right way.

reply
verdverm
15 hours ago
[-]
I agree the latest Gen of models, Opus 4.5 and Gemini 3 are more capable. 5.2 is OpenAI squeezing as much as they can out of 4 because they haven't had a successful pre training run since Ilya left

I disagree that they are really really capable engineers et al. They have moments where they shine like one. They also have moments where they perform worse than a new grad/hire. This is not what a really really capable engineer looks like. I don't see this fundamental changing, even with all the improvements we are seeing. It's lower level and more core than something adding more layers on top can resolve, that a only addresses best it can

reply
throwaway7783
17 hours ago
[-]
In my own anecdotal experience Claude Code found a bug in production faster than I could. I was the author of the said code, that was written 4 years ago by hand. GPs claim perhaps is not all that unsubstantiated. My role is moving more towards QA/PM nowadays.
reply
verdverm
17 hours ago
[-]
I have many wins with Ai, I also have many fail hards. This experience helps me understand where their limits are

Do you have fail hards to share along with your wins? Are we going to only share our wins like stonk hussies?

reply
throwaway7783
16 hours ago
[-]
For sure. Not hard fails, but bad fixes. It confidently thought it fixed a bug, but it really didn't. I could only tell (it was fairly complex), because I tried reproducing it before/after. Ultimately I believe there was not sufficient context provided to it. It has certainly failed to do what I asked it to do in round 1, round 2, but eventually got it right (a rendering issue for a barcode designer).

These incidents have been less and less over the last year - switching it Opus made failure frequencies less. Same thing for code reviews. Most of it is fluff, but it does give useful feedback, if the instructions are good. For example, I asked for a blind code review of a PR ("Review this PR"), and it gave some generic commentary. I made the prompt more specific ("Follow the API changes across modules and see impact") - it found a serious bug.

The number of times I had to give up in frustration has been going down over the last one year. So I tend believe a swarm of agents could do a decent job of autonomous development/maintenance over the next few years.

reply
majormajor
12 hours ago
[-]
Leadership will do what customers demand, which in most cases won't be ship-constantly-and-just-mitigate.

How to find problems through testing before they happen is a decades-long unsolved problem, sadly.

reply
jauntywundrkind
15 hours ago
[-]
Even lesser agents are incredibly good and incredibly fast using tools to inspect the system & come up with ideas for things to check, and checking them. I absolutely agree: we will 100% give the agents far more power. A browser, a debugger for the server that works with that browser instance, a database tool, a opentelemtry tool.

The teams are going to figure out how to mitigate bad deploys by using even more AI & giving it even better information gathering.

reply
tarxvf
19 hours ago
[-]
If companies were generally capable of that level of awareness they would not operate the way that they do.
reply
verdverm
18 hours ago
[-]
Yaml is my #1 failure in devops. That so many have resigned themselves to this limit and no longer seek to improve, it's disappointing. Our job is to make things run better and easier, yet so many won't recognize the biggest pains in their own work. Seriously, is text templating an invisibly scoped language really where you think the field has reached maturity?
reply
firesteelrain
18 hours ago
[-]
JSON so much easier in my experience and less prone to error
reply
verdverm
18 hours ago
[-]
JSON does not have comments, no JSON5 is not the answer either

Think bigger, it's not something you are using today. The next config language should have schemas built in and support for modules/imports so we can do sharing/caring. It should look and feel like config languages and interoperate with all of those that we currently use. It will be a single configuration fabric across the SDLC.

This exists today for you to try, with CUE

I've been cooking up something the last few weeks for those interested, CUE + Dagger

https://github.com/hofstadter-io/hof/tree/_next/examples/env

reply
throwaway7783
17 hours ago
[-]
Like XML? :)
reply
lijok
17 hours ago
[-]
Like Python?
reply
verdverm
17 hours ago
[-]
CUE
reply
lijok
15 hours ago
[-]
Why not Python?
reply
NewJazz
12 hours ago
[-]
Typing is bolted on rather than a native concept, for one.
reply
lijok
7 hours ago
[-]
Why is that a problem?
reply
verdverm
10 hours ago
[-]
Invisible scoping and turning complete

Python is better than bash in ops, been using more Go in this space

Config is another beast and separate languages

reply
lijok
7 hours ago
[-]
I’m not sold that config is a complex enough domain to necessitate another language. What problems is CUE solving when compared to python and why are those problems substantial enough to make it worth learning a new language?
reply
firesteelrain
5 hours ago
[-]
Given that we now have TOML, JSON, INI, CSV, YAML, etc it seems we are converging on either JSON, YAML or TOML. There is too much inertia behind those three and not much behind CUE right now.
reply
firesteelrain
17 hours ago
[-]
I genuinely despise the identing requirements of YAML.

For comments, I use a _comment field for my custom JSON reading apps

reply
orev
14 hours ago
[-]
I dislike the idea of _comment because it’s something that is parsed and becomes part of the data structure in memory. Comments should be ignored and not parsed.
reply
firesteelrain
4 hours ago
[-]
When I wrote a custom deployment tool for some lab deployments, my Python based tool used JSON as the config language and comments were parsed I guess but not part of my data structure. They were dropped
reply
verdverm
17 hours ago
[-]
yeah, this is what I'm talking about, innovation has stopped and we do dirty hacks like `imports: [...]` in yaml and `_comment` in json

How are people not embarrassed by this complete lack of quality in their work?

reply
firesteelrain
17 hours ago
[-]
I don’t think we need anything formal resembling XML like JSON. It was originally meant for over the wire payloads and people like myself use it for more than that
reply
verdverm
17 hours ago
[-]
You're still thinking "good enough". I'm advocating for the "we can do so much better" attitude

The current popular config choices cause a lot of extra work, bugs, and effort. Is improving the status quo not a worthy goal anymore? Are we at a point in history throwing our hands up and saying meh, I deal with this... is basically where people are today? (I'm somewhat a believer of this based on anecdata and vibes)

reply
firesteelrain
14 hours ago
[-]
The uncomfortable reality is that config formats don’t win by being best. They win by being:

1. already installed everywhere,

2. easy to parse in every language,

3. supported by editors/linters/CI tools,

4. stable enough that vendors bet on them.

reply
verdverm
10 hours ago
[-]
The config language we write does not have to be the same thing the programs read. Same analogy to compilers and assembly
reply
dissent
16 hours ago
[-]
Write it in a higher level language and generate the YAML from that. See the YAML as a wire protocol, not something you author things in.
reply
verdverm
16 hours ago
[-]
exactly, why interop with everything that exists today is important

however, you don't want config being turing complete, that creates a host of other problems at a layer you don't want them

reply
dissent
16 hours ago
[-]
I know what you mean, but there seems to be some kind of misplaced fear about this which has led us down the garden path of unmaintainable config (or even trying to jinja template it!)

If your config is turing complete and consumed as-is, then without a lot of discipline you can dig yourself into a hole, sure.

If you're producing YAML that is not turing complete, that constraint means you have to code in a way that produces deterministic output. It's actually very safe, and YAML maps 1:1 to types in something like Python.

My favourite go-to example is for AWS Cloudformation:

https://github.com/cloudtools/troposphere

reply
hahahahhaah
7 hours ago
[-]
Yaml has it's place and it is great for describing what your single microservice needs.
reply
acdha
4 hours ago
[-]
YAML is okay for writing structured prose for humans. It’s terrible for anything consumed by programs because even that single microservice has a high likelihood of some problem caused by YAML’s magic typing, silent data loss due to indentation, etc. unless you pair it with a separate validation tool chain, making the argument for simplicity increasingly dubious.
reply
bmitch3020
18 hours ago
[-]
DevOps only failed in that so many don't know what it is.

DevOps isn't a tool, but there are lots of tools that make it easier to implement.

DevOps isn't how management can eliminate half the org and have one person do two roles, specialization is still valuable.

DevOps isn't an organization structure, though the wrong org structure can make it fail.

DevOps is collaboration. It's getting two distinct roles to better interoperate. The dev team that wants to push features fast. And the ops team that wants stability and uptime.

From the management side, if you aren't focused on building teams that work well together, eliminating conflicts, rewarding the team collectively for features and uptime, and giving them the resources to deliver, that's not a DevOps failure, that's a management failure.

reply
acdha
3 hours ago
[-]
I think this is the key insight: most of the problems are related to management decisions so they’re only DevOps failures to the extent that the movement failed to get political pressure to fix those.
reply
mgilroy
17 hours ago
[-]
I'd argue that it has failed in some organisations. DevOps for me is embedding the operations with the development team. I still have operations specialist, however, they attend the development team stand ups and help articulate the problems to the developers. They may have separate operations standups and meetings to ensure the operations teams know what they are doing and share best practices. Developers learn about the operations side from those that understand it well and the operations experts learn the limitations and needs of the developers. Occasionally I am fortunate to discover someone's that can understand both areas incredibly well. Either way, this results in increased trust and closer working. You don't care about helping some random person on a ticket from a tream you don't know. You do care about the person you work with daily and understand the problems they have.

If you can't account for someone spending x% of their time working with a team but for budgetary purposes belonging to a different team then sack your accountants.

DevOps,like agile, when done correctly should help to create teams that understand complete systems or areas of a business work more efficiently than having stand alone teams. The other part of the puzzle is to include the QA team too to ensure that the impact of full system, performance and integration tests are understood by all and that both everyone understands how their changes impact everything else.

Having the dev team build code that makes the test and ops teams life easier benefits everyone. Having the ops team provide solutions that support test and dev helps everyone. Having test teams build system that work best with the Dev and ops teams helps everyone.

Agile development should enable teams to work at a higher level of performance by granting them the agency to make the right decisions at the right time to deliver a better product by building what is needed in the correct timeline.

DevOps and agile fail where companies try to follow waterfall models whilst claiming agile processes. The goal with all these business and operating models is to improve efficiency. When that isn't happening then either you aren't applying the model correctly or you need to change the model.

reply
politelemon
19 hours ago
[-]
If your developers weren't looking at dashboards before, they won't use a chat interface to interrogate it either. That doesn't really bring it to them any more than their existing capabilities. There's also a worrying underlying assumption being made here that the answers your LLM will give you are accurate and trustworthy.
reply
verdverm
17 hours ago
[-]
My underlying assumption is that this is a content marketing piece to show managers / investors that "we are doing/thinking something in ai as a company"
reply
amtamt
19 hours ago
[-]
> There's also a worrying underlying assumption being made here that the answers your LLM will give you are accurate and trustworthy.

I first hand saw in, AWS devDays, an AI giving SIWINCH as "root-cause" of Apache error in a containerized process is in EKS for a backend FCGI process connection error. It has been extremely hard since that demo to trust any AI for system level debugging.

reply
verdverm
17 hours ago
[-]
(1) when was that? If it was less < 6months ago, the current gen of models is noticeably better

(2) AWS is not a leader, if even a contender, in the AI space. I would not evaluate the potential based on a demo they produced

reply
temp0826
19 hours ago
[-]
If we were smart we'd use AI to grok a system in order to help us reduce its complexity. I don't think we're anywhere close to even being able to provide all the necessary context to solve problems like this.
reply
anonymars
19 hours ago
[-]
Am I the only one who remembers when DevOps meant "developers are responsible for dealing with the operational part of their software too, so that they don't just throw stuff over the wall for another team to deal with the 3AM pages"?

It seems to have become: "we turned ops into coding too, so now the ops team needs to be good at software engineering"

reply
vee-kay
18 hours ago
[-]
DevOps was (and is) merely an excuse for companies to replace Developers with cheaper Ops resources, and yet expecting better services and better products from them.

My personal experience says that the best way is that Ops team shouldn not be repurposed as Developers, rather put the experienced Developers into Production Support (incident management, that's intense Ops, working in shifts and weekends, etc.). And rotate them whenever needed. Over a period of time, you'll invariably see less defects and issues percolating down from the Devs, and then after both sides are stable and working well together with less friction and open tickets, then some more tech savvy Ops members can be rotated into Development teams as rookie devs to help reduce costs a bit (as there'll invariably be some natural attrition among the Devs and Ops, so this gives an alternative career path to the Ops team (who are usually less paid, and more stressed), and pushes the Devs not to become complacent). Such an approach is doable and productive.

reply
solid_fuel
13 hours ago
[-]
> DevOps was (and is) merely an excuse for companies to replace Developers with cheaper Ops resources, and yet expecting better services and better products from them.

Most places I've worked it was the even worse "we've laid off the ops team, now developers are responsible for both" followed by "no we can't hire any more developers, we have enough already".

reply
chanux
14 hours ago
[-]
> DevOps was (and is) merely an excuse for companies to replace Developers with cheaper Ops resources [..].

Like everything, the original intentions must have been noble. But as we can see, looking back, it got popular and popular enough to get to the enterprise types.

Nothing really survives that.

PS: I have witnessed a sysadmin team being renamed DevOps and then SRE with not much other meaningful changes. I couldn't believe it at the time.

reply
starkparker
11 hours ago
[-]
> Like everything, the original intentions must have been noble.

It was, ca. 2012-15. Sysadmins making automation tools so they could offload the horseshit, often batshit bash/perl scripting work, of manually provisioning dev environments (on VMs, or even basic configuration of new bare metal) to devs, who were already more comfortable with writing their own automation. Devs can unblock themselves, and devs hate relying on anyone else and everyone worships and fears the devs, so fine, give them the sysadmins' rope and rafters.

Moving to a "cattle not pets" mentality for servers well before the proliferation of containers and microservices, much less the mainstreaming of serverless workflows and cloud compute. CI/CD, to make software release processes scriptable, or even better declarative, tasks that could be tested and verified in version-controlled source _before_ being deployed, just like the software itself.

Better automation and better testing meant devs could ship safer and faster; devs owning pipelines meant devs could fix dev-related problems faster.

A lot of early devops tools were written by sysadmins who were tired of being buried by rapidly growing requests to unblock developers, who were outnumbering them by the hundreds or thousands to one at FANG companies (pre-FAANG, much less the big six).

Puppet attacked config management by turning it into declarative code, Ansible made that easier to deploy; Luke Kanies and Michael DeHaan came from sysadmin. HashiCorp made VM provisioning scalable; Armon Dadgar and Mitchell Hashimoto were compsci students who hated doing ops work with rudimentary early cloud services. Most of their early sales inroads into companies came from IT departments using their open-source products; most of their early evangelists were IT executives.

Google splintering devops into the SRE role they coined mostly reflected how they (thought they) had made the "devs unblocking themselves on provisioning" problem that had inspired a lot of foundation tools simply part of the dev culture, especially through GCS and k8s. They didn't think about "devops" anymore much like people don't think about breathing, and narrowed their focus onto uptime.

That was really the failure IMO, that the idea was mostly a cultural one: people working on a problem should also have a stake in, or ownership of, the things they need to unblock their work. A dev being "blocked" from dev work by IT because only IT can provision a piece of hardware or stand up a VM is a cultural problem; the largely open-source tools made by sysadmins and junior/student devs were a response to an entrenched enterprise culture that showed no interest in doing the work necessary to solve that problem.

The tools forced the culture change, but then the tools created their own culture, and the world that defined the culture also changed beneath them. But the companies built around those tools didn't want to die, so they turned devops into whatever might keep them alive.

The problem isn't that "devops" failed to do the job it set out to do (make sysadmins' lives easier), it's that the entire problem area changed so much, and so quickly, that its goal was no longer relevant. There were no "sysadmins" left to help; there are still systems, and there are still administrators, but their responsibilities have been diced up and tossed into the organizational winds.

Not quite as easy of a narrative for the founder of an ops company selling an ops product to frame in a company blog post, though. Not that things in the post are necessarily wrong, but IMO the problem isn't "devops failed", it's why the fuck are we still talking about devops? The word means nothing anymore, its massive overloading pollutes any discussion about who's having problems, what those problems are, and what the solutions to those problems might be.

Or, IMO the problem is that few to no people are asking the modern equivalent of "how do we make sysadmins' lives better?" They're instead chasing a ghost of a concept that peaked a decade ago, because that's easier than looking at an organization's failures from both a sufficiently high and low level to see the cracks that run all the way through them.

reply
Uvix
17 hours ago
[-]
We tried this, but we just got more defects, because the Devs lost what little Ops knowledge they had. Where previously Ops would have to involve Devs, now that Production Support has some Dev knowledge, suddenly they get the blame for everything. Devs no longer have interest in things like "reading log files"; they just ship any problems over to Production Support.
reply
vee-kay
9 hours ago
[-]
Any day, I (as a manager) would prefer to have an experienced Developer do a Production Support role, rather than a cheaply obtained non-engineering campus graduate hired as "Tech Ops" resource to do Production Support on complex, mission-critical systems.

It is a bad idea for a company to give shoddy after-sales support to customers, because they would then lose the customer's trust and relationship in the long run. No customer wants to see their production systems have frequent incidents caused hours or days of outages.

Vendor companies ignoring investment and support for Production Support on their Products/Services, do so at their own peril.

In fact, canny companies have realised the real money is not in upfront cost, but in volume billing (billing/invoicing, based on monthly transactions counts, number of users/licenses and tiered rate card), so they need to have adequate Production Support teams

This is why companies are trying their level best to move existing customers to subscription services (e.g., Office 365 by Micro$oft).

reply
verdverm
17 hours ago
[-]
You can find examples that go both ways for both endeavors, anecdata...

The problem in your case is not the dev vs ops split, it's a company culture thing which I'm sure you see play out in more places than this current focus

reply
prmoustache
18 hours ago
[-]
I am with you.

DevOps is a methodology. DevOps as a role or team name is a fantasy from people who do not understand the methodology.

If you want DevOps to work, your Ops must be member of the development team, take part in the sprints, etc. But many company do not want to do that because they want to separate ops and dev budget/accounting and do not want to hire enough people with ops skills.

reply
verdverm
17 hours ago
[-]
This is not true, you can make it work well either way. It's about people and processes, not about some specific setup or way of grouping people
reply
verdverm
17 hours ago
[-]
That was an ambition of devops at one point, it has not born the fruit it promised. Dev teams are not positioned to do ops well. We have specializations for a reason
reply
anonymars
14 hours ago
[-]
Indeed. Another comment brought up the comparison with the idea of "full stack" (https://news.ycombinator.com/item?id=46662777). Management would sure love it if we could all be interchangeable widgets, wouldn't they (with no pesky tribal knowledge either)
reply
antod
13 hours ago
[-]
I don't know of any other term in tech that people experience in so many different often contradictory ways that causes people to talk past each other because they're all talking about different things or been places that work so differently.
reply
hahahahhaah
7 hours ago
[-]
Agile is like this too.
reply
zug_zug
19 hours ago
[-]
"I think the entire DevOps movement was a mighty, ... it failed."

I'm so sick of this nonsense. "Devops" isn't failing, isn't an issue, you can rename it whatever you want, but throughout my career the devops engineers (the ones you don't skimp on) are the best, highest paid professionals at the company.

I don't know why I keep reading these completely crazy think-pieces hemming and hawing about a system (having a few engineers who master performance/backups/deployments/oncall/retros) that seems to be wildly successful. It would be nice if more engineers understood under-the-hood, but most companies choose not to exclusively hire at that caliber.

reply
verdverm
17 hours ago
[-]
For sure, I just turned down a gig because the company saw devops as an afterthought, not as something they would invest in. They wanted me to come in and "fix some issues quick" on a short-term contract. What they really need is 1-2 FTE ops people who think about their problems every day. If you are pushing past 3-4 devs to 10 and you have no intention of hiring a FTE ops person, you are not doing it right and shall reap what you have sown before long
reply
0xbadcafebee
14 hours ago
[-]
As a movement, DevOps failed a long time ago. Once the word completely lost its meaning, it was impossible to educate anyone about it. But as a business practice (which is mostly what it is), it's still a viable option that any business can implement. It just takes the right people rising into leadership positions to enact it.
reply
pjmlp
13 hours ago
[-]
It failed because for management and HR, DevOps means ops, for them it is the new buzzword for systems integration, integration engineers, sysadmin,..

I have been foolish enough to accept a few project proposals with DevOps role, which in the end meant ops work dealing with VMs, networking and the like.

reply
blutoot
19 hours ago
[-]
My message to the CTO of Honeycomb.io (who apparently wrote this post): please avoid getting philosophical and controversial to gin up curiosity about your AI platform. If you want to highlight the benefits of your platform then do so earnestly and objectively. Please don't mask marketing with an excoriation of a profession that has never been well-defined (or has always been defined to fit into an organization's political landscape for the most part). And you guys (like every other SRE/Ops platform) capitalized on that structural divide and deservedly got rich by selling licenses to these teams. I don't think you can come in now with this holier-than-thou best practice messaging just because platforms like yours have zero moat in this post-CC/Codex world.

Hence my vitriol: https://news.ycombinator.com/item?id=46662287.

reply
TacticalCoder
19 hours ago
[-]
> id getting philosophical and controversial to gin up curiosity about your AI platform

Also: please could he please avoid doing it by illustrating his non-sense with graphs that are both childish and non-sensical?

reply
maccard
18 hours ago
[-]
The CTO is a she.
reply
browningstreet
14 hours ago
[-]
I miss good ol’ classical sys admin.
reply
maerF0x0
14 hours ago
[-]
I'm surprised no one has commented that open loops plus closed loops is more or less how velcro works, and that stuff sticks.
reply
skybrian
18 hours ago
[-]
I don't understand these graphs. Why do the lines go back in time?
reply
verdverm
17 hours ago
[-]
feed"back" loops
reply
VerifiedReports
14 hours ago
[-]
Has this buzzword even been around 20 years?
reply
BobbyTables2
13 hours ago
[-]
Like other fads, it has certainly failed to go out of existence as soon as it should have.
reply
omnifischer
8 hours ago
[-]
From the article:

> What the devs care about is the ability to understand the product experience from the perspective of each customer. In practice, this can mean any combination or permutation of agent, user, mobile device type, laptop, desktop, point of sale device, and so on

Really? Any permutation?

Most (arse hole) devs

- Import world - as it works on their latest 1TB machine or macOS studio

- always on the latest iPhone or pixel

- Add 100 tracking that works on their own machine

- POS device? They should ask some of devs to go and work on their canteens that have POS

reply
jbreckmckye
19 hours ago
[-]
Because the idea you can have all aspects of maintaining a complex piece of technology, maintained by a single cross-skilled team of interchangeable cogs, is utopian and unworkable past any reasonable level of scale

DevOps, shift left, full stack dev, all reminds me of the Futurama episode where Hermes Conrad successfully reorgs the slave camp he's sent to, so that all physical labour is done by a single Australian man

Speaking darker, there is a kind of - well, perhaps not misanthropy, but certainly a not-so-well-meaning dismissiveness, to the "silo breaking" philosophy that looks at complex fields and says "well these should all just be lumped together as one thing, the important stuff is simple, I don't know why you're making all these siloes, man" - assuming that ops specialists, sysadmins, programmers, DBAs, frontend devs, mobile devs, data engineers and testers have just invented the breadth and depth and subtleties of their entire fields, only as a way of keeping everybody else out

But modern systems are complex, they are only getting more so, and the further you buy into the shift-left everyone-is-everything computer-jobs-are-all-the-same philosophy the harder and harder it will get to find employees who can straddle the exhausting range of knowledge to master

reply
lll-o-lll
17 hours ago
[-]
> the "silo breaking" philosophy that looks at complex fields and says "well these should all just be lumped together as one thing, the important stuff is simple,

I don’t think this is the right take. “Silo’s” is an ill-defined term, but let’s look at a couple of the negative aspects. “Lack of communication”, and “Lack of shared understanding” (or different models of the world). I’m going to use a different industry example, as I think it helps think about the problem more abstractly.

In the world of biomedical engineering, the types of products you are making require the expertise of two very different groups of people. Engineers and Doctors. A member of either of these groups have an in-group language, and there is an inherent power differential between them. Doctors are more “important” than engineers. But to get anything made, you need the expertise of both.

One way to handle this is to keep the engineers and doctors separate and to communicate primarily via documents. The doctor will attempt to detail exactly how a certain component should work. The engineer will attempt to detail the constraints and request clarifications.

The problem with this approach is that the engineer cannot speak “doctorese” nor can the doctor speak “engineerese”; and the consequence is a model in each person’s head that differs significantly from the other. There is no shared model; and the real world product suffers as a result.

The alternative is to attempt to “break the silos”; force the engineers and doctors to sit with each other, learn each other’s language, and build a shared mental model of what is being created. This creates a far better product; one that is much closer to the “physical reality” it must inhabit.

The same is true across all kinds of business groups. If different groups of people are required to collaborate, in order to do something, those people are well served by learning each other’s languages and building a shared mental model. That’s what breaking silos is about. It is not “everyone is the same”, it’s “breaking down the communication barriers”.

reply
jbreckmckye
17 hours ago
[-]
I don't think that's like DevOps, though. A closer analogy would be a business that only hired EngDocs, doctors who had to be accredited engineers as well as vascular surgeons.

I don't think anyone thinks siloes are themselves a good thing, but they might be a necessary consequence of having specialists. Shift-left is mostly designed to reduce conversations between groups, by having individuals straddle across tasks. It's actually kind of anti-collaboration, or at least pessimistic that collaboration can happen

reply
lll-o-lll
17 hours ago
[-]
Oh, I completely agree! We created “EngDocs”, as you say, and simply made the situation worse. An EngDoc is an obviously ludicrous concept, on its face. But by breaking down the silo in the biomedical example, each engineer becomes a bit knowledgeable about an aspect of medicine and each doctor gains some knowledge about aspects of engineering.

I am arguing that all such people, whether developers or ops or ux designers or product managers; need to engage in this learning as they collaborate. This doesn’t mean that we want the DevPM as a resultant title, just that Siloing these different groups will lead to perverse outcomes.

Dev and ops have been traditionally siloed. DevOps was a silly attempt to address it.

reply
bravetraveler
18 hours ago
[-]
Scratching neck: come on... just one more vendor, bro
reply
hahahahhaah
7 hours ago
[-]
I am lucky that the article is alien to me. I work somewhere we do devops. I write code and tests etc. but I would unlikely go a day without observing something in prod either by telemetry, dogfooding or getting paged. It is a really cool way to work understanding the whole system (within the team remit of course). And vaguely understanding neighbouring teams.

To the comments dev and ops are different. They are! I think the magic is massive platform team support too. I am not troubleshooting why splunk indexes aren't indexing for example.

reply
GiorgioG
18 hours ago
[-]
In my experience DevOps has little interest in doing actual DevOps - they just want to run ops. They want to advise (or tell us we’re holding it wrong) but not actually get their hands dirty. On the flip side, devs don’t want to spend a ton of time learning k8s or how to manage servers, cloud services, etc.

DevOps is a mess of our own making - embracing K8s created complexity for little gain for nearly all companies.

reply
iwontberude
13 hours ago
[-]
You made a graph with T being the x axis and then had lines which go backwards in time. I closed the window at this point.
reply
gardenhedge
19 hours ago
[-]
In my company, instead of relying on an ops team.. we rely on a devops team.
reply
kittikitti
13 hours ago
[-]
The vast majority of people who called themselves DevOps were opportunist corporatists looking to get a promotion. It became less Ops as in operations and more Ops as in opponent. These people were haphazardly given the keys to production and I was often bullied by their ability to control it. Management needs to know if my application was running? They asked DevOps who often came back with lies in order to fit their own career objectives.

I apologize if my words were sharp because many DevOps engineers were not mean to me. Perhaps I just had bad luck to deal with ignorant gatekeepers to production. You already know if my opinion doesn't apply to you.

reply
blutoot
20 hours ago
[-]
I can't wait for indie developers to build super-agents that commoditize providers like Honeycomb.io and more importantly clone all their features and offer them up for free as OSS.
reply
verdverm
17 hours ago
[-]
Sounds like you don't know what a nightmare of version compat and bespokeness ops/obv is. This is going to be one of the harder things for LLMs to do because everyone is running on some snowflake held together with duct tape
reply
blutoot
15 hours ago
[-]
Fair point - my statement is more about stealing market for simpler integrations by undercutting them on price.

And I don't want to trivialize the reality of enterprise platforms where bespoke connectors rule. I have dealt with migrations of platforms that are business critical and managing version compatibility and ensuring none of the integrations regressed was par for the course. I am not even saying that that makes me qualified to replicate Honeycomb.io. But I do think someone with a deep technical background in building observability platforms armed with Claude Code or Codex and armed with the right set of MCP's and all the necessary tooling should be able to build a clone of Honeycomb.uio.

Maybe it won't be a fast turnaround like a typical vibe-coded project but even if it is a month-long project to even get to 60% feature parity. these vendors will have to sit up and pay attention.

reply
verdverm
9 hours ago
[-]
> And I don't want to trivialize...

as you immediately trivialize something it seems you know very little about

MCPs are outdated btw, it's bad to attach a bunch of MCPs in with your messages, pollutes the context. If you don't do this, you can build agents that are better than copilot/codex on gemini-3-flash. Claude Code is probably the leader here, but still definitely not capable of what you it is

reply
hahahahhaah
7 hours ago
[-]
I assume then you are retired or not a programmer as you are wishing for the last bastions of comoanies that pay programmers to melt with the ice sheets, leaving the desert of no paid coding work.
reply
alphazard
19 hours ago
[-]
DevOps only works when the developers are always right. What usually happens is the DevOps team thinks they know best (they are developers too, just not the ones using the tools), and they build a lot of garbage that no one wants to use, often making things more complicated than they were before.

Eventually a bureaucrat becomes the manager of the team, and seeks to expand the set of things under DevOps' control. This makes the team a single point of failure for more and more things, while driving more and more developer processes towards mediocrity. Velocity slows, while the DevOps bottlenecks are used as a reason to hire.

It's an organizational problem, not a talent or knowledge problem. Allowing a group to hire and grow within an organization, which is not directly accountable for the success of the other parts of the organization that it was intended to support, is creating a cancer, definitionally.

reply
verdverm
17 hours ago
[-]
Don't attribute internal dysfunction and mistakes to an entire field. I've worked in an org where the opposite is true. Blanket statements like these never hold up because they lack nuance and usually are inspired by frustration
reply