In response, there's been a wave of "serverless" startups because the idea of running anything yourself has become understood as (a) a time sink, (b) incredibly error prone, and (c) very likely to fail in production.
I think a Kubernetes 2.0 should consider what it would look like to have a deployment platform that engineers can easily adopt and feel confident running themselves – while still maintaining itself as a small-ish core orchestrator with strong primitives.
I've been spending a lot of time building Rivet to itch my own itch of an orchestrator & deployment platform that I can self-host and scale trivially: https://github.com/rivet-gg/rivet
We currently advertise as the "open-source serverless platform," but I often think of the problem as "what does Kubernetes 2.0 look like." People are already adopting it to push the limits into things that Kubernetes would traditionally be good at. We've found the biggest strong point is that you're able to build roughly the equivalent of a Kubernetes controller trivially. This unlocks features more complex workload orchestration (game servers, per-tenant deploys), multitenancy (vibe coding per-tenant backends, LLM code interpreters), metered billing per-tenant, more powerful operators, etc.
Someone decides X technology is too heavy-weight and wants to just run things simply on their laptop because "I don't need all that cruft". They spend time and resources inventing technology Y to suit their needs. Technology Y gets popular and people add to it so it can scale, because no one runs shit in production off their laptops. Someone else comes along and says, "damn, technology Y is too heavyweight, I don't need all this cruft..."
"There are neither beginnings nor endings to the Wheel of Time. But it was a beginning.”
A lot of times stuff gets added to simple systems because they're thought to be necessary for production systems, but as our experience grows we realize those additions were not necessary, were in the right direction but not quite right, were leaky abstractions, etc. Then when the 4-year-experienced Senior Developers reinvent the simple solution, they get stripped out - which is a good thing. When the new simple system inevitably starts to grow in complexity, it won't include the cruft that we now know was bad cruft.
Freeing the new system to discover its own bad cruft, of course. But maybe also some new good additions, which we didn't think of the first time around.
I'm not a Kubernetes expert, or even a novice, so I have no opinions on necessary and unnecessary bits and bobs in the system. But I have to think that container orchestration is a new enough domain that it must have some stuff that seemed like a good idea but wasn't, some stuff that seemed like a good idea and was, and lacks some things that seem like a good idea after 10 years of working with containers.
I've grown to learn that the bulk of the criticism directed at Kubernetes in reality does not reflect problems with Kubernetes. Instead, it underlined that the critics are actually the problem, not Kubernetes. I mean,they mindlessly decided to use Kubernetes for tasks and purposes that made no sense, proceeded to be frustrated due to the way they misuse it, and blame Kubernetes as the scapegoat.
Think about it for a second. Kubernetes is awesome in the following scenario:
- you have a mix of COTS bare metal servers and/or vCPUs that you have lying around and you want to use it as infrastructure to run your jobs and services,
- you want to simplify the job of deploying said jobs and services to your heterogeneous ad-hoc cluster including support for rollbacks and blue-green deployments,
- you don't want developers to worry about details such as DNS and networking and topologies.
- you want to automatically scale up and down your services anywhere in your ad-hoc cluster without having anyone click a button or worry too much if a box dies.
- you don't want to be tied to a specific app framework.
If you take ad-hoc cluster of COTS hardware out of the mix, odds are Kubernetes is not what you want. It's fine if you still want to use it, but odds are you have a far better fit elsewhere.
Did they need to know this before Kubernetes? I've been in the trade for over 20 years and the typical product developer never cared a bit about it anyway.
> - you don't want to be tied to a specific app framework.
Yes and no. K8s (and docker images) indeed helps you in deploying more consistently different languages/frameworks but the biggest factor against this is in the end still organizational rather than purely technical. (This in an average product company with average developers, not super-duper SV startup with world-class top-notch talent where each dev is fluent in at least 4 different languages and stacks).
Yes? How do you plan to configure an instance of an internal service to call another service?
> I've been in the trade for over 20 years and the typical product developer never cared a bit about it anyway.
Do you work with web services? How do you plan to get a service to send requests to, say, a database?
This is a very basic and recurrent usecase. I mean, one of the primary selling points of tools such as Docker compose is how they handle networking. Things like Microsoft's Aspire were developed specifically to mitigate the pain points of this usecase. How come you believe that this is not an issue?
You are the ops. There are no sysadmdins. Do you still live in the 90s? I mean, even Docker compose supports specifying multiple networks where to launch your local services. Do you ever worked with web services at all?
Hell, in some places, Ops are pushing k8s partially because it makes DNS and TLS something that can be easily and reliably provided in minimal amount of time, so you (as a dev) don't have a request for DNS update wait 5 weeks while Ops are fighting fire all the time.
The only developers who never had to know about the network are those who do not work with networks.
Ironically, that looks a lot like when k8s is managed by a dedicated infra team / cloud provider.
Whereas in most smaller shops that erroneously used k8s, management fell back on the same dev team also trying to ship a product.
Which I guess is reasonable: if you have a powerful enough generic container orchestration system, it's going to have enough configuration complexity to need specialists.
(Hence why the first wave of not-k8s were simplified k8s-on-rails, for common use cases)
There are multiple concerns at play:
- how to stitch together this cluster in a way that it can serve our purposes,
- how to deploy my app in this cluster so that it works and meets the definition of done.
There are some overlaps between both groups, but there are indeed some concerns that are still firmly in the platform team's wheelhouse. For example, should different development teams have access to other team's resources? Should some services be allowed to deploy to specific nodes? If a node fails, should a team drop work on features to provision everything together again? If anyone answers "no" to any of the questions, it is a platform concern.
Which is a shame really because if you want something simple, learning Service, Ingress and Deployment is really not that hard and rewards years of benefits.
Plenty of PaaS who will run your cluster for cheap so you don't have to maintain it yourself, like OVH.
It really is an imaginary issue with terrible solutions.
Or... were instructed that they had to use it, regardless of the appropriateness of it, because a company was 'standardizing' all their infrastructure on k8s.
There's no such thing as "bad cruft" - all cruft is are features you don't use but are (or were) in all likelihood critical to someone else's workflow. Projects transform from minimal and lightning fast, to bloated, one well-reasoned-PR at a time; someone will try to use a popular project and figure "this would be perfect, if only it had feature X or supported scenario Y", multiplied by a few thousand PRs.
If you'll entertain my argument for a second:
The job of someone designing systems like this is to decide what are the correct primitives and invest in building a simple + flexible platform around those.
The original cloud primitives were VMs, block devices, LBs, and VPCs.
Kubernetes became popular because it standardized primitives (pods, PVCs, services, RBAC) that containerized applications needed.
Rivet's taking a different approach of investing in different three primitives based on how most organizations deploy their applications today:
- Stateless Functions (a la Fluid Compute)
- Stateful Workers (a la Cloudflare Durable Objects)
- Containers (a la Fly.io)
I fully expect to raise a few hackles claiming these are the "new primitives" for modern applications, but our experience shows it's solving real problems for real applications today.
Edit: Clarified "original cloud primitives"
I think your take only reflects buzzword-driven development, and makes no sense beyond that point. A "stateless function" is at best a constrained service which supports a single event handler. What does that buy you over Kubernetes plain old vanilla deployments? Nothing.
To make matters worse, it doesn't seem that your concept was thought all the way through. I mean, two of your concepts (stateless functions and stateful workers) have no relationship with containers. Cloudflare has been for years telling everyone who listens that they based their whole operation on tweaking the V8 engine to let multiple tenants run their code in how many V8 isolates they want and need. Why do you think you need containers to run a handler? Why do you think you need a full blown cluster orchestrating containers just to run a function? Does that make sense to you? It sounds like you're desperate to shoehorn the buzzword "Kubernetes" next to "serverless" in a way it serves absolutely no purpose beyond being able to ride a buzzword.
> Why do you think you need containers to run a handler?
You don't, but plenty of people don't care and ask for this shit. This is probably another way of saying "buzzword-driven" as people ask for "buzzwords". I've heard plenty of people say things like
We're looking for a container native platform
We're not using containers yet though.
We were hoping we can start now, and slowly containerize as we go
or I want the option to use containers, but there is no business value in containers for me today. So I would rather have my team focus on the code now, and do containers later
These are actual real positions by actual real CTOs commanding millions of dollars in potential contracts if you just say "ummm, sure.. I guess I'll write a Dockerfile template for you??"> Why do you think you need a full blown cluster orchestrating containers just to run a function?
To scale. You need to solve the multi-machine story. Your system can't be a single node system. So how do you solve that? You either roll up your sleeves and go learn how Kafka or Postgres does it for their clusters or you let Kubernetes most of that hardwork and deploy your "handlers" on it.
> Does that make sense to you?
Well... I don't know. These types of systems (of which I have built 2) are extremely wasteful and bullshit by design. A design that there will never be a shortage of demand for.
It's a really strange pattern too. It has so many gotchas on cost, waste, efficiency, performance, code organization, etc. You always look and whoever built these things either has a very limited system in functionality, or they have slowly reimplemented what a "Dockerfile" is, but "simpler" you know. it's "simple" because they know the ins and outs of it.
Why can't it be? How many customers do you have that you can't deploy a bunch of identical workers over a beefy database?
Companies spend so much time on this premature optimization, that they forget to actually write some features.
That's a fundamental problem with the approach OP is trying to sell. It's not solving any problem. It tries to sell a concept that is disconnected from technologies and real-world practices, requires layers of tech that solve no problem nor have any purpose, and doesn't even simplify anything at all.
I recommend OP puts aside 5 minutes to go through Cloudflare's docs on Cloudflare Workers that they released around a decade ago, and get up to speed on what it actually takes to put together stateless functions and durable objects. Dragging Kubernetes to the problem makes absolutely no sense.
It feels like you didn't really read his comment yet are responding with an awful lot of hostility.
You don't. A container only handles concerns such as deployment and configuration. Containers don't speak HTTP either: they open ports and route traffic at a OSI layer lower than HTTP's.
Containers can contain code which open arbitrary ports using the provided kernel interface whereas serverless workers cannot. Workers can only handle HTTP using the provided HTTP interface.
I don’t need a container, sure, I need a system with a network sockets API.
I don't have the links to Azure's or GCP's function emulation framework, but my recollection is that they behave similarly, for similar reasons
Funnily, enough, the V8 isolates support stdio via WASM now
Not true. Cloudflare Workers support Cron triggers and RPC calls in the form of service bindings. Also, Cloudflare Queues support consumer workers.
Claiming "Workers can only handle HTTP" is also meaningless as HTTP is also used to handle events. For example, Cloudflare Queues supports consumers using HTTP short polling.
But I still can't handle SSH or proxy WireGuard or anything like that (yet!)
Just because something’s complex doesn’t necessarily mean it has to be that complex.
I can assure you that trying to reproduce kubernetes with a shitload of shell scripts, autoscaling groups, cloudwatch metrics, and hopes-and-prayers is too complex for my metric within the audience of people who know Kubernetes
The tools I've most enjoyed (including deployment tools) are those with a clear target group and vision, along with leadership that rejects anything that falls too far outside of it. Yes, it usually doesn't have all the features I want, but it also doesn't have a myriad of features I don't need
I don't think so. The original problem that the likes of Kubernetes solves is still the same: setup a heterogeneous cluster of COTS hardware and random cloud VMs to run and automatically manage the deployment of services.
The problem, if there is any, is that some people adopt Kubernetes for something Kubernetes was not designed to do. For example, do you need to deploy and run a service in multiple regions? That's not the problem that Kubernetes solves. Do you want to autoscale your services? Kubernetes might support that, but there are far easier ways to do that.
So people start to complain about Kubernetes because they end up having to use it for simple applications such as running a single service in a single region from a single cloud provider. The problem is not Kubernetes, but the decision to use Kubernetes for an application where running a single app service would do the same job.
Part of it comes from new generations not understanding the old technology well enough.
Part of it comes from the need to remake some of the most base assumptions, but nobody has the guts to redo Posix or change the abstractions available in libc. Everything these days is a layer or three of abstractions on top of unix primitives coming up with their own set of primitives.
This stupid need we have to create general purpose platforms is going to be the end of progress in this industry.
Just write what you need for the situation you’re in. Don’t use kubernetes and helm, use your own small thing that was written specifically to solve the problem you have; not a future problem you might not have, and not someone else’s problem. The problem that you have right now.
It takes much less code than you think it will, and after you’ve done it a few times, all other solutions look like enormous Rube Goldberg machines (because that’s what they are, really).
It is 1/100th of the complexity to just write your own little thing and maintain it than it is to run things in Kubernetes and to maintain that monster.
I’m not talking about writing monoliths again. I’m talking about writing only the tiny little bits of kubernetes that you really need to do what you need done, then deploying to that.
Don't limit yourself like that. A journey of a thousand miles begins with a single step. You will have your monolith in no time
Re-implement the little bits of kubernetes you need here and there. A script here, an env var there, a cron or a daemon to handle tings. You'll have your very own marvelous creation in no time. Which is usually the perfect time to jump to a different company, or replace your 1.5 year old "legacy system". Best thing about it? no one really understands it but you, which is really all that matters.
The things like this that I write stay small. It is when others take them over from me that those things immediately bloat and people start extending them with crap they don’t need because they are so used to it that they don’t see it as a problem.
I am allergic to unnecessary complexity and I don’t think anyone else that I have ever worked with is. They seem drawn to it.
* single container
* docker compose
* manual deployment (with docker run commands)
But erm, realistically how is this a viable way to deploy a "serverless infrastructure platform" at any real scale?
My gut response would be ... how can I deploy Rivet on Kubernetes, either in containers or something like kube-virt to run this serverless platform across a bunch of physical/virtual machines? How is docker compose a better more reliable/scalable alternative to Kubernetes? So alternately then you sell a cloud service, but ... that's not a Kubernetes 2.0. If I was going to self-host Rivet I'd convert your docs so I could run it on Kubernetes.
If you're curious on the details, we've put a lot of work to make sure that there's as few moving parts as possible:
We have our own cloud VM-level autoscaler that's integrated with the core Rivet platofrm – no k8s or other orchestrators in between. You can see the meat of it here: https://github.com/rivet-gg/rivet/blob/335088d0e7b38be5d029d...
For example, Rivet has an API to dynamically spin up a cluster on demand: https://github.com/rivet-gg/rivet/blob/335088d0e7b38be5d029d...
Once you start the Rivet "seed" process with your API key, everything from there is automatic.
Therefore, self-hosted deployments usually look like one of:
- Plugging in your cloud API token in to Rivet for autoscaling (recommended)
- Fixed # of servers (hobbyist deployments that were manually set up, simple Terraform deployments, or bare metal)
- Running within Kubernetes (usually because it depends on existing services)
There's thousands of helm charts available that allow you to deploy even the most complicated databases within a minute.
Deploying your own service is also very easy as long as you use one of the popular helm templates.
Helm is by no means perfect, but it's great when you set it up the way you want. For example I have full code completion for values.yaml by simply having "deployment" charts which bundle the application database(s) and application itself into a single helm chart.
You can't just "jump into" kubernetes like you can with many serverless platforms, but spending a week banging your head to possibly save hundreds of thousands in a real production environment is a no-brainer to me.
And what when someone deployment hangs and can't be deleted but still allocates resources? This is a common issue.
Etcd can be replaced with postgres which solves majority of issues, but does require self-management. I've seen k3s clusters with 94 nodes chug away just fine in the default configuration thought.
You can deploy to k8s with configuration that's just a little more complex than docker-compose. Having said that, perhaps for a majority of apps even that "a little more" is unnecessary - and docker-compose was what the long tail actually wanted. That docker-compose didn't get sustained traction is unfortunate.
This point bears repeating. OP's opinion and baseless assertions contrast with reality. Even with COTS hardware nowadays we have Kubernetes distributions which are trivial to setup and run, to the point that they only require a package being installed to get a node running, and running a single command to register the newly provisioned node in a cluster.
I wonder what's OP's first-hand experience with Kubernetes.
Now, in every thread, people replay arguments from close to a decade ago that reflect conditions 90% of companies wouldn't face today.
I agree, but it's important to stress the fact that things like etcd are only a pivotal component if you're putting together an ad-hoc cluster of unreliable nodes.
Let that sink in.
Kubernetes is a distributed system designed for high reliability, and depends on consensus algorithms to self-heal. That's a plausible scenario if you're running a cluster of cheap unreliable COTS hardware.
Is that what most people use? Absolutely not.
The whole point is that none of the main cloud providers runs containers directly on bare metal servers, and even their own VPS are resilient to their own hardware failures.
Then there's the whole debate on whether it's a good idea to put together a Kubernetes cluster running on, say, EC2 instances.
If people's concerns is that they want a deployment platform that can be easily adopted and used, it's better to understand Kubernetes as the primitives on which the PaaS that people want can be built on top of it.
Having said all that, Rivet looks interesting. I recognize some of the ideas from the BEAM ecosystem. Some of the appeal to me has less to do with deploying at scale, and more to do with resiliency and local-first.
I agree, OP's points contrast with reality. Anyone can whip out a Kubernetes cluster with COTS hardware or cheap VMs within minutes. Take Canonical's microK8s distribution. Do a snap install to get a Kubernetes node up and running, and run a command to register it in a cluster. Does this pass nowadays as rocket science?
With established cloud providers its even easier.
That’s only after you compile your Ubuntu kernel and all the software. Don’t even get me started on bad sectors on the floppy discs they give out at conferences!
No, that's simply wrong at so many levels.
May be you're thinking about cheapest VPS possible, driven by something like cpanel. Those are not cloud providers. But usually you wouldn't choose them for reliable service, because their whole model is overselling.
Maybe the problem (and strength) of Kubernetes is that it's design by committee or at least common-denominator agreement so everybody stays on board.
Any more clearly defined project would likely not become a de facto industry standard.
That said, the path to ease of use usually involves making sound decisions ahead of time for users and assuming 99% will stay on that paved path. This is how frameworks like Spring and Rails because ubiquitous.
However, if you can simplify your setup by allowing for example a performant NFS filer as storage provider (can be random machine with FreeNAS on top, if you have small needs even a random NAS home box), one server as "control plane", the rest can be plugged in automatically (with nfs-provisioner handling simple storage management for you)
Second, I'm curious as to why dependencies are a thing in Helm charts and why dependency ordering is being advocated, as though we're still living in a world of dependency ordering and service-start blocking on Linux or Windows. One of the primary idioms in Kubernetes is looping: if the dependency's not available, your app is supposed to treat that is a recoverable error and try again until the dependency becomes available. Or, crash, in which case, the ReplicaSet controller will restart the app for you.
You can't have dependency conflicts in charts if you don't have dependencies (cue "think about it" meme here), and you install each chart separately. Helm does let you install multiple versions of a chart if you must, but woe be unto those who do that in a single namespace.
If an app truly depends on another app, one option is to include the dependency in the same Helm chart! Helm charts have always allowed you to have multiple application and service resources.
Indeed, working with kubernetes I would argue that the primary architectural feature of kubernetes is the "reconciliation loop". Observe the current state, diff a desired state, apply the diff. Over and over again. There is no "fail" or "success" state, only what we can observe and what we wish to observe. Any difference between the two is iterated away.
I think it's interesting that the dominant "good enough technology" of mechanical control, the PID feedback loop, is quite analogous to this core component of kubernetes.
But the principle applies to other things that aren't controllers. For example a common pattern is a workload which waits for a resource (e.g. a database) to be ready before becoming ready itself. In a webserver Pod, for example, you might wait for the db to become available, then check that the required migrations have been applied, then finally start serving requests.
So you're basically progressing from a "wait for db loop" to a "wait for migrations" loop then a "wait for web requests" loop. The final loop will cause the cluster to consider the Pod "ready" which will then progress the Deployment rollout etc.
[0] https://kubernetes.io/docs/concepts/architecture/controller/
we had integrated monitoring/log analysis to correlate failures with "things that happen"
I was part of an outage caused by a fail-closed behavior on a dependency that wasn't actually used and was being turned down.
Dependencies among servers are almost always soft. Just return a 500 if you can't talk to your downstream dependency. Let your load balancer route around unhealthy servers.
This does not work good enough. Right now I have an issue, where keycloak takes a minute to start and dependent service which crashes on start without keycloak, takes like 5-10 minutes to start, because repicaset controller starts to throttle it and it'll wait for minutes for nothing, even after keycloak started. Eventually it'll work, but I don't want to wait 10 minutes, if I could wait 1 minute.
I think that proper way to solve this issue is to develop an init container which would wait for dependent service to be up before passing control to the main container. But I'd prefer for Kubernertes to allow me to explicitly declare start dependencies. My service WILL crash, if that dependency is not up, what's the point to even try to start it, just to throttle it few tries later.
Dependency is dependency. You can't just close your eyes, pretending it does not exist.
You can make an opinionated platform that does things how you think is the best way to do them, and people will do it how they want anyway with bad results. Or you can add the features to make it work multiple ways and let people choose how to use it.
This is gonna sound stupid, but people see the initial error in their logs and freak out. Or your division's head sees your demo and says "Absolutely love it. Before I demo it though, get rid of that error". Then what are you gonna do? Or people keep opening support tickets saying "I didn't get any errors when I submitted the deployment, but it's not working. If it wasn't gonna work, why did you accept it"
You either do what one of my collogues does, add some weird ass logic of "store error logs and only display them if they fire twice, (no three, 4? scratch that 5 times) with 3 second delay in between except for the last one, that can take up to 10 seconds, after that, if this was a network error, sleep for another 2 minutes and at the very end make sure to have a `logger.Info("test1")`
Or you say "fuck it" and introduce a dependency order. We know that it's stupid, but...
You're sniping at some absolutely irrelevant detail no one, absolutely no one cares about at all. Unbelievable
I suppose that's true in one sense - in that I'm using EKS heavily, and don't maintain cluster health myself (other than all the creative ways I find to fuck up a node). And perhaps in another sense: It'll try its hardest to run some containers so matter how many times I make it OOMkill itself.
Buttttttttt Kubernetes is almost pure maintenance in reality. Don't get me wrong, it's amazing to just submit some yaml and get my software out into the world. But the trade off is pure maintenance.
The workflows to setup a cluster, decide which chicken-egg trade-off you want to get ArgoCD running, register other clusters if you're doing a hub-and-spoke model ... is just, like, one single act in the circus.
Then there's installing all the operators of choice from https://landscape.cncf.io/. I mean that page is a meme, but how many of us run k8s clusters without at least 30 pods running "ancillary" tooling? (Is "ancillary" the right word? It's stuff we need, but it's not our primary workloads).
A repeat circus is spending hours figuring out just the right values.yaml (or, more likely, hours templating it, since we're ArgoCD'ing it all, right?)
> As an side, I once spent HORUS figuring out to (incorrectly) pass boolean values around from a Secrets Manager Secret, to a k8s secret - via External Secrets, another operator! - to an ArgoCD ApplicationSet definition, to another values.yaml file.
And then you have to operationalize updating your clusters - and all the operators you installed/painstakingly configured. Given the pace of releases, this is literally, pure maintenance that is always present.
Finally, if you're autoscaling (Karpenter in our case), there's a whole other act in the circus (wait, am I still using that analogy?) of replacing your nodes "often" without downtime, which gets fun in a myriad of interesting ways (running apps with state is fun in kubernetes!)
So anyway, there's my rant. Low fucking maintenance!
No, deployment of a distributed system itself is complicated, regardless of the platform you deploy it on. Kubernetes is only "complicated" because it can do all of the things you need to do to deploy software, in a standard way. You can simplify by not using Kube, but then you have to hand roll all of the capabilities that Kube just gives you for free. If you don't need a good hunk of those capabilities, you probably don't need Kube.
They want their system to be reliable to hardware failures. So when the server inevitably goes down some day, they want their website to continue to work. Very few people wants their website to go down.
They want their system to scale. So when the sudden rise of popularity hits the load balancer, they want their website to continue to work.
In the past, the price to run a distributed system was too high, so most people accepted the downsides of running a simple system.
Nowadays the price to run a distributed system is so low, that it makes little sense to avoid it anymore, for almost any website, if you can afford more than $50/month.
But in a company that had properly reliable infrastructure, any system that moved to the new infra based on K8s needed much less maintenance, had much more standardized DevOps (which allowed people from other teams to chime in when needed), and had much fewer mistakes. There was no disagreement that K8s stramlined everything.
One could make the argument that deployments today that necessitate K8s are too complex, I think there's a more convincing argument there, but my previous company was extremely risk averse in architecture (no resumé driven development) and eventually moved on to K8s, and systems at my current company often end up being way simpler than anyone would expect, but at scale, the coordination without a K8s equivalent would be too much work.
Organisations with needs big and complex enough for something like Kubernetes are big enough to have dedicated people looking after it.
In fact, it was so low maintenance that I lost my SSH key for the master node and I had to reprovision the entire cluster. Took about 90 mins including the time spent updating my docs. If it was critical I could have got that down to 15 mins tops.
20€/mo for a k8s cluster using k3s, exclusively on ARM, 3 nodes 1 master, some storage, and a load balancer with automatic dns on cloudflare.
Anecdotes like these are not helpful.
I have thousands of services I'm running on ~10^5 hosts and all kinds of compliance and contractual requirements to how I maintain my systems. Maintenance pain is a very real table-stakes conversation for people like us.
Your comment is like pure noise in our space.
You're right: they're using managed Kubernetes instead - which covers 90% of the maintenance complexity.
But nowhere near the reductive bullshit comment of "there's no maintenance because my $20/mo stack says so".
Installing is super fast.
We don't do backup of the cluster for example for that reason ( except databases etc) we just reprovision the whole cluster.
I believe there's no need to go that path for most applications. A simple kustomize script already handles most of the non-esoteric needs.
Sometimes things are simple of that's how we want them to be.
Instead of editing some YAML files, in the "old" days these software vendors would've asked you to maintain a cronjob, ansible playbooks, systemd unit, bash scripts...
It's indeed a lot of maintenance to run things thins way. You're no longer operationalizing your own code, you're also operating (as you mentioned) a CI/CD, secret management, logging, analytics, storage, databases, cron tasks, message brokers, etc. You're doing everything.
On one (if you're not doing anything super esoteric or super cloud specific) migrating kubernetes based deployments between clouds has always been super easy for me. I'm currently managing a k3s cluster that's running a nodepool on AWS and a nodepool on Azure.
I'll admit I know very little about the history of kubernetes before ~2017, BUT 2017-present kubernetes is absolutely designed/capable of being your end to end solution for everything.
Take the random list I made and the meme page at:
- CI/CD [github, gitlab, circleci]: https://landscape.cncf.io/guide#app-definition-and-developme...
- secret management [IAM, SecretsManager, KeyVault]: https://landscape.cncf.io/guide#provisioning--key-management
- logging & analytics [CloudWatch, AppInsights, Splunk, Tablue, PowerBI] : https://landscape.cncf.io/guide#observability-and-analysis--...
- storage [S3, disks, NFS/SMB shares]: https://landscape.cncf.io/guide#runtime--cloud-native-storag...
- databases: https://landscape.cncf.io/guide#app-definition-and-developme...
- cron tasks: [Built-in]
- message brokers: https://landscape.cncf.io/guide#app-definition-and-developme...
The idea is wrap cloud provider resources in CRDs. So instead of creating an AWS ELB or an Azure SLB, you create a Kubernetes service of type LoadBalancer. Then kubernetes is extensible enough for each cloud provider to swap what "service of type LoadBalancer" means for them.
For higher abstraction services (SaaS like ones mentioned above) the idea is similar. Instead of creating an S3 bucket, or an Azure Storage Account, you provision CubeFS on your cluster (So now you have your own S3 service) then you create a CubeFS Bucket.
You can replace all the services listed above, with free and open source (under a foundation) alternatives. As long as you can satisfy the requirements of CubeFS, you can have your own S3 service.
Of course you're now maintaining the equivalent of github, circleci, S3, ....
Kubernetes gives you a unified way of deploying all these things regardless of the cloud provider. Your target is Kubernetes, not AWS, Microsoft or Google.
The main benefit (to me) is with Kubernetes you get to choose where YOU want to draw the line of lock-in vs value. We all have different judgements after all
Do you see no value in running and managing kafka? maybe SQS is simple enough and cheap enough that you just use it. Replacing it with a compatible endpoint is cheap.
Are you terrified of building your entire event based application on top of SQS and Lambda? How about Kafka and ArkFlow?
Now you obviously trade one risk for another. You're trading the risk of vendor lock-in with AWS, but at the same time just because ArkFlow is free and open source, doesn't mean that it'll be as maintained in 8 years as AWS Lambda is gonna be. Maybe maybe not. You might have to migrate to another service.
On this we agree. That's a nontrivial amount of undifferentiated heavy lifting--and none is a core feature of K8S. You are absolutely right that you can use K8S CRDs to use K8S as the control plane and reduce the number of idioms you have to think about, but the dirty details are in the data plane.
> and none is a core feature of K8S
The core feature of k8s is "container orchestration" which is extremely broad. Whatever you can run by orchestrating containers which is everything. The other core feature is extensibility and abstraction. So to me CRDs are as core to kubernetes as anything else really. They are such a simple concept, that custom vs built-in is only a matter of availability and quality sometimes.
> That's a nontrivial amount of undifferentiated heavy lifting
Yes it is. Like I said, the benefit of kubernetes is it gives you the choice of where you wanna draw that line. Running and maintaining GitHub, CircleCI and S3 is a "nontrivial amount of undifferentiated heavy lifting" to you. The equation might be different to another business or organization. There is a popular "anti-corporation, pro big-government" sentemnt on the internet today, right? would it make sense for say an organization like the EU to take hard dependency on GitHub or CircleCI? or should they contract OVH and run their own Github, CircleCI instances?
People always complain about vendor-lock in, closed source services, bait and switch with services, etc. with Kubernetes, you get to choose what your anxieties are, and manage them yourself.
That is 100% not true and why different foundational services have (often vastly) different control planes. The Kubernetes control plane is very good for a lot of things, but not everything.
> People always complain about vendor-lock in, closed source services, bait and switch with services, etc. with Kubernetes, you get to choose what your anxieties are, and manage them yourself.
There is no such thing as zero switching costs (even if you are 100% on premise). Using Kubernetes can help reduce some of it, but you can't take a mature complicated stack running on AWS in EKS and port it to AKS or GKE or vice versa without a significant amount of effort.
The reality is yes, noting is zero switching costs. There are plenty of best practices to how to utilize k8s for least headache migrations. It's very doable and I see it all done all the time.
If autoscaling doesn't save more $$ than the tech debt/maintenance burden, turn it off.
But I think a lot of people are in state where they need to run stuff the way it is because “just turn it off” won’t work.
Like system after years on k8s coupled to its quirks. People not knowing how to setup and run stuff without k8s.
The trouble is you then start doing more. You start going way beyond what you were doing before. Like you ditch RDS and just run your DBs in cluster. You stop checking your pipelines manually because you implement auto-scaling etc.
It's not free, nobody ever said it was, but could you do all the stuff you mentioned on another system with a lower maintenance burden? I doubt it.
What it boils down to is running services has maintenance still, but it's hopefully lower than before and much of the burden is amortized across many services.
But you definitely need to keep an eye on things. Don't implement auto-scaling unless you're spending a lot of your time manually scaling. Otherwise you've now got something new to maintain without any payoff.
you do have to know what you're doing and not fall prey to the "install the cool widget" trap.
You can hide the ops person but it but you can't remove them from the equation which is what people seem to want.
Why not use protobuf, or similar interface definition languages? Then let users specify the config in whatever language they are comfortable with.
I see some value instead. Lately I've been working on Terraform code to bring up a whole platform in half a day (aws sub-account, eks cluster, a managed nodegroup for karpenter, karpenter deployment, ingress controllers, LGTM stack, public/private dns zones, cert-manager and a lot more) and I did everything in Terraform, including Kubernetes resources.
What I appreciated about creating Kubernetes resources (and helm deployments) in HCL is that it's typed and has a schema, so any ide capable of talking to an LSP (language server protocol - I'm using GNU Emacs with terraform-ls) can provide meaningful auto-completion as well proper syntax checking (I don't need to apply something to see it fail, emacs (via the language server) can already tell me what I'm writing is wrong).
I really don't miss having to switch between my ide and the Kubernetes API reference to make sure I'm filling each field correctly.
Not really. There are optional ways to add schema but nothing that's established enough to rely on.
Edit: apparently it just has popular ones built in: https://raw.githubusercontent.com/SchemaStore/schemastore/ma...
As much as Helm is prone to some type of errors, it will validate the schema before doing an install or upgrade so `version: 1.30` won't apply but `version: "1.30"` will.
In the age of AI arguments about a configuration language are moot. Nobody is going to hand craft those deployment configs anymore. The AIs can generate whatever weird syntax the underlying K8s machinery needs and do it more reliably than any human. If not today then probably in 3 months or something.
If you want to dream big for K8s 2.0 let AI parse human intent to generate the deployments and the cluster admin. Call it vibe-config or something. Describe what you want in plain language. A good model will think about the edge cases and ask you some questions to clarify, then generate a config for your review. Or apply it automatically if you're an edgelord ops operator.
Let's face it, modern code generation is already heading mostly in that direction. You're interacting with an AI to create whatever application you have in your imagination. You still have to guide it to not make stupid choices, but you're not going to hand craft every function. Tell the AI "I want 4 pods with anti-affinity, rolling deployment, connected to the existing Postgres pod and autoscaling based on CPU usage. Think hard about any edge cases I missed." and get on with your life. That's where we're heading.
I'm emotionally prepared for the roasting this comment will elicit so go for it. I genuinely want to hear pro and con arguments on this.
The idea that someone who works with JavaScript all day might find HCL confusing seems hard to imagine to me.
To be clear, I am talking about the syntax and data types in HCL, not necessarily the way Terraform processes it, which I admit can be confusing/frustrating. But Kubernetes wouldn’t have those pitfalls.
outer {
example {
foo = "bar"
}
example {
foo = "baz"
}
}
it reminds me of the insanity of toml [lol]
[[whut]]
foo = "bar"
[[whut]]
foo = "baz"
only at least with toml I can $(python3.13 -c 'import tomllib, sys; print(tomllib.loads(sys.stdin.read()))') to find out, but with hcl too bad server {
location / {
return 200 "OK";
}
location /error {
return 404 "Error";
}
}
server {
location / {
return 200 "OK";
}
location /error {
return 404 "Error";
}
}
But because we know that nginx is a configuration language for a server where we define the behavior we want through creating a set of resources (servers in this example, but "example"s or "whut"s in yours) {
"servers": [
{
"locations": [
{
"key": "/",
"return": {
"status": 200,
"content": "OK"
}
},
{
"key": "/error",
"return": {
"status": 404,
"content": "error"
}
}
]
},
{
"locations": [
{
"key": "/",
"return": {
"status": 200,
"content": "OK"
}
},
{
"key": "/error",
"return": {
"status": 404,
"content": "error"
}
}
]
}
]
}
I did see the "inspired by nginx" on hcl's readme, but that must be why you and Mitchell like hcl so much because I loathe nginx's config with all my heart
At the end of the day HCL isn't that much more complex than something like yaml. Saying "You don't want your developers learning yaml" is like saying "You don't want your developers learning Yaml. They send an email to the SRE guys, and the SRE guys are the Yaml experts"
People complain about YAML then go and use TOML.
$ hcl2json <<FOO | jq -r '"has \(length) keys"'
outer {
example {
foo = "bar"
}
example {
foo = "baz"
}
}
FOO
has 1 keys
Terraform converts HCL to JSON in a different way that makes a lot more sense, and hcldec provided by Hashicorp lets you define your own specification for how you want to use it.
I would like to point out that perfect compatibility with JSON is not a goal nor a requirement of a decent configuration language.
https://www.npmjs.com/package/@cdktf/hcl2json
https://github.com/hashicorp/terraform-cdk/tree/main/package...
I use the package to implement module dependency graph tracking across our terraform monorepo for things like code reviews and CI triggering.
I've found HCL perfectly adequate for vanilla Terraform, furthermore I've found no real reason to leverage CDK.
To be clear I am not saying HCL is amazing, merely that it's adequate and better than YAML, and that I'm not sure how it could be considered confusing. I personally think HashiCorp has jumped the shark since the IBM acquisition and that CDK will never gain mass adoption in a significant way.
As for why I linked to "some random guy," that's because the Hashicorp people in their infinite wisdom didn't ship any rendering binary so some kind soul had to glue one to the official sdk https://github.com/tmccombs/hcl2json/blob/v0.6.7/go.mod#L8
I hear you about JSON might not be a goal, but I can tell you that Terraform accepts .json files just as much as it accepts .hcl files so that sane people can generate them, because sane people cannot generate HCL https://developer.hashicorp.com/terraform/language/syntax/js...
Absolutely loathsome syntax IMO
variable "my_list" { default = ["a", "b"] }
resource whatever something {
for_each = var.my_list
}
The given "for_each" argument value is unsuitable: the "for_each" argument must be a map, or set of strings, and you have provided a value of type tuple.
Ansible's with_items: and loop: do the exact same thing with YAML.
HCL's looping and conditionals are a mess but they're wonderful in comparison to its facilities for defining and calling functions.
Good functional languages like Clojure make something like this awkward and painful to do, because it doesn't really make sense in a functional context.
It is claiming the provided thing is not a set of strings it's a tuple
The fix is
for_each = toset(var.my_list)
but who in their right mind would say "oh, I see you have a list of things, but I only accept unordered things" zipmap(range(length(var.my_list)), var.my_list)
where you get {0 => item, 1=>item} and your resources will be named .0, .1, .2. I get the annoyance about the type naming but in HCL lists and tuples are the same thing.Ok, so which one of ["a", "b"] in my example isn't unique?
And I definitely, unquestionably, never want resources named foo.0 and foo.1 when the input was foo.alpha and foo.beta
If you don't want .0, .1 then you agree with Terraform's design because that's the only way you could for_each over an arbitrary list that might contain duplicate values. Terraform making you turn your lists into sets gets you what you want.
This sounds a lot more like “I resented learning something new” than anything about HCL, or possibly confusing learning HCL simultaneously with something complex as a problem with the config language rather than the problem domain being configured.
The ideal solution would be to have an abstraction that is easy to use and does not require learning a whole new concept (especially an ugly one as hcl). Also learning hcl is simply just the tip of the iceberg, with sinking into the dependencies between components and outputs read from a bunch of workspaces etc. It is simply wasted time to have the devs keeping up with all the terraform heap that SREs manage and keep evolving under the hood. The same dev time better spent creating features.
If your argument is instead that they shouldn’t learn infrastructure, then the point is moot because that applies equally to every choice (knowing how to indent YAML doesn’t mean they know what to write). That’s also wrong as an absolute position but for different reasons.
I have no objections to protobufs, but I think that once you’re past a basic level of improvements over YAML (real typing, no magic, good validation and non-corrupting formatters) this matters less than managing the domain complexity. YAML is a bad choice for anything but text documents read by humans because it requires you to know how its magic works to avoid correct inputs producing unexpected outputs (Norway, string/float confusion, punctuation in string values, etc.) and every tool has to invent conventions for templating, flow control, etc. I think HCL is better across the board but am not dogmatic about it - I just want to avoid spending more time out of people’s lives where they realize that they just spent two hours chasing an extra space or not quoting the one value in a list which absolutely must be quoted to avoid misinterpretation.
Given that its main purpose is to describe objects of various types, its facilities for describing types are pretty weak and its facilities for describing data are oddly inconsistent. Its features for loops and conditionals are weak and ad-hoc compared with a regular programming language. Identifier scope / imports are a disaster and defining a function is woeful.
Some of this is influenced by Terraform's problem domain but most of it isn't required by that (if it was required, people wouldn't have such an easy time doing their Terraform in python/go/rust).
Despite this, Terraform is overall good. But there's a reason there's a small industry of alternatives to HCL.
The actual format is binary and I'm not expecting people to pass around binary blobs that describe their deployment configuration.
to serialize it, you're writing code in some language... then what? you just want deployment as code? because that seems like a separate paradigm.
What would you expect to be hearing different if HCL really was just an objectively bad language design?
I don't think I can take a JSON or YAML document and export it to an HCL that contains logic like that, but at the same time we shouldn't be doing that either IMHO. I very much do not want my configuration files to have side effects or behave differently based on the environment. "Where did this random string come from?" "It was created dynamically by the config file because this environment variable wasn't set" sounds like a nightmare to debug.
YAML is bad for a great many reasons and I agree with the author that we should get rid of it, but HCL is bad for this use case for the same reasons - it does too much. Unfortunately, I'm not sure if there's another configuration language that makes real honest sense in this use case that isn't full of nightmarish cruft also.
I'd agree that YAML isn't a good choice, but neither is HCL. Ever tried reading Terraform, yeah, that's bad too. Inherently we need a better way to configure Kubernetes clusters and changing out the language only does so much.
IPv6, YES, absolutely. Everything Docker, container and Kubernetes should have been IPv6 only internal from the start. Want IPv4? That should be handle by a special case ingress controller.
The longer I look at k8s, the more I see it "batteries not included" around storage, networking, etc, with the result being that the batteries come with a bill attached from AWS, GCP, etc. K8s is less of an open source project, and more as a way encourage dependency on these extremely lucrative gap filler services from the cloud providers.
I invite anyone to try reading https://kubernetes.io/docs/setup/production-environment/. It starts with "there are many options". Inside every option there are even more options and considerations, half of which nobody cares about. Why would i ever care which container runtime or network stack that is used. It's unnecessary complexity. Those are edge cases, give something sane out of the box.
This is exactly what I love about k3s.
For me, the only thing that really changed was LLMs - chatgpt is exceptional at understanding and generating valid k8s configs (much more accurately than it can do coding). It's still complex, but it feels I have a second brain to look at it now
At some point you won't need a fully dedicated ops team. I think a lot of people in this discussion are oblivious to where this is heading.
I think programmers are more likely to go extinct before that version of reality materializes. That's my secret plan on how to survive the alleged AI apocalypse: AI ain't running its own data flow pipelines into its own training job clusters. As a non-confabulation, I give you https://status.openai.com. They have one of every color, they're collecting them!
Things we are aiming to improve:
* Globally distributed * Lightweight, can easily run as a single binary on your laptop while still scaling to thousands of nodes in the cloud. * Tailnet as the default network stack * Bittorrent as the default storage stack * Multi-tenant from the ground up * Live migration as a first class citizen
Most of these needs were born out of building modern machine learning products, and the subsequent GPU scarcity. With ML taking over the world though this may be the norm soon.
Non-requirement?
> * Tailnet as the default network stack
That would probably be the first thing I look to rip out if I ever was to use that.
Kubernetes assuming the underlying host only has a single NIC has been a plague for the industry, setting it back ~20 years and penalizing everyone that's not running on the cloud. Thank god there are multiple CNI implementation.
Only recently with Multus (https://www.redhat.com/en/blog/demystifying-multus) some sense seem to be coming back into that part of the infrastructure.
> * Multi-tenant from the ground up
How would this be any different from kubernetes?
> * Bittorrent as the default storage stack
Might be interesting, unless you also mean seeding public container images. Egress traffic is crazy expensive.
It is a requirement because you can't find GPUs in a single region reliably and Kubernetes doesn't run on multiple regions.
>> * Tailnet as the default network stack
> That would probably be the first thing I look to rip out if I ever was to use that.
This is fair, we find it very useful because it easily scales cross clouds and even bridges them locally. It was the simplest solution we could implement to get those properties, but in no way would we need to be married to it.
>> * Multi-tenant from the ground up
> How would this be any different from kubernetes?
Kuberentes is deeply not multi-tenant, anyone who has tried to make a multi-tenant solution over kube has dealt with this. I've done it at multiple companies now, its a mess.
>> * Bittorrent as the default storage stack
> Might be interesting, unless you also mean seeding public container images. Egress traffic is crazy expensive.
Yeah egress cost is a concern here, but its lazy so you don't pay for it unless you need it. This seemed like the lightest solution to sync data when you do live migrations cross cloud. For instance, I need to move my dataset and ML model to another cloud, or just replicate it there.
How so? You can definitely use annotations in nodes and provision them in different regions.
But you're right. You launch node pretty much anywhere, as long as you have network connectivity (and you don't even need full network connectivity, a couple of tcp ports open are enough).
It's not really recommended (due to latency), but you can also run the control-plane across different regions.
Every time I’ve had multiple NICs on a server with different IPs, I’ve regretted it.
Network Policies are also defense in depth, since another Pod would need to know its sibling Pod's name or IP to reach it directly, the correct boundary for such things is not to expose management toys in the workload's Service, rather create a separate Service that just exposes those management ports
Akin to:
interface Awesome { String getFavoriteColor(); }
interface Management { void setFavoriteColor(String value); }
class MyPod implements Awesome, Management {}
but then only make either Awesome, or Management, available to the consumers of each behavior> the first thing I look to rip out
This only shows how varied the requirements are across the industry. One size does not fit all, hence multiple materially different solutions spring up. This is only good.
Sooo… like what kubernetes does today?
Since k8s manifests are a language, there can be multiple implementations of it, and multiple dialects will necessarily spring up.
Also, ohgawd please never ever do this ever ohgawd https://github.com/agentsea/nebulous/blob/v0.1.88/deploy/cha...
If you mean Helm, yeah I hate it but it is the most common standard. Also not sure what you mean by the secret, that is secure.
I'd like to see more emphasis on UX for v2 for the most common operations, like deploying an app and exposing it, then doing things like changing service accounts or images without having to drop into kubectl edit.
Given that LLMs are it right now, this probably won't happen, but no harm in dreaming, right?
Even Terraform seems to live on just a single-layer and was relatively straight-forward to learn.
Yes, I am in the middle of learning K8s so I know exactly how steep the curve is.
Much of the complexity then comes from the enormous amount of resource types - including all the custom ones. But the basic idea is really pretty small.
I find terraform much more confusing - there’s a spec, and the real world.. and then an opaque blob of something I don’t understand that terraform sticks in S3 or your file system and then.. presumably something similar to a one-shot reconciler that wires that all together each time you plan and apply?
There’s the state file (base commit, what the system looked like the last time terraform succesfully executed). The current system (the main branch, which might have changed since you “branched off”) and the terraform files (your branch)
Running terraform then merges your branch into main.
Now that I’m writing this down, I realize I never really checked if this is accurate, tf apply works regardless of course.
I don't know how to have a cute git analogy for "but first, git deletes your production database, and then recreates it, because some attribute changed that made the provider angry"
You skipped the { while true; do tofu plan; tofu apply; echo "well shit"; patch; done; } part since the providers do fuck-all about actually, no kidding, saying whether the plan could succeed
Declarative reconciliation is (very) nice but not irreplaceable (and actually not mandatory, e.g. kubectl run xyz)
You can invent a new resource type that spawns raw processes if you like, and then use k8s without pods or nodes, but if you take away the reconciliation system then k8s is just an idle etcd instance
Imagine if pods couldn't reach other and you had to specify all networks and networking rules.
Or imagine that once you created a container you had to manually schedule it on a node. And when the node or pod crashes you have to manually schedule it somewhere else.
CRs for example. CRs are an amazing design concept. The fact that everything is a CR, with some exceptions (like everything in the v1 API) and their controllers are always Pods running somewhere is incredible. Debugging cluster irrops is so much easier once you internalize this.
However, this exemplifies my desire for better UX. It would be great if "kubectl logs" accepted an CR kind, which would have the API server would automatically find its associated controller and work with the controller-manager to pump out its logs. Shit, I should make that PR, actually.
Even better would be a (much improved) built-in UI that UIops people can use to do something like this. This will become extremely important once the ex-VMware VIadmins start descending onto Kubernetes clusters, which will absolutely happen given that k8s is probably the best vCenter alternative that exists (as non-obvious as that seems right now, though I work at Red Hat and am focused on OpenShift, so take that with two grains of salt)
I agree with you on a human level. Operators and controllers remind me of COM and CORBA, in a sense. They are hightly abstract things, that are intrinsically so flexible that they allow judgement (and misjudgement) in design.
For simple implementations, I'd want k8s-lite, that was more opinionated and less flexible. Something which doesn't allow for as much shooting ones' self in the foot. For very complex implementations, though, I've felt existing abstractions to be limiting. There is a reason why a single cluster is sometimes the basis for cell boundaries in cellular architectures.
I sometimes wonder if one single system - kubernetes 2.0 or anything else - can encompass the full complexity of the problem space while being tractable to work with by human architects and programmers.
You seem to want something like https://skateco.github.io/ (still compatible to k8s manifests).
Or maybe even something like https://uncloud.run/
Or if you still want real certified Kubernetes, but small, there is https://k3s.io/
- I get HCL, types, resource dependencies, data structure manipulation for free
- I use a single `tf apply` to create the cluster, its underlying compute nodes, related cloud stuff like S3 buckets, etc; as well as all the stuff running on the cluster
- We use terraform modules for re-use and de-duplication, including integration with non-K8s infrastructure. For example, we have a module that sets up a Cloudflare ZeroTrust tunnel to a K8s service, so with 5 lines of code I can get a unique public HTTPS endpoint protected by SSO for whatever running in K8s. The module creates a Deployment running cloudflared as well as configures the tunnel in the Cloudflare API.
- Many infrastructure providers ship signed well documented Terraform modules, and Terraform does reasonable dependency management for the modules & providers themselves with lockfiles.
- I can compose Helm charts just fine via the Helm terraform provider if necessary. Many times I see Helm charts that are just “create namespace, create foo-operator deployment, create custom resource from chart values” (like Datadog). For these I opt to just install the operator & manage the CRD from terraform directly or via a thin Helm pass through chat that just echos whatever HCL/YAML I put in from Terraform values.
Terraform’s main weakness is orchestrating the apply process itself, similar to k8s with YAML or whatever else. We use Spacelift for this.
We do sometimes have the mutating webhook stuff, for example when running 3rd party JVM stuff we tell the Datadog operator to inject JMX agents into applications using a mutating webhook. For those kinds of things we manage the application using the Helm provider pass-through I mentioned, so what Terraform stores in its state, diffs, and manages on change is the input manifests passed through Helm; for those resources it never inspects the Kubernetes API objects directly - it will just trigger a new helm release if the inputs change on the Terraform side or delete the helm release if removed.
This is not a beautiful solution but it works well in practice with minimal fuss when we hit those Kubernetes provider annoyances.
Last time I tried loading the openapi schema in the swagger ui on my work laptop (this was ~3-4 years ago, and i had an 8th gen core i7 with 16gb ram) it hang my browser, leading to tab crash.
4,5 probably don't require 2.0 - can be easily added within existing API via KEP (cri-o already does userns configuration based on annotations)
I maintained borgcfg 2015-2019
The biggest lesson k8s drew from borg is to replace bcl (borgcfg config language) with yaml (by Brian Grant)
Then this article suggests to reverse
Yep, knowledge not experienced is just fantasy
The fact that YAML encourages you to never put double-quotes around strings means that (based on my experience at $DAYJOB) one gets turbofucked by YAML mangling input by converting something that was intended to be a string into another datatype at least once a year. On top of that, the whitespace-sensitive mode makes long documents extremely easy to get lost in, and hard to figure out how to correctly edit. On top of that, the fact that the default mode of operation for every YAML parser I'm aware of emits YAML in the whitespace-sensitive mode means that approximately zero of the YAML you will ever encounter is written in the (more sane) whitespace-insensitive mode. [0]
It may be that bcl is even worse than YAML. I don't know, as I've never worked with bcl. YAML might be manna from heaven in comparison... but that doesn't mean that it's actually good.
[0] And adding to the mess is the fact that there are certain constructions (like '|') that you can't use in the whitespace-insensitive mode... so some config files simply can't be straightforwardly expressed in the other mode. "Good" job there, YAML standard authors.
The easy solution here is to generate the YAML from the HCL (or from helm, or whatever other abstraction you choose) and to commit and apply the YAML.
More broadly, I think Kubernetes has a bit of a marketing problem. There is a core 20% of the k8s API which is really good and then a remaining 80% of niche stuff which only big orgs with really complex deployments need to worry about. You likely don't need (and should not use) that cloud native database that works off CRDs. But if you acknowledge this aspect of its API and just use the 20%, then you will be happy.
A lot simpler hopefully. It never really took off but docker swarm had a nice simplicity to it. Right idea, but Docker Inc. mismanaged it.
Unfortunately, Kubernetes evolved into a bit of a monster. Designed to be super complicated. Full of pitfalls. In need of vast amounts of documentation, training, certification, etc. Layers and layers of complexity, hopelessly overengineered. I.e. lots of expensive hand holding. My mode with technology is that if the likes of Red Hat, IBM, etc. get really excited: run away. Because they are seeing lots of dollars for exactly the kind of stuff I don't want in my life.
Leaving Kubernetes 2.0 to the people that did 1.0 is probably just going to lead to more of the same. The people behind this require this to be a combination of convoluted and hard to use. That's how they make money. If it was easy, they'd be out of business.
I use Terranix to make config.tf.json which means I have the NixOS module system that's composable enough to build a Linux distro at my fingertips to compose a great Terraform "state"/project/whatever.
It's great to be able to run some Python to fetch some data, dump it in JSON, read it with Terranix, generate config.tf.json and then apply :)
I couldn't tell you exactly, but modules always end up either not exposing enough or exposing too much. If I were to write my module with Terranix I can easily replace any value in any resource from the module I'm importing using "resource.type.name.parameter = lib.mkForce "overridenValue";" without having to expose that parameter in the module "API".
The nice thing is that it generates "Terraform"(config.tf.json) so the supremely awesome state engine and all API domain knowledge bound in providers work just the same and I don't have to reach for something as involved as Pulumi.
You can even mix Terranix with normal HCL since config.tf.json is valid in the same project as HCL. A great way to get started is to generate your provider config and other things where you'd reach to Terragrunt/friends. Then you can start making options that makes resources at your own pace.
The terraform LSP sadly doesn't read config.tf.json yet so you'll get warnings regarding undeclared locals and such but for me it's worth it, I generally write tf/tfnix with the provider docs open and the language (Nix and HCL) are easy enough to write without full LSP.
https://terranix.org/ says it better than me, but by doing it with Nix you get programatical access to the biggest package library in the world to use at your discretion (Build scripts to fetch values from weird places, run impure scripts with null_resource or it's replacements) and an expressive functional programming language where you can do recursion and stuff, you can use derivations to run any language to transform strings with ANY tool.
It's like Terraform "unleashed" :) Forget "dynamic" blocks, bad module APIs and hacks (While still being able to use existing modules too if you feel the urge).
I'm not familiar with HCL so I'm struggling to find much here that would be conclusive, but a lot of this thread sounds like "HCL's features that YAML does not have are sub-par and not sufficient to let me only use HCL" and... yeah, you usually can't use YAML that way either, so I'm not sure why that's all that much of a downside?
I've been idly exploring config langs for a while now, and personally I tend to just lean towards JSON5 because comments are absolutely required... but support isn't anywhere near as good or automatic as YAML :/ HCL has been on my interest-list for a while, but I haven't gone deep enough into it to figure out any real opinion.
The setup with Terranix sounds cool! I am pretty interested in build system type things myself, I recently wrote a plan/apply system too that I use to manage SQL migrations.
I want learn nix, but I think that like Rust, it's just a bit too wide/deep for me to approach on my own time without a tutor/co-worker or forcing function like a work project to push me through the initial barrier.
Try using something like devenv.sh initially just to bring tools into $PATH in a distro agnostic & mostly-ish MacOS compatible way (so you can guarantee everyone has the same versions of EVERYTHING you need to build your thing).
Learn the language basics after it brings you value already, then learn about derivations and then the module system which is this crazy composable multilayer recursive magic merging type system implemented on top of Nix, don't be afraid to clone nixpkgs and look inside.
Nix derivations are essentially Dockerfiles on steroids, but Nix language brings /nix/store paths into the container, sets environment variables for you and runs some scripts, and all these things are hashed so if any input changes it triggers automatic cascading rebuilds, but also means you can use a binary cache as a kind of "memoization" caching thingy which is nice.
It's a very useful tool, it's very non-invasive on your system (other than disk space if you're not managing garbage collection) and you can use it in combination with other tools.
Makes it very easy to guarantee your DevOps scripts runs exactly your versions of all CLI tools and build systems and whatever even if the final piece isn't through Nix.
Look at "pgroll" for Postgres migrations :)
If the documentation and IDE story for kustomize was better, I'd be its biggest champion
And since it's Terraform you can create resources using any provider in the registry to create resources according to your Kubernetes objects too, it can technically replace things like external-dns and similar controllers that create stuff in other clouds, but in a more "static configuration" way.
Edit: This works nicely with Gitlab Terraform state hosting thingy as well.
HCL, like YAML, doesn't even have a module system. It's a data serialization format.
1. Instead of recreating the "gooey internal network" anti-pattern with CNI, provide strong zero-trust authentication for service-to-service calls.
2. Integrate with public networks. With IPv6, there's no _need_ for an overlay network.
3. Interoperability between several K8s clusters. I want to run a local k3s controller on my machine to develop a service, but this service still needs to call a production endpoint for a dependent service.
I am not aware of any reason why you couldn't connect directly to any Pod, which necessarily includes the kube-apiserver's Pod, from your workstation except for your own company's networking policies
> I am not aware of any reason why you couldn't connect directly to any Pod, which necessarily includes the kube-apiserver's Pod, from your workstation except for your own company's networking policies
I don't think I can create a pod with EKS that doesn't have any private network?
K8s will happily let you run a pod with host network, and even the original "kubenet" network implementation allowed directly calling pods if your network wasn't more damaged than a brain after multiple strokes combined with headshot tank cannon (aka most enterprise networks in my experience)
And, if it wasn't obvious: VPC-CNI isn't the only CNI, nor even the best CNI, since the number of ENIs that one can attach to a Node vary based on its instance type, which is just stunningly dumb IMHO. Using an overlay network allows all Pods that can fit upon a Node to run there
From your lips to God's ears. And, as they correctly pointed out, this work is already done, so I just do not understand the holdup. Folks can continue using etcd if it's their favorite, but mandating it is weird. And I can already hear the butwhataboutism yet there is already a CNCF certification process and a whole subproject just for testing Kubernetes itself, so do they believe in the tests or not?
> The Go templates are tricky to debug, often containing complex logic that results in really confusing error scenarios. The error messages you get from those scenarios are often gibberish
And they left off that it is crazypants to use a textual templating language for a whitespace sensitive, structured file format. But, just like the rest of the complaints, it's not like we don't already have replacements, but the network effect is very real and very hard to overcome
That barrier of "we have nicer things, but inertia is real" applies to so many domains, it just so happens that helm impacts a much larger audience
I personally use TypeScript since it has unions and structural typing with native JSON support but really anything can work.
application/json
application/json;stream=watch
application/vnd.kubernetes.protobuf
application/vnd.kubernetes.protobuf;stream=watch
application/yaml
which I presume all get coerced into protobuf before being actually interpreted1. Helm: make it official, ditch the text templating. The helm workflow is okay, but templating text is cumbersome and error-prone. What we should be doing instead is patching objects. I don't know how, but I should be setting fields, not making sure my values contain text that are correctly indented (how many spaces? 8? 12? 16?)
2. Can we get a rootless kubernetes already, as a first-class citizen? This opens a whole world of possibilities. I'd love to have a physical machine at home where I'm dedicating only an unprivileged user to it. It would have limitations, but I'd be okay with it. Maybe some setuid-binaries could be used to handle some limited privileged things.
So, to circle back to your original point: rke2 (Apache 2) is a fantastic, airgap-friendly, intelligence community approved distribution, and pairs fantastic with rancher desktop (also Apache 2). It's not the kubernetes part of that story which is hard, it's the "yes, but" part of the lego build
- https://github.com/rancher/rke2/tree/v1.33.1%2Brke2r1#quick-...
- https://github.com/rancher-sandbox/rancher-desktop/releases
https://artifacthub.io/docs/topics/repositories/
You can do the same with just about any K8s related artifact. We always encourage projects to go through the process but sometimes they need help understanding that it exists in the first place.
Artifacthub is itself an incubating project in the CNCF, ideas around making this easier for everyone are always welcome, thanks!
(Disclaimer: CNCF Staff)
Including ingress-nginx? Per OP, it's not marked as verified. If even the official components don't bother, it's hard to recommend it to third parties.
Is that true. No one is really using it?
I think one thing k8s would need is some obvious answer for stateful systems(at scale, not mysql at a startup). I think there are some ways to do it? Where I work there is basically everything on k8s, then all the databases on their own crazy special systems to support they insist its impossible and costs to much. I work in the worst of all worlds now supporting this.
re: comments about k8s should just schedule pods. mesos with aurora or marathon was basically that. If people wanted that those would have done better. The biggest users of mesos switched to k8s
1. etcd did an fsync on every write and required all nodes to complete a write to report a write as successful. This was not configurable and far higher a guarantee than most use cases actually need - most Kubernetes users are fine with snapshot + restore an older version of the data. But it really severely impacts performance.
2. At the time, etcd had a hard limit of 8GB. Not sure if this is still there.
3. Vanilla etcd was overly cautious about what to do if a majority of nodes went down. I ended up writing a wrapper program to automatically recover from this in most cases, which worked well in practice.
In conclusion there was no situation where I saw etcd used that I wouldn't have preferred a highly available SQL DB. Indeed, k3s got it right using sqlite for small deployments.
Of course configurability is good (e.g. for automated fasts tests you don't need it), but safe is a good default here, and if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable (e.g. 1000 fsyncs/second).
I didn't! Our business DR plan only called for us to restore to an older version with short downtime, so fsync on every write on every node was a reduction in performance for no actual business purpose or benefit. IIRC we modified our database to run off ramdisk and snapshot every few minutes which ran way better and had no impact on our production recovery strategy.
> if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable
At the time one of the problems I ran into was that public cloud regions in southeast asia had significantly worse SSDs that couldn't keep up. This was on one of the big three cloud providers.
1000 fsyncs/second is a tiny fraction of the real world production load we required. An API that only accepts 1000 writes a second is very slow!
Also, plenty of people run k8s clusters on commodity hardware. I ran one on an old gaming PC with a budget SSD for a while in my basement. Great use case for k3s.
If a ramdisk is sufficient for your use case, why would you use a Raft-based distributed networked consensus database like etcd in the first place?
Its whole point is to protect you not only from power failure (like fsync does) but even from entire machine failure.
And the network is usually higher latency than a local SSD anyway.
> 1000 fsyncs/second is a tiny fraction of the real world production load we required
Kubernetes uses etcd for storing the cluster state. Do you update the cluster state more than 1000 times per second? Curious what operation needs that.
Ooo, ooo, I know this one! It's for clusters with more than approximately 300 Nodes, as their status polling will actually knock over the primary etcd cluster. That's why kube-apiserver introduced this fun parameter: https://kubernetes.io/docs/reference/command-line-tools-refe... and the "but why" of https://kubernetes.io/docs/setup/best-practices/cluster-larg...
I've only personally gotten clusters up to about 550 so I'm sure the next scaling wall is hiding between that and the 5000 limit they advertise
Polling the status sounds like a read-only operation, why would it trigger an fsync?
So, I don't this second know what the formal reconciliation loop is called for that Node->apiserver handshake but it is a polling operation in that the connection isn't left open all the time, and it is a status-reporting operation. So that's how I ended up calling it "status polling." It is a write operation because whichever etcd is assigned to track the current state of the Events needs to be mutated to record the current state of the Events
It actually could be that even a smaller sized cluster could get into this same situation if there were a bazillion tiny Pods (e.g. not directly correlated with the physical size of the cluster) but since the limits say one cannot have more than 110 Pods per Node, I'm guessing the Event magnification is easier to see with physically wider clusters
Until your write outrun the disk performance/page cache and your disk I/O performance spikes. Linux used to be really bad at this when memory cgroups were involved until a cpuple of years ago.
> If a ramdisk is sufficient for your use case, why would you use a Raft-based distributed networked consensus database like etcd in the first place?
Because at the time Kubernetes required it. If the adapters to other databases existed at the time I would have tested them out.
> Kubernetes uses etcd for storing the cluster state. Do you update the cluster state more than 1000 times per second? Curious what operation needs that.
Steady state in a medium to large cluster exceeds that. At the time I was looking at these etcd issues I was running fleets of 200+ node clusters and hitting a scaling wall around 200-300. These days I use a major Kubernetes service that does not use etcd behind the scenes and my fleets can scale up to 15000 nodes at the extreme end.
Right now running K8S on anything other than cloud providers and toys (k3s/minikube) is disaster waiting to happen unless you're a really seasoned infrastructure engineer.
Storage/state is decidedly not a solved problem, debugging performance issues in your longhorn/ceph deployment is just pain.
Also, I don't think we should be removing YAML, we should instead get better at using it as an ILR (intermediate language representation) and generating the YAML that we want instead of trying to do some weird in-place generation (Argo/Helm templating) - Kubernetes sacrificed a lot of simplicity to be eventually consistent with manifests, and our response was to ensure we use manifests as little as possible, which feels incredibly bizzare.
Also, the design of k8s networking feels like it fits ipv6 really well, but it seems like nobody has noticed somehow.
My main pain point is, and always has been, helm templating. It's not aware of YAML or k8s schemas and puts the onus of managing whitespace and syntax onto the chart developer. It's pure insanity.
At one point I used a local Ansible playbook for some templating. It was great: it could load resource template YAMLs into a dict, read separately defined resource configs, and then set deeply nested keys in said templates and spit them out as valid YAML. No helm `indent` required.
It's the "make break glass situation little easier" option, not the main mechanism.
Use a programming language, a dedicated DSL, hell a custom process involving CI/CD that generates json manifests. A bit of time with jsonnet gets you a template that people who never wrote jsonnet (and barely ever touched k8s manifests) can use to setup new applications or modify existing ones in moments.
* Uses local-only storage provider by default for PVC
* Requires entire cluster to be managed by k3s, meaning no freebsd/macos/windows node support
* Master TLS/SSL Certs not rotated (and not talked about).
k3s is very much a toy - a nice toy though, very fun to play with.
If you deployed this to production though, the problem is not that it’s a toy: its you not understanding what technical trade offs they are making to give you an easy environment.
I can tell you’re defensive though, might be good to learn what I mean instead of trying to ram a round peg into a square hole.
k3s is definitely fine to run in production, hence it's not a toy. You just have to understand the tradeoffs they've made, when you should change the defaults, and where it's a reasonable choice compared to other alternatives.
A toy implementation of Kubernetes would be something like a personal hobby project made for fun or to learn, that comes with critical bugs and drawbacks. That's not what k3s is.
Do not listen to this person, and do not buy any products of theirs where they have operational control of production platforms.
If you want to run a program on a computer, the most basic way is to open a command line and invoke the program.
And that executes it on one computer number CPUs TBD.
But with modern networking primitives and foundations, why can I not open a command line and have a concise way of orchestrating and execution of a program across multiple machines?
I have tried several times to do this writing utility code for Cassandra. I got in my opinion very temptingly close to being able to do this.
Likewise with docker, vagrant, and yes, kubernetes, with their CLI interfaces for running commands on containers, the CLI fundamentals are also there.
Others taking a shot at this are salt, stack, ansible and those types of things, but they really seem to be concerned mostly about Enterprise contracts and at the core of pure CLI execution.
Security is really a pain in the ass when it comes to things like this. Your CLI prompt has a certain security assurance with it. You've already logged in.
That's a side note. One of the frustrations I started running to as I was doing this is the Enterprise obsession with getting a manual login / totp code still access anything. Holy hell do I have to jump through hopes in order to automate things across multiple machines when they have totp barriers for accessing them.
The original kubernetes kind of handwaved a lot of this away by forcing the removal, jump boxes, a flat network plane, etc.
I wouldnt so much go HCL as something like JSONnet, Pkl, Dhall, or even (inspiration not recommendation) Nix - we need something that allows a schema for powering an LSP, with enough expressitivity to void the need for Helm's templating monstrosity, and ideally with the ability for users to override things that library/package authors haven't provided explicit hooks for.
Does that exist yet? Probably not, but the above languages are starting to approach it.
it's okay to be declarative (the foundation layer), but people usually think in terms of operations (deploy this, install that, upgrade this, assign some vhost, filter that, etc.)
and even for the declarative stuff, a builder pattern works well, it can be super convenient, sufficient terse, and typed, and easily composed (and assembled via normál languages, not templates)
...
well. anyway. maybe by the time k8s2 rolls around Go will have at least normal error handling.
While I agree type safety in HCL beats that of YAML (a low bar), it still leaves a LOT to be desired. If you're going to go through the trouble of considering a different configuration language anyway, let's do ourselves a favor and consider things like CUE[1] or Starlark[2] that offer either better type safety or much richer methods of composition.
1. https://cuelang.org/docs/introduction/#philosophy-and-princi...
2. https://github.com/bazelbuild/starlark?tab=readme-ov-file#de...
Every JSON Schema aware tool in the universe will instantly know this PodSpec is wrong:
kind: 123
metadata: [ {you: wish} ]
I think what is very likely happening is that folks are -- rightfully! -- angry about using a text templating language to try and produce structured files. If they picked jinja2 they'd have the same problem -- it does not consider any textual output as "invalid" so jinja2 thinks this is a-ok jinja2.Template("kind: {{ youbet }}").render(youbet=True)
I am aware that helm does *YAML* sanity checking, so one cannot just emit whatever crazypants yaml they wish, but it does not then go one step further to say "uh, your json schema is fubar friend"I disagree that YAML is so bad. I don't particularly like HCL. The tooling I use don't care though -- as long as I can stil specify things in JSON, then I can generate (not template) what I need. It would be more difficult to generate HCL.
I'm not a fan of Helm, but it is the de facto package manager. The main reason I don't like Helm has more to do with its templating system. Templated YAML is very limiting, when compared to using a full-fledged language platform to generate a datastructure that can be converted to JSON. There are some interesting things you can do with that. (cdk8s is like this, but it is not a good example of what you can do with a generator).
On the other hand, if HCL allows us to use modules, scoping, and composition, then maybe it is not so bad after all.
As a die-hard fan of cdk8s (and the AWS CDK), I am curious to hear about this more. What do you feel is missing or could be done better?
I wrote https://github.com/matsuri-rb/matsuri ... I have not really promoted it. I tried cdk8s because the team I was working with used Typescript and not Ruby, and I thought cdk8s would have worked well, since it generates manifests instead of templates it.
Matsuri takes advantage of language features in Ruby not found in Typescript (and probably not Python) that allows for being able to compose things together. Instead of trying to model objects, it is based around constructing a hash that is then converted to JSON. It uses fine-grained method overloading to allow for (1) mixins, and (2) configuration from default values. The result is that with very little ceremony, I can get something to construct the manifest I needed. There were a lot of extra ceremony and boiler plate needed to do anything in the Typescript cdk8s.
While I can use class inheritance with Matsuri, over the years, I had moved away from it because it was not as robust as mixins (compositions). It was quite the shock to try to work with Typescript cdk8s and the limitations of that approach.
The main reason I had not promoted Matsuri is because this tool is really made for people who know Ruby well ... but that might have been a career mistake not to try. Instead of having 10 years to slowly get enough support behind it, people want something better supported such as cdk8s or Helmfiles.
Besides, easily half of this thread is whining about helm for which docker-compose has no answer whatsoever. There is no $(docker compose run oci://example.com/awesome --version 1.2.3 --set-string root-user=admin)
helm is an anti pattern, nobody should touch it with a 10 foot pole
It's Hashicorp so you have to be weary, but Nomad fills this niche
https://developer.hashicorp.com/terraform/language/expressio...
This actually makes me want to give HCL another chance
https://github.com/hashicorp/hcl/blob/v2.23.0/hclsyntax/spec... I believe is the actual language specification of heredocs
What actually makes Kubernetes hard to set up by yourself are a) CNIs, in particular if you both intend to avoid cloud-provider specific CNIs, support all networking (and security!) features, and still have high performance; b) all the cluster PKI with all the certificates for all the different components, which Kubernetes made an absolute requirement because, well, prpduction-grade security.
So if you think you're going to make an "easier" Kubernetes, I mean, you're avoiding all the lessons learned and why we got here in the first place. CNI is hardly the naive approach to the problem.
Complaining about YAML and Helm are dumb. Kubernetes doesn't force you to use either. The API server anyway expects JSON at the end. Use whatever you like.
I'm going out on a limb to say you've only ever used hosted Kubernetes, then. A sibling comment mentioned their need for vanity tooling to babysit etcd and my experience was similar.
If you are running single node etcd, that would also explain why you don't get it: you've been very, very, very, very lucky never to have that one node fail, and you've never had to resolve the very real problem of ending up with just two etcd nodes running
And you know... etcd supports five-node clusters, precisely to help support people who are paranoid about extended single node failure.
A big benefit could be for the infrastructure language to match the developer language. However, knowing software, reinventing something like Kubernetes is a bottomless pit type of task, best off just dealing with it and focusing on the Real Work (TM), right?
I want to be able to say in two or five lines of YAML:
- run this as 3 pods with a max of 5
- map port 80 to this load balancer
- use this environment variables
I don't really care if it's YAML or HCL. Moving from YAML to HCL it's going to be an endless issue of "I forgot to close a curly bracket somewhere" versus "I missed an ident somewhere".
& get automatic scaling out of the box etc. a more simplified flow rather than wrangling yaml or hcl
in short imaging if k8s was a 2-3 max 5 line docker compose like file
What Kubernetes is missing most is a 10 year track record of simplicity/stability. What it needs most to thrive is a better reputation of being hard to foot-gun yourself with.
It's just not a compelling business case to say "Look at what you can do with kubernetes, and you only need a full-time team of 3 engineers dedicated to this technology at tho cost of a million a year to get bin-packing to the tune of $40k."
For the most part Kubernetes is becoming the common-tongue, despite all the chaotic plugins and customizations that interact with each other in a combinatoric explosion of complexity/risk/overhead. A 2.0 would be what I'd propose if I was trying to kill kuberenetes.
Kubernetes 2.0 should be a boring pod scheduler with some RBAC around it. Let folks swap out the abstractions if they need it instead of having everything so tightly coupled within the core platform.
Kubernetes clusters shouldn't be bespoke and weird with behaviors that change based on what flavor of plugins you added. That is antithetical to the principal of the workloads you're trying to manage. You should be able to headshot the whole thing with ease.
Service discovery is just one of many things that should be a different layer.
hard agree. Its like jenkins, good idea, but its not portable.
Having regretfully operated both k8s and Jenkins, I fully agree with this, they have some deep DNA in common.
Sure, but then one of those third party products (say, X) will catch up, and everyone will start using it. Then job ads will start requiring "10 year of experience in X". Then X will replace the core orchestrator (K8s) with their own implementation. Then we'll start seeing comments like "X is a horribly complex, bloated platform which should have been just a boring orchestrator" on HN.
Swiss Army Buggy Whips for Everyone!
I get that it's not for everyone, I'd not recommend it for everyone. But once you start getting a pretty diverse ecosystem of services, k8s solves a lot of problems while being pretty cheap.
Storage is a mess, though, and something that really needs to be addressed. I typically recommend people wanting persistence to not use k8s.
I have actually come to wonder if this is actually an AWS problem, and not a Kubernetes problem. I mention this because the CSI controllers seem to behave sanely, but they are only as good as the requests being fulfilled by the IaaS control plane. I secretly suspect that EBS just wasn't designed for such a hot-swap world
Now, I posit this because I haven't had to run clusters in Azure nor GCP to know if my theory has legs
I guess the counter-experiment would be to forego the AWS storage layer and try Ceph or Longhorn but no company I've ever worked at wants to blaze trails about that, so they just build up institutional tribal knowledge about treating PVCs with kid gloves
Or is your assertion that Kubernetes should be its own storage provider and EBS can eat dirt?
Having experience in both the former and the latter (in big tech) and then going on to write my own controllers and deal with fabric and overlay networking problems, not sure I agree.
In 2025 engineers need to deal with persistence, they need storage, they need high performance networking, they need HVM isolation and they need GPUs. When a philosophy starts to get in the way of solving real problems and your business falls behind, that philosophy will be left on the side of the road. IMHO it's destined to go the way as OpenStack when someone builds a simpler, cleaner abstraction, and it will take the egos of a lot of people with it when it does.
My life experience so far is that "simpler and cleaner" tends to be mostly achieved by ignoring the harder bits of actually dealing with the real world.
I used kubernete's (lack of) support for storage as an example elsewhere, it's the same sort of thing where you can look really clever and elegant if you just ignore the 10% of the problem space that's actually hard.
The problem is k8s is both a orchestration system and a service provider.
Grid/batch/tractor/cube are all much much more simple to run at scale. More over they can support complex dependencies. (but mapping storage is harder)
but k8s fucks around with DNS and networking, disables swap.
Making a simple deployment is fairly simple.
But if you want _any_ kind of ci/cd you need flux, any kind of config management you need helm.
Absurdly wrong on both counts.
I get that in the early days such a fast paced release/EOL schedule made sense. But now something that operates at such a low level shouldn’t require non-security upgrades every 3 months and have breaking API changes at least once a year.
“Yaml doesn’t enforce types but HCL does”
Is the same schema-based validation that is 1) possible client-side with HCL and 2) enforced server-side by k8s not also trivial to enforce client side in an ide?
You can be declarative and still have imperative constructs hidden behind pure functions let's say. That's how Ansible does it - it is super useful to define a custom filter that's called like "{{ myvariable | my_filter }}" but underneath there is a bunch of imperative Python datastructure wrangling (without visible side effects, of course - just in memory stuff), with arbitrary code. Doing the same in HCL is I believe impossible in general case.
- it requires too much RAM to run in small machines (1GB RAM). I want to start small but not have to worry about scalability. docker swarm was nice in this regard.
- use KCL lang or CUE lang to manage templates
Kubernetes is one of the few reasons I need to care about Go.
I have seen my fair share of golang panics in the Kubernetes ecosystem, but I can't recall very many memory corruption issues that would warrant porting off of golang
In Go circles even a language like Java is considered PhD skill level programming languages, it is sooo hard.
Pascal and C like enumerations require a kludge with iota and const, dependencies was an whole show between community and Google, generics took their sweet time and the implementation has a few gotchas , makes direct references to git repos for packages and requires tricks with HTML meta headers to redirect requests, if != err boilerplate,...
Had it not been for Docker and Kubernetes adoption success, it might have not taken off.
Take a look at https://github.com/kubernetes/enhancements/tree/master/keps/..., which is hopefully landing as alpha in Kubernetes 1.34. It lets you run a controller that issues certificates, and the certificates get automatically plumbed down into pod filesystems, and refresh is handled automatically.
Together with ClusterTrustBundles (KEP 3257), these are all the pieces that are needed for someone to put together a controller that distributes certificates and trust anchors to every pod in the cluster.
As someone who has the same experience you described with janky sidecars blowing up normal workloads, I'm violently anti service-mesh. But, cert expiry and subjectAltName management is already hard enough, and you would want that to happen for every pod? To say nothing of the TLS handshake for every connection?
Hard pass. One of the big downsides to a DSL is it's linguistic rather than programmatic. It depends on a human to learn a language and figuring out how to apply it correctly.
I have written a metric shit-ton of terraform in HCL. Yet even I struggle to contort my brain into the shape it needs to think of how the fuck I can get Terraform to do what I want with its limiting logic and data structures. I have become almost completely reliant on saved snippet examples, Stackoverflow, and now ChatGPT, just to figure out how to deploy the right resources with DRY configuration in a multi-dimensional datastructure.
YAML isn't a configuration format (it's a data encoding format) but it does a decent job at not being a DSL, which makes things way easier. Rather than learn a language, you simply fill out a data structure with attributes. Any human can easily follow documentation to do that without learning a language, and any program can generate or parse it easily. (Now, the specific configuration schema of K8s does suck balls, but that's not YAML's fault)
> I still remember not believing what I was seeing the first time I saw the Norway Problem
It's not a "Norway Problem". It's a PEBKAC problem. The "problem" is literally that the user did not read the YAML spec, so they did not know what they were doing, then did the wrong thing, and blamed YAML. It's wandering into the forest at night, tripping over a stump, and then blaming the stump. Read the docs. YAML is not crazy, it's a pretty simple data format.
> Helm is a perfect example of a temporary hack that has grown to be a permanent dependency
Nobody's permanently dependent on Helm. Plenty of huge-ass companies don't use it at all. This is where you proved you really don't know what you're talking about. (besides the fact that helm is a joy to use compared to straight YAML or HCL)
Also, unless something has radically changed recently, ECS == Amazon Managed Docker (one can see a lot of docker-compose inspired constructs: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/... ) which, as this thread is continuously debating, reasonable people can differ on whether that's a limitation of a simplification
So here are my wishes:
1. Deliver Kubernetes as a complete immutable OS image. Something like Talos. It should auto-update itself.
2. Take opinionated approach. Do not allow for multiple implementations of everything, especially as basic as networking. There should be hooks for integration with underlying cloud platform, of course.
3. The system must work reliably out of the box. For example, kubeadm clusters are not set up properly, when it takes to memory limits. You could easily make node unresponsive by eating memory in your pod.
4. Implement built-in monitoring. Built-in centralized logs. Built-in UI. Right now, kubeadm cluster if not usable. You need to spend a lot of time, intalling prometheus, loki, grafana, configure dashboards, configure every piece of software. Those are very different softwares from different vendors. It's a mess. It requires a lot of processing power and RAM to work. It should not be like that.
5. Implement user management, with usernames and passwords. Right now you need to set up keycloak, configure oauth authentication, complex realm configuration. It's a mess. It requires a lot of RAM to work. It should not be like that.
6. Remove certificates, keys. Cluster should just work, no need to refresh anything. Join node and it stays there.
So basically I want something like Linux. Which just works. I don't need to set up Prometheus to look at my 15-min load average or CPU consumption. I don't need to set up Loki to look at logs, I have journald which is good enough for most tasks. I don't need to install CNI to connect to network. I don't need to install Keycloak to create user. It won't stop working, because some internal certificate has expired. I also want lower resources consumption. Right now Kubernetes is very hungry. I need to dedicate like 2 GiB RAM to master node, probably more. I don't even want to know about master nodes. Basic Linux system eats like 50 MiB RAM. I can dedicate another 50 MiB to Kubernetes, rest is for me, please.
Right now it feels that Kubernetes was created to create more jobs. It's a very necessary system, but it could be so much better.
yes please, and then later...
> "Allow etcd swap-out"
I don't have any strong opinions about etcd, but man... can we please just have one solution, neatly packaged and easy to deploy.
When your documentation is just a list of abstract interfaces, conditions or "please refer to your distribution", no wonder that nobody wants to maintain a cluster on-prem.
Yaml is simple. A ton of tools can parse and process it. I understand the author's gripes (long files, indentation, type confusions) but then I would even prefer JSON as an alternative.
Just use better tooling that helps you address your problems/gripes. Yaml is just fine.
A guy can dream anyway.
* kpt is still not 1.0
* both kustomize and kpt require complex setups to programatically generate configs (even for simple things like replicas = replicasx2)
Maybe all of those "implementation details" are what you meant by "config files redone" and then perhaps by the follow-on "simple matter of programming" to teach C how to have new behaviors that implement those redone config files
the other parts don't exist because that's really only useful for distributed systems, and systems is not that right now