Basic premise is to encode, be it lifecycle rules or a cron, behavior such that instances are cycled after at most 7 days, but there should always be an instance cycling (with some cool down period of course).
It has never not improved overall system stability and in a few cases even decreased costs significantly.
Pets vs Cattle seems much more clear, cattle is there to be culled, you feed it, look after it, but you don't get attached. If the herd has a week member you kill it.
I'd be a heartless farmer, but that analogy radically improved my infrastructure.
I remember when I had my first desktop PC at home (Windows 95) and it would need a fresh install of Windows every so often as things went off the rails.
Ten years ago I think rule of thumb was uptime of not greater than 6 months. But for different reasons. (Windows Server...)
On Solaris, Linux, BSDs etc. it's only necessary for maintenance. Literally. I think my longest uptime production system was a sparc postgres system under sustained high load with uptime of around 6 years.
With cloud infra, people have forgotten just how stable the Unixken are.
Windows XP largely made that irrelevant, and Windows 7 made it almost completely irrelevant.
Manage these things and any stateful distributed service can run easily in Kubernetes.
When I update the Kubernetes or Talos version new nodes will be created, and after the existing pods are rescheduled on new nodes the old nodes are deleted.
Works pretty well.