For a critical control plane component like this, I tend to prefer a constant work pattern [0], to avoid metastable failures [1], e.g. periodically pull the data instead of relying on notifications.
[0] https://aws.amazon.com/builders-library/reliability-and-cons...
What are some use cases that you found are useful?
I’ve seen them used for traffic routing, storage system metadata systems, distributed cache etc
Some of the key examples highlighted on our blog are Unity Catalog, which is essentially the metadata layer for Databricks, our Query Orchestration Engine, and our distributed remote cache. See the blog post for more!