Lifecycle Best Practices¶
Support window¶
Refer to Supported tracks to choose the right track for your needs. Note that different tracks may have different ubuntu bases or minimum Juju version requirement.
Certificate rotation¶
Use blackbox-exporter to monitor TLS certificates validity.
Maintenance¶
Before restarting a Kubernetes node with COS Lite applications on it, you should cordon and drain it so that the StatefulSets are moved to another node. This process will ensure the least amount of downtime.
In the event that a node goes down unexpectedly and cannot be recovered, you can manually recover the COS Lite units by force deleting the pod and any volume attachments that existed on the inaccessible node. The pods will then be rescheduled to a working node.
Known issues¶
High availability during maintenance is only possible on clusters utilizing distributed storage, such as MicroCeph.
All of the COS Lite applications use StatefulSets, so these pods will not self-heal and deploy to another node automatically.
The juju controller needs to be up for the pods to start, otherwise their charm container will fail, causing the pod to go into a crash loop.
Upgrading¶
Remember to juju refresh
with --trust
. If omitted, you would need to juju trust X --scope=cluster
.