Key rotation¶
A Charmed HPC deployment uses two keys: the internal authentication key, to ensure communications “received by Slurm are coming from trusted sources”, and a standalone JWT key used for API token generation.
Both keys are generated by the slurmctld charm during installation and are securely
distributed to related Slurm using Juju Secrets. The authentication key is distributed to charms: sackd, slurmd,
slurmdbd, and slurmrestd, while the JWT key is distributed to only slurmdbd.
The slurmctld application is the owner of the key charm secrets while the related Slurm charms are
secret observers.
It is good security practice to replace keys regularly or in response to security incidents to ensure malicious users cannot use a leaked key to exert control over the cluster.
Authentication key rotation process¶
The Rotate the Slurm authentication key how-to provides the steps to initiate the key rotation process, using the rotate-auth-key action. The action triggers a cluster-wide rotation of the authentication key. Once the process is complete, a new key is in
place and the previous key can no longer be used to authenticate Slurm communication.
The full rotation process is as follows:
A Charmed HPC cluster administrator runs
juju run slurmctld/leader rotate-auth-keyThe
slurmctldleader generates a new cryptographically secure key and adds it to itsslurm.jwksfile alongside the old key. To ensure this key has a unique identifier, its key ID, orkid, value is set to a newly generated Universally Unique Identifier (UUID).With the new key in place, the
slurmctldleader performs anscontrol reconfigureto reload configuration across the cluster. At this point, both the new and old keys are valid and can be used for authentication.The
slurmctldleader updates the Juju secret authentication key to a new revision with the new key.The revision update triggers a
SecretChangedEventin each unit of the related Slurm charms (the secret observers).In the
SecretChangedEvent, units read the new revision of the Juju secret then replace their local key file with the new key and reload their service configuration. This ensures the new key is in use and the old key is no longer valid on that unit.Once all units of related Slurm charms have updated to the new key, the
slurmctldleader receives aSecretRemoveEvent.In the
SecretRemoveEvent, theslurmctldleader removes the old key from the slurm.jwks file and performs anotherscontrol reconfigureto ensure the old key is no longer valid in the cluster and only the new key is trusted for future authentication.
JWT key rotation process¶
The Rotate the JWT authentication key how-to provides the steps
to initiate the key rotation process, using the rotate-jwt-key action. The action triggers a
rotation of the JWT key on both the slurmctld controller and slurmdbd database units. Once the
process is complete, a new key is in place and the previous key can no longer be used to generate
valid tokens.
The full rotation process is as follows:
A Charmed HPC cluster administrator runs
juju run slurmctld/leader rotate-jwt-keyThe
slurmctldleader generates a new cryptographically secure key and replaces its existingjwt_hs256.key.With the new key in place, the
slurmctldleader performs anscontrol reconfigureto reload configuration across the cluster.slurmdbdis now using a stale key and its API endpoints are inaccessible.The
slurmctldleader updates the Juju secret JWT key to a new revision with the new key.The revision update triggers a
SecretChangedEventin theslurmdbdunit.In the
SecretChangedEvent, theslurmdbdunit reads the new revision of the Juju secret then replaces its local key file with the new key and reloads its service configuration. This ensures the new key is in use and the old key is no longer valid.The API endpoints for
slurmdbdare now accessible again.