Grafana dashboards¶
This is an overview of all the charms used in Charmed HPC that provide dashboards for Grafana, which acts as a web interface to visualize data from aggregators such as Prometheus or Loki.
See Integrate with Canonical Observability Stack for more information.
Panel query
Any panel can be inspected using the panel inspect view to see the exact query used to provide the panel with data.
Slurmctld¶
The dashboards from the slurmctld charm provide a display of information from the entire cluster, each partition, and each charm.
Cluster Overview¶
The “Cluster Overview” dashboard provides a display of cluster-level metrics such as:
Total resource utilization
Job status distribution
Node state distribution
Scheduler metrics

Partition Overview¶
The “Partition Overview” dashboard provides a display of partition-level metrics such as:
Total nodes and jobs in the partition
Total resource utilization for the partition
Job status distributing for jobs in the partition
Node state distribution for all nodes in the partition

Node Overview¶
The “Node Overview” dashboard provides a display of node-level metrics such as:
Available resources that are allocatable for jobs
Total resource utilization on the node

MySQL¶
The dashboard from the mysql charm displays metrics for the storage database of Slurmdbd:
Uptime
Queries per second
Current cache size
Maximum number of concurrent connections
Thread resource usage
Network traffic statistics

Traefik K8s¶
The dashboard from the traefik-k8s charm displays metrics about the reverse proxy used when communicating
between the compute plane cluster and the monitoring/identity k8s clusters. This includes:
Uptime
Response times
HTTP response code statistics
Open connection statistics
Raw logs for every proxied endpoint
