Prometheus metrics¶
This guide presents an overview of the Charmed Kubeflow (CKF) charms that provide Prometheus monitoring metrics.
All metrics can be accessed using the Prometheus or Grafana User Interface (UI). See Integrate with COS for more information.
Argo controller¶
See Argo controller upstream documentation for more information on provided metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="argo-controller"}
Dex Auth¶
The dex-auth
charm provides:
A custom metric counting HTTP requests. See Dex Auth source code for more details.
Go runtime and process metrics for monitoring the controller.
gRPC server metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="dex-auth"}
Envoy¶
The envoy
charm provides the following metrics:
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="envoy"}
Istio pilot¶
See Istio pilot upstream documentation for more information on provided metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="istio-pilot"}
Istio gateway¶
See Istio gateway upstream documentation for more information on provided metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="istio-gateway"}
Jupyter controller¶
The jupyter-controller
provides the following metrics:
Custom notebook-related metrics. See Jupyter controller source code for more details.
Go runtime metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="jupyter-controller"}
Katib controller¶
The katib
controller provides the following metrics:
Custom experiment-related metrics. See Katib controller source code for more details.
Go runtime metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="katib-controller"}
Kfp api¶
The kfp-api
provides the following metrics:
Custom metrics related to its several components. See its source code for more details:
Go runtime and process metrics for monitoring the controller.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="kfp-api"}
Knative eventing¶
The knative-eventing
metrics come from the knative-operator
charm that deploys otel-collector.
See Knative eventing upstream documentation for more details.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="knative-operator", namespace_name="knative-eventing"}
Knative serving¶
The knative-serving
metrics come from the knative-operator
charm that deploys otel-collector.
See Knative serving upstream documentation for more details.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="knative-operator", namespace_name="knative-serving"}
Knative operator¶
See Knative operator upstream documentation for more information on provided metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="knative-operator"}
Metacontroller operator¶
The metacontroller-operator
provides the following metrics:
Custom metrics. See Metacontroller source code for more details.
Go runtime and process metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="metacontroller-operator"}
Minio¶
See Minio upstream documentation for more information on provided metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="minio"}
Seldon controller manager¶
See Seldon controller manager upstream documentation for more information on provided metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="seldon-controller-manager"}
Training operator¶
The training-operator
provides the following metrics:
Custom job-related metrics. See Training operator source code for more details.
Go runtime and process metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="training-operator"}
Pvcviewer operator¶
The pvcviewer-operator
provides the following metrics:
Go runtime and process metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="pvcviewer-operator"}
Kserve controller¶
The kserve-controller
provides the following metrics:
Go runtime and process metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="kserve-controller"}
Kubeflow profiles¶
Kubeflow profiles manage two Pebble services:
profile-controller
.kfam
.
Profile controller¶
The profile-controller
provides the following metrics:
Custom job-related metrics. See Profile controller source code for more details.
Go runtime and process metrics for monitoring the controller.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="kubeflow-profiles"}
Kfam¶
The kfam
provides the following metrics:
Custom job-related metrics. See Kfam source code for more details.
Go runtime and process metrics for monitoring the controller.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="kubeflow-profiles"}
Tensorboard controller¶
The tensorboard-controller
provides the following metrics:
Go runtime and process metrics for monitoring the controller.
Controller runtime metrics.
You can check its metrics through the Prometheus or Grafana UI using the following query:
{juju_charm="tensorboard-controller"}