How to instrument machine charms¶
This guide shows you how to integrate a charm deployed on a machine substrate with the Canonical Observability Stack running on Kubernetes.
The Opentelemetry Collector machine charm handles installation, configuration, and Day 2 operations specific to the Opentelemetry Collector, using Juju. The charm is designed to run in virtual machines as a subordinate.
This how-to guide uses COS Lite as the example, but either COS Lite or COS (HA) deployments can be used.
Prerequisites¶
A charmed application that is running in a virtual (or physical) machine.
The Canonical Observability Stack, running on Kubernetes.
Note
Application units are typically run in an isolated container on a machine with no knowledge or access to other applications deployed onto the same machine.
When you relate a subordinate charm to a principal one, the subordinate will be deployed on the same machine on which the principal is running.
Subordinate units scale together with their principal.
Ensure COS Lite is up and running¶
Ensure the Observability Stack is up and running in your cos model (see getting started with COS Lite) in a Kubernetes controller:
juju status --relations
Your output should show all applications active:
Model Controller Cloud/Region Version SLA Timestamp
cos-lite ck8s k8s 3.6.19 unsupported 18:07:14-03:00
App Version Status Scale Charm Channel Rev Address Exposed Message
alertmanager 0.28.0 active 1 alertmanager-k8s 2/stable 191 10.152.183.36 no
catalogue active 1 catalogue-k8s 2/stable 113 10.152.183.195 no
grafana 12.0.2 active 1 grafana-k8s 2/stable 180 10.152.183.90 no
loki 2.9.15 active 1 loki-k8s 2/stable 217 10.152.183.57 no
prometheus 2.53.3 active 1 prometheus-k8s 2/stable 287 10.152.183.172 no
traefik 2.11.0 active 1 traefik-k8s latest/stable 281 10.152.183.221 no Serving at http://192.168.1.200
Unit Workload Agent Address Ports Message
alertmanager/0* active idle 10.1.0.87
catalogue/0* active idle 10.1.0.33
grafana/0* active idle 10.1.0.214
loki/0* active idle 10.1.0.178
prometheus/0* active idle 10.1.0.90
traefik/0* active idle 10.1.0.35 Serving at http://192.168.1.200
Offer Application Charm Rev Connected Endpoint Interface Role
alertmanager-karma-dashboard alertmanager alertmanager-k8s 191 0/0 karma-dashboard karma_dashboard provider
grafana-dashboards grafana grafana-k8s 180 2/2 grafana-dashboard grafana_dashboard requirer
loki-logging loki loki-k8s 217 2/2 logging loki_push_api provider
prometheus-metrics-endpoint prometheus prometheus-k8s 287 0/0 metrics-endpoint prometheus_scrape requirer
prometheus-receive-remote-write prometheus prometheus-k8s 287 2/2 receive-remote-write prometheus_remote_write provider
Integration provider Requirer Interface Type Message
alertmanager:alerting loki:alertmanager alertmanager_dispatch regular
alertmanager:alerting prometheus:alertmanager alertmanager_dispatch regular
alertmanager:grafana-dashboard grafana:grafana-dashboard grafana_dashboard regular
alertmanager:grafana-source grafana:grafana-source grafana_datasource regular
alertmanager:replicas alertmanager:replicas alertmanager_replica peer
alertmanager:self-metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
catalogue:catalogue alertmanager:catalogue catalogue regular
catalogue:catalogue grafana:catalogue catalogue regular
catalogue:catalogue prometheus:catalogue catalogue regular
catalogue:replicas catalogue:replicas catalogue_replica peer
grafana:grafana grafana:grafana grafana_peers peer
grafana:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
grafana:replicas grafana:replicas grafana_replicas peer
loki:grafana-dashboard grafana:grafana-dashboard grafana_dashboard regular
loki:grafana-source grafana:grafana-source grafana_datasource regular
loki:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
loki:replicas loki:replicas loki_replica peer
prometheus:grafana-dashboard grafana:grafana-dashboard grafana_dashboard regular
prometheus:grafana-source grafana:grafana-source grafana_datasource regular
prometheus:prometheus-peers prometheus:prometheus-peers prometheus_peers peer
traefik:ingress alertmanager:ingress ingress regular
traefik:ingress catalogue:ingress ingress regular
traefik:ingress-per-unit loki:ingress ingress_per_unit regular
traefik:ingress-per-unit prometheus:ingress ingress_per_unit regular
traefik:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
traefik:peers traefik:peers traefik_peers peer
traefik:traefik-route grafana:ingress traefik_route regular
Add the required integrations to the charm¶
This example uses Zookeeper as the machine charm to integrate with COS Lite.
Obtain the cos_agent library¶
Execute the following command to have Charmcraft fetch the required library from Charmhub.
charmcraft fetch-lib charms.grafana_agent.v0.cos_agent
Add the needed provider¶
In the metadata.yaml of the Zookeeper charm, add the cos-agent relation to the provides section.
[...]
provides:
zookeeper:
interface: zookeeper
+ cos-agent:
+ interface: cos_agent
+ limit: 1
[...]
Integrate the library in the charm code¶
In src/charm.py, import the library.
from charms.grafana_agent.v0.cos_agent import COSAgentProvider
Instantiate the COSAgentProvider object in the charm’s __init__ method.
# ...
self._grafana_agent = COSAgentProvider(
self,
metrics_endpoints=[
{"path": "/metrics", "port": NODE_EXPORTER_PORT},
{"path": "/metrics", "port": JMX_PORT},
{"path": "/metrics", "port": METRICS_PROVIDER_PORT},
],
metrics_rules_dir="./src/alert_rules/prometheus",
logs_rules_dir="./src/alert_rules/loki",
dashboard_dirs=["./src/grafana_dashboards"],
log_slots=["charmed-zookeeper:logs"],
)
# ...
As part of this constructor call, you may change the paths where metrics alert rules, log alert rules, and Grafana dashboard files are stored.
Note
To learn how to craft alert rules and dashboards, check these examples.
Pack the charm¶
Pack the charm using Charmcraft:
charmcraft pack
Refresh the Zookeeper charm¶
Switch to the machine model and refresh the Zookeeper charm with the newly built charm file:
juju switch lxd:admin/zoo # or wherever your zookeeper charm is deployed
juju refresh zookeeper --path ./*.charm
Juju will do an in-place upgrade of the charm, adding the cos-agent relation. To check Zookeeper’s status:
juju status zoo
The status for Zookeeper should be active:
Model Controller Cloud/Region Version SLA Timestamp
zoo lxd localhost/localhost 3.6.19 unsupported 18:24:39-03:00
App Version Status Scale Charm Channel Rev Exposed Message
zookeeper 3.9.2 active 1 zookeeper 3/stable 163 no
Unit Workload Agent Machine Public address Ports Message
zookeeper/0* active idle 1 10.72.158.122
Machine State Address Inst id Base AZ Message
1 started 10.72.158.122 juju-50c528-1 ubuntu@22.04 charm-dev-36 Running
Deploy the Opentelemetry Collector machine charm¶
Deploy the Opentelemetry Collector machine charm:
juju deploy opentelemetry-collector otelcol --channel 2/stable --base=ubuntu@22.04
Check the status to verify the deployment:
juju status
The otelcol charm should now be listed:
Model Controller Cloud/Region Version SLA Timestamp
zoo lxd localhost/localhost 3.6.19 unsupported 18:25:34-03:00
App Version Status Scale Charm Channel Rev Exposed Message
otelcol unknown 0 opentelemetry-collector 2/stable 248 no
zookeeper 3.9.2 active 1 zookeeper 3/stable 163 no
Unit Workload Agent Machine Public address Ports Message
zookeeper/0* active idle 1 10.72.158.122
Machine State Address Inst id Base AZ Message
1 started 10.72.158.122 juju-50c528-1 ubuntu@22.04 charm-dev-36 Running
At this point, there’s one zookeeper unit in active state, and an opentelemetry-collector in unknown state with no units. This is because opentelemetry-collector is a subordinate charm.
Integrate the charms¶
Integrate zookeeper with opentelemetry-collector over the cos-agent relation:
juju integrate zookeeper otelcol:cos-agent
Once the relation has been established, otelcol will be deployed together with the zookeeper unit, in the same machine. To check the model status:
juju status
Your output should show the otelcol unit deployed alongside zookeeper, in a blocked state:
Model Controller Cloud/Region Version SLA Timestamp
zoo lxd localhost/localhost 3.6.19 unsupported 18:28:50-03:00
App Version Status Scale Charm Channel Rev Exposed Message
otelcol 0.130.0 blocked 1 opentelemetry-collector 2/stable 248 no ['cloud-config']|['grafana-dashboards-provider']|['send-loki-logs']|['send-remote-write'] for cos-agent
zookeeper 3.9.2 active 1 zookeeper 3/stable 163 no
Unit Workload Agent Machine Public address Ports Message
zookeeper/0* active idle 1 10.72.158.122
otelcol/1* blocked idle 10.72.158.122 ['cloud-config']|['grafana-dashboards-provider']|['send-loki-logs']|['send-remote-write'] for cos-agent
Machine State Address Inst id Base AZ Message
1 started 10.72.158.122 juju-50c528-1 ubuntu@22.04 charm-dev-36 Running
Note that despite otelcol being deployed and collecting telemetry, it hasn’t forwarded telemetry anywhere due to the lack of relations to the corresponding components in the Observability stack.
Relate Opentelemetry Collector to COS Lite¶
Relate Opentelemetry Collector to the following COS Lite components:
Prometheus for the metrics,
Loki for the logs, and
Grafana for the dashboards.
From the application model, verify the offers COS Lite is exposing:
juju find-offers -m ck8s:admin/cos-lite
The output lists the available offers:
Store URL Access Interfaces
ck8s admin/cos-lite.prometheus-metrics-endpoint admin prometheus_scrape:metrics-endpoint
ck8s admin/cos-lite.prometheus-receive-remote-write admin prometheus_remote_write:receive-remote-write
ck8s admin/cos-lite.alertmanager-karma-dashboard admin karma_dashboard:karma-dashboard
ck8s admin/cos-lite.grafana-dashboards admin grafana_dashboard:grafana-dashboard
ck8s admin/cos-lite.loki-logging admin loki_push_api:logging
Consume the offers:
juju consume ck8s:admin/cos-lite.prometheus-receive-remote-write
juju consume ck8s:admin/cos-lite.loki-logging
juju consume ck8s:admin/cos-lite.grafana-dashboards
Verify the model status:
juju status
The model status now shows a SAAS section:
Model Controller Cloud/Region Version SLA Timestamp
zoo lxd localhost/localhost 3.6.19 unsupported 18:31:41-03:00
SAAS Status Store URL
grafana-dashboards active ck8s admin/cos-lite.grafana-dashboards
loki-logging active ck8s admin/cos-lite.loki-logging
prometheus-receive-remote-write active ck8s admin/cos-lite.prometheus-receive-remote-write
App Version Status Scale Charm Channel Rev Exposed Message
otelcol 0.130.0 blocked 1 opentelemetry-collector 2/stable 248 no ['cloud-config']|['grafana-dashboards-provider']|['send-loki-logs']|['send-remote-write'] for cos-agent
zookeeper 3.9.2 active 1 zookeeper 3/stable 163 no
Unit Workload Agent Machine Public address Ports Message
zookeeper/0* active idle 1 10.72.158.122
otelcol/1* blocked idle 10.72.158.122 ['cloud-config']|['grafana-dashboards-provider']|['send-loki-logs']|['send-remote-write'] for cos-agent
Machine State Address Inst id Base AZ Message
1 started 10.72.158.122 juju-50c528-1 ubuntu@22.04 charm-dev-36 Running
The SAAS section lists all the interfaces offered by other applications running in other models. Relate Opentelemetry Collector to these 3 applications:
juju integrate otelcol prometheus-receive-remote-write
juju integrate otelcol loki-logging
juju integrate otelcol grafana-dashboards
Verify the three new integrations in the model status:
juju status --relations
All three should appear in the Integration provider section:
Model Controller Cloud/Region Version SLA Timestamp
zoo lxd localhost/localhost 3.6.19 unsupported 18:33:53-03:00
SAAS Status Store URL
grafana-dashboards active ck8s admin/cos-lite.grafana-dashboards
loki-logging active ck8s admin/cos-lite.loki-logging
prometheus-receive-remote-write active ck8s admin/cos-lite.prometheus-receive-remote-write
App Version Status Scale Charm Channel Rev Exposed Message
otelcol 0.130.0 active 1 opentelemetry-collector 2/stable 248 no
zookeeper 3.9.2 active 1 zookeeper 3/stable 163 no
Unit Workload Agent Machine Public address Ports Message
zookeeper/0* active idle 1 10.72.158.122
otelcol/1* active idle 10.72.158.122
Machine State Address Inst id Base AZ Message
1 started 10.72.158.122 juju-50c528-1 ubuntu@22.04 charm-dev-36 Running
Integration provider Requirer Interface Type Message
loki-logging:logging otelcol:send-loki-logs loki_push_api regular
otelcol:grafana-dashboards-provider grafana-dashboards:grafana-dashboard grafana_dashboard regular
otelcol:peers otelcol:peers otelcol_replica peer
prometheus-receive-remote-write:receive-remote-write otelcol:send-remote-write prometheus_remote_write regular
zookeeper:cluster zookeeper:cluster cluster peer
zookeeper:cos-agent otelcol:cos-agent cos_agent subordinate
zookeeper:restart zookeeper:restart rolling_op peer
zookeeper:upgrade zookeeper:upgrade upgrade peer
Verify that metrics and logs reach Prometheus and Loki¶
With the Cross Model Relations established, verify that the metrics zookeeper exposes are reaching Prometheus:
curl -s http://192.168.1.200/cos-lite-prometheus-0/api/v1/query\?query\=zookeeper_DataDirSize | jq
The output should confirm that Zookeeper metrics are being received:
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "zookeeper_DataDirSize",
"instance": "localhost:9998",
"job": "zookeeper_0",
"juju_application": "zookeeper",
"juju_model": "zoo",
"juju_model_uuid": "8bc6571a-11d5-4c14-84a7-7c7c8e50c528",
"juju_unit": "zookeeper/0",
"memberType": "Leader",
"replicaId": "1"
},
"value": [
1776288972.212,
"551"
]
}
]
}
}
Log into Grafana and use the Explore tab to verify logs, or check the list of dashboards for the ZooKeeper dashboards.