Integrating COS Lite with uncharmed applications¶
The COS Lite bundle is designed to be deployed and operated using Juju. However, not all workloads that you may want to monitor will be. The good news is that you can use COS Lite to monitor workloads that are not charmed (aka “not managed by Juju”). The bad news is that it’s relatively straightforward to do so. Not bad at all.
Prerequisites¶
This how-to assumes that you already have a working deployment of COS Lite. If that is not the case, we recommend you first follow our tutorial for getting started with COS Lite.
The first step will be to get a hold of a machine, somewhere, and follow this guide on how to get started with COS lite on MicroK8s.
Unless you’re also planning to monitor some charmed applications with this COS Lite deployment, you will not need to use the offers
overlay.
Deploy Grafana Agent¶
Grafana Agent will act as an intermediary between the applications you want to monitor and the COS Lite stack. It will gather telemetry from your applications and send them to COS Lite, where you will be able to inspect them through the Grafana dashboards.
We recommend to host Grafana agent as close as possible to the workloads you intend to monitor, to minimize the risk of network faults and the resulting gaps in telemetry collection.
We recommend to install Grafana Agent via a handy snap we maintain:
$ sudo snap install grafana-agent
Note
Grafana Agent is also available as a single Go binary, and you are free to install it and run it the way you like. See the official documentation for the publisher’s recommendations and guides. We also have it containerized and petrified.
Now that you have Grafana Agent up and running, you will need to configure it.
Get the API endpoints¶
COS Lite includes a Traefik instance that takes care of load balancing and providing ingress capabilities to the various observability components of the stack. Since COS Lite runs on Kubernetes, this allows you to talk to them via Traefik over a stable URL.
Caution
Before you can use Traefik from an external service such as Grafana agent, you will need to ensure that the Traefik URL is routable from the service host, and that the address is stable. (e.g., not a dynamic IP).
In other words, Traefik’s own URL needs to be stable.
In the Juju model where COS Lite is deployed, run the command below to find out the URL to the proxied endpoint.
$ juju run traefik/0 show-proxied-endpoints
Assuming you have configured the Traefik charm to use an external host name, for example "traefik.url"
, you will see something like:
proxied-endpoints: '{
"prometheus/0": {"url": "https://traefik.url/mymodel-prometheus-0"},
"loki/0": {"url": "https://traefik.url/mymodel-loki-0"},
"alertmanager": {"url": "https://traefik.url/mymodel-alertmanager"},
"catalogue": {"url": "https://traefik.url/mymodel-catalogue"},
}'
If you prefer to explore the deployment in a graphical manner, you can also
open https://traefik.url/mymodel-catalogue
in a browser for a list of all the
user interfaces of the components included.
At this point you will need to follow the documentation on how to configure Grafana Agent. Use the URLs you obtained from Traefik to tell the agent where to send its telemetry.
Once you’ve written your finished configuration to /etc/grafana-agent.yaml
, you’ll
be able to restart the snap using the following command:
$ sudo snap restart grafana-agent
And with that, you are done! Good job, you got this!
Diving deeper¶
If completing this how-to guide made you crave for more, feel free to continue on with one of the following extracurriculars.
Custom dashboards and alerts¶
If you want to add your own dashboards and alerts to COS Lite you may extend your COS Lite deployment with the COS Configuration charm.
This charm allow you to use a GitOps-type of workflow for continuously feeding your COS deployment with your latest alert and dashboard definitions.
See this guide for more information.
Using TLS¶
To enable secure communications with (and within) COS Lite, deploy COS Lite with the TLS overlay. You can follow this guide to enable TLS in Traefik and COS Lite.
Grafana Agent snap as a client¶
As a client (e.g. scraping /metrics
endpoint), Grafana Agent must trust the CA that signed the COS charms (or the COS
ingress charm).
For example, to obtain the CA certificate from the self-signed-certificates charm,
juju run ssc/0 get-ca-certificate --format=yaml \
| yq '.ssc/0.results.ca-certificate'
Next, you need to add the certificate to the root store.
Note: After running
update-ca-certificates
and restarting thegrafana-agent
snap service, check the Grafana Agent logs to confirm there are no log lines such as:
msg="Failed to send batch, retrying" err="Post \"https://.../api/v1/write\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
Grafana Agent as a server¶
To have a TLS handshake with incoming connections (e.g. if you push to Grafana Agent via remote-write), you need to have a private key and a certificate, and add to the Grafana Agent config as follows:
server:
log_level: info
+ grpc_tls_config:
+ cert_file: /home/path/to/grafana-agent.pem
+ key_file: /home/path/to/grafana-agent.key
+ http_tls_config:
+ cert_file: /home/path/to/grafana-agent.pem
+ key_file: /home/path/to/grafana-agent.key
Using the Prometheus Scrape Target charm¶
In some rare circumstances, you might prefer to use Prometheus Scrape Target instead of Grafana Agent. Namely:
When you only need metrics (no logs, traces, etc…)
When you’d rather make the necessary firewall changes in the workload you want to monitor, than ingress cos-lite
When you lack permissions to install snaps on the server hosting the workload you want to monitor
If this describes your situation, you can opt for deploying the Prometheus Scrape Target charm instead, configuring it to scrape your workload.