Tier OpenTelemetry Collector with different pipelines per data stream¶
By design, charmed OpenTelemetry Collector (otelcol) forwards all receivers to all exporters. For this reason, in order to mimic the wide range of architectures that the pipeline config supports, multiple OpenTelemetry Collector charms may need to be deployed in a tiered topology. One such use case is for processing data differently per receiver or exporter.
Tiering outgoing data streams¶
One imaginable scenario is splitting a log stream into hot and cold data based on log levels. For compliance reasons we may also want to implement a redaction processor for removing sensitive data. Additionally, the batch processor improves efficiency of both log streams via compression. Low-severity levels like TRACE, DEBUG and INFO often have a greater frequency in log streams and indicate normal workload operation. This can be filtered out in a log stream which is sent to long-term (cold) storage to minimize cost while maintaining compliance. Conversely, the hot storage could include INFO logs, since storage is short-term, while still filtering out TRACE and DEBUG logs.
To understand how to filter telemetry with otelcol, refer to the selectively drop telemetry documentation or see the examples for log-level filtering.
flowchart TB
flog[flog] --> fan-out
fan-out["opentelemetry-collector<br>(redact & batch)"]
fan-out --> warn
fan-out --> info
warn["opentelemetry-collector<br>(cold filter)"] --> loki-cold
info["opentelemetry-collector<br>(hot filter)"] --> loki-hot
loki-hot["loki<br>(hot storage)"]
loki-cold["loki<br>(cold storage)"]
class fan-out,warn,info thickStroke;
classDef thickStroke stroke-width:2px, stroke:#FFA500;
With Juju config we use the otelcol processor config to:
Set the minimum severity level to
WARNING
cold-filter:
options:
processors: |-
filter:
logs:
log_record:
- ContainsValue(Keys(ParseJSON(body)), "level") and
(ParseJSON(body)["level"] == "INFO" or
ParseJSON(body)["level"] == "DEBUG" or
ParseJSON(body)["level"] == "TRACE")
Set the minimum severity level to
INFO
hot-filter:
options:
processors: |-
filter:
logs:
log_record:
- ContainsValue(Keys(ParseJSON(body)), "level") and
(ParseJSON(body)["level"] == "DEBUG" or
ParseJSON(body)["level"] == "TRACE")
Redact sensitive log messages and batch
redact-and-batch:
options:
processors: |
batch:
redaction:
blocked_values:
- "(dolorem|facilis|quo) .* (corporis|debitis|quis)"
Tiering incoming data streams¶
Another imaginable scenario is classifying log streams prior to ingestion into a common storage destination. Each flog log source has unique downstream data processing, useful for environment classification and identification. Both data streams benefit from the redact & batch otelcol using the redaction processor for compliance reasons and the batch processor for efficiency. Additionally, they have an attributes processor, uniquely configured, to classify the logging source environment.
flowchart TB
flog-dev["flog<br>(dev)"] --> dev
dev["opentelemetry-collector<br>(dev attributes)"] --> fan-in
flog-prod["flog<br>(prod)"] --> prod
prod["opentelemetry-collector<br>(prod attributes)"] --> fan-in
fan-in["opentelemetry-collector<br>(redact & batch)"] --> loki[loki]
class fan-in,dev,prod thickStroke;
classDef thickStroke stroke-width:2px, stroke:#FFA500;
With Juju config we use the otelcol processor config to:
Label the log stream as
developmentand originating fromregion-a
dev-attributes:
options:
processors: |-
attributes/dev:
actions:
- key: "region-a.environment"
value: "dev"
action: upsert
Label the log stream as
productionand originating fromregion-a
prod-attributes:
options:
processors: |-
attributes/prod:
actions:
- key: "region-a.environment"
value: "prod"
action: upsert
Redact sensitive log messages and batch
redact-and-batch:
options:
processors: |-
batch:
redaction:
allow_all_keys: true
blocked_values:
- "(dolorem|facilis|quo) .* (corporis|debitis|quis)"