Configure Datadog metrics collection for Consul on Kubernetes
This page describes the processes for integrating Datadog metrics collection in your Consul on Kubernetes deployment. The Helm chart includes automated configuration options to simplify the integration process.
Datadog Metrics Integration Methods
Users should choose one integration method from the three described below that best suites the intent for metrics collection. DogStatsD, Consul Integration, and Openmetrics Prometheus methods of integration are mutually exclusive.
Reasoning: The consul-k8s helm chart automated configuration implements Datadog's Consul Integration method using the use_prometheus_endpoint
configuration parameter. DogstatsD, Consul Integration, and Openmetrics Prometheus Metrics by design share the same metric name syntax for collection, and would therefore cause a conflict. The consul.py integration source code, as well as the consul-k8s helm chart prohibit the enablement of more that one integration at a time.
DogstatsD
This method of implementation leverages the hashicorp/go-metrics DogstatsD client library to manage metrics collection. Metrics are aggregated and sent via UDP or UDS transports to a Datadog Agent that runs on the same Kube Node as the Consul servers.
Enabling this method of metrics collection allows Consul to control the delivery of metrics traffic directly to a Datadog agent rather
than a Datadog agent attempting to reach Consul and scrape the /v1/agent/metrics
API endpoint.
This is accomplished by updating each server agent's configuration telemetry stanza.
Helm Chart Configuration
Consul Helm Chart Overrides
Resulting server agent telemetry configuration
UDS/UDP Advantages and Disadvantages
This integration method accomplishes metrics collection by leveraging either Unix Domain Sockets (UDS) or User Datagram Protocol (UDP) transport. Practitioners who manage their Kubernetes infrastructure and/or service-mesh should take into account the implications outlined in the tables below.
UDS
Packet Transport: Unix Domain Socket File
Advantages | Disadvantages |
---|---|
No IP or DNS resolution requirement for Datadog Agent | Requires hostPath Volume attachment |
Improved network performance | Datadog Agent must run on every host you send metrics from |
Higher throughput capacity | |
Packet error handling | |
Automatic container ID tagging |
UDP
Packet Transport:
- Kubernetes Service
IP:Port
- Container Host Port
Advantages | Disadvantages |
---|---|
Does not require hostPath Volume attachment | No packet error handling |
(KubeDNS) Does not require Hostport exposure if accessible from cluster | (Hostport) Requires a networking provider that adheres to the CNI specification, such as Calico, Canal, or Flannel. |
Similar IP:Port configuration as Virtual Machine hosts | (Hostport) Requires port to be exposed on host using hostNetwork |
(Hostport) Requires firewall access controls to permit access | |
(Hostport) Network Namespace sharing is required |
Verifying DogstatsD Metric Collection
To confirm you're Datadog agent is receiving traffic, the status
subcommand can be ran from the Datadog Agent expecting to receive DogstatsD traffic from Consul.
There should be an increase in either UDP or UDS traffic packet counts from the resultant output after the configuration has been properly established.
Transport | Command | Pod | Container |
---|---|---|---|
UDP ||UDS | agent status | datadog-agent | agent |
Traffic verification can also be accomplished using the netstat
command line utility from a consul-server expected to be submitting
metrics data to Datadog.
Note
Usingnetstat
requires privileged container permissions to install open-bsd
networking tools on the consul-server for testing.Transport | Command | Pod | Container |
---|---|---|---|
UDP || UDS | netstat | consul-server | consul |
UDS provides the additional capability for verification by sending a test metrics packet to the Unix Socket configured.
Note
Usingnetcat
(nc) requires privileged container permissions to install open-bsd
networking tools on the consul-server for testing.Transport | Command | Pod | Container |
---|---|---|---|
UDS | nc | consul-server | consul |
Use Case
DogstatsD integration provides full-scope metrics collection from Consul, and minimizes access control configuration requirements as traffic
flow is outbound (toward the Datadog Agent) as opposed to inbound (toward the /v1/agent/metrics/
API endpoint).
Metrics Data Collected
- Full list of metrics sent via DogstatsD consists of those listed in the Agent Telemetry documentation.
Datadog Checks: Official Consul Integration
The Datadog Agent package includes official third-party integrations for built-in availability upon agent deployment.
The Consul Integration Datadog checks provided some additional metric verification checks that leverage Consul's built-in feature-set, and help monitor Consul during normal operation beyond that of Consul's available metrics.
See the below table for an outline of the features added by the official integration.
Note
Currently, the annotations configured by the Helm overrides with Consul RPC TLS enabled assume server and ca certificate secrets are shared with the Datadog agent release namespace and mount the validtls.crt
, tls.key
, and ca.crt
secret volumes at the /etc/datadog-agent/conf.d/consul.d/certs
path on the Datadog Agent, agent container.Helm Chart Configuration
Consul Helm Chart Overrides
Consul server-statefulset.yaml
annotations
Additional Integration Checks Performed
Consul Component | Description | API Endpoint(s) |
---|---|---|
Agent | Agent Metadata (i.e., version) | /v1/agent/self |
Metrics | Prometheus formatted metrics | /v1/agent/metrics |
Serf | Events and Membership Flaps | /v1/health/service/consul /v1/agent/self |
Raft | Monitors Raft peer information and leader elections | /v1/status/leader /v1/status/peers |
Catalog Services | Service Health Status and Node Count | /v1/catalog/services /v1/health/state/any |
Catalog Nodes | Node Service Count and Health Status | /v1/health/state/any /v1/health/service/<service> |
Consul Latency | Consul LAN + WAN Coordinate Latency Calculations | /v1/agent/self /v1/coordinate/nodes /v1/coordinate/datacenters |
Use Case
This integration is primarily for basic Consul monitoring with focus on the service discovery.
Metrics Data Collected
The list of Consul's Prometheus metrics scraped and mapped by this method are listed in the latest metrics.py of the integration source code.
To understand how Consul Latency metrics are calculated, review the Consul Network Coordinates documentation.
Review the Datadog Documentation for the full description of Metrics data collected.
Openmetrics Prometheus
For Datadog agents at or above v6.5.0, OpenMetrics and Prometheus checks are available to scrape Kubernetes application Prometheus endpoints.
This method implements the collection via Openmetrics as that is fully supported for Prometheus text format and is accomplished using pod annotations as demonstrated below.
Note
Enabling OpenMetrics collection via Datadog by design removes theprometheus.io/path
and prometheus.io/port
annotations from the consul-server statefulset deployment to allow Datadog to scrape the agent's metrics API endpoint using either RPC TLS and Consul ACLs as necessary.Note
Currently, the annotations configured by the Helm overrides with Consul RPC TLS enabled assume server and ca certificate secrets are shared with the Datadog agent release namespace and mount the validtls.crt
, tls.key
, and ca.crt
secret volumes at the /etc/datadog-agent/conf.d/consul.d/certs
path on the Datadog Agent, agent container.Helm Chart Configuration
Consul Helm Chart Overrides
Consul server-statefulset.yaml
annotations
Use Case
This method of integration is useful for Prometheus-enabled scrapes with further customization of the collected data.
By default, all metrics pulled using this method scrape Consul metrics using the /v1/agent/metrics?format=prometheus
API query, and are considered to be custom metrics.
Use of this method maps to Datadog as described in Mapping Prometheus Metrics to Datadog Metrics. The following table summarizing how these metrics map to each other.
OpenMetrics metric type | Datadog metric type |
---|---|
Gauge | gauge |
Counter | count |
Histogram: _count | count.count |
Histogram: _sum | count.sum |
Histogram: _bucket | count.bucket || distribution |
Summary: _count | count.count |
Summary: _sum | count.sum |
Summary: sample | gauge.quantile |
Metrics Data Collected
The integration, by default, uses a wildcard (".*"
) to collect all metrics emitted from the /v1/agent/metrics
endpoint.
Please refer to the Agent Telemetry documentation for a full list and desription of the metrics data collected.