Audit log telemetry
Audit log telemetry provides information on the health of your configured audit devices.
Default metrics
vault.audit.log_request_failure
Metric type | Value | Description |
---|---|---|
counter | number | The number of audit log request failures across all devices |
The number of request failures is a crucial metric.
When using Prometheus sink use rate
or irate
to convert this into the number
of failures over a specific time period.
When using Vault's built-in /metrics
output format, counters are reported
aggregated over the metrics interval which defaults to 10 seconds. Due to
historical reasons, this counter is recorded in a way that makes the count
field misleading - it counts every request whether it failed or not. The mean
value however will correctly record the normalized per-second rate at which
audit errors have occurred over the interval.
Any increase in this counter indicates that all the configured audit devices failed to log a request (or response). If Vault cannot properly audit a request, or the response to a request, the original request will fail.
Refer to the Vault logs and any device-specific metrics to troubleshoot the failing audit log device.
vault.audit.log_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete all audit log requests across all audit log devices |
vault.audit.log_response_failure
Metric type | Value | Description |
---|---|---|
counter | number | The number of audit log response failures across all devices |
The number of response failures is a crucial metric.
When using Prometheus sink use rate
or irate
to convert this into the number
of failures over a specific time period.
When using Vault's built-in /metrics
output format, counters are reported
aggregated over the metrics interval which defaults to 10 seconds. Due to
historical reasons, this counter is recorded in a way that makes the count
field misleading - it counts every request whether it failed or not. The mean
value however will correctly record the normalized per-second rate at which
audit errors have occurred over the interval.
Any increase in this counter indicates that all the configured audit devices failed to log a request (or response). If Vault cannot properly audit a request, or the response to a request, the original request will fail.
Refer to the Vault logs and any device-specific metrics to troubleshoot the failing audit log device.
vault.audit.log_response
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete audit log responses across all audit log devices |
vault.audit.sink.success
Metric type | Value | Description |
---|---|---|
counter | number | Number of times an audit device was written to successfully |
vault.audit.sink.failure
Metric type | Value | Description |
---|---|---|
counter | number | Number of times an audit device encountered an error while writing |
vault.audit.fallback.success Enterprise
Metric type | Value | Description |
---|---|---|
counter | number | Number of times the fallback audit device was written to |
vault.audit.fallback.miss Enterprise
Metric type | Value | Description |
---|---|---|
counter | number | Number of times Vault filtered out an audit entry such that no devices were written to |
Audit device metrics
Device-specific metrics for each enabled audit device. For example, if you
enable a file audit device, the related metrics are:
vault.audit.file.log_request
and vault.audit.file.log_response
.
vault.audit.{DEVICE}.log_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete all audit log requests across the device |
vault.audit.{DEVICE}.log_response
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete all audit log responses across the device |