Troubleshoot your Consul datacenter with the hcdiag tool
HashiCorp Diagnostics — hcdiag — is a troubleshooting data-gathering tool that you can use to collect and archive important data from Consul, Nomad, Vault, and TFE server environments. The information gathered by hcdiag
is well-suited for sharing with teams during incident response and troubleshooting.
In this tutorial, you will:
- Run a Consul server in "dev" mode, inside an Ubuntu Docker container
- Install hcdiag from the official Hashicorp Ubuntu package repository
- Execute basic
hcdiag
commands against this Consul service - Explore the contents of files created by the hcdiag tool
- Learn about additional hcdiag features and how to use a custom configuration file with
hcdiag
Prerequisites
You will need a local install of Docker running on your machine for this tutorial. You can find the instructions for installing Docker here.
Set up the environment
Run an ubuntu
Docker container in detached mode with the -d
flag. The --rm
flag instructs Docker to delete the container once it has been stopped and the -t
flag allocates a pseudo-tty which keeps the container running until it is stopped manually.
Open an interactive shell session in the container with the -it
flags.
Tip
Your terminal prompt will now appear differently to show that you are in a shell in the Ubuntu container - for example, it may look something like root@a931b3c8ca00:/#
. The rest of the commands in the tutorial are to be run in this Ubuntu container shell.
Update apt-get
and install the necessary dependencies.
Create a working directory and change into it.
Install and start Consul
Add the HashiCorp repository.
Install the consul
package.
Start Consul as a background process, with all of its output redirected to a consul.log
file in your current directory.
Access the Consul client
Set the required environment variable that points hcdiag
to your Consul service — in this case, since the dev-mode Consul agent is running, it is http://127.0.0.1:8500
.
Run the consul members
command to confirm Consul is running.
Install and run the hcdiag
tool
Install the latest hcdiag
release from the HashiCorp repository.
This is a minimal environment, so ensure you set the SHELL
environment variable.
Run hcdiag
with the consul
flag. This will gather your available environment and Consul product information.
Tip
This is an extremely minimal environment which doesn't provide some of the system services that hcdiag
uses to gather information — seeing a few errors, like in the output above, is normal.
You can also invoke hcdiag
without options to gather all available environment and product information. To learn about all executable options, run hcdiag -h
.
Examine the results
List the directory for .tar.gzip
archive files to discover the file that hcdiag
created.
Tip
The extracted directory uses a timestamp as part of the filename. This means any references to it used in this tutorial will be different than what you will see on your local machine.
Unpack the archive to further examine its contents.
The archive extracts several files and directories:
Manifest.json
contains information describing the hcdiag run, including configuration options used, run duration, and a count of any errors encountered.Results.json
contains information about the environment and the output from an invocation ofconsul debug
.ConsulDebug/
is a directory that contains the output from invokingconsul debug
.
Inspect the bundle to ensure it contains only information that is appropriate to share based on your use-case or situation. If you need to obscure secrets or sensitive information that might be contained in an hcdiag
bundle, please refer to the hcdiag
redactions documentation.
Consul Enterprise users
If you are a Consul Enterprise user, please share the output from hcdiag
with HashiCorp Customer Support to reduce the amount of information gathering required for a support request.
The tool only works locally and does not export or share the diagnostic bundle with anyone. You must use other tools to transfer it to a secure location so you can share it with specific support staff who need to view it.
Configuration file
You can configure hcdiag
's behavior with a HashiCorp Configuration Language (HCL) formatted file. Using this file, you can configure behavior by adding your own custom runners, redacting sensitive content using regular expressions, excluding commands, and more.
Tip
This minimal environment doesn't ship with most common command-line text editors, so you will want to install one with apt-get install nano
or apt-get install vim
.
Create a file named diag.hcl
with the following contents. This file does two things:
It adds an agent-level (global) redaction which instructs
hcdiag
to redact all sensitive content in the formatPASSWORD=sensitive
. This is a contrived example; please refer to the officialhcdiag
Documentation for more detailed information about how redactions work and how to use them.It instructs
hcdiag
to exclude theconsul debug
command.
Run the hcdiag
command with the diag.hcl
configuration file. It will return a similar output to the following.
Unlike the previous hcdiag
run, this output does not contain Consul agent metrics information.
Additionally, any runner output that might capture and expose passwords in the redacted format would show <PASSWORD REDACTED>
in place of this sensitive content.
Cleanup
Exit the Ubuntu container to return to your terminal prompt.
Stop the Docker container. Docker will automatically delete the container due to the -rm
flag passed to the docker run
command used in the beginning of the tutorial.
Production usage tips
By default, the hcdiag tool includes files for up to 72 hours back from the current time. You can specify the desired time range using the -include-since
flag.
If you are concerned about impacting performance of your Consul servers, you can specify that the runners to not run concurrently, and instead be invoked serially with the -serial
flag.
Deploying hcdiag in production involves a workflow similar to the following:
Place the
hcdiag
binary on the Consul system in scope - this could be a Consul server or a Consul client.When running with a configuration file and the
-config
flag, ensure that the specified configuration file is readable by the user that executeshcdiag
.Ensure that the current directory (or the destination directory you have chosen with the
dest
flag) is writable by the user that executeshcdiag
.Ensure connectivity to the HashiCorp products that
hcdiag
needs to connect to during the run. Export any required environment variables for establishing connection or passing authentication tokens as necessary.Decide on a duration for information gathering, noting that the default is to gather for up to 72 hours back in server log output. Adjust your needs as necessary with the
-include-since
flag. For example, to include only 24 hours of log output, invoke as:Limit what is gathered with the
-includes
flag. For example,-includes /var/log/consul-*,/var/log/nomad-*
instructshcdiag
to only gather logs matching the specified Consul and Nomad filename patterns.Use redactions to prevent sensitive information like keys or passwords from reaching hcdiag's output or the generated bundle files.
Use the
-dryrun
flag to observe what hcdiag will do without anything actually being done for testing configuration and options.
Summary
In this tutorial, you retrieved a Git repository, created a local Consul datacenter with Docker Compose, and used the local environment to explore the hcdiag
tool in the context of gathering information from a running Consul environment.
You also learned about the available configuration flags, the configuration file, and production specific tips for using hcdiag
.
Next Steps
For additional information about the tool, check out the the hcdiag
GitHub repository.
There are also hcdiag
guides for other HashiCorp tools including Vault, Terraform, and Nomad.
Help and Reference
Feel free to explore the following resources for additional help with troubleshooting your Consul environment.