Detecting Drift
One of the core challenges of infrastructure as code is keeping an up-to-date record of all deployed infrastructure and their properties. Terraform manages this by maintaining state information in a single file, called the state file.
Terraform uses declarative configuration files to define the infrastructure
resources to provision. This configuration serves as the target source of truth
for what exists on the backend API. Changes to Infrastructure outside of
Terraform will be detected as deviation by Terraform and shown as a diff in
future runs of terraform plan
. This type of change is referred to as "drift",
and its detection is an important responsibility of Terraform in order to inform
users of changes in their infrastructure. Here are a few techniques for
developers to ensure drift is detected.
Capture all state in READ
A provider's READ
method is where state is synchronized from the remote API to
Terraform state. It's essential that all attributes defined in the schema are
recorded and kept up-to-date in state. Consider this provider code:
As defined in the schema, the type
attribute is optional, now consider this
config:
Even though type
is omitted from the config, it is vital that we record it
into state in the READ
function, as the backend API could set it to a default
value. To illustrate the importance of capturing all state consider a
configuration that interpolates the optional value into another resource:
Update state after modification
A provider's CREATE
and UPDATE
functions will create or modify resources on
the remote API. APIs might perform things like provide default values for
unspecified attributes (as described in the above example config/provider code),
or normalize inputs (lower or upper casing all characters in a string). The end
result is a backend API containing modified versions of values that Terraform
has in its state locally. Immediately after creation or updating of a resource,
Terraform will have a stale state, which will result in a detected deviation on
subsequent plan
or apply
s, as Terraform refreshes its state and wants to
reconcile the diff. Because of this, it is standard practice to call READ
at
the end of all modifications to synchronize immediately and avoid that diff.
Error checking aggregate types
Terraform schema is defined using primitive types and aggregate types.
The preceding examples featured primitive types which don't require error
checking. Aggregate types on the other hand, schema.TypeList
,
schema.TypeSet
, and schema.TypeMap
, are converted to key/value pairs when
set into state. As a result the Set
method must be error checked, otherwise
Terraform will think it's operation was successful despite having broken state.
The same can be said for error checking API responses.
Use Schema Helper methods
As mentioned, remote APIs can often perform mutations to the attributes of a
resource outside of Terraform's control. Common examples include data containing
uppercase letters and being normalized to lowercase, or complex defaults being
set for unset attributes. These situations expectedly result in drift, but can
be reconciled by using Terraform's schema functions, such as
DiffSuppressFunc
or DefaultFunc
.