Data tokenization with transform secrets engine
Vault Enterprise feature
Transform secrets engine requires a Vault Enterprise Advanced Data Protection (ADP) license.
What is data tokenization?
Data tokenization process replaces sensitive data with unique values (tokens) that are unrelated to the original value in any algorithmic sense. Therefore, those tokens cannot risk exposing the plaintext satisfying the PCI-DSS guidance.
Vault's transform secrets engine has a data transformation method to tokenize sensitive data stored outside of Vault.
Tokenization versus FPE
When encrypting sensitive data, preservation of the original data format or length may be required to meet certain industry standards such as HIPAA or PCI. One method to fulfill this requirement is to use format preserving encryption (FPE).
However, there are organizations that care more about the irreversibility of the tokenized data and not so much about preserving the original data format. Therefore, the transform secrets engine's FPE transformation may not meet the governance, risk and compliance (GRC) strategy they are looking for due to the use of reversible cryptography to perform FPE.
Characteristics of the tokenization transformation:
Non-reversible identification: Protect data pursuant to requirements for data irreversibility (PCI-DSS, GDPR, etc.)
Integrated Metadata: Supports metadata for identifying data type and purpose
Extreme scale and performance: Support for performantly managing billions of tokens across clouds as well as on-premise
Prerequisites
To perform the tasks described in this tutorial, you need to have Vault Enterprise with the Advanced Data Protection module or HCP Vault Dedicated Plus tier cluster.
- Access to a Vault Enterprise license with the ADP module to run Vault in dev mode or a HCP Vault Dedicated Plus tier cluster.
- Vault binary installed
- Docker installed
- jq installed
Policy requirements
For the purpose of this tutorial, you can use root
token to work
with Vault. However, it is recommended that root tokens are only used for just
enough initial setup or in emergencies. As a best practice, use tokens with
appropriate set of policies based on your role in the organization.
To perform all tasks demonstrated in this tutorial, your policy must include the following permissions:
If you are not familiar with policies, refer to the policies tutorial.
Lab setup
Note
If you do not have access to an HCP Vault Dedicated cluster, visit the Create a Vault Cluster on HCP tutorial.
Launch the HCP Portal and login.
Click Vault in the left navigation pane.
In the Vault clusters pane, click vault-cluster.
Under Cluster URLs, click Public Cluster URL.
In a terminal, set the
VAULT_ADDR
environment variable to the copied address.Return to the Overview page and click Generate token.
Within a few moments, a new token will be generated.
Copy the Admin Token.
Return to the terminal and set the
VAULT_TOKEN
environment variable.Set the
VAULT_NAMESPACE
environment variable toadmin
.The
admin
namespace is the top-level namespace automatically created by HCP Vault. All CLI operations default to use the namespace defined in this environment variable.Type
vault status
to verify your connectivity to the Vault cluster.
The Vault Dedicated server is ready.
Setup the transform secrets engine
Create a role named mobile-pay
which is attached to a transformation named credit-card
.
The tokenized value will have a fixed maximum time-to-live (TTL) of 24 hours.
Enable the
transform
secrets engine attransform/
.Create a role named
mobile-pay
with a transformation namedcredit-card
.Create a transformation named
credit-card
which sets the generated token's time-to-live (TTL) to 24 hours.Example output:
The
max_ttl
is an optional parameter which allows you to control how long the token should stay valid.You can set the
allowed_roles
parameter to a wildcard (*
) to allow all roles or with globs at the end for pattern matching (e.g.mobile-*
).Display details about the
credit-card
transformation.The
type
is set totokenization
.
Tokenize secrets
The Vault client applications must have the following in their policy to perform
tokenization transformation using the transform secrets engine enabled at
transform/
.
Encode data using the
mobile-pay
role and store it in a variable.The
ttl
value is an optional parameter. Remember that themax_ttl
was set to 24 hours when you created thecredit-card
transformation. You can overwrite that value to make the token's TTL to be shorter.In addition, you can set optional metadata about the data.
Example output:
Retrieve the metadata of the token.
Notice the
expiration_time
value. Since you have overwritten themax_ttl
, thettl
is set to 8 hours.Validate the token value.
Validate that the credit card number has been tokenized already.
Retrieve the original plaintext credit card value.
Convergent tokenization
Requirement
This feature requires Vault 1.11.0 or later.
If you encode the same value multiple times, it returns a different encoded value each time.
Example:
In some use cases, you may want to have the same encoded value for a given input so that you can query your database to count the number of entries for a given secret.
Key derivation is supported to allow the same key to be used for multiple purposes by deriving a new key based on a user-supplied context value. In this mode, convergent encryption can optionally be supported, which allows the same input values to produce the same ciphertext.
Update the
mobile-pay
role with a convergent transformation namedcredit-card-convergent
.Create a transformation named
credit-card-convergent
which sets the enables the convergent encryption. When you define a transformation, setconvergent=true
.Example output:
Encode a value using the
credit-card-convergent
transformation.Example output:
Run the command again.
Example output:
It returns the same encrypted value.
Lookup token
When the transformation is configured with convergent encryption, you can look up the tokenized value (token).
Encode the value using the
credit-card-convergent
transformation with time-to-live (TTL) of 8 hours.Example output:
The encoded value (token) is longer than the one without a TTL.
Look up the token for card number "5555-6666-7777-8888".
Example output:
Look up with expiration of "any".
Example output:
This returns two space-separated tokens. In absence of the
expiration
parameter, the command returns token with no expiration. When the expiration is set to "any", it returns tokens with any expiration.Look up tokens that have an expiration between a given range using
min_expiration
andmin_expiration
which are RFC3339 formatted time and date.Example output:
Any token that expires within the provided date range is displayed.
Key rotation
Note
The automatic key rotation requires Vault Enterprise 1.12.0 or later.
Rotating keys regularly limits the amount of information produced by a key if that key ever becomes compromised. In this section, you are going to enable automatic key rotation for your tokenization keys.
Read the key information for
credit-card
transformation.Notice that the
latest_version
is 1.Rotate the key for
credit-card
transformation.Read the key information again.
The
latest_version
is now 2.Configure the key to be automatically rotated every 90 days to reduce operational overhead.
Note
The minimum permitted value for the
auto_rotate_period
is 1 hour.Example output:
Verify the configuration.
Example output:
If the key becomes compromised, you can rotate the key using the
transform/tokenization/keys/<transformation_name>/rotate
, and then set themin_decryption_version
to the latest key version so that the older (possibly compromised) key will not be able to decrypt the data.
Because the minimum rotation period you can set is 1 hour, you will need to come back later to see that the key is rotated.
Set the rotation period to 1 hour.
Encrypt another value for testing.
The value,
1234-5678-9012-3456
is encoded withcredit-card
key version of 2, and the returned encoded value is stored in theTOKEN_VALUE_2
environment variable.You can make sure that the environment variable holds the encoded value.
Wait for at least 1 hour to see the key has been rotated.
You can test that the data encoded by version 1 of the key can be still decoded because
min_decryption_version
is 1.Also, you should be able to decode
TOKEN_VALUE_2
.If you change the
min_decryption_version
to 2, you will be able to decodeTOKEN_VALUE_2
but notTOKEN_VALUE
.
Setup external token storage
Vault Dedicated does not currently support external key storage.
Bring your own key (BYOK)
Note
This section is provided to help you understand the process generate and use a data key without sending data to Vault. A complete end-to-end scenario cannot be replicated in the tutorial.
When your use case requires an external key, users of Vault version 1.12.0 or greater can use BYOK functionality to import an existing encryption key that was generated outside Vault.
The target key for import can originate from an HSM or other external source, and must be prepared according to its origin before you can import it.
Note
Tokenization transformations with imported keys do not currently support convergent tokenization.
The example shown here will use a 256-bit AES key, referred to as the target key. To successfully import the target key, you must perform the following operations to prepare it.
Generate an ephemeral 256-bit AES key.
Wrap the target key using the ephemeral AES key with AES-KWP.
Wrap the AES key under the Vault wrapping key using RSAES-OAEP with MGF1 and either SHA-1, SHA-224, SHA-256, SHA-384, or SHA-512.
Delete the ephemeral AES key.
Append the wrapped target key to the wrapped AES key.
Base64 encode the result.
A specific code example for preparing and wrapping the key for import is beyond the scope of this tutorial. For more details about wrapping the key for import including instructions for wrapping key from an HSM, refer to the key wrapping guide.
Before you can wrap the key for import, you must read the wrapping key from Vault so that it can be used to prepare your key.
The output is the (4096-bit RSA) wrapping key.
Use the wrapping key value at step 3 in the previously detailed preparation steps. Once you have prepared and base64 encoded the ciphertext, export the value to the environment variable IMPORT_CIPHERTEXT
.
Example:
Create a new transformation role named legacy-system
to use for the transformation that you will import the key into.
Import the key into the application-form
transformation. Imported keys do not support rotation by default, so include the allow_rotation
parameter and set its value to true
so that you can also try rotating the imported key. Add the allowed_roles parameter and specify the legacy-system
role.
Try using the newly imported key to encode a value and some metadata.
The imported key is working, and the encoded value returned by the application-form
transformation is using the imported key.
Note
To import subsequent versions of the key, you must use the import_version API endpoint.
Review the key information.
The key's latest_version
is currently 1..
Rotate the key.
Note
Once an imported key is rotated within Vault, it will no longer
support importing key material with the import_version
endpoint.
Check the key information once more.
The key's latest_version
is currently 2., and you can no longer import external versions of the key as it is now internally maintained by Vault.
Clean up
Unset the
VAULT_TOKEN
environment variable.Unset the
VAULT_ADDR
environment variable.Unset the
IMPORT_CIPHERTEXT
environment variable.Stop and remove the Postgres container.
Stop the Vault Enterprise dev mode container.
Summary
Transformation secrets engine introduced tokenization transformation feature which replaces sensitive data with unique value (token) that are unrelated to the original value in any algorithmic sense. This can help organizations to meet certain industry standards.
If retaining the original data format is important, refer to the Transform Secrets Engine to learn about the format preserving encryption (FPE) transformation.