Scale Terraform Enterprise instances on Kubernetes

This topic describes how to increase the number of replica instances of your Terraform Enterprise deployment on Kubernetes so that your deployment can scale to meet demand.

Introduction

To handle increased computational workload requirements of your organization, you can scale Terraform Enterprise horizontally by increasing the number of terraform-enterprise pods and terraform-enterprise agent pods.

Impact on external resources

Scaling your deployment up to meet larger demands may impact the following external resources.

PostgreSQL: In our testing, CPU utilization and memory usage increased when scaling Terraform Enterprise pods with increasing capacity concurrency. We recommend closely monitoring both CPU utilization and memory usage and provide additional resources as necessary.
Redis: In our testing, Redis CPU or memory spiked in some cases. You should monitor your Redis server and provide additional resources as necessary.

Requirements

The requirements for your Kubernetes cluster depend on the specific workloads. Update CPU, RAM, storage, and network capacity according to your applications demands.

CPU requirements

The following minimum requirements for a Terraform Enterprise pod are appropriate for most initial production deployments and for development and test environments:

CPU	Memory
2 core	3 GB RAM

We tested with a minimum memory of 2.5GB and 0.75 vCPU and scaled up to maximum of 7.5GB and 4 vCPU.

resources:
   requests:
     memory: "2500Mi"
     cpu: "750m"
   limits:
     memory: "7500Mi"
     cpu: "4000m"

Database requirements

The following table describes the recommended minimum memory and CPU requirements for the PostgreSQL database. You may require additional resources depending on the anticipated demands on Terraform Enterprise within your organization:

Type	CPU	Memory	Storage
Minimum	4 core	16 GB RAM	50 GB
Scaling	8 core	32 GB RAM	50 GB

Redis cache requirements

We successfully tested Terraform Enterprise with the following data store configuration:

6GB Redis cache.
Multi-AZ enabled.
Automated failover enabled.

Increase `terraform-enterprise` pods

Kubernetes creates a Deployment object that manages the terraform-enterprise pods through a ReplicaSet. Refer to the Kubernetes documentation to learn about Deployments and ReplicaSets.

The deployment.yaml file generated during the Terraform Enterprise deployment process includes the replicas field, which determines how many terraform-enterprise pods Kubernetes should create.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: terraform-enterprise
  labels:
    app: terraform-enterprise
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: terraform-enterprise
  template:
    metadata:
      annotations:
    {{- if .Values.pod.annotations }}
    {{ toYaml .Values.pod.annotations | indent 8 }}
    {{ end }}
      labels:
        app: terraform-enterprise

Complete the following steps to increase the number of terraform-enterprise pods for your deployment:

Update the value of replicaCount in the values file of your Helm chart. The replicaCount value maps to the spec.replicas value in the deployment file. The maximum number of terraform-enterprise pods you can instruct Kuberneteds to create is 5.
Run the helm upgrade command to update the deployment.

In the following example, the replicaCount is set to 3, which sets the spec.replicas value to 3 after running helm ugrade:

  replicaCount: 3
  ...

Increase Terraform Enterprise run capacity

Terraform Enterprise executes runs by creating agent jobs in a different namespace, which in turn creates agent pods. Each run executes in its own agent pod. When the run is completed, the agent job and agent pod are automatically cleaned up. Increasing the maximum number of concurrent agent pods can reduce run queue lengths, resulting in reduced wait time for runs to begin execution.

The TFE_CAPACITY_CONCURRENCY setting determines the maximum number of agent jobs that can be created by the Terraform Enterprise pod at a given time. The default concurrency is set at 10. TFE_CAPACITY_CONCURRENCY applies to each terraform-enterprise pod. For example, if you have three terraform-enterprise pods, and TFE_CAPACITY_CONCURRENCY is 10, the maximum number of agent pods for Terraform Enterprise will be 30.

You can also configure memory and cpu limits for individual agent pods. The memory and cpu configuration for agent pods is determined by the TFE_CAPACITY_MEMORYand TFE_CAPACITY_CPU values on the helm chart.

  env:
  ...
  variables:
    TFE_CAPACITY_CONCURRENCY: "10"     # Set the maximum number of concurrent runs, eg: 10
    TFE_CAPACITY_CPU: "0"              # Set the maximum CPU utilization. "0" equals unlimited.
    TFE_CAPACITY_MEMORY: "2048"        # Set the maximum memory utilization, eg: "2048" equals 2048Mi.

To increase the number of agent pods that can be run concurrently, update the TFE_CAPACITY_CONCURRENCY value on the values file on the helm chart and do a helm upgrade to update the deployment.

The maximum value of TFE_CAPACITY_CONCURRENCY supported is 50.