Terraform: benefits and setup

Terraform: benefits and setup

- 10 mins

Index

Introduction

Within the world of software tools and frameworks, there is a category dedicated to those that try to define physical infrastructure (servers, databases…) using code.

The magic to make this possible is to rely on a Infrastructure as a Service provider (IaaS), to expose the necessary APIs for us to programmatically communicate with them and tell them what resources we want to have deployed.

This post aims to cover the basics of one of the Infrastructure as code tools, Terraform, as long as its incredible benefits on the long run.

What is Terraform? 🏗️

Terraform is a command line tool that allow developers to communicate with IaaS cloud providers, in order to manage resources within those providers. The Terraform team usually refers to this management of resources as both their creation, and their evolution over time, covering every stage of their lifecycle: creation, modification, and tear down.

The supported cloud providers, alongside their documentation, are:

For a deeper explanation, watch the Introduction video by Armon Dadgar (HashiCorp CTO).

How to use Terraform?

Terraform relies on the definition of resources by using one particular language called HashiCorp Configuration Language (HCL). This language allows developers to define resources independently of the cloud provider.

It is said to be “cloud agnostic”.

HCL uses sets of key-values pairs to define particular properties within each cloud resource. The resource name dictates which keys are accepted, as different providers allow different ways of configuring their resources.

For instance, this is an SQL Database definition within GCP:

resource "google_sql_database" "us-database" {
  project  = "dummy-project"                # The GCP project targeted
  name     = "us-db"                        # The DB name we selected
  charset  = "UTF8"                         # The charset within the DB
  instance = "project-database-instance"    # Assuming this DB instance exists
}

Once there is a set of defined resources using a specific provider, the terraform CLI can be used to create a State file. This is the most important file from all the ones involved in the tool usage.

The state file (terraform.tfstate) maps real world resources to your configuration, and it is used by Terraform when developers want to evolve an already existing set of resources. In short, it is the source of truth of what has been created within a cloud project.

If something is not in the state file, then it does not exist to Terraform.

Once a set of HCL files have been declared, using Terraform CLI involves:

  1. Creating / getting the initial state: terraform init
  2. Checking the diff between the HCL files and the state: terraform plan
  3. Applying the diff between the HCL files and the state: terraform apply

Benefits of using Terraform

You may be wondering why such a dedication to define cloud resources, given that the apparently easiest way to manage them is by using the Web UI all these providers offer.

Well, using the UI is great to explore and discover other products, but terrible when it comes to change traceability, and evolution control.

Change traceability

As anything associated with billing, it is important to know when infrastructure changes occur, and by whom.

Bad news is, it is almost certain that your cloud provider does not offer a complete history of changes for each resource. Even less, user name of who performed each change.

Good news is: we do not need it. Thanks to the use of Terraform and the HCL language, we can back up the HCL files within a GIT repository, already providing us the complete list of changes since the creation of the repo. GIT to the rescue! 🚀

Evolution control

Given the source of truth role of the terraform.tfstate file, it is very common to place this file in a centralized bucket (remote location), controlling who can and cannot modify it.

For example, this is how to do it with a GCP bucket:

terraform {
  backend "gcs" {
    bucket  = "terraform-bucket"        # Desired bucket name
    prefix  = "state"                   # Folders before the .tfstate file
  }
}

This centralized way of operating with Terraform allow teams to control how changes are introduced by setting up a connection between their GIT platform (i.e. GitHub), and the access to the remote bucket, so that only PR reviewed changes translate into real world infrastructure evolution (and to .tfstate changes along the way).

Trust me, centralized infrastructure control is a nice property to have 😌.

Automatic provisioning 🤖

The logical next step upon having the Terraform resources within a GIT repository, is to implement some kind of Continuous Integration system to automatize how code changes are propagated to real world infrastructure.

This automation is usually refer to as “Automatic provisioning”.

The main challenges of automatic provisioning are two:

  1. How is the Terraform repo C.I. going to authenticate with the cloud provider?
  2. How is the Terraform repo C.I. going to be defined?

Setup guide - CI Authentication

This section contains the specific guide of authenticating GitHub Actions runners with a Google Cloud Platform project. Please consider: the authentication may work differently with other cloud providers.

  1. Create a GCP project.
  2. Create a Service account (S.A.) within that project.
  3. Give project Owner permissions to that S.A.
  4. Create a Service account key for that S.A.
  5. Create a GitHub secret containing the key from step 4.

Later on, that secret will be used to declare an env. variable called GOOGLE_APPLICATION_CREDENTIALS, used to authenticate CLI tools with a GCP project, according to the GCP Authentication documentation.

Setup guide - CI workflow

The last step involves creating a GitHub Actions workflow that automatically validates and applies Terraform changes upon PR merges.

A real world example may look like this:

name: Terraform apply CI

on:
  push:
    branches:
      - master
    paths:
      - "project/**/*.tf"

env:
  HASHICORP_REPO_KEY: "https://apt.releases.hashicorp.com/gpg"
  HASHICORP_REPO_NAME: "deb [arch=amd64] https://apt.releases.hashicorp.com focal main"
  TERRAFORM_LOCAL_PATH: "./project"
  TERRAFORM_VERSION: "0.13.4"


jobs:

  check:
    needs: []
    runs-on: ubuntu-20.04
    steps:

      # ------ Set up Terraform CLI ------
      - name: "Set up GitHub Actions"
        uses: actions/checkout@v2
      - name: "Add HashiCorp GCP key"
        run: curl -fsSL ${HASHICORP_REPO_KEY} | sudo apt-key add -
      - name: "Add HashiCorp repository"
        run: sudo apt-add-repository "${HASHICORP_REPO_NAME}"
      - name: "Install Terraform package"
        run: sudo apt-get --yes install terraform=${TERRAFORM_VERSION}

      # ------ Check Terraform format -----
      - name: "Check Terraform files"
        run: terraform fmt -check -recursive ${TERRAFORM_LOCAL_PATH}
  
  validate:
    needs: [check]
    runs-on: ubuntu-20.04
    steps:

      # ------ Set up Terraform CLI ------
      - name: "Set up GitHub Actions"
        uses: actions/checkout@v2
      - name: "Add HashiCorp GCP key"
        run: curl -fsSL ${HASHICORP_REPO_KEY} | sudo apt-key add -
      - name: "Add HashiCorp repository"
        run: sudo apt-add-repository "${HASHICORP_REPO_NAME}"
      - name: "Install Terraform package"
        run: sudo apt-get --yes install terraform=${TERRAFORM_VERSION}

      # ------ Validate Terraform syntax -----
      - name: "Initialize Terraform state"
        run: terraform init -backend=false ${TERRAFORM_LOCAL_PATH}
      - name: "Validate Terraform files"
        run: terraform validate ${TERRAFORM_LOCAL_PATH}

  apply:
    needs: [validate]
    runs-on: ubuntu-20.04
    env:
      PROJECT_SECRET_KEY: "${{ secrets.GCP_PROJECT_SA_KEY }}"
      PROJECT_SECRET_PATH: "./project_key.json"

      # This env. variable is established across steps so that
      # Both 'init' and 'apply' commands can authenticate with GCP
      # Ref: https://cloud.google.com/docs/authentication/production
      GOOGLE_APPLICATION_CREDENTIALS: "./project_key.json"

    steps:

      # ------ Set up Terraform CLI ------
      - name: "Set up GitHub Actions"
        uses: actions/checkout@v2
      - name: "Add HashiCorp GCP key"
        run: curl -fsSL ${HASHICORP_REPO_KEY} | sudo apt-key add -
      - name: "Add HashiCorp repository"
        run: sudo apt-add-repository "${HASHICORP_REPO_NAME}"
      - name: "Install Terraform package"
        run: sudo apt-get --yes install terraform=${TERRAFORM_VERSION}

      # ------ Set up GCP credentials ------
      - name: "Download GCP Service Account key"
        run: echo ${PROJECT_SECRET_KEY} > ${PROJECT_SECRET_PATH}

      # ------ Apply Terraform changes -----
      - name: "Initialize Terraform state"
        run: terraform init ${TERRAFORM_LOCAL_PATH}
      - name: "Apply Terraform changes"
        run: terraform apply -auto-approve ${TERRAFORM_LOCAL_PATH}

Clarifications for the workflow:

Advanced topics

This blog post is limited and does not cover all the features of Terraform.

As some sort of advanced topic, I would suggest taking a look to the concept of module. A module is nothing more than a packaged version of a resource / set of resources, that can be imported from a parent module in order to avoid duplication of definitions.

They become useful when defining very similar resources over an over again, as they help developers to extract common specification and reduce code duplication.

In addition, there is a public registry of Terraform modules, where different companies publish their curated ones (comparable to what DockerHub is for Docker). Modules from the public registry are often complex definitions of multiple resources, so that they form a good architectural pattern (i.e. UI-Backend-DB pattern).

Summary

Combining what the post covers, and the official Terraform documentation, technical readers can achieve a good understanding of what Terraform is, what are its main benefits, and how they can set up a basic pipeline to apply automatic provisioning in their cloud projects.

Terraform involves a time investment at first, but it pays off in the long run. I hope this post helped you make that decision easier!

Sinclert Pérez

Sinclert Pérez

Sr Software Engineer @ Shopify

rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora