GitOps: Detecting Drift in Your Terraform / Terragrunt Infrastructure

Hello everyone.





Disclaimer : I'll tell you that I am writing an article along the way, the "code" in it is working, but does not pretend to be any best practices, so do not find fault :) The purpose of the article: to convey general principles to the interested Russian-speaking part of the population, it is possible to wake up interest to sort it out yourself and do something much better and more interesting. So let's go!





Let's say you work with Terraform / Terragrunt (the latter is not fundamental here, but better study if you are not already using it) and automate the infrastructure, for example, in AWS (but not necessarily AWS). The infrastructure is in the code of the repository, it is deployed from it, it would seem that this is GitOps happiness :)





Everything is going well until some user changes something with his hands through the console / UI and of course forgot to tell anyone about it. And then he did something bad on purpose. And here it is your drift: code and infrastructure no longer match! :(





In order to at least timely learn about this, it is necessary to slightly refine the automation.





As usual, there are many different ways to get what you want. For example, recently a fairly well-developing utility has appeared on the horizon https://github.com/cloudskiff/driftctl , which may even be more than what I will bring to your attention below, but at the time of this writing, driftctl at least does not support working with aws provider v2, and also does not know how to multi region, which makes its use impossible in most serious projects. But the guys promise to finish it in a month or two.





In the meantime, I will describe and give an example of a small amount of code for the following very simple scheme:





  1. we create a pipeline, which either on a schedule (in Gitlab you can use Pipeline schedules) or in a circle will make a terraform plan





  2. (diff ) pipeline , , Slack.





, , issue , api , apply, . - state, .





live , .. , :





account_1/
├── eu-central-1
│   ├── dev
│   │   ├── eks
│   │   │   ├── terragrunt.hcl
│   │   │   └── values.yaml
│   │   └── s3-bucket
│   │       ├── terragrunt.hcl
│   │       └── values.yaml
│   ├── prod
│   │   ├── eks
│   │   │   ├── terragrunt.hcl
│   │   │   └── values.yaml
│   │   └── s3-bucket
│   │       ├── terragrunt.hcl
│   │       └── values.yaml
│   └── staging
│       ├── eks
│       │   ├── terragrunt.hcl
│       │   └── values.yaml
│       └── s3-bucket
│           ├── terragrunt.hcl
│           └── values.yaml
├── us-east-1
│   ├── dev
│   │   ├── eks
│   │   │   ├── terragrunt.hcl
│   │   │   └── values.yaml
│   │   └── s3-bucket
│   │       ├── terragrunt.hcl
│   │       └── values.yaml
│   ├── prod
│   │   ├── eks
│   │   │   ├── terragrunt.hcl
│   │   │   └── values.yaml
│   │   └── s3-bucket
│   │       ├── terragrunt.hcl
│   │       └── values.yaml
│   └── staging
│       ├── eks
│       │   ├── terragrunt.hcl
│       │   └── values.yaml
│       └── s3-bucket
│           ├── terragrunt.hcl
│           └── values.yaml
└── terragrunt.hcl
      
      



account_1



2 : us-east-1



eu-central-1



, AWS. Terragrunt /, , "${basename(get_terragrunt_dir())}"







, 2: eks



s3-bucket







,





<account_name>/<region>/<environment>/<component>/*







.. " " */*/*/<component>/*







, , s3-bucket ( , ).





Incoming WebHooks Slack Webhook URL. : https://api.slack.com/messaging/webhooks





pipeline Slack diff' :





#!/bin/bash

ROOT_DIR=$(pwd)

plan () {
  echo -e "$(date +'%H-%M-%S %d-%m-%Y') $F"

  CURRENT_DIR=$(pwd)
  PLAN=$CURRENT_DIR/plan.tfplan

  terragrunt run-all plan --terragrunt-non-interactive -lock=false -detailed-exitcode -out=$PLAN 2>/dev/null || ec=$?
  
  case $ec in
    0) echo "No Changes Found"; exit 0;;
    1) printf '%s\n' "Command exited with non-zero"; exit 1;;
    2) echo "Changes Found! Reporting!"; 
  
       MESSAGE=$(terragrunt show -no-color ${PLAN} | sed "s/\"/'/g");    # let's replace the double quotes from the diff with single as double quotes "break" the payload
       curl -X POST --data-urlencode "payload={\"channel\": \"#your-slack-channel-here\", \"username\": \"webhookbot\", \"text\": \"DRIFT DETECTED!!!\n ${MESSAGE}\", \"icon_emoji\": \":ghost:\"}" https://hooks.slack.com/services/YOUR/WEBHOOK/URL_HERE;;
  esac
}

N="$(($(grep -c ^processor /proc/cpuinfo)*4))"    # any number suitable for your situation goes here

for F in */*/*/s3-bucket/*; do
  ((i=i%N)); ((i++==0)) && wait    # let's run only N jobs in parallel to speed up the process
  cd $ROOT_DIR
  cd $F
  plan &    # send the job to background to start the new one
done
      
      



- , pipeline :)





!





, /, - , , , , , , @vainkop







PS: IMHO the project https://github.com/cloudskiff/driftctl seems to me personally really useful and solving the correct problem and it has no good analogues, so I ask you to support the guys, and if possible, do your bit for open source.





Good mood to you all!








All Articles