Infrastructure as code is not documentation

2020-08-25

There's no lack of opinions on the role of comments in a codebase. From undergraduate computer science classes to grumpy IRC discussions, opinions range from the Gospel of Ludicrous Commentiquette to considering the mere existence of comments a code smell.

This is also an issue for systems engineers who are writing shell scripts, glue utilities, and thousands of lines of configuration management language. With tools like Pulumi, sysadmins are truly coding their infrastructure. So what is the role of code comments for sysadmins in this world? Is formalized documentation necessary if you can just read the Ansible and Terraform for a given environment?

I contend that Infrastructure as Code is not documentation, and comments are still necessary. Comments explain the why, and documentation communicates the intended state.

Explain the "why"

Let's use a small snippet of an Ansible playbook as an example. Here, we see a set of upstream Docker instructions for installation on a CentOS machine.

tasks:
  - name: Ensure old versions of Docker and utils are not installed
    yum:
      name:
        - docker
        - docker-client
        - docker-client-latest
        - docker-common
        - docker-latest
        - docker-latest-logrotate
        - docker-logrotate
        - docker-engine
      state: absent

  - name: Add Docker repository
    yum_repository:
      name: Docker
      description: Docker CentOS repo
      baseurl: https://download.docker.com/linux/centos/docker-ce.repo

  - name: Install Docker
    yum:
      name:
        - docker-ce
        - docker-ce-cli
        - containerd.io
      state: latest

The technical functionality here is clear. We remove old Docker versions, if they are installed, add the required dependencies, add the Docker repository, and install the new Docker packages. However, it leaves implementation questions unanswered. Why are we installing Docker from upstream, rather than using our distribution's version? A quick comment here clarifies this for the reader, and provides useful context that is not captured by the code itself.

Here's another example, where we're downloading a tarball of Harbor from GitHub.

- name: Download Harbor
  get_url:
  url: https://github.com/goharbor/harbor/releases/download/v2.0.1/harbor-onlineinstaller-v2.0.1.tgz
  dest: /root/harbor.tgz

Again, the technical implementation is clear, but the reasoning is not. Why are we downloading a tarball, and not installing from a package? The answer: There isn't an upstream Harbor package, so we're grabbing release tarballs from Github. This is useful information to place in a comment. When a system package is available, a different engineer can easily swap out this implementation and be sure that there isn't an additional technical reason for this decision.

This idea extends far beyond software installation. Comments explain why a given load balancer is configured for transparent balancing rather than using X-Forwarded-For or why ip_conntrack_max is set to a specific value. Use comments to explain implementation choices that are not clearly communicated by the code itself.

Documentation and intended state

What do I mean when I say "intended state?" While it is the role of tools like Terraform and Ansible to enforce state, they are only enforcing the state that is defined by the codebase. Unless nobody in your organization ever makes a mistake, you can't assume that the state enforced by these tools is necessarily what was intended. This is where documentation comes in. Besides being a great way of forcing engineers to think holistically about a system, thorough documentation also defines the state the developer intended to enforce. This is distinguished from the system's actual state, as enforced by the Infrastructure as Code tools.

Let's say you are working on a load balancer configuration. This load balancer has multiple internal and external interfaces, one for each backend service. When making some changes, you notice an internal backend service with a public IP address. This is the only service with that configuration. Without documentation, you have no way of knowing whether this is an intentional choice, or if someone accidentally added a public IP address to the internal interface configuration stanza.

Conclusion

Comments and documentation are part of being a good team member. Though you may never need to review them, you are doing a service to your teammates and anyone who might work on that codebase in the future. Use comments to explain technical choices that are not explained by the code, and use documentation to articulate the intended state of a system.