Skip to main content

Command Palette

Search for a command to run...

AI-Assisted IaC Reviews: Using Claude and Copilot to Audit Your Azure Terraform

Published
12 min read
AI-Assisted IaC Reviews: Using Claude and Copilot to Audit Your Azure Terraform
J

Executive technology leader responsible for platform reliability, cloud operations, security posture, and enterprise technology risk within an investor-backed fintech environment. I lead technology operations at the intersection of engineering execution, governance, and business outcomes — ensuring platforms are scalable, resilient, and trusted by investors, regulators, and clients.

Currently VP of DevOps at InvestorFlow, where I focus on building board-ready technology operations, strengthening risk and resilience, and shaping long-term platform strategy to support growth and regulatory confidence.

If you've been writing Terraform for Azure for any length of time, you'll know that code reviews are where most of the real quality work happens. A plan might be clean, the pipeline might be green, and yet the pull request can still contain a storage account with public network access enabled, a key vault without purge protection, or a resource sitting in the wrong subscription. These are the things a second pair of eyes catches, assuming that second pair of eyes has the time and context to actually look.

The reality for most teams is that reviewers are stretched. You end up approving things quickly because the author is blocked, or because the diff is 400 lines and you trust the person who wrote it. This is the gap that AI-assisted reviews are starting to fill, and over the last year I've been leaning on Claude and GitHub Copilot as a genuine review layer rather than just an autocomplete tool. In this post I'll walkthrough how I use them to audit Terraform for Azure, where they add real value, and where I'd still be cautious about trusting them.

Where AI fits into the review workflow

The way I think about this is in layers. You already have terraform validate, terraform fmt, and a plan output. On top of that you likely have static analysis from something like Checkov, tfsec, or Snyk. These tools are excellent at pattern matching against a known ruleset, and they should stay in the pipeline.

What AI brings is the layer above that, the reasoning layer. A static scanner can tell you that public_network_access_enabled = true is flagged by rule AZURE-STORAGE-002. What it can't do is look at your whole configuration, notice that you've also defined a private endpoint on the same resource, and work out whether the flag is a genuine misconfiguration or a transitional state during a migration. That kind of contextual reading is where LLMs are actually useful.

I use AI reviews in three places in the workflow:

  1. Before I commit, inside VS Code using Copilot Chat or Claude

  2. On a pull request, either through GitHub Copilot's code review or a custom workflow calling Claude's API

  3. Ad-hoc, when I want a deeper look at a module before it gets reused across environments

Setting up Copilot for Terraform reviews in VS Code

Copilot's chat panel is the quickest way to get started. Assuming you already have the GitHub Copilot extension installed and you're signed in with a licence that includes Copilot Chat, you can open the chat pane and point it at an open file or a selection.

The trick that makes the output actually useful is being specific with the prompt. "Review this Terraform" will give you a generic response. Something like this gives you a useful one:

You are reviewing Terraform for Azure. Check the open file for:

- Security misconfigurations (public access, weak TLS, missing encryption)
- Azure naming convention violations against CAF
- Missing tags required by our policy (environment, owner, cost-centre)
- Resources that should be using Azure Verified Modules
- Any hardcoded values that should be variables

Report findings grouped by severity and reference the line number.

If I run that against a typical storage account block, Copilot will flag things like min_tls_version not being set, the absence of shared_access_key_enabled = false, and missing tags, and it'll point at the actual lines. It won't catch everything, but what it does catch is directionally useful.

Using Claude for deeper module audits

Where Claude tends to earn its keep is on a full module rather than a single file. Because the context window is larger, I can paste in a main.tf, variables.tf, outputs.tf, and even the versions.tf and ask for a holistic review.

The prompt structure I've settled on looks something like this:

Act as a senior Azure platform engineer reviewing a Terraform module for a production deployment. The module provisions [brief description].

Please review against:
1. Azure Well-Architected Framework pillars (security, reliability, cost)
2. Terraform best practices (variable validation, locals, module structure)
3. Azure CAF naming and tagging
4. Secrets handling - flag anything that could end up in state or logs
5. Anything that would fail a Checkov or tfsec scan

Be specific with file and resource references. Do not repeat the code back to me, just give the findings and the suggested fix.

The "do not repeat the code back" line matters more than it looks. Without it, you end up with a 4,000 word response that's mostly a re-print of what you already gave it, and the actual findings get buried.

A real example: last month I was reviewing a module that provisioned an Azure Function App with a linked storage account. Claude flagged that the storage account used for the function runtime had public_network_access_enabled = true and pointed out that while this was acceptable for the function to work, I should either lock it down with a service endpoint and a network rule, or move the function to a Premium plan with VNet integration and use a private endpoint. None of the static scanners I had in the pipeline made that connection between the two resources, they just flagged the storage account in isolation.

Wiring Copilot into your PR process

If you want this as part of your PR process rather than something you run manually, GitHub Copilot code review is the path of least resistance. There is no Actions workflow to maintain, no API key to rotate, and the output lands as inline PR comments in the same review surface your team already uses.

The first thing to know, and this tripped me up when I first tried to set it up, is that Copilot cannot be added to a CODEOWNERS file. CODEOWNERS only accepts GitHub users and team slugs, and Copilot is neither. If you try to add it, the validation just ignores the entry. The correct mechanism is a branch ruleset that automatically requests a Copilot review when a PR is opened against a protected branch.

Enabling Copilot code review on the repository

You can turn this on at the repository level or across an organisation. For a single repo, it's under SettingsRulesRulesetsNew branch ruleset. Give the ruleset a name, set the target branch to main (or whatever your default is), and under Branch rules tick Automatically request Copilot code review.

There are two options worth turning on:

  1. Review new pushes - so Copilot re-reviews when the author pushes additional commits, rather than just the initial PR.

  2. Review draft pull requests - useful if you want feedback before you mark the PR as ready for human review.

Once the ruleset is saved, any PR against the target branch will have Copilot added as a reviewer automatically. You'll see it appear in the reviewers list alongside any human reviewers assigned by your existing CODEOWNERS.

Making Copilot Azure-aware with custom instructions

Out of the box, Copilot's review is generic. It knows it's looking at Terraform, but it doesn't know anything about your Azure standards, your tagging policy, or the modules your team has agreed to use. This is where custom instructions come in, and it's where most of the actual value lives.

Copilot reads a file at .github/copilot-instructions.md for repository-wide guidance. There's a 4,000 character limit so you have to be deliberate about what goes in it. Here's the instructions file I use on Azure Terraform repositories:

# Copilot Code Review Instructions

This repository contains Terraform code that deploys infrastructure to Azure. When performing a code review, apply the following checks.

## Security
- Flag any resource with `public_network_access_enabled = true` unless a private endpoint is also defined in the same file.
- Flag storage accounts without `min_tls_version = "TLS1_2"` and `shared_access_key_enabled = false`.
- Flag Key Vaults without `purge_protection_enabled = true` and `soft_delete_retention_days` set to at least 7.
- Flag any hardcoded secrets, connection strings, or subscription IDs. These should be variables or Key Vault references.
- Flag any resource using `azurerm_role_assignment` with a built-in role that grants broader access than needed (e.g. Owner or Contributor at subscription scope).

## Naming and tagging
- Resource names must follow the Cloud Adoption Framework naming
  convention: <resource-abbreviation>-<workload>-<environment>-<region>.
- All resources must include tags for `environment`, `owner`, and `cost-centre`. Flag any resource missing one or more of these.

## Terraform best practice
- Flag variables without a `description` or without `validation` blocks where one would catch common input errors.
- Flag provider version constraints that use `>=` without an upper bound. Prefer `~>` to pin minor versions.
- Prefer Azure Verified Modules (AVM) over raw `azurerm_` resources where an equivalent AVM exists.
- Flag outputs that expose sensitive values without `sensitive = true`.

## Report format
Group findings by severity (High, Medium, Low). Reference the file and resource address. Do not repeat code back in the review.

The thing I'd stress here is that custom instructions work best when they're specific and actionable. Telling Copilot to "review for security" is useless. Telling it to "flag any resource with public_network_access_enabled = true unless a private endpoint is defined in the same file" gives it something it can actually check against.

Path-scoped instructions for modules

If your repository has a mix of Terraform and other code, or you want different rules for different directories, you can use path-scoped instructions. These live in .github/instructions/*.instructions.md and use an applyTo frontmatter block with glob patterns.

For example, you might have a modules/ directory where you want stricter rules than in your root configuration:

---
applyTo: "modules/**/*.tf"
---

# Module-specific review rules

- Every module must have `variables.tf`, `outputs.tf`, `main.tf`, and `versions.tf`.
- All variables must have a `description` and a `type`.
- All outputs must have a `description`.
- Modules must not hardcode resource group names, locations, or
  subscription IDs. These should come in as variables.
- `required_providers` in `versions.tf` must pin the provider to a specific minor version using `~>`.

Copilot applies the repo-wide instructions plus any path-scoped files that match the changed files in the PR.

Have AI help

Sometimes it’s hard to know where to start with instructions files, or to update existing files. I have found passing these through something like Claude with natural language of what I am trying to have the agent review/validate has helped speed up the process and even give me a start with a template I can modify to be more specific to the way I want it.

One gotcha worth knowing

Custom instructions for automated PR reviews are still rolling out, and the behaviour isn't quite consistent with Copilot Chat reviews yet. I've had PRs where the instructions file was clearly not applied, because the findings didn't match any of the specific rules I'd written. The workaround, which feels daft but does work, is to remove Copilot as a reviewer and re-add it after a few minutes. This seems to force it to re-read the instructions file. GitHub has this on the roadmap to unify, but worth knowing if your reviews look generic when they shouldn't.

What AI is genuinely good at finding

After running these reviews across a few hundred PRs now, there are patterns in what the models catch well and what they miss.

They're good at things like:

  • Missing min_tls_version on storage accounts and app services

  • Hardcoded secrets or connection strings that should be Key Vault references

  • Resources without tags when the rest of the codebase has a tagging standard

  • Inconsistent use of azurerm versus Azure Verified Modules

  • Variables without validation blocks where one would catch common mistakes

  • Outputs marked as sensitive incorrectly, or not marked when they should be

  • Provider version constraints that are too loose

What they still miss

Where they fall over, and where I'd still want human review, is anything that needs context from outside the repo. If you've got a resource that looks misconfigured but is actually following a pattern agreed with your security team six months ago, the model doesn't know that. It'll flag it every time.

They also struggle with anything that depends on runtime state. A plan output tells you what's about to happen in Azure. A static review of the code can't always predict that, especially when for_each loops or dynamic blocks are involved. I've seen Claude confidently describe a loop as iterating over a list when it's actually iterating over a map, because the variable definition wasn't in the context I'd given it.

And then there's the hallucination problem. Every so often a model will suggest a resource argument that doesn't exist, or recommend a provider feature that was removed two versions ago. This happens less now than it did a year ago, but it still happens, and if you act on the suggestion without checking, you end up with a configuration that doesn't even validate.

The rule I apply is: treat AI review findings like findings from a junior engineer who has read the AzureRM provider docs cover to cover but has never actually deployed anything. The pattern recognition is strong, the context is weak, and the confidence level doesn't always match the accuracy.

My thoughts on AI in the review loop

AI-assisted reviews are not replacing your static analysis tools, and they are not replacing human reviewers. What they are doing is catching a class of issue that sits in between those two, the contextual stuff that a rule-based scanner will never catch and that a time-pressured human reviewer might miss on a busy day.

The teams I see getting the most value from this are the ones treating it as an additional gate rather than a replacement for anything. Checkov or Snyk runs on every commit. The AI review runs on every PR. A human reviewer still has to approve the merge. Each layer catches something the others don't, and the cost of adding the AI layer is surprisingly low once the workflow is in place.

%buymeacoffe-butyellow