|
|
Reusable Terraform Modules
Author: Venkata Sudhakar
A Terraform module is a self-contained package of Terraform configuration files in a directory that represents a reusable unit of infrastructure. Instead of copy-pasting the same Cloud SQL or GKE configuration across multiple projects, you write it once as a module with well-defined input variables and output values, then call the module from different root configurations with different arguments. This is the infrastructure equivalent of writing a library function instead of duplicating code. Every Terraform configuration is technically a module - the "root module" is the directory where you run terraform apply. Child modules are referenced with the module block. Modules can be sourced from a local directory path, a Git repository URL, a Terraform Registry URL (registry.terraform.io), or a GitHub/GitLab URL. The official Terraform Registry hosts hundreds of community modules for common patterns, but for enterprise teams it is best practice to maintain internal private module registries (in Terraform Cloud, Spacelift, or a Git repository) to enforce company standards on tagging, networking, and security settings. The below example shows how to build a reusable GCS data lake module and then call it from two different environments (dev and prod) with different variable values.
It gives the following output,
# Dev apply:
module.dev_data_lake.google_storage_bucket.this: Creating...
module.dev_data_lake.google_storage_bucket.this: Creation complete after 2s
module.dev_data_lake.google_storage_bucket_iam_member.data_engineer_read: Creating...
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Outputs:
dev_bucket_url = "gs://my-gcp-project-dev-data-lake"
# Prod apply:
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Outputs:
prod_bucket_url = "gs://my-gcp-project-prod-data-lake"
# Same module, different configs - zero code duplication.
# Updating the module once updates both dev and prod on next apply.
Module best practices: One resource type per module - Keep modules focused. A gcs_data_lake module should only create GCS resources. Combining GCS, Cloud SQL, and GKE in one module makes it hard to use partially. Version your modules - Use Git tags (v1.0.0, v1.1.0) and reference modules by tag: source = "git::https://github.com/myorg/terraform-modules.git//gcs_data_lake?ref=v1.2.0". This prevents unintended changes when the module is updated. Always define outputs - Every module should output the IDs, names, and connection strings of the resources it creates, so callers can chain modules together. Use for_each with modules - You can call a module multiple times in a loop: for_each = var.environments creates one bucket per environment from a single module call.
|
|