Infrastructure as Code for Data Engineering
Terraform fundamentals, provisioning Snowflake warehouses, S3 buckets, Airflow environments, IAM roles, and managing data infrastructure with state, modules, and CI/CD.
Infrastructure as Code — Why It Matters for Data Platforms
A data platform is built on infrastructure: S3 buckets, Snowflake warehouses, IAM roles, Airflow environments, VPCs, security groups, and dozens of other cloud resources. When this infrastructure is created manually through cloud consoles, it becomes invisible — nobody knows exactly what exists, who created it, or why. The dev environment drifts from production. A new environment takes a week to set up. A misconfigured IAM role exposes PII to the wrong team.
Infrastructure as Code treats cloud resources like software: defined in version-controlled files, reviewed through pull requests, tested in CI, and deployed through an automated pipeline. A data engineer who can write Terraform to provision a complete data platform environment in one command has a significant operational advantage over one who creates resources manually.
Terraform Fundamentals — The Core Concepts Every Data Engineer Needs
Terraform is the dominant IaC tool in 2026. It has providers for every major cloud (AWS, Azure, GCP) and for data tools like Snowflake, Databricks, and Confluent. Understanding the core concepts — providers, resources, state, plan, and apply — is sufficient to manage most data platform infrastructure.
TERRAFORM WORKFLOW:
terraform init ← download provider plugins, initialise backend
terraform plan ← show what will change (no actual changes made)
terraform apply ← apply the changes (creates/updates/destroys resources)
terraform destroy ← tear down all resources in the configuration
PROVIDER: the plugin that talks to a cloud or service API
# providers.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
snowflake = {
source = "Snowflake-Labs/snowflake"
version = "~> 0.87"
}
}
# Remote state backend (required for team use):
backend "s3" {
bucket = "freshmart-terraform-state"
key = "data-platform/terraform.tfstate"
region = "ap-south-1"
encrypt = true
dynamodb_table = "freshmart-terraform-locks" # prevents concurrent applies
}
}
provider "aws" {
region = var.aws_region
}
provider "snowflake" {
account = var.snowflake_account
username = var.snowflake_user
password = var.snowflake_password
role = "SYSADMIN"
}
RESOURCE: a single infrastructure object (S3 bucket, Snowflake warehouse, etc.)
# Each resource block = one real cloud resource
resource "aws_s3_bucket" "data_lake" {
bucket = "freshmart-data-lake-${var.environment}" # freshmart-data-lake-prod
tags = local.common_tags
}
STATE: Terraform's record of what it has created
Stored in: S3 (remote, for teams) or locally (.terraform/terraform.tfstate)
Contains: mapping from resource blocks → actual cloud resource IDs
Critical: never edit manually. Never delete. If lost: expensive to recover.
Remote state: use S3 + DynamoDB lock (prevents two engineers applying simultaneously)
PLAN OUTPUT (what to read before every apply):
terraform plan
# Terraform will perform the following actions:
# aws_s3_bucket.data_lake will be created (+)
+ resource "aws_s3_bucket" "data_lake" {
+ bucket = "freshmart-data-lake-prod"
+ id = (known after apply)
}
# aws_s3_bucket.staging will be destroyed (-)
- resource "aws_s3_bucket" "staging" {
- bucket = "freshmart-staging-old"
- id = "freshmart-staging-old"
}
# aws_snowflake_warehouse.analytics will be updated in-place (~)
~ resource "snowflake_warehouse" "analytics" {
~ warehouse_size = "SMALL" → "MEDIUM"
}
Plan: 1 to add, 1 to change, 1 to destroy.
READ THE PLAN CAREFULLY before apply.
(-) destroy: something will be permanently deleted. Understand why.
(~) update: in-place change. Usually safe.
(+) create: new resource. Check the configuration.
-/+ replace: resource must be destroyed and recreated (data loss risk).Variables, outputs, and locals — making Terraform reusable
# variables.tf — declare inputs
variable "environment" {
description = "Deployment environment: dev, staging, or prod"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
variable "aws_region" {
description = "AWS region for all resources"
type = string
default = "ap-south-1"
}
variable "snowflake_account" {
description = "Snowflake account identifier"
type = string
sensitive = true # marked sensitive: not shown in plan output
}
variable "data_retention_days" {
description = "Number of days to retain data in S3 Standard before transition"
type = number
default = 90
}
# locals.tf — computed values used throughout the configuration
locals {
name_prefix = "freshmart-${var.environment}"
common_tags = {
Environment = var.environment
Project = "freshmart-data-platform"
ManagedBy = "terraform"
Team = "data-engineering"
}
# Size overrides per environment:
snowflake_warehouse_size = {
dev = "X-SMALL"
staging = "SMALL"
prod = "MEDIUM"
}
}
# outputs.tf — values to expose after apply (useful for other modules)
output "data_lake_bucket_name" {
description = "Name of the S3 data lake bucket"
value = aws_s3_bucket.data_lake.id
}
output "data_lake_bucket_arn" {
description = "ARN of the S3 data lake bucket"
value = aws_s3_bucket.data_lake.arn
}
output "snowflake_pipeline_role" {
description = "Name of the Snowflake role for pipeline service accounts"
value = snowflake_role.pipeline.name
}
# ENVIRONMENT-SPECIFIC VARIABLE FILES:
# terraform/environments/prod.tfvars
environment = "prod"
aws_region = "ap-south-1"
data_retention_days = 365
# terraform/environments/dev.tfvars
environment = "dev"
aws_region = "ap-south-1"
data_retention_days = 30
# Deploy to prod: terraform apply -var-file=environments/prod.tfvars
# Deploy to dev: terraform apply -var-file=environments/dev.tfvarsProvisioning an S3 Data Lake — Complete Terraform Configuration
The S3 data lake is the foundation of the Medallion Architecture. Its Terraform configuration covers the bucket, encryption, versioning, lifecycle policies, access logging, and the bucket policies that implement zone-based access control — all in version-controlled code.
# s3.tf — data lake bucket with all production settings
resource "aws_s3_bucket" "data_lake" {
bucket = "${local.name_prefix}-data-lake"
force_destroy = var.environment == "dev" # only allow destroy in dev
tags = local.common_tags
}
# Encryption: all objects encrypted with KMS key
resource "aws_s3_bucket_server_side_encryption_configuration" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.data_lake.arn
}
bucket_key_enabled = true # reduces KMS API calls and cost
}
}
# Block all public access (critical for data lakes)
resource "aws_s3_bucket_public_access_block" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# Versioning: enables recovery from accidental deletion
resource "aws_s3_bucket_versioning" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
versioning_configuration {
status = "Enabled"
}
}
# Lifecycle: transition objects through storage tiers automatically
resource "aws_s3_bucket_lifecycle_configuration" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
depends_on = [aws_s3_bucket_versioning.data_lake]
# Landing zone: short-lived raw files
rule {
id = "landing-zone-expiry"
status = "Enabled"
filter { prefix = "landing/" }
expiration { days = 30 }
# Files older than 30 days deleted automatically
}
# Bronze: transition to cheaper storage after 90 days
rule {
id = "bronze-tiering"
status = "Enabled"
filter { prefix = "bronze/" }
transition {
days = 90
storage_class = "STANDARD_IA"
}
transition {
days = 365
storage_class = "GLACIER"
}
noncurrent_version_expiration {
noncurrent_days = 30 # delete old versions after 30 days
}
}
# Silver/Gold: standard IA after 180 days
rule {
id = "silver-gold-tiering"
status = "Enabled"
filter {
or {
prefix = "silver/"
prefix = "gold/"
}
}
transition {
days = 180
storage_class = "STANDARD_IA"
}
}
}
# Access logging: who accessed what, for GDPR audit
resource "aws_s3_bucket" "access_logs" {
bucket = "${local.name_prefix}-data-lake-logs"
tags = local.common_tags
}
resource "aws_s3_bucket_logging" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
target_bucket = aws_s3_bucket.access_logs.id
target_prefix = "s3-access-logs/"
}
# Bucket notification: trigger Lambda on new landing files
resource "aws_s3_bucket_notification" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
lambda_function {
lambda_function_arn = aws_lambda_function.bronze_ingestion_trigger.arn
events = ["s3:ObjectCreated:*"]
filter_prefix = "landing/"
}
}
# KMS key for encryption
resource "aws_kms_key" "data_lake" {
description = "FreshMart data lake encryption key"
deletion_window_in_days = 30
enable_key_rotation = true
tags = local.common_tags
}
resource "aws_kms_alias" "data_lake" {
name = "alias/${local.name_prefix}-data-lake"
target_key_id = aws_kms_key.data_lake.key_id
}IAM for Data Platforms — Least Privilege as Code
IAM is the access control layer for all AWS resources. For a data platform, four IAM roles cover the primary access patterns: ingestion pipelines (write to landing/bronze), transformation pipelines (read bronze, write silver/gold), analyst access (read silver/gold only), and the CI service account (read all, create/delete staging environments). Defining these in Terraform ensures the principle of least privilege is enforced consistently and reviewable.
# iam.tf — roles for the data platform access patterns
# ── INGESTION PIPELINE ROLE ────────────────────────────────────────────────────
resource "aws_iam_role" "pipeline_ingestion" {
name = "${local.name_prefix}-pipeline-ingestion"
assume_role_policy = data.aws_iam_policy_document.lambda_assume.json
tags = local.common_tags
}
resource "aws_iam_policy" "pipeline_ingestion" {
name = "${local.name_prefix}-pipeline-ingestion-policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "WriteLanding"
Effect = "Allow"
Action = ["s3:PutObject", "s3:GetObject"]
Resource = [
"${aws_s3_bucket.data_lake.arn}/landing/*",
"${aws_s3_bucket.data_lake.arn}/bronze/*",
]
},
{
Sid = "ListBucket"
Effect = "Allow"
Action = ["s3:ListBucket", "s3:GetBucketLocation"]
Resource = aws_s3_bucket.data_lake.arn
Condition = {
StringLike = { "s3:prefix" = ["landing/*", "bronze/*"] }
}
},
{
Sid = "UseKMS"
Effect = "Allow"
Action = ["kms:GenerateDataKey", "kms:Decrypt"]
Resource = aws_kms_key.data_lake.arn
}
]
})
}
resource "aws_iam_role_policy_attachment" "ingestion_policy" {
role = aws_iam_role.pipeline_ingestion.name
policy_arn = aws_iam_policy.pipeline_ingestion.arn
}
# ── TRANSFORMATION PIPELINE ROLE (dbt, Spark) ─────────────────────────────────
resource "aws_iam_role" "pipeline_transform" {
name = "${local.name_prefix}-pipeline-transform"
assume_role_policy = data.aws_iam_policy_document.ec2_assume.json
tags = local.common_tags
}
resource "aws_iam_policy" "pipeline_transform" {
name = "${local.name_prefix}-pipeline-transform-policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "ReadBronze"
Effect = "Allow"
Action = ["s3:GetObject", "s3:ListBucket"]
Resource = [
"${aws_s3_bucket.data_lake.arn}/bronze/*",
aws_s3_bucket.data_lake.arn,
]
},
{
Sid = "WriteTransformed"
Effect = "Allow"
Action = ["s3:PutObject", "s3:DeleteObject", "s3:GetObject"]
Resource = [
"${aws_s3_bucket.data_lake.arn}/silver/*",
"${aws_s3_bucket.data_lake.arn}/gold/*",
]
},
# NO access to: landing/ (raw PII, ingestion only)
]
})
}
# ── ANALYST ROLE (read silver and gold, no raw PII) ───────────────────────────
resource "aws_iam_role" "analyst" {
name = "${local.name_prefix}-analyst"
assume_role_policy = data.aws_iam_policy_document.federated_assume.json
tags = local.common_tags
}
resource "aws_iam_policy" "analyst" {
name = "${local.name_prefix}-analyst-policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "ReadAnalyticsLayers"
Effect = "Allow"
Action = ["s3:GetObject"]
Resource = [
"${aws_s3_bucket.data_lake.arn}/silver/*",
"${aws_s3_bucket.data_lake.arn}/gold/*",
]
# Explicit deny for PII columns handled at Athena/LakeFormation level
},
{
Sid = "ListForAnalytics"
Effect = "Allow"
Action = ["s3:ListBucket"]
Resource = aws_s3_bucket.data_lake.arn
Condition = {
StringLike = {
"s3:prefix" = ["silver/*", "gold/*"]
}
}
}
]
})
}
# Assume role policy documents:
data "aws_iam_policy_document" "lambda_assume" {
statement {
actions = ["sts:AssumeRole"]
principals { type = "Service"; identifiers = ["lambda.amazonaws.com"] }
}
}
data "aws_iam_policy_document" "ec2_assume" {
statement {
actions = ["sts:AssumeRole"]
principals { type = "Service"; identifiers = ["ec2.amazonaws.com"] }
}
}
data "aws_iam_policy_document" "federated_assume" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
principals {
type = "Federated"
identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/accounts.google.com"]
}
}
}Snowflake Infrastructure — Warehouses, Roles, and Databases in Terraform
Snowflake has a first-class Terraform provider maintained by Snowflake Labs. This allows the complete Snowflake account configuration — warehouses, databases, schemas, roles, grants, users, and resource monitors — to be managed as code. When a new analyst joins, adding them to the Snowflake platform is a one-line PR rather than a console click sequence.
# snowflake.tf — complete Snowflake infrastructure
# ── WAREHOUSES ────────────────────────────────────────────────────────────────
resource "snowflake_warehouse" "dbt_pipeline" {
name = "${upper(var.environment)}_DBT_PIPELINE_WH"
warehouse_size = lookup(local.snowflake_warehouse_size, var.environment, "SMALL")
auto_suspend = 300 # 5 min idle → suspend
auto_resume = true
max_cluster_count = 1
comment = "dbt transformation pipeline warehouse - ${var.environment}"
}
resource "snowflake_warehouse" "analyst" {
name = "${upper(var.environment)}_ANALYST_WH"
warehouse_size = "SMALL"
auto_suspend = 600 # 10 min idle
auto_resume = true
max_cluster_count = var.environment == "prod" ? 3 : 1
scaling_policy = var.environment == "prod" ? "ECONOMY" : "STANDARD"
comment = "Analyst self-service queries - ${var.environment}"
}
resource "snowflake_warehouse" "dashboard" {
name = "${upper(var.environment)}_DASHBOARD_WH"
warehouse_size = "X-SMALL"
auto_suspend = 60
auto_resume = true
comment = "BI tool service account warehouse - ${var.environment}"
}
# ── RESOURCE MONITOR — prevent runaway cost ───────────────────────────────────
resource "snowflake_resource_monitor" "monthly_limit" {
name = "${upper(var.environment)}_MONTHLY_MONITOR"
credit_quota = var.environment == "prod" ? 1000 : 100
notify_triggers = [75, 90] # alert at 75% and 90% of quota
suspend_triggers = [100] # suspend warehouses at 100%
suspend_immediately_triggers = [110] # hard stop at 110%
notify_users = ["data-team-lead@freshmart.com"]
}
resource "snowflake_warehouse" "dbt_pipeline_monitor" {
name = "${upper(var.environment)}_DBT_PIPELINE_WH"
resource_monitor = snowflake_resource_monitor.monthly_limit.name
# ... other settings from above
}
# ── DATABASES AND SCHEMAS ─────────────────────────────────────────────────────
resource "snowflake_database" "freshmart" {
name = "FRESHMART_${upper(var.environment)}"
data_retention_time_in_days = var.environment == "prod" ? 30 : 1
comment = "FreshMart data platform - ${var.environment}"
}
resource "snowflake_schema" "bronze" {
database = snowflake_database.freshmart.name
name = "BRONZE"
data_retention_time_in_days = var.environment == "prod" ? 30 : 1
}
resource "snowflake_schema" "silver" {
database = snowflake_database.freshmart.name
name = "SILVER"
}
resource "snowflake_schema" "gold" {
database = snowflake_database.freshmart.name
name = "GOLD"
}
resource "snowflake_schema" "monitoring" {
database = snowflake_database.freshmart.name
name = "MONITORING"
}
# ── ROLES ─────────────────────────────────────────────────────────────────────
resource "snowflake_role" "pipeline" {
name = "${upper(var.environment)}_PIPELINE_ROLE"
comment = "dbt and Spark pipeline service accounts"
}
resource "snowflake_role" "analyst" {
name = "${upper(var.environment)}_ANALYST_ROLE"
comment = "Analyst read access to silver and gold"
}
resource "snowflake_role" "bi_service" {
name = "${upper(var.environment)}_BI_SERVICE_ROLE"
comment = "Metabase/Tableau service account - read gold only"
}
# ── GRANTS ────────────────────────────────────────────────────────────────────
# Pipeline role: read bronze, write silver + gold
resource "snowflake_schema_grant" "pipeline_bronze_read" {
database_name = snowflake_database.freshmart.name
schema_name = snowflake_schema.bronze.name
privilege = "USAGE"
roles = [snowflake_role.pipeline.name]
}
resource "snowflake_schema_grant" "pipeline_silver_write" {
database_name = snowflake_database.freshmart.name
schema_name = snowflake_schema.silver.name
privilege = "CREATE TABLE"
roles = [snowflake_role.pipeline.name]
}
# Analyst role: read silver + gold, NOT bronze
resource "snowflake_schema_grant" "analyst_silver" {
database_name = snowflake_database.freshmart.name
schema_name = snowflake_schema.silver.name
privilege = "USAGE"
roles = [snowflake_role.analyst.name]
}
resource "snowflake_table_grant" "analyst_silver_select" {
database_name = snowflake_database.freshmart.name
schema_name = snowflake_schema.silver.name
privilege = "SELECT"
roles = [snowflake_role.analyst.name]
on_future = true # applies to all future tables automatically
}
resource "snowflake_warehouse_grant" "analyst_warehouse" {
warehouse_name = snowflake_warehouse.analyst.name
privilege = "USAGE"
roles = [snowflake_role.analyst.name]
}
# ── USERS ─────────────────────────────────────────────────────────────────────
# Manage Snowflake users from a variable-driven config:
variable "snowflake_analysts" {
description = "List of analyst email addresses"
type = list(string)
default = []
}
resource "snowflake_user" "analysts" {
for_each = toset(var.snowflake_analysts)
name = replace(each.value, "@freshmart.com", "")
email = each.value
login_name = each.value
default_role = snowflake_role.analyst.name
default_warehouse = snowflake_warehouse.analyst.name
must_change_password = true
}
resource "snowflake_role_grants" "analysts" {
for_each = toset(var.snowflake_analysts)
role_name = snowflake_role.analyst.name
users = [replace(each.value, "@freshmart.com", "")]
depends_on = [snowflake_user.analysts]
}Terraform Modules — Reusable Infrastructure Components
A Terraform module is a reusable, parameterised configuration for a set of related resources. For a data platform, common modules include: data_lake (S3 bucket with all production settings), snowflake_env (one Snowflake database + schemas + roles per environment), and airflow_env (MWAA or self-hosted Airflow deployment). Using modules means dev and prod use the same tested configuration with different variable values — no environment drift.
# MODULE STRUCTURE:
# modules/
# data_lake/
# main.tf ← resource definitions
# variables.tf ← input variables
# outputs.tf ← output values
# snowflake_env/
# main.tf
# variables.tf
# outputs.tf
# airflow_mwaa/
# main.tf
# variables.tf
# outputs.tf
# modules/data_lake/variables.tf
variable "environment" { type = string }
variable "aws_region" { type = string }
variable "retention_days_landing" { type = number; default = 30 }
variable "retention_days_bronze" { type = number; default = 365 }
variable "enable_versioning" { type = bool; default = true }
variable "tags" { type = map(string); default = {} }
# modules/data_lake/main.tf — all S3 resources from Part 03
resource "aws_s3_bucket" "data_lake" {
bucket = "freshmart-${var.environment}-data-lake"
tags = merge(var.tags, { Environment = var.environment })
}
# ... (all the lifecycle, encryption, versioning resources)
# modules/data_lake/outputs.tf
output "bucket_id" { value = aws_s3_bucket.data_lake.id }
output "bucket_arn" { value = aws_s3_bucket.data_lake.arn }
output "kms_key_id" { value = aws_kms_key.data_lake.id }
# ROOT MODULE — uses modules for each environment:
# environments/prod/main.tf
module "data_lake_prod" {
source = "../../modules/data_lake"
environment = "prod"
aws_region = "ap-south-1"
retention_days_landing = 30
retention_days_bronze = 730 # 2 years for prod
enable_versioning = true
tags = {
Environment = "prod"
CostCenter = "data-platform"
}
}
module "snowflake_prod" {
source = "../../modules/snowflake_env"
environment = "prod"
snowflake_account = var.snowflake_account
warehouse_size_pipeline = "MEDIUM"
warehouse_size_analyst = "SMALL"
analyst_cluster_count = 3
data_retention_days = 30
analysts = [
"priya@freshmart.com",
"rahul@freshmart.com",
"ananya@freshmart.com",
]
}
# environments/dev/main.tf — SAME MODULES, different variables:
module "data_lake_dev" {
source = "../../modules/data_lake"
environment = "dev"
aws_region = "ap-south-1"
retention_days_landing = 7 # shorter retention in dev
retention_days_bronze = 30
enable_versioning = false # cheaper: no versioning in dev
tags = { Environment = "dev" }
}
module "snowflake_dev" {
source = "../../modules/snowflake_env"
environment = "dev"
snowflake_account = var.snowflake_account
warehouse_size_pipeline = "X-SMALL" # smaller for dev
warehouse_size_analyst = "X-SMALL"
analyst_cluster_count = 1
data_retention_days = 1
analysts = [] # dev uses personal credentials
}CI/CD for Terraform — Safe Infrastructure Changes
Infrastructure changes carry higher risk than code changes — a wrong Terraform apply can delete a production S3 bucket or an IAM role that pipelines depend on. The CI/CD pipeline for Terraform must require a human review of the plan before any apply, and must apply in a controlled way that prevents concurrent runs.
# .github/workflows/terraform.yml
name: Terraform
on:
pull_request:
paths: ['terraform/**']
push:
branches: [main]
paths: ['terraform/**']
env:
AWS_REGION: ap-south-1
TF_WORKING_DIR: terraform/environments/prod
jobs:
terraform-plan:
name: "Terraform Plan"
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: "1.7.0"
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_TERRAFORM_ROLE_ARN }}
aws-region: ap-south-1
- name: Terraform Init
working-directory: ${{ env.TF_WORKING_DIR }}
run: terraform init
- name: Terraform Format Check
run: terraform fmt -check -recursive terraform/
- name: Terraform Validate
working-directory: ${{ env.TF_WORKING_DIR }}
run: terraform validate
- name: Terraform Plan
id: plan
working-directory: ${{ env.TF_WORKING_DIR }}
run: |
terraform plan \
-var-file=../../environments/prod.tfvars \
-out=tfplan \
-detailed-exitcode \
2>&1 | tee plan_output.txt
continue-on-error: true
- name: Post Plan to PR
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const plan = fs.readFileSync('terraform/environments/prod/plan_output.txt', 'utf8');
const truncated = plan.length > 60000 ? plan.slice(-60000) : plan;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Terraform Plan
<details><summary>Show Plan</summary>
\`\`\`
${truncated}
\`\`\`
</details>`,
});
- name: Fail if plan errored
if: steps.plan.outputs.exitcode == '1'
run: exit 1
terraform-apply:
name: "Terraform Apply"
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
environment: production # requires manual approval in GitHub Environments
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with: { terraform_version: "1.7.0" }
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_TERRAFORM_ROLE_ARN }}
aws-region: ap-south-1
- name: Terraform Init
working-directory: ${{ env.TF_WORKING_DIR }}
run: terraform init
- name: Terraform Apply
working-directory: ${{ env.TF_WORKING_DIR }}
run: |
terraform apply \
-var-file=../../environments/prod.tfvars \
-auto-approve \
-input=false
# KEY SAFETY FEATURES:
# 1. Plan on PR: shows what will change BEFORE merge
# 2. Human approval: GitHub Environments 'production' requires approval
# 3. DynamoDB locking: only one apply can run at a time (state locking)
# 4. Role assumption: CI uses a restricted IAM role, not admin keys
# 5. exit code check: plan exit code 1 = error, exit code 2 = changes (expected)
# PREVENTING ACCIDENTAL DESTROYS:
# lifecycle block prevents Terraform from destroying critical resources:
resource "aws_s3_bucket" "data_lake_prod" {
# ... bucket config ...
lifecycle {
prevent_destroy = true # terraform destroy will fail with an error
# To actually destroy: remove this block, plan, review, apply
}
}
# Same for production Snowflake database:
resource "snowflake_database" "freshmart_prod" {
name = "FRESHMART_PROD"
lifecycle {
prevent_destroy = true
}
}Onboarding a New Data Engineer in 30 Minutes With IaC
Rahul Sharma joins FreshMart as a data engineer. Before IaC, onboarding took 3-5 days: manually creating an S3 prefix, requesting Snowflake access from IT, waiting for IAM role creation, configuring dbt profiles with manual credential lookup. With IaC, the entire environment is ready in 30 minutes with one PR.
# PR TITLE: feat(infra): add dev environment for rahul.sharma
# Step 1: Add new analyst to Snowflake users list
# terraform/environments/dev/main.tf — update analysts variable:
module "snowflake_dev" {
source = "../../modules/snowflake_env"
environment = "dev"
analysts = [
"priya@freshmart.com",
"rahul.sharma@freshmart.com", # ← ADD THIS LINE
]
}
# Step 2: Add developer S3 access prefix
# modules/s3_developer_access/main.tf:
resource "aws_iam_policy" "dev_s3_access" {
for_each = toset(var.developer_emails)
name = "freshmart-dev-${replace(each.value, "@freshmart.com", "")}-s3"
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["s3:GetObject", "s3:PutObject", "s3:ListBucket"]
Resource = [
"arn:aws:s3:::freshmart-dev-data-lake/dev/${replace(each.value, "@freshmart.com", "")}/*",
"arn:aws:s3:::freshmart-dev-data-lake",
]
}]
})
}
# Step 3: Open PR → CI runs terraform plan → plan shows:
# + snowflake_user.analysts["rahul.sharma@freshmart.com"] will be created
# + snowflake_role_grants.analysts["rahul.sharma@freshmart.com"] will be created
# + aws_iam_policy.dev_s3_access["rahul.sharma@freshmart.com"] will be created
# Plan: 3 to add, 0 to change, 0 to destroy.
# Step 4: PR reviewed and merged → terraform apply runs
# → Snowflake user created with analyst role, temp password, MUST_CHANGE_PASSWORD=true
# → IAM policy created and attached to Rahul's AWS identity
# RAHUL'S ONBOARDING CHECKLIST (30 minutes total):
# [x] Data engineering lead opens PR with Rahul's email
# [x] PR reviewed, merged — Snowflake access provisioned automatically
# [x] Rahul receives email with temp Snowflake password (changes on first login)
# [x] Rahul clones the dbt repository
# [x] Rahul runs: export DBT_DEV_SCHEMA=dev_rahul_first_task
# [x] Rahul runs: dbt run --target dev --select +silver.orders (first dbt run)
# [x] Rahul queries his dev schema in Snowflake — data there immediately
# CONTRAST WITH MANUAL ONBOARDING (before IaC):
# Day 1: Submit Jira ticket for Snowflake access to IT helpdesk
# Day 2: Follow up on Jira ticket
# Day 3: IT creates Snowflake user (wrong role — analyst not pipeline)
# Day 3: Email IT to change role
# Day 4: Submit AWS access request form to security team
# Day 4: dbt setup fails — no Snowflake credentials documentation
# Day 5: Everything finally working — 5 days of frustration
# Manual: 5 days, 6 Slack messages, 2 Jira tickets, 1 frustrated engineer
# WITH IaC: 30 minutes, 1 PR, zero tickets, zero Slack messages.The IaC approach also means Rahul's offboarding is equally simple: a PR removing his email from the analysts list. Terraform applies, the Snowflake user is destroyed, the IAM policy is deleted, and all access is revoked in one automated step. No forgotten accounts, no manual cleanup, no security audit findings.
5 Interview Questions — With Complete Answers
Errors You Will Hit — And Exactly Why They Happen
🎯 Key Takeaways
- ✓Infrastructure as Code treats cloud resources — S3 buckets, Snowflake warehouses, IAM roles — like software: defined in version-controlled files, reviewed in PRs, deployed through CI/CD. The benefits: reproducibility, auditability, drift prevention, cost visibility, security by default, and environment parity between dev and prod.
- ✓Terraform core workflow: init (download providers, initialise backend) → plan (show what will change, no changes made) → apply (make the changes). Always read the plan before apply. The three change types: + (create), ~ (update in-place), - (destroy). Any destroy operation requires deliberate review.
- ✓Terraform state maps resource blocks to real cloud resource IDs. Remote state (S3 + DynamoDB) is mandatory for teams: S3 provides durability and sharing, DynamoDB prevents concurrent applies from corrupting state. Never edit state manually. If state is lost: expensive to recover via terraform import.
- ✓Variables make Terraform reusable across environments. Use sensitive = true on credential variables — they are redacted in plan output. Use validation blocks to enforce valid values. Use locals for computed values used throughout the configuration. Use .tfvars files per environment (prod.tfvars, dev.tfvars) to separate configuration from code.
- ✓Terraform modules encapsulate related resources as reusable components. A data_lake module wraps the S3 bucket, encryption, versioning, lifecycle policies, and access logging. A snowflake_env module wraps databases, schemas, roles, warehouses, and grants. Both prod and dev call the same module with different variable values — guaranteeing structural consistency.
- ✓S3 lifecycle policies must always specify a filter prefix. A lifecycle rule without a filter applies to ALL objects in the bucket. A 30-day deletion rule intended for landing/ applied without a prefix filter will delete all Silver and Gold data. Review every lifecycle rule in CI for a mandatory filter block.
- ✓The prevent_destroy lifecycle block prevents Terraform from destroying a critical resource. Terraform refuses to apply any plan that would destroy a resource with this flag. To remove a resource intentionally: remove the lifecycle block in a separate PR, review that intent explicitly, then delete. Apply this to all production databases, schemas, and S3 buckets.
- ✓IAM roles for data platforms follow least privilege: ingestion pipeline (write landing/bronze only), transformation pipeline (read bronze, write silver/gold), analyst (read silver/gold only, no bronze PII), BI service account (read gold only). Define every FUTURE GRANT in Terraform so new tables automatically inherit the correct permissions without manual grants.
- ✓Snowflake resource monitors set credit quotas per warehouse per month. Notify at 75% and 90%, suspend at 100%. Without a resource monitor, a runaway analyst query or a misconfigured pipeline can exhaust the entire monthly Snowflake compute budget in one day. Define resource monitors in Terraform so they are always present in all environments.
- ✓Onboarding a new engineer with IaC: add their email to the analysts variable list, open a PR, CI runs terraform plan showing the user creation, merge after review, Terraform provisions the Snowflake user with correct roles, IAM policies, and dev S3 access in minutes. Offboarding is the reverse: remove the email, PR, merge, access revoked automatically. Zero tickets, zero forgotten accounts.
Discussion
0Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.