Prerequisites
Prerequisites
Good one 👍 Before managing AWS using Terraform, enterprises (or even individual engineers) usually set up a few prerequisites so things don’t break later.
Here’s a structured checklist for you:
🔹 1. AWS Side Prerequisites
AWS Account(s) → sandbox, dev, stage, prod (large orgs use AWS Organizations).
IAM User / Role for Terraform
Should have programmatic access (Access Key + Secret Key).
Recommended: use an IAM role with least-privilege policies.
Permissions → depends on scope:
Admin privileges (for POC).
Granular policies (for production): e.g.,
AmazonVPCFullAccess,AmazonEKSClusterPolicy,AmazonS3FullAccess.
S3 Bucket for State Storage (backend).
DynamoDB Table for State Locking (to prevent race conditions in teams).
Tagging Policy → Enterprises define required tags (e.g.,
Environment,Owner,CostCenter).
🔹 2. Local / Developer Machine
Terraform CLI installed → latest stable version.
AWS CLI installed & configured → with profiles in
~/.aws/credentials.aws configure --profile devAccess to Git → store Terraform code in a repo (GitHub/GitLab/Bitbucket).
Basic Editor Setup → VS Code with Terraform extension (syntax + linting).
🔹 3. Terraform-Specific Setup
Remote Backend Configuration
Example: store state in
S3 + DynamoDB.
Workspaces or Environment Strategy
Decide how you’ll separate dev/stage/prod (workspaces, separate state files, or separate repos).
Module Registry or Folder Structure
Reusable modules for
networking/,compute/,security/.
Variables & tfvars → define defaults and environment-specific overrides.
🔹 4. Team / Enterprise Practices
Version Control (Git) → infra changes go via Pull Requests.
CI/CD Integration → run
terraform planandterraform applyin pipelines.Secrets Management
Never hardcode AWS keys in
.tffiles.Use AWS Vault, SSM, or Secrets Manager.
Policies & Guardrails
Use Terraform Cloud/Sentinel or OPA to enforce compliance.
🔹 5. Knowledge Prerequisites
Basic AWS concepts → VPC, IAM, EC2, S3, RDS.
Terraform basics → providers, resources, variables, outputs, modules.
Git workflow → branching, PR reviews.
✅ Minimum for a POC:
AWS IAM user with programmatic access.
Terraform + AWS CLI installed.
S3 bucket + DynamoDB for state.
✅ For Enterprise Use:
Role-based access, centralized remote backend, reusable modules, CI/CD pipelines, policy enforcement.
👉 Pavan, do you want me to draft a step-by-step setup guide (commands + config files) for initializing Terraform with AWS (including remote S3 + DynamoDB backend) so you can use it as a starter template?
Questions before setup
How Enterprises Separate Dev / Stage / Prod
Solid question 👌 This is one of the most debated topics in enterprises when adopting Terraform for AWS. Let’s break it down:
Note: I reccomend 2 repository for 2 env eg dev and prod but limit to reuse module so use env files for separate env and separate project repofor different aws account
🔹 How Enterprises Separate Dev / Stage / Prod
There are 3 common strategies:
1. Separate Environment Files (.tfvars)
.tfvars)One Terraform project, with different
*.tfvarsfiles per environment.variables.tf main.tf dev.tfvars stage.tfvars prod.tfvarsDeploy by selecting the file:
terraform apply -var-file=dev.tfvars terraform apply -var-file=prod.tfvars
✅ Pros:
Simple, easy to start.
Minimal duplication.
❌ Cons:
Risky: one mistake (wrong
-var-file) can mess up prod.Harder to enforce isolation between environments.
2. Terraform Workspaces
Single codebase, multiple workspaces →
default,dev,stage,prod.terraform workspace new dev terraform workspace select prod terraform applyState files are automatically separated by workspace.
✅ Pros:
Cleaner separation of states.
Useful for short-lived environments (feature branches).
❌ Cons:
Harder to manage in CI/CD (workspaces aren’t very Git-friendly).
Not popular for long-lived environments in enterprises.
3. Separate State Backends / Separate Repos (Enterprise Standard)
Each environment has its own state backend (S3 bucket prefix or separate workspace in Terraform Cloud).
Often managed in separate Git repos or directories:
networking/ dev/ stage/ prod/ eks/ dev/ stage/ prod/Each environment runs independently with its own backend:
backend "s3" { bucket = "my-tf-states" key = "eks/dev/terraform.tfstate" region = "us-east-1" dynamodb_table = "terraform-locks" }
✅ Pros:
Strong isolation (no chance of overwriting prod by mistake).
Easy integration with CI/CD (different pipelines per env).
Fits well with Terraform Cloud workspaces or S3 state backends.
❌ Cons:
More boilerplate code.
Harder to share infra between environments (unless modularized).
🔹 Terraform Cloud State vs S3 State
Feature
Terraform Cloud
S3 + DynamoDB
State Storage
Managed by HashiCorp
Self-managed (S3)
State Locking
Automatic
DynamoDB required
Collaboration
Built-in UI, VCS integration
Needs CI/CD setup
Policy as Code
Sentinel policies
Use OPA/third-party
Audit & History
Built-in
Need S3 versioning/CloudTrail
Cost
Paid for teams (Free for personal)
S3+DynamoDB (cheap)
Adoption
Easier for enterprises new to Terraform
Preferred by AWS-heavy enterprises
🔑 What Enterprises Actually Do
Small teams / startups → use
.tfvarsfiles or workspaces.Mature enterprises → use separate state backends + Git repos per env, often with Terraform Cloud or S3+DynamoDB.
Hybrid → modules are shared, but envs (dev/stage/prod) are completely isolated in terms of state and pipelines.
✅ Best Practice (for your SRE/DevOps setup):
Use shared modules (
networking/,eks/,iam/).Keep separate states per environment (
dev,stage,prod) → either in Terraform Cloud workspaces OR in S3+DynamoDB with prefixes.Drive everything from Git + CI/CD pipelines → no manual
terraform applyin prod.
👉 Do you want me to sketch a sample enterprise folder structure (modules + envs + backend configs) that big companies use, so you can copy it as a reference for your projects?
Where to store state s3 vs tf cloud
Great one 👌 This is exactly the kind of decision enterprises debate when standardizing Terraform. Both S3+DynamoDB and Terraform Cloud are valid options, but they serve different needs.
🔹 Option 1: S3 + DynamoDB (Self-Managed Remote State)
Where state lives → S3 bucket.
How locking works → DynamoDB table.
Setup:
terraform { backend "s3" { bucket = "my-tf-states" key = "networking/dev/terraform.tfstate" region = "us-east-1" dynamodb_table = "terraform-locks" } }
✅ Pros
Cheap (literally cents/month for S3 + DynamoDB).
Full control (you own data, audit via CloudTrail).
Well-integrated with AWS-heavy enterprises.
Easy to scale across many environments.
❌ Cons
You must manage backend setup (S3 bucket policies, DynamoDB locks, versioning).
No native UI for state/history → you rely on S3 versioning.
No built-in RBAC (access controlled via IAM).
Need CI/CD integration manually.
🔹 Option 2: Terraform Cloud (Managed Remote State)
Where state lives → HashiCorp’s managed service.
How locking works → handled automatically.
Setup:
terraform { cloud { organization = "my-org" workspaces { name = "dev" } } }
✅ Pros
Fully managed (no need to set up S3/DynamoDB).
Built-in state management, locking, history, audit logs.
Easy collaboration (UI, CLI, API, notifications).
Policy enforcement with Sentinel.
Native VCS integration (GitHub/GitLab → auto plan/apply).
Secret management built-in.
❌ Cons
Cost (Free for small, paid for teams at scale).
State is stored outside AWS → some enterprises don’t like that.
Vendor lock-in (HashiCorp service).
If internet/HashiCorp Cloud is down → can block applies (rare, but possible).
🔑 What Enterprises Actually Do
AWS-Only Enterprises → usually prefer S3 + DynamoDB, because it keeps everything inside AWS.
Multi-Cloud or Compliance-Heavy Teams → often pick Terraform Cloud (or Terraform Enterprise on-prem) for its governance, audit, and collaboration features.
Small/Medium Teams → love Terraform Cloud Free/Team edition because it’s simple and “just works.”
✅ My Recommendation for You (Pavan)
Since you:
Already work with EKS, automation, CI/CD
Want enterprise-ready practices (SRE mindset)
Also mentioned Terraform Cloud workspace confusion
👉 If your infra is 100% AWS and you want max control/cost efficiency → go with S3 + DynamoDB. 👉 If you want better collaboration, UI, history, policies, less ops overhead → go with Terraform Cloud.
Would you like me to make a side-by-side setup guide for both (S3 backend + Terraform Cloud backend) so you can literally copy/paste and try both approaches in your EKS project?
Local vs Cloud workspace
Nice catch 👍 This is a very common confusion when people first start with Terraform Cloud + local workspaces.
Here’s the deal:
🔹 Workspaces in Local Terraform (CLI)
When you run:
terraform workspace new dev terraform workspace list→ It just creates a local workspace under
.terraform/directory.Each workspace keeps its own state (in local backend or in S3 if configured with
workspace_key_prefix).These are local to your machine (or pipeline), not visible in Terraform Cloud.
🔹 Workspaces in Terraform Cloud
In Terraform Cloud, a “workspace” means a completely separate state + config execution environment hosted by HashiCorp.
You create them via the UI or CLI (
terraform login+terraform cloud workspace create).Cloud workspaces are centralized, team-accessible, and locked by Terraform Cloud.
⚠️ What Happens If You Mix Them?
If you created a Terraform Cloud workspace (say
dev) and then locally run:terraform workspace new dev→ You won’t be using the Cloud workspace. Instead, Terraform will create a local workspace named
dev, which is unrelated to the Cloud one.This can cause two different states for the same environment:
Cloud:
dev(remote state in Terraform Cloud)Local:
dev(local state in.terraform/or S3, depending on your backend config)
🔑 Best Practice
Pick one workspace strategy and stick to it:
If using Terraform Cloud → manage workspaces in Cloud only. Don’t create local ones.
If using S3/DynamoDB → you can use local workspaces to separate states.
In Terraform Cloud, environment separation is usually done by creating separate workspaces in Cloud (e.g.,
dev,stage,prod) rather than using CLI workspaces locally.
✅ Example for Terraform Cloud (with separate workspaces):
terraform {
cloud {
organization = "my-org"
workspaces {
name = "dev"
}
}
}✅ Example for Local/S3 with workspaces:
terraform {
backend "s3" {
bucket = "my-tf-states"
key = "app/${terraform.workspace}/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
}
}Here, terraform.workspace (dev, prod, etc.) decides which state file to use.
👉 So Pavan, if you’re already using Terraform Cloud, better to avoid terraform workspace new locally and instead create/manage workspaces in Terraform Cloud UI or CLI.
Do you want me to show you how to map one codebase to multiple Terraform Cloud workspaces (dev/stage/prod) properly so you don’t risk state conflicts?
Other Questions
Perfect 💯 — I’ll cover the first 20 enterprise-level Terraform + AWS questions with short, crisp answers. Later, I’ll give you the remaining ones.
🔹 Before Moving to Terraform (Planning Stage)
1. Should we use Terraform Open Source, Cloud, or Enterprise?
Small teams → OSS
Mid-size → Terraform Cloud
Large enterprises (compliance-heavy) → Terraform Enterprise
2. Repo structure: monorepo vs multi-repo?
Multi-repo per stack/env is common in enterprises; monorepo works for small infra.
3. Multi-account AWS setup?
Use AWS Organizations + Control Tower + one Terraform root module per account.
4. Registry modules or custom enterprise modules?
Start with Registry modules → later build internal standardized modules.
5. How to enforce tagging standards?
Define required tags in modules + policy checks (OPA/Sentinel).
6. Naming conventions for resources?
Standardize (e.g.,
app-env-region-resource). Helps with clarity + cost.
7. Store state in S3+DynamoDB or Terraform Cloud?
S3+DynamoDB if AWS-only; Terraform Cloud if you need governance/UI.
8. Use workspaces or separate backends per environment?
Separate backends (or Cloud workspaces). Safer for long-lived environments.
9. Who bootstraps the backend?
A one-time manual/admin Terraform run (or bootstrap script) creates S3 bucket/DynamoDB/Cloud org.
10. How to secure state files?
Enable S3 encryption + restrict IAM access + enable versioning.
11. How to secure IAM credentials for Terraform?
Use roles/STS with CI/CD pipelines. No hardcoded keys.
12. How to enforce security best practices?
Use Terraform policies + modules with defaults (encryption, logging enabled).
13. Policy-as-code options?
Sentinel (Terraform Cloud/Enterprise) or OPA (open source).
14. How to handle secrets?
Use AWS Secrets Manager/SSM Parameter Store. Never hardcode.
15. CI/CD integration?
Run
terraform planon PR,applyafter approval.
16. Manual apply vs pipeline?
Best practice → only CI/CD applies. Manual only for break-glass.
17. How to enforce PR reviews?
Git policy: all infra changes via PR + plan output in comments.
18. How to manage multiple teams?
Split by stacks/modules, separate backends per team. Use IAM for access control.
19. How to detect drift?
Run scheduled
terraform planjobs → alert on changes.
20. How to roll back if apply breaks prod?
Rollback via Git → previous commit + re-apply. Also enable state versioning in backend.
Would you like me to now continue with Q21–Q35 (after moving to Terraform: drift, scaling, multi-region, compliance, upgrades, costs, etc.) in the same short Q&A format?
Great 👍 Let’s continue with the remaining Q21–Q35 in the same short Q&A style.
🔹 After Moving to Terraform (Operational Stage)
21. How to handle disaster recovery?
Rebuild infra from Terraform code + remote state in new region/account.
22. How to manage large state files?
Split infra into multiple state files/modules (networking, compute, data).
23. How to organize Terraform for hundreds of AWS accounts?
Use AWS Organizations + per-account state files + standardized modules.
24. How to split stacks (networking, compute, security)?
Separate repos/backends for each → enables independent updates.
25. How to import existing resources into Terraform?
Use
terraform import+ manually write resource code.
26. How to manage multi-region deployments?
Separate state files per region, or parameterize region in modules.
27. Least-privilege IAM for Terraform?
Create dedicated IAM roles for Terraform with only needed permissions.
28. How to audit infra changes?
Git history + Terraform Cloud logs + S3/DynamoDB versioning.
29. How to rotate AWS credentials in pipelines?
Use short-lived STS tokens from IAM roles or AWS SSO.
30. How to enforce tagging for cost allocation?
Add required tags in modules + enforce via policy checks.
31. How to handle Terraform version upgrades?
Test upgrade in dev → run
terraform init -upgrade→ roll out per env.
32. Should we use Terragrunt?
Yes, if you need DRY structure, dependency management, and multi-env orchestration.
33. How to train developers to use Terraform safely?
Give them module-level access, enforce CI/CD, and peer review PRs.
34. What if Registry modules update/break?
Pin versions (
version = "x.y.z") and test updates in dev first.
35. How to estimate costs before apply?
Use
terraform planwith cost-estimation tools (Infracost, Terraform Cloud).
✅ That covers Q1–Q35 — the most common questions asked before and after moving to Terraform in AWS enterprises.
👉 Pavan, would you like me to now combine all 35 Q&A into a single enterprise-ready cheat sheet (so you can keep it handy for migration discussions and interviews)?
After done with prereqisites - how to start
Awesome — you’ve got the essentials in place. Here’s a clean, enterprise-style way to modularize VPC, EKS, RDS, and EC2 for two environments (dev, prod) using Terraform OSS with Terraform Cloud (TFC) for state.
I’ll give you:
Repo layout (modules + live)
Terraform Cloud workspace plan
Minimal, production-ready module skeletons
Environment wiring (providers, variables, backends)
End-to-end workflow (local + CI/CD)
1) Repos & Folder Structure
Use a two-repo pattern:
infra-modules (pure, reusable modules)
infra-live (environment wiring that calls modules)
Why this split?
Modules repo stays stable & versioned (tagged releases).
Live repo is per-env wiring, easy RBAC, easy blast-radius control.
2) Terraform Cloud Workspaces
Create one workspace per stack per environment:
vpc-dev,eks-dev,rds-dev,ec2-devvpc-prod,eks-prod,rds-prod,ec2-prod
In each live stack directory, pin the workspace:
Important: Don’t use local CLI workspaces. Use TFC workspaces only.
3) Minimal, Production-lean Module Skeletons
3.1 VPC Module (infra-modules/vpc)
variables.tf (essentials)
main.tf (outline)
outputs.tf
3.2 EKS Module (infra-modules/eks)
variables.tf
main.tf (outline; use AWS EKS + IAM + Node groups)
outputs.tf
3.3 RDS Module (infra-modules/rds)
variables.tf
main.tf (outline with subnet group + SG + instance/cluster)
3.4 EC2 Module (infra-modules/ec2)
variables.tf
main.tf
4) Environment Wiring (dev/prod)
4.1 Shared versions/tags (infra-live/global)
versions.tf
tags.tf
4.2 Dev VPC (infra-live/envs/dev/vpc)
main.tf
variables.tf
dev.auto.tfvars
backend.tf
Prod mirrors dev with different CIDRs, tags, and workspace name (
vpc-prod).
4.3 Dev EKS (infra-live/envs/dev/eks)
main.tf
variables.tf
dev.auto.tfvars (fill from VPC outputs you copied or inject via TFC variables)
backend.tf → workspace eks-dev.
Alternative: Use Terraform Cloud run tasks/variables to pass VPC outputs to EKS. Avoid hard-coding IDs in files; store them as TFC workspace variables (safest).
4.4 Dev RDS & EC2 stacks
Follow the same pattern:
Inputs come from VPC (vpc_id, subnet_ids, SGs).
Different
backend.tfmapping tords-dev,ec2-dev.
5) End-to-End Steps
Create TFC workspaces
vpc-dev,eks-dev,rds-dev,ec2-dev, and same for prod.Set workspace variables (e.g.,
AWS_REGION,TF_VAR_*, sensitive DB password, etc.).Configure VCS connection to
infra-liveor run local CLI withterraform login.
Bootstrap order (dev)
vpc-dev:terraform init && terraform applyPut resulting VPC ID + subnet IDs into TFC variables for
eks-dev,rds-dev,ec2-dev.eks-dev: applyrds-dev: applyec2-dev: apply
Repeat for prod with prod values (CIDRs, sizes, versions, instance classes).
Lock module versions
Tag modules in
infra-modules(e.g.,v1.0.0) and reference with?ref=v1.0.0.
Secrets & creds
Never commit passwords/keys. Use TFC sensitive variables for DB passwords, etc.
Use AWS SSO/assume-role in CI or
TFC_AWS_PROVIDER_AUTHfor safe auth.
CI/CD (recommended)
On PR: run
terraform fmt,validate, andplan(TFC posts plan to PR).On merge to
main:apply(requires manual TFC approval for prod).
Tagging & governance
Enforce tags via module
var.tags+locals.base_tags.Add Sentinel/OPA policies in TFC for guardrails (e.g., encryption required).
Drift & hygiene
Schedule nightly
planin TFC for each workspace; alert on drift.Enable destroy protections on prod workspaces.
Version-pin providers/TF in
versions.tf.
Quick Commands (local runs)
Pro Tips
Keep state blast radius small: one workspace per stack per env.
Pass cross-stack IDs via TFC variables (or remote state data sources if you must).
For EKS, add-ons you’ll likely need next: VPC CNI, CoreDNS, KubeProxy, ALB Controller, Karpenter, Cluster Autoscaler, Metrics Server, EBS CSI (model them as sub-modules later).
For RDS prod: enable Performance Insights, backups, deletion protection.
For EC2: standardize launch templates, SSM access, and IMDSv2.
If you want, I can drop in ready-to-run files (copy/pasteable main.tf/variables.tf/outputs.tf) for each stack in infra-live and infra-modules customized to your defaults (region, naming, tags).
Last updated