Zakariyau Mukhtar

Posted on Dec 9, 2025

Day 13 : Understanding Terraform Data Sources Like a Human, Not a Robot

#aws #terraform #devops

Today’s session was one of those “lightbulb days” in Terraform the point where things finally click. Day 13 was all about Terraform Data Sources and honestly, this is one of the core concepts that separates beginners from engineers who actually understand infrastructure workflows.

Data Sources: The Missing Puzzle Piece:

I’ve been creating resources in Terraform for days EC2 instances, S3 buckets, IAM, and more but today I finally understood why Data Sources matter so much. While resources are things Terraform creates, data sources are things Terraform reads from AWS. This difference is massive. Data sources allow you to use existing AWS components instead of hardcoding IDs or manually copying values from your console. That means:

No more manually typing VPC IDs.
No more guessing subnets.
No more hunting for AMI IDs.
No more breaking infrastructure because an ID changed. Terraform stays synced with AWS, and your code becomes smarter, reusable and more consistent.

What I Built Today:

1. Fetch the Default VPC:

data "aws_vpc" "vpc_name" {
  filter {
    name   = "tag:Name"
    values = ["default"]
  }
}

Instead of hardcoding a VPC ID, this fetches the one tagged “default.”

Small change and big improvement in reliability.

2. Fetch a Subnet Inside that VPC:

data "aws_subnet" "shared" {
  filter {
    name   = "tag:Name"
    values = ["subneta"]
  }
  vpc_id = data.aws_vpc.vpc_name.id
}

This pulls the subnet tagged “subneta,” within the VPC I retrieved earlier.

Now Terraform can dynamically locate a subnet without me touching the AWS console.

3. Fetch the Latest Amazon Linux 2 AMI:

data "aws_ami" "linux2" {
  owners      = ["amazon"]
  most_recent = true

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

This is the cleanest way to ensure your instance always uses the latest stable Amazon Linux 2 image.

4. Deploy an EC2 Instance Using ONLY Dynamic Values:

resource "aws_instance" "example" {
  ami           = data.aws_ami.linux2.id
  instance_type = "t2.micro"
  subnet_id     = data.aws_subnet.shared.id
  tags          = var.tags
}

No hardcoded IDs.
No guesswork.
No outdated AMIs.
Everything is tied to real AWS values at the time of deployment.

Backend Configuration (S3 Remote State)

I also configured my backend for state storage:

backend "s3" {
  bucket       = "devopswithzacks-terraform-state"
  key          = "dev/terraform.tfstate"
  region       = "us-east-1"
  encrypt      = true
  use_lockfile = true
}

This moved my Terraform state from local to AWS S3, enabling:

Team collaboration.
State locking.
Zero risk of losing state locally.
Better security and versioning. This is how real-world environments are structured.

Commands I Used:

Same workflow:

terraform init
terraform plan
terraform apply --auto-approve
terraform destroy --auto-approve Smooth, clean and predictable especially now that everything is dynamic.

My Experience Today

Day 13 was surprisingly straightforward compared to the last few days.

No major blockers.
No syntax confusion.
No AWS permission issues.

Just clean logic and a clearer understanding of how Terraform interacts with existing AWS infrastructure.

Data sources are the bridge between what you already have and what you're trying to build. Today made that crystal clear.

Final Thoughts

Data sources are easily one of the most powerful features in Terraform. They eliminate hardcoding, reduce errors, and make infrastructure code scalable and environment-aware.
Today felt like a big step forward, my Terraform code finally behaves like it belongs in production.

DEV Community