If you’ve moved past provisioning your first S3 bucket, you’ve likely noticed a curious file pop up in your directory: terraform.tfstate. This file, known as the Terraform State File, is arguably the most critical component governing your infrastructure and is essential for reliable cloud management.
The Role of the Terraform State File
The state file tracks the actual state of the infrastructure you have deployed. Its primary purpose is to allow Terraform to compare your desired state (defined in your configuration files, like main.tf) with the actual state of your deployed resources (like your VPCs, EC2 servers, or S3 buckets).
When you run terraform apply, Terraform does not constantly query the cloud provider API to see what exists. Instead, it relies on this intermediate state file.
1. Desired State vs. Actual State: If your configuration defines three resources (an S3 bucket, a VPC, and an EC2 server) but the state file shows zero resources, Terraform knows it must create those three missing resources to match the desired state with the actual state.
2. Handling Changes: If you later remove one resource from your main.tf, the desired state changes. Terraform compares this new configuration to the state file and knows it must destroy the corresponding resource in the environment.
Without the state file, Terraform would have to perform frequent, resource-intensive API calls, and it would lose track of what it owns and manages.
The Dangers of Local State
While crucial, the state file presents a major security risk when stored locally. It contains sensitive and confidential data, including your AWS Account ID, secret keys, and other resource details. It should not be stored on your local machine or shared server where it could be accidentally exposed.
Furthermore, local state files make collaboration impossible. If multiple users attempt to manage the same infrastructure simultaneously, the locally stored state file can quickly become corrupted, leading to infrastructure inconsistencies.
The Solution: Remote Backend Configuration
To address these security and collaboration issues, the concept of a Remote Backend is introduced. This involves storing the state file remotely, typically in a secure location like an AWS S3 bucket. Other cloud providers offer similar services, such as Azure Blob or GCP Cloud Storage.
With a remote backend configured, every time you run a Terraform command, it goes and checks the remote file in the S3 bucket, compares it with your actual infrastructure, and then makes the necessary changes.
Code Example: Configuring the S3 Remote Backend
You must configure the backend within the terraform block of your configuration file. Note that the S3 bucket hosting the state file should be created manually or via a separate script before running terraform init.
terraform {
required_providers {
// ... required providers configuration
}
// Define the remote backend configuration
backend "s3" {
// Unique name of the S3 bucket to store the state file
bucket = "tech-tutorials-my-unique-state-bucket"
// Key defines the path/folder within the bucket.
// Recommended practice is to isolate by environment.
key = "dev/terraform.tfstate"
// Encryption is highly recommended for security
encrypt = true
}
}
After adding this configuration, you must run terraform init again. It will recognize the new backend configuration and confirm that the backend has been successfully configured to use S3.
State Locking for Integrity
When using a remote backend, state locking is vital. State locking is a mechanism that prevents multiple users or processes from executing terraform apply on the same infrastructure simultaneously. If a user is running a command, the state file is locked, and only when the process completes will the lock be released, preventing file corruption.
While earlier implementations sometimes required a separate DynamoDB table for locking, S3 now offers an inbuilt feature for state locking.
Best Practices for State Management
To maintain a healthy Terraform environment, follow these best practices:
• Do Not Edit Manually: Never manually delete or update the state file, as this will corrupt your infrastructure management. Use specialized Terraform state commands like terraform state rm.
• Isolate Environments: Create separate state files for different environments (Dev, Test, Prod) by adjusting the key parameter in your backend configuration.
• Regular Backups: Implement regular backups of your state file so you can recover if the file is accidentally corrupted or lost.
By properly securing and managing your state file using a remote backend, you ensure that your infrastructure is secure, consistent, and ready for collaborative development.
Video from original challenge
Below is the video detailing the setup and importance of the Terraform State File and Remote Backend of @piyushsachdeva




Top comments (1)
I like DevOps