Spacelift team for Spacelift

Posted on Mar 24 • Originally published at spacelift.io

Terraform Archive_File Data Source

#terraform

The archive_file data source in Terraform creates and manages archive files dynamically within infrastructure as code. It automates packaging files into ZIP or TAR formats, making it easier to integrate with deployments, configuration management, and cloud storage solutions.

In this article, we will show you how to use the archive_file to create archives from single or multiple files and address common issues related to this Terraform data source.

What is the archive_file in Terraform?

The archive_file data source in Terraform creates compressed archive files (e.g., .zip, .tar) from local files or directories. It is commonly used to package application code, configuration files, or other assets for deployment.

archive_file is particularly useful in automated deployment workflows where Terraform needs to bundle files before uploading them to cloud storage or deployment services (e.g., AWS S3, Lambda, or Azure Storage).

Here's a basic syntax for the archive_file block:

data "archive_file" "example" {
  type        = "zip"             # Archive format: zip, tar, tgz
  source_dir  = "path/to/source"  # Directory to archive
  output_path = "path/to/output.zip" # Output archive path
}

Or, if you need to archive a single file:

data "archive_file" "example_file" {
  type        = "zip"
  source_file = "path/to/file.txt"
  output_path = "path/to/output.zip"
}

Where:

type - Specifies the archive format (zip, tar, or tgz)
source_dir - Defines the directory whose contents should be archived
source_file - Specifies a single file to be archived
output_path - Determines the location where the archive file will be created

Note: You can use either source_dir or source_file, but not both in a single archive_file data block.

The archive_file data source supports different compression formats and allows you to specify input files, output paths, and file types.

The archive_file data source can be useful for:

Deploying function code - AWS Lambda deployments often require function code and dependencies to be zipped before uploading.
Packaging configuration files - This includes Kubernetes manifests, Helm charts, or other configuration files that need to be archived before deployment.
Bundling multiple files - This is useful when packaging multiple files into a single archive before uploading to cloud storage services like AWS S3, Google Cloud Storage, or Azure Blob Storage.

Learn more: How to utilize Terraform data sources

How to create an archive from a single file

Here's how to use the Terraform archive_file to create a ZIP archive from a single file:

data "archive_file" "example" {
  type        = "zip"
  source_file = "example.txt"
  output_path = "example.zip"
}

output "archive_checksum" {
  value = data.archive_file.example.output_base64sha256
}

The output_base64sha256 attribute provides a SHA-256 checksum of the generated archive in Base64 encoding. This can be used to verify file integrity and detect changes in Terraform runs. However, Terraform does not automatically track changes inside the source file --- to force updates, consider using filemd5() on source_file.

Example: Creating a ZIP archive for an AWS Lambda function

For this example, let's assume you have a Python script (lambda_function.py) you want to compress and deploy as an AWS Lambda function.

Start by defining the archive_file data source:

data "archive_file" "lambda_zip" {
  type        = "zip"
  source_file = "lambda_function.py"
  output_path = "lambda_function.zip"
}

Now, use the archive in an AWS Lambda deployment:

resource "aws_lambda_function" "my_lambda" {
  function_name    = "my_lambda_function"
  role             = aws_iam_role.lambda_role.arn
  runtime          = "python3.8"
  handler          = "lambda_function.lambda_handler"

  filename         = data.archive_file.lambda_zip.output_path
  source_code_hash = data.archive_file.lambda_zip.output_base64sha256
}

Note: This example may not work if AWS Lambda expects a package with dependencies. If your Lambda function imports external libraries, you must zip the entire directory, not just the script.

💡 You might also like:

Using Terraform Moved Block to Refactor Resources
Terraform Element Function – How It Works & Use Cases
Terraform Backends – Local and Remote Explained
Don't miss out! Get more posts like this - subscribe to our newsletter

How to create archives from multiple files

Terraform's archive_file does not support specifying multiple individual files directly. If you want to create an archive from multiple specific files (but not an entire directory) using the archive_file data source in Terraform, you cannot specify multiple source_file entries in a single block.

Because archive_file only supports either source_file (for a single file) or source_dir (for an entire directory), you need to first gather all the files into a temporary directory. This can be done manually or by using a terraform_data with a local-exec script to automate the process.

Here is a workaround using terraform_data:

resource "terraform_data" "prepare_files" {
 provisioner "local-exec" {
   command = <<EOT
     mkdir -p temp_folder
     cp ${path.module}/file1.txt temp_folder/
     cp ${path.module}/file2.txt temp_folder/
     cp ${path.module}/file3.txt temp_folder/
   EOT
 }
}

data "archive_file" "multiple_files" {
 type        = "zip"
 source_dir  = "${path.module}/temp_folder"
 output_path = "${path.module}/multiple_files.zip"
 depends_on  = [terraform_data.prepare_files]
}

To archive an entire directory, use source_dir:

data "archive_file" "example" {
  type        = "zip"
  source_dir  = "${path.module}/my_folder"
  output_path = "${path.module}/example.zip"
}

Example: Archiving a directory and uploading it to Azure Storage

In this example, we will create a ZIP archive of a directory and upload it to an Azure Storage Blob.

The Terraform configuration below compresses all files within a directory (my_app_folder) into a ZIP archive:

data "archive_file" "app_package" {
  type        = "zip"
  source_dir  = "${path.module}/my_app_folder"
  output_path = "${path.module}/app_package.zip"
}

Now, we'll create an Azure Storage Account and Storage Container:

resource "azurerm_storage_account" "example" {
  name                     = "mystorageacct"
  resource_group_name      = "my-resource-group"
  location                 = "East US"
  account_tier             = "Standard"
  account_replication_type = "LRS"
}

resource "azurerm_storage_container" "example" {
  name                  = "mycontainer"
  storage_account_name  = azurerm_storage_account.example.name
  container_access_type = "private"
}

To upload the archive:

resource "azurerm_storage_blob" "example" {
  name                   = "app_package.zip"
  storage_account_name   = azurerm_storage_account.example.name
  storage_container_name = azurerm_storage_container.example.name
  type                   = "Block"
  source                 = data.archive_file.app_package.output_path
  depends_on             = [data.archive_file.app_package]
}

Troubleshooting common issues with archive_file

Here are some examples of troubleshooting common issues with the archive_file data source in Terraform.

1. Incorrect source path

One of the most common issues when using the archive_file data source is specifying an incorrect file or directory path, which leads to errors like:

Error: failed to read source file: no such file or directory

To fix this:

Verify the file or directory exists before running Terraform.
Use an absolute path if necessary.
If using relative paths, ensure they are correct by referencing ${path.module}.
Check file permissions to ensure Terraform can read the file.
Run terraform plan to confirm Terraform correctly identifies the file.

2. Missing required dependencies

Terraform relies on the zip utility to create compressed archives. If the zip binary is not installed or not available in the system's PATH, Terraform will fail.

If missing, install it via:

Linux: sudo apt install zip (Debian/Ubuntu)
macOS: brew install zip
Windows: Ensure zip.exe is in the system PATH.

3. Unchanged archive not triggering updates

If files inside source_dir change, Terraform may not detect updates in some cases. Use filemd5 to track file modifications:

output "archive_hash" {
  value = filemd5(data.archive_file.example.output_path)
}

4. Incorrect file permissions

Permission errors may arise if Terraform creates an archive, but deployment tools (e.g., AWS Lambda, Docker containers, or remote servers) fail due to missing execution permissions. This commonly occurs when archiving shell scripts (.sh), executables, or other files that require specific permissions to run in a deployment environment.

When trying to deploy an AWS Lambda function or an executable script, you might encounter an error such as:

Error: permission denied

/bin/sh: ./script.sh: Permission denied

This happens because Terraform archives files with their existing permissions, and some deployment tools require explicit execution rights.

To avoid permission issues, modify the file permissions before Terraform creates the archive:

chmod +x my_script.sh

5. Incorrect path resolution inside a module

When using Terraform modules, file paths are resolved relative to the module's directory, not the root module. This can cause issues if the archive_file data source references a path incorrectly.

Error: failed to read source file: no such file or directory

This typically happens when source_dir or source_file references files assuming a root module path instead of the module's own directory.

To fix this issue, always use path.module instead of path.root to ensure Terraform correctly resolves paths relative to the module.

The correct way to define paths inside a module:

data "archive_file" "lambda_package" {
  type        = "zip"
  source_dir  = "${path.module}/app"  # Ensures correct relative path
  output_path = "${path.module}/lambda.zip"
}

Key points

The archive_file data source in Terraform helps create ZIP and TAR archives from single or multiple files. In this guide, we covered syntax, usage, and troubleshooting issues such as incorrect paths, permissions, and missing files.

We encourage you to explore how Spacelift makes it easy to work with Terraform. If you need help managing your Terraform infrastructure, building more complex workflows based on Terraform, and managing AWS credentials per run, instead of using a static pair on your local machine, Spacelift is a fantastic tool for this.

To learn more about Spacelift, create a free account today or book a demo with one of our engineers.

Written by Mariusz Michalowski