DocumentDB with Terraform: A Production-Grade Deep Dive
The relentless pressure to deliver features faster often leads to complex data modeling requirements. Traditional relational databases can become bottlenecks, hindering agility. Many teams find themselves needing a flexible, scalable, and schema-less database solution. DocumentDB, specifically Azure Cosmos DB’s DocumentDB API, addresses this need. This post details how to manage DocumentDB within a Terraform-centric infrastructure as code (IaC) workflow, focusing on practical implementation for engineers building and operating production systems. It assumes familiarity with Terraform fundamentals and Azure. This service fits into a platform engineering stack as a managed database offering, provisioned and governed through IaC, integrated with CI/CD pipelines, and monitored via observability tools.
What is "DocumentDB" in Terraform Context?
Within Terraform, DocumentDB is managed through the azurerm
provider. The core resource is azurerm_cosmosdb_account
, representing the Cosmos DB account itself. The DocumentDB API is specified within this resource. Terraform handles the lifecycle of the account, including creation, updates, and deletion.
The azurerm
provider exhibits typical Terraform behavior: state management is crucial, and dependencies must be explicitly defined. A key caveat is the asynchronous nature of Cosmos DB account provisioning. Terraform will create the resource, but it can take a significant amount of time for the account to become fully available. Using time_sleep
resources or data sources to poll for readiness is often necessary. Furthermore, changes to the capabilities
block (e.g., enabling automatic failover) can trigger lengthy operations.
Azure Cosmos DB Terraform Documentation
Use Cases and When to Use
DocumentDB excels in scenarios where flexibility and scalability are paramount:
- Personalization Engines: Storing user profiles, preferences, and behavioral data. The schema-less nature allows for evolving data structures without costly migrations. SREs benefit from the auto-scaling capabilities to handle peak loads during marketing campaigns.
- Content Management Systems (CMS): Managing articles, blog posts, and multimedia content. DocumentDB’s ability to store complex nested data structures simplifies content modeling. DevOps teams can automate deployments of CMS infrastructure.
- IoT Data Ingestion: Handling high-velocity data streams from IoT devices. DocumentDB’s global distribution and low latency are critical for real-time analytics. Infrastructure architects can design multi-region deployments for high availability.
- E-commerce Catalogs: Storing product information, including variations, attributes, and reviews. The flexible schema accommodates diverse product types.
- Gaming Leaderboards & Player Data: Storing player profiles, scores, and game state. DocumentDB’s low latency and scalability are essential for a responsive gaming experience.
Key Terraform Resources
-
azurerm_cosmosdb_account
: The core resource for creating and managing a Cosmos DB account.
resource "azurerm_cosmosdb_account" "example" {
name = "my-cosmosdb-account"
location = "East US"
resource_group_name = azurerm_resource_group.example.name
kind = "GlobalDocumentDB"
consistency_level = "Session"
enable_automatic_failover = true
}
-
azurerm_resource_group
: Essential for organizing resources.
resource "azurerm_resource_group" "example" {
name = "my-resource-group"
location = "East US"
}
-
azurerm_cosmosdb_sql_container
: Creates a container (similar to a table) within the database.
resource "azurerm_cosmosdb_sql_container" "example" {
account_name = azurerm_cosmosdb_account.example.name
database_name = "my-database"
name = "my-container"
partition_key_path = "/city"
}
-
azurerm_cosmosdb_sql_database
: Creates a database within the Cosmos DB account.
resource "azurerm_cosmosdb_sql_database" "example" {
account_name = azurerm_cosmosdb_account.example.name
name = "my-database"
}
-
azurerm_role_assignment
: Grants permissions to access the Cosmos DB account.
resource "azurerm_role_assignment" "example" {
scope = azurerm_cosmosdb_account.example.id
role_definition_name = "Cosmos DB Operator"
principal_id = data.azuread_user.example.id
}
data "azuread_user" "example" {
user_principal_name = "user@example.com"
}
-
data.azuread_client_config
: Retrieves Azure Active Directory configuration.
data "azuread_client_config" "current" {}
-
time_sleep
: Used to wait for Cosmos DB account provisioning to complete.
resource "time_sleep" "wait_for_cosmosdb" {
depends_on = [azurerm_cosmosdb_account.example]
create_duration = "60s" # Adjust as needed
}
-
azurerm_cosmosdb_virtual_network_rule
: Configures virtual network access to the Cosmos DB account.
resource "azurerm_cosmosdb_virtual_network_rule" "example" {
account_name = azurerm_cosmosdb_account.example.name
virtual_network_id = "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/YOUR_RESOURCE_GROUP/providers/Microsoft.Network/virtualNetworks/YOUR_VNET_NAME"
}
Common Patterns & Modules
Using for_each
with azurerm_cosmosdb_sql_container
allows for dynamic container creation based on a map of container configurations. Remote backends (e.g., Terraform Cloud, Azure Blob Storage) are essential for state locking and collaboration.
A layered module structure is recommended: a core cosmosdb
module handles account creation, and separate modules manage databases and containers. Monorepos are suitable for larger projects, while environment-based directories (e.g., environments/dev/cosmosdb
) work well for simpler deployments.
Terraform Azure Cosmos DB Module Example
Hands-On Tutorial
This example creates a Cosmos DB account, a database, and a container.
1. Provider Setup:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
}
}
provider "azurerm" {
features {}
}
2. Resource Configuration:
resource "azurerm_resource_group" "example" {
name = "cosmosdb-example-rg"
location = "East US"
}
resource "azurerm_cosmosdb_account" "example" {
name = "cosmosdb-example-account"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
kind = "GlobalDocumentDB"
consistency_level = "Session"
}
resource "time_sleep" "wait_for_cosmosdb" {
depends_on = [azurerm_cosmosdb_account.example]
create_duration = "60s"
}
resource "azurerm_cosmosdb_sql_database" "example" {
account_name = azurerm_cosmosdb_account.example.name
name = "example-database"
}
resource "azurerm_cosmosdb_sql_container" "example" {
account_name = azurerm_cosmosdb_account.example.name
database_name = azurerm_cosmosdb_sql_database.example.name
name = "example-container"
partition_key_path = "/id"
}
3. Apply & Destroy Output:
terraform init
terraform plan
terraform apply
terraform destroy
terraform plan
will show the resources to be created. terraform apply
will provision them. terraform destroy
will remove them.
Enterprise Considerations
Large organizations leverage Terraform Cloud/Enterprise for state management, remote runs, and collaboration. Sentinel or Azure Policy can enforce compliance rules. IAM design should follow the principle of least privilege, using Azure AD groups and role assignments. State locking is critical to prevent concurrent modifications. Costs can be significant, especially with global distribution and high throughput. Multi-region deployments require careful planning to minimize latency and ensure data consistency.
Security and Compliance
Enforce least privilege using azurerm_role_assignment
. Implement RBAC to control access to Cosmos DB resources. Use Azure Policy to enforce tagging policies (e.g., environment
, owner
) for cost tracking and governance. Drift detection (using Terraform Cloud or custom scripts) identifies unauthorized changes. Audit logs should be enabled and integrated with a SIEM system.
resource "azurerm_role_assignment" "cosmosdb_contributor" {
scope = azurerm_cosmosdb_account.example.id
role_definition_name = "Cosmos DB Contributor"
principal_id = data.azuread_group.example.id
}
data "azuread_group" "example" {
display_name = "CosmosDBAdmins"
}
Integration with Other Services
- Azure Functions: Triggered by document changes in Cosmos DB.
- Azure Logic Apps: Orchestrating workflows based on Cosmos DB data.
- Azure Event Hubs: Streaming Cosmos DB change feed events.
- Azure Synapse Analytics: Analyzing Cosmos DB data for business intelligence.
- Azure App Service: Applications reading and writing data to Cosmos DB.
graph LR
A[Terraform] --> B(Cosmos DB);
B --> C{Azure Functions};
B --> D{Azure Logic Apps};
B --> E{Azure Event Hubs};
B --> F{Azure Synapse Analytics};
B --> G{Azure App Service};
Module Design Best Practices
Abstract DocumentDB into reusable modules with well-defined input variables (e.g., name
, location
, resource_group_id
, consistency_level
). Use output variables to expose key attributes (e.g., account_id
, endpoint
). Utilize locals for complex configurations. Document the module thoroughly with examples and usage instructions. Consider using a backend like Terraform Cloud for versioning and collaboration.
CI/CD Automation
# .github/workflows/cosmosdb-deploy.yml
name: Deploy Cosmos DB
on:
push:
branches:
- main
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: hashicorp/setup-terraform@v2
- run: terraform fmt
- run: terraform validate
- run: terraform plan -out=tfplan
- run: terraform apply tfplan
Pitfalls & Troubleshooting
-
Provisioning Delays: Cosmos DB account creation can take a long time. Use
time_sleep
or polling. - Partition Key Selection: Incorrect partition key leads to uneven data distribution and performance issues.
- Throttling: Exceeding RU/s limits results in throttling. Monitor RU/s consumption and adjust accordingly.
- Consistency Level: Choosing the wrong consistency level impacts data consistency and latency.
- State Corruption: Ensure proper state locking and backup.
- Network Connectivity: Virtual network rules can block access if not configured correctly.
Pros and Cons
Pros:
- Scalability and Flexibility
- Global Distribution
- Schema-less Data Model
- Strong Integration with Azure Services
- Terraform-managed lifecycle
Cons:
- Cost can be high
- Asynchronous provisioning requires workarounds
- Partition key design is critical
- Complexity of consistency levels
- Potential for throttling
Conclusion
DocumentDB, managed through Terraform, provides a powerful and flexible data storage solution for modern applications. By embracing IaC principles and leveraging Terraform’s capabilities, engineers can automate deployments, enforce compliance, and scale their infrastructure efficiently. Start by building a simple module, integrating it into your CI/CD pipeline, and evaluating the cost and performance implications for your specific use case. Focus on proper partition key design and consistency level selection to maximize the benefits of this service.
Top comments (0)