Motivation
Since 2023 Codecov provides an official self-hosted repository with Docker Compose examples and some basic guidance. It's a reasonable starting point, but it falls short in a few important ways:
- Documentation gaps: The official docs cover the happy path (GitHub SaaS + Docker Compose on a single VM) but leave a lot of blanks for anything beyond that.
- No GitLab self-hosted coverage: Our entire engineering workflow runs on a self-hosted GitLab instance. Getting Codecov to integrate with it: OAuth app creation, the right environment variables, the correct redirect URLs, etc required piecing together information from GitHub issues, the Codecov community forum, Codecov python code itself (to figure out how the env vars should be correctly named), and A LOT of trial and error. None of it is documented in one place.
- Everything as code: At our company, we don't click through cloud consoles or run one-off scripts to provision infrastructure. Every resource: DNS records, OAuth applications, IAM roles, database instances, etc is managed through Terraform and reviewed like any other code change. The official self-hosted guide doesn't reflect this approach at all.
This article documents what we actually built: a production-grade, fully Terraform-managed Codecov deployment on AWS ECS Fargate, integrated with a self-hosted GitLab instance, with no manual steps after terraform apply.
Table of Contents
- Architecture Overview
- Prerequisites
- Repository Structure
- providers.tf
- variables.tf
- locals.tf
- data.tf
- Networking: nat-gateway.tf
- Networking: vpc-endpoints.tf
- Networking: service-discovery.tf
- security-groups.tf
- ecs-cluster.tf
- iam.tf
- Data Layer: rds.tf
- Data Layer: elasticache.tf
- Data Layer: efs-codecov-timescale.tf
- Data Layer: s3-codecov.tf
- secrets.tf
- GitLab OAuth: gitlab-oauth-codecov-app.tf
- ECS Services
- autoscaling.tf
- alb.tf
- acm.tf
- dns.tf
- outputs.tf
- CI/CD Pipeline
- Deploying
- Day-2 Operations
Architecture Overview
The deployment uses a private-first network model. All Codecov services run in private subnets with no public IP assignment. External traffic enters through a public Application Load Balancer (ALB) with HTTPS termination, and internal services communicate via AWS Cloud Map service discovery.
Internet
│
▼
Cloudflare (DNS)
│
▼
Application Load Balancer (public subnets, HTTPS 443)
│
▼ (private subnets)
┌──────────────────────────────────────────────────────────────┐
│ ECS Fargate Cluster │
│ │
│ ┌─────────┐ ┌──────────┐ ┌─────┐ ┌────────┐ ┌────┐ │
│ │ Gateway │→ │ Frontend │ │ API │ │ Worker │ │ IA │ │
│ └─────────┘ └──────────┘ └──┬──┘ └────────┘ └────┘ │
│ │ │
│ Internal DNS: mycompany-tooling.local │
│ (AWS Cloud Map) │
└──────────────────────────────────────────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
RDS Redis TimescaleDB S3
(PostgreSQL) (Cache) (ECS+EFS) (Coverage data)
Key design decisions:
- No public IPs on tasks all egress goes through multi-AZ NAT gateways
- VPC endpoints for ECR, S3, Secrets Manager, CloudWatch Logs, and KMS, keeps traffic off the public internet and reduces NAT costs
- AWS Cloud Map for internal service discovery instead of hardcoded IPs or environment variable injection
- Secrets Manager for sensitive runtime values; SSM Parameter Store for config values
- Deployment circuit breaker with rollback on every ECS service
- GitLab OAuth application provisioned by Terraform itself, zero manual setup in GitLab UI
Prerequisites
- Terraform >= 1.10
- AWS CLI configured with sufficient permissions
- Cloudflare API token with DNS edit rights on your zone
- GitLab token with admin-level scope (for managing OAuth applications via the GitLab Terraform provider)
- An existing VPC with subnets tagged with
*-private-*and*-public-*name patterns - An S3 bucket for Terraform remote state
Repository Structure
terraform/
├── providers.tf # Backend + provider versions
├── variables.tf # All input variables
├── locals.tf # Computed values, tags, image versions
├── data.tf # Data sources (VPC, subnets, AZs)
├── outputs.tf # Useful post-apply outputs
│
├── nat-gateway.tf # Multi-AZ NAT gateways + route tables
├── vpc-endpoints.tf # Interface and Gateway VPC endpoints
├── service-discovery.tf # AWS Cloud Map namespace + services
├── security-groups.tf # SGs for ECS, RDS, Redis, EFS
│
├── ecs-cluster.tf # Fargate cluster + CloudWatch log group
├── iam.tf # Task execution + task roles
│
├── rds.tf # PostgreSQL 17 (Multi-AZ)
├── elasticache.tf # Redis 7.x
├── efs-codecov-timescale.tf # EFS for TimescaleDB persistence
├── s3-codecov.tf # Coverage data bucket
│
├── secrets.tf # Secrets Manager + SSM Parameters
├── gitlab-oauth-codecov-app.tf # GitLab OAuth application
│
├── ecs-service-codecov-timescale.tf # TimescaleDB on Fargate
├── ecs-service-codecov-gateway.tf # Reverse proxy
├── ecs-service-codecov-frontend.tf # Web UI
├── ecs-service-codecov-api.tf # Backend API
├── ecs-service-codecov-worker.tf # Background worker
├── ecs-service-codecov-ai.tf # AI service
│
├── autoscaling.tf # CPU-based auto-scaling for API + Worker
├── alb.tf # ALB (HTTP→HTTPS redirect)
├── acm.tf # ACM certificate + Cloudflare DNS validation
└── dns.tf # Cloudflare CNAME → ALB
providers.tf
terraform {
required_version = ">= 1.10.0"
backend "s3" {
bucket = "mycompany-tf-state"
key = "tooling/terraform.tfstate"
region = "eu-central-1"
use_lockfile = true
encrypt = true
}
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 6.0"
}
cloudflare = {
source = "cloudflare/cloudflare"
version = ">= 5.0"
}
random = {
source = "hashicorp/random"
version = ">= 3.0"
}
gitlab = {
source = "gitlabhq/gitlab"
version = ">= 18.0"
}
}
}
provider "aws" {
region = var.region
}
provider "cloudflare" {
alias = "main"
api_token = var.cloudflare_api_token
}
provider "gitlab" {
base_url = "${var.gitlab_url}/api/v4"
token = var.gitlab_token
}
variables.tf
################################################################################
# General
################################################################################
variable "region" {
description = "AWS region"
type = string
default = "eu-central-1"
}
variable "environment" {
description = "Environment name"
type = string
default = "tooling"
}
variable "vpc_name" {
description = "Name of the existing VPC to look up via data source"
type = string
default = "mycompany-vpc-tooling"
}
################################################################################
# Cloudflare
################################################################################
variable "cloudflare_api_token" {
description = "Cloudflare API token"
type = string
sensitive = true
}
variable "cloudflare_zone_id" {
description = "Cloudflare zone ID for your domain"
type = string
}
################################################################################
# Codecov Configuration
################################################################################
variable "codecov_domain" {
description = "Domain name for Codecov"
type = string
default = "codecov.example.com"
}
variable "codecov_enterprise_license" {
description = "Codecov enterprise license key (default 50-user community license is included in the image)"
type = string
default = ""
sensitive = true
}
variable "codecov_admins" {
description = "List of GitLab usernames to designate as Codecov install admins"
type = list(string)
default = ["alice", "bob", "carol"]
}
variable "codecov_upload_token" {
description = "Global upload token for Codecov CI integration (auto-generated if empty)"
type = string
default = ""
sensitive = true
}
################################################################################
# GitLab
################################################################################
variable "gitlab_url" {
description = "GitLab instance URL"
type = string
default = "https://git.example.com"
}
variable "gitlab_token" {
description = "GitLab API token with admin scope for managing OAuth applications"
type = string
sensitive = true
}
################################################################################
# RDS
################################################################################
variable "rds_instance_class" {
description = "RDS instance class"
type = string
default = "db.t3.small"
}
variable "rds_allocated_storage" {
description = "Initial allocated storage in GB"
type = number
default = 20
}
variable "rds_max_allocated_storage" {
description = "Maximum allocated storage in GB for autoscaling"
type = number
default = 100
}
################################################################################
# TimescaleDB
################################################################################
variable "timescale_image" {
description = "TimescaleDB docker image (pin this!)"
type = string
default = "timescale/timescaledb:2.25.1-pg17"
}
variable "timescale_db_username" {
description = "TimescaleDB username"
type = string
default = "codecov"
}
variable "timescale_db_name" {
description = "TimescaleDB database name"
type = string
default = "codecov_timeseries"
}
################################################################################
# ElastiCache
################################################################################
variable "redis_node_type" {
description = "ElastiCache Redis node type"
type = string
default = "cache.t3.micro"
}
locals.tf
locals {
name_prefix = "mycompany-tooling"
tags = {
Project = "mycompany-tooling"
Environment = var.environment
ManagedBy = "terraform"
Service = "codecov"
}
# Codecov container images pin API and Worker to a specific calver release.
# Gateway, Frontend, and IA don't yet has the 26.2.2 tags, so latest-calver is used.
codecov_version = "26.2.2"
codecov_gateway_version = "latest-calver"
codecov_frontend_version = "latest-calver"
codecov_ia_version = "latest-calver"
codecov_images = {
gateway = "codecov/self-hosted-gateway:${local.codecov_gateway_version}"
frontend = "codecov/self-hosted-frontend:${local.codecov_frontend_version}"
ia = "codecov/self-hosted-api-umbrella:${local.codecov_ia_version}"
api = "codecov/self-hosted-api:${local.codecov_version}"
worker = "codecov/self-hosted-worker:${local.codecov_version}"
}
# Default 50-user community license bundled in the self-hosted image.
# Override by setting var.codecov_enterprise_license.
codecov_default_license = "F5O0Fu5ASFTPtWXM51BK8YQlq7IM2s+8TBGULrf9Um7wHjfPwI+Z3E4PfF/dPs6Uc5A+MLti+2etHq5dnFEfZgoiIVCLZ8x+0BVmUSWwPS42vJXnf1veY9Bglang4mDIhmfWfp5l6AT6cxmAVFpGrwobiK6OcN9pjWx4iWabazmsOiF9LM++v0WtuHNvhgzRcKmnJPgqahEB7qqF6KQ1hg=="
codecov_license = var.codecov_enterprise_license != "" ? var.codecov_enterprise_license : local.codecov_default_license
}
data.tf
################################################################################
# General Data Sources
################################################################################
data "aws_region" "current" {}
data "aws_caller_identity" "current" {}
data "aws_availability_zones" "available" {}
################################################################################
# VPC Data Sources (from existing VPC)
################################################################################
data "aws_vpc" "main" {
filter {
name = "tag:Name"
values = [var.vpc_name]
}
}
# Subnets are tagged with Name pattern: {vpc_name}-{az}-private-{env}
data "aws_subnets" "private" {
filter {
name = "vpc-id"
values = [data.aws_vpc.main.id]
}
filter {
name = "tag:Name"
values = ["*-private-*"]
}
}
# Subnets are tagged with Name pattern: {vpc_name}-{az}-public-{env}
data "aws_subnets" "public" {
filter {
name = "vpc-id"
values = [data.aws_vpc.main.id]
}
filter {
name = "tag:Name"
values = ["*-public-*"]
}
}
data "aws_internet_gateway" "this" {
filter {
name = "attachment.vpc-id"
values = [data.aws_vpc.main.id]
}
}
# Fetch subnet details to map AZ -> subnet ID
data "aws_subnet" "public" {
for_each = toset(data.aws_subnets.public.ids)
id = each.value
}
data "aws_subnet" "private" {
for_each = toset(data.aws_subnets.private.ids)
id = each.value
}
Networking: nat-gateway.tf
One NAT gateway per availability zone ensures that a single AZ failure doesn't break outbound connectivity for the rest of the cluster. Each private subnet gets its own route table pointing at the NAT gateway in the same AZ. IPv6 egress is handled by an egress-only internet gateway, which is needed for pulling images from Docker Hub over IPv6.
################################################################################
# Multi-AZ NAT Gateway (one per AZ)
################################################################################
locals {
# Map: AZ -> public subnet id (one per AZ)
public_subnet_by_az = {
for subnet_id, s in data.aws_subnet.public :
s.availability_zone => subnet_id
}
# Map: private subnet id -> AZ
private_az_by_subnet = {
for subnet_id, s in data.aws_subnet.private :
subnet_id => s.availability_zone
}
}
# One EIP per AZ
resource "aws_eip" "nat" {
for_each = local.public_subnet_by_az
domain = "vpc"
tags = merge(local.tags, {
Name = "${local.name_prefix}-nat-eip-${each.key}"
})
}
# One NAT GW per AZ, in the public subnet for that AZ
resource "aws_nat_gateway" "this" {
for_each = local.public_subnet_by_az
allocation_id = aws_eip.nat[each.key].id
subnet_id = each.value
tags = merge(local.tags, {
Name = "${local.name_prefix}-nat-${each.key}"
})
depends_on = [data.aws_internet_gateway.this]
}
# A private route table per AZ
resource "aws_route_table" "private" {
for_each = local.public_subnet_by_az
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, {
Name = "${local.name_prefix}-private-rt-${each.key}"
})
}
# Default IPv4 route per AZ -> NAT GW in that AZ
resource "aws_route" "private_default_v4" {
for_each = local.public_subnet_by_az
route_table_id = aws_route_table.private[each.key].id
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.this[each.key].id
}
# Route S3 traffic through the Gateway endpoint (keeps S3 traffic private)
resource "aws_vpc_endpoint_route_table_association" "s3_private" {
for_each = local.public_subnet_by_az
vpc_endpoint_id = aws_vpc_endpoint.s3.id
route_table_id = aws_route_table.private[each.key].id
}
# Associate each private subnet to the private route table for its AZ
resource "aws_route_table_association" "private" {
for_each = local.private_az_by_subnet
subnet_id = each.key
route_table_id = aws_route_table.private[each.value].id
}
# Egress-only IGW for IPv6
resource "aws_egress_only_internet_gateway" "this" {
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, {
Name = "${local.name_prefix}-eigw"
})
}
resource "aws_route" "private_default_ipv6" {
for_each = aws_route_table.private
route_table_id = each.value.id
destination_ipv6_cidr_block = "::/0"
egress_only_gateway_id = aws_egress_only_internet_gateway.this.id
}
# Allow IPv6 egress from ECS tasks (needed for Docker Hub pulls over IPv6)
resource "aws_security_group_rule" "ecs_ipv6_egress_all" {
type = "egress"
security_group_id = aws_security_group.ecs_services.id
from_port = 0
to_port = 0
protocol = "-1"
ipv6_cidr_blocks = ["::/0"]
description = "Allow IPv6 egress (needed for Docker Hub pulls over IPv6)"
}
Networking: vpc-endpoints.tf
VPC endpoints cut NAT gateway data processing costs for high-volume traffic (ECR image pulls, CloudWatch log shipping, Secrets Manager calls) and keep that traffic inside the AWS network.
Note: This file does not include
ssmorssmmessagesVPC endpoints. If you don't have those already created in your VPC from another deployment, add them here, they're required for ECS Exec and SSM parameter access from tasks running in private subnets.
################################################################################
# VPC Endpoints so Fargate in private subnets can reach AWS APIs
################################################################################
# SG for Interface Endpoints (allow HTTPS from ECS tasks)
resource "aws_security_group" "vpc_endpoints" {
name = "${local.name_prefix}-vpc-endpoints"
description = "Allow ECS tasks to reach VPC interface endpoints"
vpc_id = data.aws_vpc.main.id
ingress {
description = "HTTPS from ECS services SG"
from_port = 443
to_port = 443
protocol = "tcp"
security_groups = [
aws_security_group.ecs_services.id,
aws_security_group.timescale.id,
]
}
egress {
description = "All egress"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.tags, { Component = "vpc-endpoints" })
}
locals {
vpce_subnets = data.aws_subnets.private.ids
vpce_sg = [aws_security_group.vpc_endpoints.id]
}
# CloudWatch Logs for ECS task log shipping
resource "aws_vpc_endpoint" "logs" {
vpc_id = data.aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.logs"
vpc_endpoint_type = "Interface"
subnet_ids = local.vpce_subnets
security_group_ids = local.vpce_sg
private_dns_enabled = true
tags = merge(local.tags, { Name = "${local.name_prefix}-vpce-logs" })
}
# Secrets Manager for pulling secrets at task startup
resource "aws_vpc_endpoint" "secretsmanager" {
vpc_id = data.aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.secretsmanager"
vpc_endpoint_type = "Interface"
subnet_ids = local.vpce_subnets
security_group_ids = local.vpce_sg
private_dns_enabled = true
tags = merge(local.tags, { Name = "${local.name_prefix}-vpce-secrets" })
}
# ECR API for image manifest lookups
resource "aws_vpc_endpoint" "ecr_api" {
vpc_id = data.aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.ecr.api"
vpc_endpoint_type = "Interface"
subnet_ids = local.vpce_subnets
security_group_ids = local.vpce_sg
private_dns_enabled = true
tags = merge(local.tags, { Name = "${local.name_prefix}-vpce-ecr-api" })
}
# ECR DKR for actual image layer pulls
resource "aws_vpc_endpoint" "ecr_dkr" {
vpc_id = data.aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.ecr.dkr"
vpc_endpoint_type = "Interface"
subnet_ids = local.vpce_subnets
security_group_ids = local.vpce_sg
private_dns_enabled = true
tags = merge(local.tags, { Name = "${local.name_prefix}-vpce-ecr-dkr" })
}
# KMS required because Secrets Manager uses KMS for encryption
resource "aws_vpc_endpoint" "kms" {
vpc_id = data.aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.kms"
vpc_endpoint_type = "Interface"
subnet_ids = local.vpce_subnets
security_group_ids = local.vpce_sg
private_dns_enabled = true
tags = merge(local.tags, { Name = "${local.name_prefix}-vpce-kms" })
}
# S3 Gateway for ECR image layers (stored in S3) + Codecov S3 bucket
resource "aws_vpc_endpoint" "s3" {
vpc_id = data.aws_vpc.main.id
service_name = "com.amazonaws.${var.region}.s3"
vpc_endpoint_type = "Gateway"
route_table_ids = [for rt in aws_route_table.private : rt.id]
tags = merge(local.tags, { Name = "${local.name_prefix}-vpce-s3" })
}
Networking: service-discovery.tf
AWS Cloud Map provides internal DNS for service-to-service communication. Each Codecov component registers under a shared private namespace, resolving to <service>.mycompany-tooling.local.
################################################################################
# AWS Cloud Map - Service Discovery Namespace
################################################################################
resource "aws_service_discovery_private_dns_namespace" "this" {
name = "${local.name_prefix}.local"
description = "Service discovery namespace for ${local.name_prefix} ECS services"
vpc = data.aws_vpc.main.id
tags = local.tags
}
resource "aws_service_discovery_service" "api" {
name = "api"
dns_config {
namespace_id = aws_service_discovery_private_dns_namespace.this.id
routing_policy = "MULTIVALUE"
dns_records {
ttl = 10
type = "A"
}
}
tags = local.tags
}
resource "aws_service_discovery_service" "frontend" {
name = "frontend"
dns_config {
namespace_id = aws_service_discovery_private_dns_namespace.this.id
routing_policy = "MULTIVALUE"
dns_records {
ttl = 10
type = "A"
}
}
tags = local.tags
}
resource "aws_service_discovery_service" "timescale" {
name = "timescale"
dns_config {
namespace_id = aws_service_discovery_private_dns_namespace.this.id
routing_policy = "MULTIVALUE"
dns_records {
ttl = 10
type = "A"
}
}
tags = local.tags
}
resource "aws_service_discovery_service" "ia" {
name = "ia"
dns_config {
namespace_id = aws_service_discovery_private_dns_namespace.this.id
routing_policy = "MULTIVALUE"
dns_records {
ttl = 10
type = "A"
}
}
tags = merge(local.tags, { Component = "ia" })
}
security-groups.tf
################################################################################
# Security Group - ECS Services
################################################################################
resource "aws_security_group" "ecs_services" {
name = "${local.name_prefix}-ecs-services"
description = "Security group for ECS Fargate services"
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, {
Name = "${local.name_prefix}-ecs-services"
})
}
resource "aws_vpc_security_group_ingress_rule" "ecs_from_alb" {
security_group_id = aws_security_group.ecs_services.id
description = "Allow traffic from ALB"
from_port = 0
to_port = 65535
ip_protocol = "tcp"
referenced_security_group_id = module.alb.security_group_id
}
# Allow ECS services to communicate with each other (Cloud Map service discovery)
resource "aws_vpc_security_group_ingress_rule" "ecs_from_ecs" {
security_group_id = aws_security_group.ecs_services.id
description = "Allow inter-service communication"
from_port = 0
to_port = 65535
ip_protocol = "tcp"
referenced_security_group_id = aws_security_group.ecs_services.id
}
resource "aws_vpc_security_group_egress_rule" "ecs_to_internet" {
security_group_id = aws_security_group.ecs_services.id
description = "Allow outbound for S3, ECR, Secrets Manager, and container image pulls"
ip_protocol = "-1"
cidr_ipv4 = "0.0.0.0/0"
}
################################################################################
# Security Group - RDS
################################################################################
resource "aws_security_group" "rds" {
name = "${local.name_prefix}-rds"
description = "Security group for RDS PostgreSQL"
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, {
Name = "${local.name_prefix}-rds"
})
}
resource "aws_vpc_security_group_ingress_rule" "rds_from_ecs" {
security_group_id = aws_security_group.rds.id
description = "PostgreSQL from ECS services"
from_port = 5432
to_port = 5432
ip_protocol = "tcp"
referenced_security_group_id = aws_security_group.ecs_services.id
}
################################################################################
# Security Group - Redis
################################################################################
resource "aws_security_group" "redis" {
name = "${local.name_prefix}-redis"
description = "Security group for ElastiCache Redis"
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, {
Name = "${local.name_prefix}-redis"
})
}
resource "aws_vpc_security_group_ingress_rule" "redis_from_ecs" {
security_group_id = aws_security_group.redis.id
description = "Redis from ECS services"
from_port = 6379
to_port = 6379
ip_protocol = "tcp"
referenced_security_group_id = aws_security_group.ecs_services.id
}
################################################################################
# Security Group - TimescaleDB (ECS)
################################################################################
resource "aws_security_group" "timescale" {
name = "${local.name_prefix}-timescale"
description = "Security group for TimescaleDB ECS service"
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, {
Name = "${local.name_prefix}-timescale"
})
}
resource "aws_vpc_security_group_ingress_rule" "timescale_from_ecs" {
security_group_id = aws_security_group.timescale.id
description = "PostgreSQL from ECS services"
from_port = 5432
to_port = 5432
ip_protocol = "tcp"
referenced_security_group_id = aws_security_group.ecs_services.id
}
resource "aws_vpc_security_group_egress_rule" "timescale_to_any" {
security_group_id = aws_security_group.timescale.id
description = "Egress"
ip_protocol = "-1"
cidr_ipv4 = "0.0.0.0/0"
}
################################################################################
# Security Group - EFS (TimescaleDB persistence)
################################################################################
resource "aws_security_group" "efs_timescale" {
name = "${local.name_prefix}-efs-timescale"
description = "EFS SG for TimescaleDB"
vpc_id = data.aws_vpc.main.id
tags = merge(local.tags, { Name = "${local.name_prefix}-efs-timescale" })
}
resource "aws_vpc_security_group_ingress_rule" "efs_timescale_from_ecs" {
security_group_id = aws_security_group.efs_timescale.id
description = "NFS from ECS tasks"
from_port = 2049
to_port = 2049
ip_protocol = "tcp"
referenced_security_group_id = aws_security_group.ecs_services.id
}
resource "aws_vpc_security_group_ingress_rule" "efs_timescale_from_timescale" {
security_group_id = aws_security_group.efs_timescale.id
description = "NFS from Timescale task SG"
from_port = 2049
to_port = 2049
ip_protocol = "tcp"
referenced_security_group_id = aws_security_group.timescale.id
}
resource "aws_vpc_security_group_egress_rule" "efs_timescale_to_any" {
security_group_id = aws_security_group.efs_timescale.id
description = "Egress"
ip_protocol = "-1"
cidr_ipv4 = "0.0.0.0/0"
}
ecs-cluster.tf
################################################################################
# ECS Cluster
################################################################################
resource "aws_ecs_cluster" "this" {
name = "${local.name_prefix}-ecs"
configuration {
execute_command_configuration {
logging = "OVERRIDE"
log_configuration {
cloud_watch_log_group_name = aws_cloudwatch_log_group.ecs_cluster.name
}
}
}
setting {
name = "containerInsights"
value = "enabled"
}
tags = local.tags
}
resource "aws_ecs_cluster_capacity_providers" "this" {
cluster_name = aws_ecs_cluster.this.name
capacity_providers = ["FARGATE"]
default_capacity_provider_strategy {
base = 1
weight = 100
capacity_provider = "FARGATE"
}
}
resource "aws_cloudwatch_log_group" "ecs_cluster" {
name = "/aws/ecs/${local.name_prefix}-ecs"
retention_in_days = 90
tags = merge(local.tags, {
Name = "/aws/ecs/${local.name_prefix}-ecs"
})
}
iam.tf
Two roles are required:
- Task Execution Role: used by the ECS agent to pull images, read secrets from Secrets Manager and SSM, and write logs to CloudWatch.
- Task Role: used by the running application containers in this case, to read/write the S3 coverage bucket using IAM auth (no static credentials needed).
################################################################################
# ECS Task Execution Role
# Used by ECS agent to pull images, access secrets, write logs
################################################################################
resource "aws_iam_role" "ecs_task_execution" {
name = "${local.name_prefix}-ecsTaskExecutionRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "ecs-tasks.amazonaws.com" }
}
]
})
tags = merge(local.tags, { Purpose = "ECS Task Execution" })
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_managed" {
role = aws_iam_role.ecs_task_execution.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
resource "aws_iam_role_policy" "ecs_task_execution_secrets" {
name = "${local.name_prefix}-ecs-task-execution-secrets"
role = aws_iam_role.ecs_task_execution.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "SecretsManagerAccess"
Effect = "Allow"
Action = ["secretsmanager:GetSecretValue"]
Resource = [
"arn:aws:secretsmanager:${var.region}:${data.aws_caller_identity.current.account_id}:secret:${local.name_prefix}/codecov/*"
]
},
{
Sid = "SSMParameterAccess"
Effect = "Allow"
Action = ["ssm:GetParameters"]
Resource = [
"arn:aws:ssm:${var.region}:${data.aws_caller_identity.current.account_id}:parameter/${local.name_prefix}/codecov/*"
]
},
{
Sid = "CloudWatchLogs"
Effect = "Allow"
Action = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
]
Resource = [
"arn:aws:logs:${var.region}:${data.aws_caller_identity.current.account_id}:log-group:/aws/ecs/${local.name_prefix}-*",
"arn:aws:logs:${var.region}:${data.aws_caller_identity.current.account_id}:log-group:/aws/ecs/${local.name_prefix}-*:log-stream:*"
]
}
]
})
}
resource "aws_iam_role_policy" "ecs_task_execution_efs" {
name = "${local.name_prefix}-ecs-task-execution-efs"
role = aws_iam_role.ecs_task_execution.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "EfsMount"
Effect = "Allow"
Action = [
"elasticfilesystem:ClientMount",
"elasticfilesystem:ClientWrite",
"elasticfilesystem:ClientRootAccess"
]
Resource = [
aws_efs_file_system.timescale.arn,
aws_efs_access_point.timescale.arn
]
}
]
})
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_ssm" {
role = aws_iam_role.ecs_task_execution.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
################################################################################
# ECS Task Role (Application-level permissions)
# Used by Codecov containers to access S3
################################################################################
resource "aws_iam_role" "ecs_task_codecov" {
name = "${local.name_prefix}-codecov-taskRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "ecs-tasks.amazonaws.com" }
}
]
})
tags = merge(local.tags, { Purpose = "ECS Task - Codecov Application" })
}
resource "aws_iam_role_policy" "ecs_task_codecov_s3" {
name = "${local.name_prefix}-codecov-s3-access"
role = aws_iam_role.ecs_task_codecov.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "S3CodecovStorageAccess"
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts",
"s3:GetBucketLocation",
"s3:HeadBucket",
"s3:ListBucketVersions"
]
Resource = [
aws_s3_bucket.codecov_storage.arn,
"${aws_s3_bucket.codecov_storage.arn}/*"
]
}
]
})
}
Data Layer: rds.tf
################################################################################
# RDS PostgreSQL 17 for Codecov
################################################################################
resource "aws_db_subnet_group" "codecov" {
name = "${local.name_prefix}-postgres"
subnet_ids = data.aws_subnets.private.ids
tags = merge(local.tags, {
Name = "${local.name_prefix}-postgres"
})
}
resource "random_password" "rds_password" {
length = 32
special = false # Avoid URL-encoding issues in connection strings
}
resource "aws_db_instance" "codecov" {
identifier = "${local.name_prefix}-postgres"
engine = "postgres"
engine_version = "17"
instance_class = var.rds_instance_class
allocated_storage = var.rds_allocated_storage
max_allocated_storage = var.rds_max_allocated_storage
storage_type = "gp3"
storage_encrypted = true
db_name = "codecov"
username = "db_admin"
password = random_password.rds_password.result
multi_az = true
publicly_accessible = false
db_subnet_group_name = aws_db_subnet_group.codecov.name
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = false
final_snapshot_identifier = "${local.name_prefix}-postgres-final"
deletion_protection = true
performance_insights_enabled = true
tags = merge(local.tags, {
Name = "${local.name_prefix}-postgres"
})
}
Data Layer: elasticache.tf
################################################################################
# ElastiCache Redis for Codecov
################################################################################
resource "aws_elasticache_subnet_group" "codecov" {
name = "${local.name_prefix}-codecov"
subnet_ids = data.aws_subnets.private.ids
tags = local.tags
}
resource "aws_elasticache_cluster" "codecov" {
cluster_id = "${local.name_prefix}-codecov-redis"
engine = "redis"
engine_version = "7.1"
node_type = var.redis_node_type
num_cache_nodes = 1
parameter_group_name = "default.redis7"
port = 6379
subnet_group_name = aws_elasticache_subnet_group.codecov.name
security_group_ids = [aws_security_group.redis.id]
snapshot_retention_limit = 7
tags = merge(local.tags, {
Name = "${local.name_prefix}-codecov-redis"
})
}
Data Layer: efs-codecov-timescale.tf
Codecov uses TimescaleDB for time-series metrics (coverage trends over time). Instead of running a separate managed database, we deploy it on ECS Fargate with EFS-backed persistent storage. The EFS access point enforces UID/GID 999, which matches the postgres user inside the TimescaleDB container.
resource "aws_efs_file_system" "timescale" {
encrypted = true
throughput_mode = "bursting"
tags = merge(local.tags, {
Name = "${local.name_prefix}-timescale-efs"
Component = "timescale"
})
}
resource "aws_efs_mount_target" "timescale" {
for_each = toset(data.aws_subnets.private.ids)
file_system_id = aws_efs_file_system.timescale.id
subnet_id = each.value
security_groups = [aws_security_group.efs_timescale.id]
}
resource "aws_efs_access_point" "timescale" {
file_system_id = aws_efs_file_system.timescale.id
posix_user {
uid = 999
gid = 999
}
root_directory {
path = "/pgdata"
creation_info {
owner_uid = 999
owner_gid = 999
permissions = "0750"
}
}
tags = merge(local.tags, {
Name = "${local.name_prefix}-timescale-ap"
Component = "timescale"
})
}
Data Layer: s3-codecov.tf
################################################################################
# S3 Bucket for Codecov Coverage Data
################################################################################
resource "aws_s3_bucket" "codecov_storage" {
bucket = "mycompany-codecov-storage"
tags = merge(local.tags, {
Name = "mycompany-codecov-storage"
})
}
resource "aws_s3_bucket_versioning" "codecov_storage" {
bucket = aws_s3_bucket.codecov_storage.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "codecov_storage" {
bucket = aws_s3_bucket.codecov_storage.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
resource "aws_s3_bucket_public_access_block" "codecov_storage" {
bucket = aws_s3_bucket.codecov_storage.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_s3_bucket_policy" "codecov_storage_https_only" {
bucket = aws_s3_bucket.codecov_storage.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "EnforceHTTPS"
Effect = "Deny"
Principal = "*"
Action = "s3:*"
Resource = [
aws_s3_bucket.codecov_storage.arn,
"${aws_s3_bucket.codecov_storage.arn}/*"
]
Condition = {
Bool = { "aws:SecureTransport" = "false" }
}
}
]
})
}
secrets.tf
Secrets Manager stores all sensitive runtime values. SSM Parameter Store is used for non-secret configuration that needs central management (license key, upload token, OAuth client ID).
Note on
ignore_changes: Thecookie_secretandupload_tokenuselifecycle { ignore_changes = [...] }. This prevents Terraform from rotating these values on everyapply. To rotate manually, taint the resource:terraform taint aws_secretsmanager_secret_version.codecov_cookie_secret.
################################################################################
# Random password generation
################################################################################
resource "random_password" "cookie_secret" {
length = 64
special = false
}
resource "random_password" "upload_token" {
length = 40
special = false
}
resource "random_password" "timescale_password" {
length = 32
special = false
}
################################################################################
# Secrets Manager - Database URLs
################################################################################
resource "aws_secretsmanager_secret" "codecov_database_url" {
name = "${local.name_prefix}/codecov/database-url"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "codecov_database_url" {
secret_id = aws_secretsmanager_secret.codecov_database_url.id
secret_string = format(
"postgres://%s:%s@%s:%s/%s",
aws_db_instance.codecov.username,
random_password.rds_password.result,
aws_db_instance.codecov.address,
aws_db_instance.codecov.port,
aws_db_instance.codecov.db_name
)
}
resource "aws_secretsmanager_secret" "codecov_timeseries_database_url" {
name = "${local.name_prefix}/codecov/timeseries-database-url"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "codecov_timeseries_database_url" {
secret_id = aws_secretsmanager_secret.codecov_timeseries_database_url.id
secret_string = format(
"postgres://%s:%s@%s:%s/%s",
var.timescale_db_username,
random_password.timescale_password.result,
"timescale.${local.name_prefix}.local",
"5432",
var.timescale_db_name
)
}
################################################################################
# Secrets Manager - Redis URL
################################################################################
resource "aws_secretsmanager_secret" "codecov_redis_url" {
name = "${local.name_prefix}/codecov/redis-url"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "codecov_redis_url" {
secret_id = aws_secretsmanager_secret.codecov_redis_url.id
secret_string = "redis://${aws_elasticache_cluster.codecov.cache_nodes[0].address}:${aws_elasticache_cluster.codecov.cache_nodes[0].port}"
}
################################################################################
# Secrets Manager - GitLab OAuth
################################################################################
resource "aws_secretsmanager_secret" "codecov_gitlab_oauth" {
name = "${local.name_prefix}/codecov/gitlab-oauth-secret"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "codecov_gitlab_oauth" {
secret_id = aws_secretsmanager_secret.codecov_gitlab_oauth.id
secret_string = gitlab_application.codecov.secret
}
################################################################################
# Secrets Manager - GitLab Bot Token
# Used by the Worker to post PR comments and status checks back to GitLab
################################################################################
resource "aws_secretsmanager_secret" "codecov_gitlab_bot_token" {
name = "${local.name_prefix}/codecov/gitlab-bot-token"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "codecov_gitlab_bot_token" {
secret_id = aws_secretsmanager_secret.codecov_gitlab_bot_token.id
secret_string = var.gitlab_token
}
################################################################################
# Secrets Manager - Cookie Secret
################################################################################
resource "aws_secretsmanager_secret" "codecov_cookie_secret" {
name = "${local.name_prefix}/codecov/cookie-secret"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "codecov_cookie_secret" {
secret_id = aws_secretsmanager_secret.codecov_cookie_secret.id
secret_string = random_password.cookie_secret.result
lifecycle {
ignore_changes = [secret_string]
}
}
################################################################################
# Secrets Manager - TimescaleDB Password
################################################################################
resource "aws_secretsmanager_secret" "timescale_password" {
name = "${local.name_prefix}/codecov/timescale-password"
tags = local.tags
}
resource "aws_secretsmanager_secret_version" "timescale_password" {
secret_id = aws_secretsmanager_secret.timescale_password.id
secret_string = random_password.timescale_password.result
}
################################################################################
# SSM Parameters - Non-sensitive configuration
################################################################################
resource "aws_ssm_parameter" "codecov_license" {
name = "/${local.name_prefix}/codecov/enterprise-license"
type = "SecureString"
value = local.codecov_license
tags = local.tags
}
resource "aws_ssm_parameter" "codecov_gitlab_client_id" {
name = "/${local.name_prefix}/codecov/gitlab-oauth-client-id"
type = "String"
value = gitlab_application.codecov.application_id
tags = local.tags
}
resource "aws_ssm_parameter" "codecov_upload_token" {
name = "/${local.name_prefix}/codecov/upload-token"
type = "SecureString"
value = var.codecov_upload_token != "" ? var.codecov_upload_token : random_password.upload_token.result
lifecycle {
ignore_changes = [value]
}
tags = local.tags
}
GitLab OAuth: gitlab-oauth-codecov-app.tf
Terraform provisions the GitLab OAuth application directly. The generated credentials are stored back into SSM/Secrets Manager for the ECS services to consume.
resource "gitlab_application" "codecov" {
name = "Codecov"
redirect_url = "https://${var.codecov_domain}/login/gle"
scopes = ["api"]
confidential = true
}
The credentials from gitlab_application.codecov are written to AWS secrets in secrets.tf (see codecov_gitlab_oauth and codecov_gitlab_client_id above).
ECS Services
Component Overview
| Service | CPU | Memory | Tasks | Role |
|---|---|---|---|---|
| TimescaleDB | 1024 | 2 GB | 1 (fixed) | Time-series database with EFS persistence |
| Gateway | 256 | 512 MB | 2 (fixed) | Reverse proxy routing to API/Frontend/IA |
| Frontend | 256 | 512 MB | 2 (fixed) | Web UI |
| API | 512 | 1 GB | 2–6 (auto-scaling) | Backend REST/GraphQL API |
| Worker | 512 | 1 GB | 1–4 (auto-scaling) | Background job processor |
| IA | 512 | 1 GB | 1 (fixed) | AI-assisted code review service |
Each service creates its own CloudWatch log group to keep logs isolated and independently configurable for retention.
ecs-service-codecov-timescale.tf
TimescaleDB runs as a Fargate task with an EFS volume for data persistence. An init sidecar waits for PostgreSQL to be ready and then runs CREATE EXTENSION IF NOT EXISTS timescaledb (idempotent: safe to run on every restart).
Important: The
PGDATAenv var is set to a subdirectory of the mount point (/var/lib/postgresql/data/data). Without this, the PostgreSQL entrypoint tries tochownthe EFS mount root, which fails because EFS access points enforce ownership.
resource "aws_cloudwatch_log_group" "timescale" {
name = "/aws/ecs/${local.name_prefix}-timescale"
retention_in_days = 30
tags = merge(local.tags, { Component = "timescale" })
}
resource "aws_ecs_task_definition" "timescale" {
family = "${local.name_prefix}-timescale"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 1024
memory = 2048
execution_role_arn = aws_iam_role.ecs_task_execution.arn
task_role_arn = aws_iam_role.ecs_task_execution.arn
volume {
name = "timescale-data"
efs_volume_configuration {
file_system_id = aws_efs_file_system.timescale.id
transit_encryption = "ENABLED"
authorization_config {
access_point_id = aws_efs_access_point.timescale.id
iam = "ENABLED"
}
}
}
container_definitions = jsonencode([
{
name = "timescale"
image = var.timescale_image
essential = true
# IMPORTANT: run as postgres UID/GID matching the EFS access point
user = "999:999"
portMappings = [
{ containerPort = 5432, hostPort = 5432, protocol = "tcp" }
]
environment = [
{ name = "POSTGRES_USER", value = var.timescale_db_username },
{ name = "POSTGRES_DB", value = var.timescale_db_name },
# IMPORTANT: use a subdirectory so the entrypoint doesn't try to chown the EFS mount root
{ name = "PGDATA", value = "/var/lib/postgresql/data/data" }
]
secrets = [
{ name = "POSTGRES_PASSWORD", valueFrom = aws_secretsmanager_secret.timescale_password.arn }
]
mountPoints = [
{ sourceVolume = "timescale-data", containerPath = "/var/lib/postgresql/data", readOnly = false }
]
healthCheck = {
command = ["CMD-SHELL", "pg_isready -h 127.0.0.1 -p 5432 -U ${var.timescale_db_username} -d ${var.timescale_db_name} || exit 1"]
interval = 10
timeout = 5
retries = 6
}
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.timescale.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "timescale"
}
}
},
# One-shot init sidecar: enables the timescaledb extension (idempotent)
{
name = "timescale-init"
image = "postgres:17"
essential = false
dependsOn = [
{ containerName = "timescale", condition = "START" }
]
environment = [
{ name = "PGHOST", value = "127.0.0.1" },
{ name = "PGPORT", value = "5432" },
{ name = "PGUSER", value = var.timescale_db_username },
{ name = "PGDATABASE", value = var.timescale_db_name }
]
secrets = [
{ name = "PGPASSWORD", valueFrom = aws_secretsmanager_secret.timescale_password.arn }
]
command = [
"bash", "-lc",
"for i in $(seq 1 60); do pg_isready -h 127.0.0.1 -p 5432 && break; sleep 2; done; pg_isready -h 127.0.0.1 -p 5432 || exit 1; psql -v ON_ERROR_STOP=1 -c \"CREATE EXTENSION IF NOT EXISTS timescaledb;\""
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.timescale.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "timescale-init"
}
}
}
])
tags = merge(local.tags, { Component = "timescale" })
}
resource "aws_ecs_service" "timescale" {
name = "${local.name_prefix}-timescale"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.timescale.arn
desired_count = 1
launch_type = "FARGATE"
enable_execute_command = true
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.timescale.id]
assign_public_ip = false
}
service_registries {
registry_arn = aws_service_discovery_service.timescale.arn
}
deployment_circuit_breaker {
enable = true
rollback = true
}
tags = merge(local.tags, { Component = "timescale" })
}
ecs-service-codecov-gateway.tf
The gateway is the only Codecov service registered with the ALB. It is a Traefik-based reverse proxy that routes traffic to the frontend, API, and IA services based on path prefix.
Important env var naming: The gateway uses
CODECOV_DEFAULT_HOST(notCODECOV_FRONTEND_HOST) to route to the frontend. This is not obvious from the docs.
resource "aws_cloudwatch_log_group" "codecov_gateway" {
name = "/aws/ecs/${local.name_prefix}-codecov-gateway"
retention_in_days = 30
tags = merge(local.tags, { Component = "gateway" })
}
resource "aws_ecs_task_definition" "codecov_gateway" {
family = "${local.name_prefix}-codecov-gateway"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 256
memory = 512
execution_role_arn = aws_iam_role.ecs_task_execution.arn
task_role_arn = aws_iam_role.ecs_task_codecov.arn
container_definitions = jsonencode([
{
name = "gateway"
image = local.codecov_images.gateway
essential = true
portMappings = [
{ containerPort = 8080, protocol = "tcp" }
]
environment = [
# Disable MinIO sidecar we use S3 directly
{ name = "CODECOV_GATEWAY_MINIO_ENABLED", value = "false" },
# API upstream (internal Cloud Map name)
{ name = "CODECOV_API_HOST", value = "api.${local.name_prefix}.local" },
{ name = "CODECOV_API_PORT", value = "8000" },
# Frontend upstream NOTE: DEFAULT_*, not FRONTEND_*
{ name = "CODECOV_DEFAULT_HOST", value = "frontend.${local.name_prefix}.local" },
{ name = "CODECOV_DEFAULT_PORT", value = "8080" },
# IA service
{ name = "CODECOV_IA_HOST", value = "ia.${local.name_prefix}.local" },
{ name = "CODECOV_IA_PORT", value = "8000" },
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.codecov_gateway.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "gateway"
}
}
healthCheck = {
command = ["CMD-SHELL", "bash -c '</dev/tcp/127.0.0.1/8080' || exit 1"]
interval = 30
timeout = 5
retries = 3
startPeriod = 300
}
}
])
tags = merge(local.tags, { Component = "gateway" })
}
resource "aws_ecs_service" "codecov_gateway" {
name = "${local.name_prefix}-codecov-gateway"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.codecov_gateway.arn
desired_count = 2
launch_type = "FARGATE"
enable_execute_command = true
lifecycle {
ignore_changes = [desired_count]
}
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.ecs_services.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = module.alb.target_groups["codecov_gateway"].arn
container_name = "gateway"
container_port = 8080
}
deployment_circuit_breaker {
enable = true
rollback = true
}
tags = merge(local.tags, { Component = "gateway" })
depends_on = [module.alb]
}
ecs-service-codecov-frontend.tf
The frontend needs to know the GitLab instance URL to render the login button correctly. Both CODECOV_GLE_CLIENT_ID and GITLAB_ENTERPRISE_CLIENT_ID are set to the same value: both names are checked by different parts of the frontend code.
resource "aws_cloudwatch_log_group" "codecov_frontend" {
name = "/aws/ecs/${local.name_prefix}-codecov-frontend"
retention_in_days = 30
tags = merge(local.tags, { Component = "frontend" })
}
resource "aws_ecs_task_definition" "codecov_frontend" {
family = "${local.name_prefix}-codecov-frontend"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 256
memory = 512
execution_role_arn = aws_iam_role.ecs_task_execution.arn
task_role_arn = aws_iam_role.ecs_task_codecov.arn
container_definitions = jsonencode([
{
name = "frontend"
image = local.codecov_images.frontend
essential = true
portMappings = [
{ containerPort = 8080, protocol = "tcp" }
]
environment = [
{ name = "CODECOV_BASE_HOST", value = var.codecov_domain },
{ name = "CODECOV_API_HOST", value = var.codecov_domain },
{ name = "CODECOV_SCHEME", value = "https" },
# Required to show the GitLab Enterprise login button in the UI
{ name = "CODECOV_GLE_HOST", value = var.gitlab_url },
]
secrets = [
# Both names are read by different parts of the frontend code
{
name = "GITLAB_ENTERPRISE_CLIENT_ID"
valueFrom = aws_ssm_parameter.codecov_gitlab_client_id.arn
},
{
name = "CODECOV_GLE_CLIENT_ID"
valueFrom = aws_ssm_parameter.codecov_gitlab_client_id.arn
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.codecov_frontend.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "frontend"
}
}
healthCheck = {
command = ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:8080/ || exit 1"]
interval = 30
timeout = 5
retries = 3
startPeriod = 60
}
}
])
tags = merge(local.tags, { Component = "frontend" })
}
resource "aws_ecs_service" "codecov_frontend" {
name = "${local.name_prefix}-codecov-frontend"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.codecov_frontend.arn
desired_count = 2
launch_type = "FARGATE"
lifecycle {
ignore_changes = [desired_count]
}
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.ecs_services.id]
assign_public_ip = false
}
service_registries {
registry_arn = aws_service_discovery_service.frontend.arn
}
deployment_circuit_breaker {
enable = true
rollback = true
}
tags = merge(local.tags, { Component = "frontend" })
}
ecs-service-codecov-api.tf
This is the most configuration-heavy service. A few things worth calling out:
Dual env var paths for GitLab: Each GitLab setting is set twice: Once under
SETUP__GITLAB_ENTERPRISE__*(feeds the runtime YAML config, used by internal config loading) and once underGITLAB_ENTERPRISE__*(direct config path, required by the GraphQLloginProvidersresolver and OAuth callback handler). We discovered this by reading the Codecov Python source.S3 via
SERVICES__MINIO__*: Despite the name, these env vars configure S3, not MinIO. SettingIAM_AUTH = truemakes Codecov use the task's IAM role for S3 authentication, no static credentials.JSONCONFIG___prefix: Variables with this prefix are parsed as JSON by Codecov's config loader. It's the only way to set list-valued config (likeadmins) through environment variables.
resource "aws_cloudwatch_log_group" "codecov_api" {
name = "/aws/ecs/${local.name_prefix}-codecov-api"
retention_in_days = 30
tags = merge(local.tags, { Component = "api" })
}
resource "aws_ecs_task_definition" "codecov_api" {
family = "${local.name_prefix}-codecov-api"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 512
memory = 1024
execution_role_arn = aws_iam_role.ecs_task_execution.arn
task_role_arn = aws_iam_role.ecs_task_codecov.arn
container_definitions = jsonencode([
{
name = "api"
image = local.codecov_images.api
essential = true
portMappings = [{ containerPort = 8000, protocol = "tcp" }]
environment = [
{ name = "RUN_ENV", value = "ENTERPRISE" },
# External URLs
{ name = "CODECOV_URL", value = "https://${var.codecov_domain}" },
{ name = "SETUP__CODECOV_URL", value = "https://${var.codecov_domain}" },
{ name = "SETUP__CODECOV_API_URL", value = "https://${var.codecov_domain}/api" },
# Enable TimescaleDB time-series features
{ name = "SETUP__TIMESERIES__ENABLED", value = "true" },
{ name = "SETUP__TA_TIMESERIES__ENABLED", value = "true" },
# GitLab Enterprise SETUP__* path (runtime YAML config)
{ name = "SETUP__GITLAB_ENTERPRISE__ENABLED", value = "true" },
{ name = "SETUP__GITLAB_ENTERPRISE__URL", value = var.gitlab_url },
{ name = "SETUP__GITLAB_ENTERPRISE__CLIENT_ID", value = tostring(gitlab_application.codecov.application_id) },
{ name = "SETUP__GITLAB_ENTERPRISE__GLOBAL_UPLOAD_TOKEN_ENABLED", value = "true" },
# GitLab Enterprise direct path (required for loginProviders GraphQL resolver and OAuth callback)
# get_config("gitlab_enterprise", "url") reads GITLAB_ENTERPRISE__URL
{ name = "GITLAB_ENTERPRISE__URL", value = var.gitlab_url },
{ name = "GITLAB_ENTERPRISE__CLIENT_ID", value = tostring(gitlab_application.codecov.application_id) },
{ name = "GITLAB_ENTERPRISE__REDIRECT_URI", value = "https://${var.codecov_domain}/login/gle" },
{ name = "GITLAB_ENTERPRISE__API_URL", value = "${var.gitlab_url}/api/v4" },
# S3 storage uses MINIO-style env vars regardless of the actual provider
{ name = "SERVICES__MINIO__HOST", value = "s3.${var.region}.amazonaws.com" },
{ name = "SERVICES__MINIO__BUCKET", value = aws_s3_bucket.codecov_storage.id },
{ name = "SERVICES__MINIO__REGION", value = var.region },
{ name = "SERVICES__MINIO__VERIFY_SSL", value = "true" },
{ name = "SERVICES__MINIO__IAM_AUTH", value = "true" }, # Use task IAM role, no credentials needed
# Admins JSONCONFIG___ prefix triggers json.loads() in the config loader
{
name = "JSONCONFIG___SETUP__ADMINS"
value = jsonencode([
for username in var.codecov_admins : {
service = "gitlab_enterprise"
username = username
}
])
},
# Disable guest/anonymous access
{ name = "SETUP__GUEST_ACCESS", value = "off" },
]
secrets = [
{ name = "SERVICES__DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_database_url.arn },
{ name = "SERVICES__TIMESERIES_DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_timeseries_database_url.arn },
{ name = "SERVICES__TA_TIMESERIES_DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_timeseries_database_url.arn },
{ name = "SERVICES__REDIS_URL", valueFrom = aws_secretsmanager_secret.codecov_redis_url.arn },
{ name = "SETUP__HTTP__COOKIE_SECRET", valueFrom = aws_secretsmanager_secret.codecov_cookie_secret.arn },
{ name = "SETUP__ENTERPRISE_LICENSE", valueFrom = aws_ssm_parameter.codecov_license.arn },
# GitLab OAuth client secret set on both paths
{ name = "SETUP__GITLAB_ENTERPRISE__CLIENT_SECRET", valueFrom = aws_secretsmanager_secret.codecov_gitlab_oauth.arn },
{ name = "GITLAB_ENTERPRISE__CLIENT_SECRET", valueFrom = aws_secretsmanager_secret.codecov_gitlab_oauth.arn },
# Bot token for posting status checks back to GitLab
{ name = "GITLAB_ENTERPRISE__BOT__KEY", valueFrom = aws_secretsmanager_secret.codecov_gitlab_bot_token.arn },
# Global upload token
{ name = "SETUP__GITLAB_ENTERPRISE__GLOBAL_UPLOAD_TOKEN", valueFrom = aws_ssm_parameter.codecov_upload_token.arn },
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.codecov_api.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "api"
}
}
healthCheck = {
command = ["CMD-SHELL", "bash -c '</dev/tcp/127.0.0.1/8000' || exit 1"]
interval = 30
timeout = 5
retries = 3
startPeriod = 120
}
}
])
tags = merge(local.tags, { Component = "api" })
}
resource "aws_ecs_service" "codecov_api" {
name = "${local.name_prefix}-codecov-api"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.codecov_api.arn
desired_count = 2
launch_type = "FARGATE"
enable_execute_command = true
lifecycle {
ignore_changes = [desired_count]
}
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.ecs_services.id]
assign_public_ip = false
}
service_registries {
registry_arn = aws_service_discovery_service.api.arn
}
deployment_circuit_breaker {
enable = true
rollback = true
}
tags = merge(local.tags, { Component = "api" })
}
ecs-service-codecov-worker.tf
The worker has no inbound ports, it pulls jobs from Redis queues. It needs the same GitLab and S3 config as the API since it processes coverage uploads and posts results back to GitLab.
resource "aws_cloudwatch_log_group" "codecov_worker" {
name = "/aws/ecs/${local.name_prefix}-codecov-worker"
retention_in_days = 30
tags = merge(local.tags, { Component = "worker" })
}
resource "aws_ecs_task_definition" "codecov_worker" {
family = "${local.name_prefix}-codecov-worker"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 512
memory = 1024
execution_role_arn = aws_iam_role.ecs_task_execution.arn
task_role_arn = aws_iam_role.ecs_task_codecov.arn
container_definitions = jsonencode([
{
name = "worker"
image = local.codecov_images.worker
essential = true
# Worker has no inbound ports pulls work from Redis queues
environment = [
{ name = "RUN_ENV", value = "ENTERPRISE" },
{ name = "CODECOV_URL", value = "https://${var.codecov_domain}" },
{ name = "SETUP__CODECOV_URL", value = "https://${var.codecov_domain}" },
{ name = "SETUP__TIMESERIES__ENABLED", value = "true" },
{ name = "SETUP__TA_TIMESERIES__ENABLED", value = "true" },
# GitLab Enterprise
{ name = "GITLAB_ENTERPRISE__URL", value = var.gitlab_url },
{ name = "GITLAB_ENTERPRISE__API_URL", value = "${var.gitlab_url}/api/v4" },
# S3 storage
{ name = "SERVICES__MINIO__HOST", value = "s3.${var.region}.amazonaws.com" },
{ name = "SERVICES__MINIO__BUCKET", value = aws_s3_bucket.codecov_storage.id },
{ name = "SERVICES__MINIO__REGION", value = var.region },
{ name = "SERVICES__MINIO__VERIFY_SSL", value = "true" },
{ name = "SERVICES__MINIO__IAM_AUTH", value = "true" },
]
secrets = [
{ name = "SERVICES__DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_database_url.arn },
{ name = "SERVICES__TIMESERIES_DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_timeseries_database_url.arn },
{ name = "SERVICES__TA_TIMESERIES_DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_timeseries_database_url.arn },
{ name = "SERVICES__REDIS_URL", valueFrom = aws_secretsmanager_secret.codecov_redis_url.arn },
{ name = "SETUP__ENTERPRISE_LICENSE", valueFrom = aws_ssm_parameter.codecov_license.arn },
{ name = "GITLAB_ENTERPRISE__CLIENT_ID", valueFrom = aws_ssm_parameter.codecov_gitlab_client_id.arn },
{ name = "GITLAB_ENTERPRISE__CLIENT_SECRET", valueFrom = aws_secretsmanager_secret.codecov_gitlab_oauth.arn },
{ name = "GITLAB_ENTERPRISE__BOT__KEY", valueFrom = aws_secretsmanager_secret.codecov_gitlab_bot_token.arn },
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.codecov_worker.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "worker"
}
}
}
])
tags = merge(local.tags, { Component = "worker" })
}
resource "aws_ecs_service" "codecov_worker" {
name = "${local.name_prefix}-codecov-worker"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.codecov_worker.arn
desired_count = 1
launch_type = "FARGATE"
lifecycle {
ignore_changes = [desired_count]
}
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.ecs_services.id]
assign_public_ip = false
}
deployment_circuit_breaker {
enable = true
rollback = true
}
tags = merge(local.tags, { Component = "worker" })
}
ecs-service-codecov-ai.tf
The IA (Ingestion/API umbrella) service is required by newer gateway releases. It listens on port 8000 and registers with Cloud Map.
resource "aws_cloudwatch_log_group" "codecov_ia" {
name = "/aws/ecs/${local.name_prefix}-codecov-ia"
retention_in_days = 30
tags = merge(local.tags, { Component = "ia" })
}
resource "aws_ecs_task_definition" "codecov_ia" {
family = "${local.name_prefix}-codecov-ia"
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = 512
memory = 1024
execution_role_arn = aws_iam_role.ecs_task_execution.arn
task_role_arn = aws_iam_role.ecs_task_codecov.arn
container_definitions = jsonencode([
{
name = "ia"
image = local.codecov_images.ia
essential = true
portMappings = [
{ containerPort = 8000, protocol = "tcp" }
]
environment = [
{ name = "RUN_ENV", value = "ENTERPRISE" },
{ name = "CODECOV_URL", value = "https://${var.codecov_domain}" },
{ name = "SETUP__CODECOV_URL", value = "https://${var.codecov_domain}" },
{ name = "SETUP__TIMESERIES__ENABLED", value = "true" },
{ name = "SETUP__TA_TIMESERIES__ENABLED", value = "true" },
{ name = "SERVICES__MINIO__HOST", value = "s3.${var.region}.amazonaws.com" },
{ name = "SERVICES__MINIO__BUCKET", value = aws_s3_bucket.codecov_storage.id },
{ name = "SERVICES__MINIO__REGION", value = var.region },
{ name = "SERVICES__MINIO__VERIFY_SSL", value = "true" },
{ name = "SERVICES__MINIO__IAM_AUTH", value = "true" },
]
secrets = [
{ name = "SERVICES__DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_database_url.arn },
{ name = "SERVICES__TIMESERIES_DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_timeseries_database_url.arn },
{ name = "SERVICES__TA_TIMESERIES_DATABASE_URL", valueFrom = aws_secretsmanager_secret.codecov_timeseries_database_url.arn },
{ name = "SERVICES__REDIS_URL", valueFrom = aws_secretsmanager_secret.codecov_redis_url.arn },
{ name = "SETUP__ENTERPRISE_LICENSE", valueFrom = aws_ssm_parameter.codecov_license.arn },
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.codecov_ia.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "ia"
}
}
healthCheck = {
command = ["CMD-SHELL", "bash -c '</dev/tcp/127.0.0.1/8000' || exit 1"]
interval = 30
timeout = 5
retries = 3
startPeriod = 120
}
}
])
tags = merge(local.tags, { Component = "ia" })
}
resource "aws_ecs_service" "codecov_ia" {
name = "${local.name_prefix}-codecov-ia"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.codecov_ia.arn
desired_count = 1
launch_type = "FARGATE"
enable_execute_command = true
lifecycle { ignore_changes = [desired_count] }
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.ecs_services.id]
assign_public_ip = false
}
service_registries {
registry_arn = aws_service_discovery_service.ia.arn
}
deployment_circuit_breaker {
enable = true
rollback = true
}
tags = merge(local.tags, { Component = "ia" })
}
autoscaling.tf
CPU-based auto-scaling keeps the cluster right-sized. The API scales between 2 and 6 tasks; the Worker scales between 1 and 4. Both trigger at 70% average CPU utilization.
################################################################################
# ECS Auto Scaling - API
################################################################################
resource "aws_appautoscaling_target" "api" {
max_capacity = 6
min_capacity = 2
resource_id = "service/${aws_ecs_cluster.this.name}/${aws_ecs_service.codecov_api.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "api_cpu" {
name = "${local.name_prefix}-codecov-api-cpu"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.api.resource_id
scalable_dimension = aws_appautoscaling_target.api.scalable_dimension
service_namespace = aws_appautoscaling_target.api.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70
}
}
################################################################################
# ECS Auto Scaling - Worker
################################################################################
resource "aws_appautoscaling_target" "worker" {
max_capacity = 4
min_capacity = 1
resource_id = "service/${aws_ecs_cluster.this.name}/${aws_ecs_service.codecov_worker.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "worker_cpu" {
name = "${local.name_prefix}-codecov-worker-cpu"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.worker.resource_id
scalable_dimension = aws_appautoscaling_target.worker.scalable_dimension
service_namespace = aws_appautoscaling_target.worker.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70
}
}
alb.tf
################################################################################
# Application Load Balancer
################################################################################
module "alb" {
source = "terraform-aws-modules/alb/aws"
version = "~> 10.0"
name = "${local.name_prefix}-alb"
load_balancer_type = "application"
vpc_id = data.aws_vpc.main.id
subnets = data.aws_subnets.public.ids
enable_deletion_protection = true
security_group_ingress_rules = {
all_http = {
from_port = 80
to_port = 80
ip_protocol = "tcp"
cidr_ipv4 = "0.0.0.0/0"
}
all_https = {
from_port = 443
to_port = 443
ip_protocol = "tcp"
cidr_ipv4 = "0.0.0.0/0"
}
}
security_group_egress_rules = {
vpc_outbound = {
description = "Allow outbound to VPC for ECS health checks and container communication"
ip_protocol = "-1"
cidr_ipv4 = data.aws_vpc.main.cidr_block
}
}
listeners = {
# HTTP listener redirects all traffic to HTTPS
http = {
port = 80
protocol = "HTTP"
redirect = {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
}
}
# HTTPS listener routes to Gateway target group
https = {
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-Res-PQ-2025-09"
certificate_arn = aws_acm_certificate_validation.codecov.certificate_arn
forward = {
target_group_key = "codecov_gateway"
}
}
}
target_groups = {
codecov_gateway = {
backend_protocol = "HTTP"
backend_port = 8080
target_type = "ip"
deregistration_delay = 30
load_balancing_cross_zone_enabled = true
health_check = {
enabled = true
healthy_threshold = 3
interval = 30
matcher = "200-399"
path = "/"
port = "traffic-port"
protocol = "HTTP"
timeout = 5
unhealthy_threshold = 3
}
create_attachment = false
}
}
tags = local.tags
}
acm.tf
ACM certificates require DNS validation. The Cloudflare provider creates the validation records automatically so Terraform waits until the certificate is issued before proceeding.
In our case, we use a internal module to handle all these steps, here I will include all resources needed to achieve the same
################################################################################
# ACM Certificate with Cloudflare DNS validation
################################################################################
resource "aws_acm_certificate" "codecov" {
domain_name = var.codecov_domain
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
tags = local.tags
}
resource "cloudflare_dns_record" "acm_validation" {
provider = cloudflare.main
for_each = {
for dvo in aws_acm_certificate.codecov.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
type = dvo.resource_record_type
value = dvo.resource_record_value
}
}
zone_id = var.cloudflare_zone_id
name = each.value.name
type = each.value.type
content = each.value.value
ttl = 60
}
resource "aws_acm_certificate_validation" "codecov" {
certificate_arn = aws_acm_certificate.codecov.arn
validation_record_fqdns = [for r in cloudflare_dns_record.acm_validation : r.hostname]
}
dns.tf
################################################################################
# Cloudflare DNS Record codecov.example.com -> ALB
################################################################################
resource "cloudflare_dns_record" "codecov" {
provider = cloudflare.main
zone_id = var.cloudflare_zone_id
name = var.codecov_domain
type = "CNAME"
content = module.alb.dns_name
ttl = 60
proxied = false
}
outputs.tf
################################################################################
# ECS Cluster
################################################################################
output "ecs_cluster_name" {
description = "Name of the ECS cluster"
value = aws_ecs_cluster.this.name
}
output "ecs_cluster_arn" {
description = "ARN of the ECS cluster"
value = aws_ecs_cluster.this.arn
}
################################################################################
# Networking
################################################################################
output "alb_dns_name" {
description = "DNS name of the Application Load Balancer"
value = module.alb.dns_name
}
output "codecov_url" {
description = "URL to access Codecov"
value = "https://${var.codecov_domain}"
}
output "vpc_id" {
description = "VPC ID used by the ECS cluster"
value = data.aws_vpc.main.id
}
################################################################################
# Data Layer
################################################################################
output "rds_endpoint" {
description = "RDS PostgreSQL endpoint"
value = aws_db_instance.codecov.endpoint
sensitive = true
}
output "redis_endpoint" {
description = "ElastiCache Redis endpoint"
value = aws_elasticache_cluster.codecov.cache_nodes[0].address
sensitive = true
}
output "codecov_storage_bucket" {
description = "S3 bucket name for Codecov storage"
value = aws_s3_bucket.codecov_storage.id
}
################################################################################
# IAM
################################################################################
output "ecs_task_execution_role_arn" {
description = "ARN of the ECS task execution role"
value = aws_iam_role.ecs_task_execution.arn
}
output "ecs_task_role_arn" {
description = "ARN of the Codecov ECS task role"
value = aws_iam_role.ecs_task_codecov.arn
}
################################################################################
# GitLab Application
################################################################################
output "codecov_gitlab_client_id" {
value = gitlab_application.codecov.application_id
sensitive = true
}
output "codecov_gitlab_client_secret" {
value = gitlab_application.codecov.secret
sensitive = true
}
CI/CD Pipeline
The GitLab CI pipeline enforces a safe plan-before-apply workflow. Every merge request gets a visible terraform plan diff as a pipeline artifact, and production applies require a manual button click in the GitLab UI.
# .gitlab-ci.yml
stages:
- test
- secret-detection
- plan
- apply
variables:
TF_ROOT: terraform
TF_STATE_NAME: tooling
TF_VAR_gitlab_token: $DEVOPS_SA_GITLAB_TOKEN
include:
- template: Jobs/SAST.gitlab-ci.yml
- template: Jobs/Secret-Detection.gitlab-ci.yml
terraform_plan:
stage: plan
script:
- cd $TF_ROOT
- terraform init
- terraform plan -out=tfplan -no-color | tee plan.txt
artifacts:
paths:
- $TF_ROOT/tfplan
- $TF_ROOT/plan.txt
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
terraform_apply:
stage: apply
script:
- cd $TF_ROOT
- terraform init
- terraform apply -auto-approve tfplan
dependencies:
- terraform_plan
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
when: manual # Requires explicit button click in GitLab UI
Deploying
First-Time Setup
-
Configure variables create a
terraform.tfvars(keep it out of git, it's in.gitignore):
region = "eu-central-1"
environment = "tooling"
vpc_name = "mycompany-vpc-tooling"
codecov_domain = "codecov.example.com"
cloudflare_zone_id = "your-cloudflare-zone-id"
cloudflare_api_token = "..." # preferably via TF_VAR_cloudflare_api_token env var
codecov_enterprise_license = "" # leave empty to use the bundled 50-user community license
gitlab_token = "..." # GitLab admin token
codecov_admins = ["alice", "bob", "carol"]
gitlab_url = "https://git.example.com"
rds_instance_class = "db.t3.small"
rds_allocated_storage = 20
redis_node_type = "cache.t3.micro"
- Initialize Terraform:
terraform init
- Review the plan:
terraform plan -out=tfplan
- Apply:
terraform apply tfplan
The first apply takes 10–20 minutes, mostly waiting for RDS Multi-AZ provisioning and ACM certificate DNS validation.
Pre-Commit Hooks
Install the hooks to catch formatting and secret issues before they reach CI:
pip install pre-commit
pre-commit install
A minimal .pre-commit-config.yaml:
repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.105.0
hooks:
- id: terraform_fmt
- id: terraform_validate
- repo: https://github.com/gitleaks/gitleaks
rev: v8.30.0
hooks:
- id: gitleaks
Day-2 Operations
Viewing Logs
All ECS services log to CloudWatch with the /aws/ecs/{name_prefix}-codecov-{service} naming pattern:
aws logs tail /aws/ecs/mycompany-tooling-codecov-api --follow
aws logs tail /aws/ecs/mycompany-tooling-codecov-worker --follow
aws logs tail /aws/ecs/mycompany-tooling-timescale --follow
Restarting a Service
Force a new deployment (replaces all running tasks with new ones):
aws ecs update-service \
--cluster mycompany-tooling-ecs \
--service mycompany-tooling-codecov-api \
--force-new-deployment
Upgrading Codecov
Update the version in locals.tf:
codecov_version = "26.3.0"
Run terraform plan to confirm only task definitions change, then apply. ECS performs a rolling deployment the circuit breaker rolls back automatically if health checks fail.
Scaling Manually
Auto-scaling handles normal load. To override the desired count:
aws ecs update-service \
--cluster mycompany-tooling-ecs \
--service mycompany-tooling-codecov-worker \
--desired-count 3
Note:
lifecycle { ignore_changes = [desired_count] }in the service resources prevents Terraform from reverting manual scaling on the next apply.
Connecting GitLab CI to Codecov
After deployment, get the upload token:
aws ssm get-parameter \
--name /mycompany-tooling/codecov/upload-token \
--with-decryption \
--query Parameter.Value \
--output text
Add this to each project's .gitlab-ci.yml:
upload_coverage:
stage: test
image: python:3.12-slim
script:
- pip install codecov-cli
- codecovcli upload-process
--token $CODECOV_TOKEN
--codecov-yaml-path .codecov.yml
coverage: '/TOTAL.*\s+(\d+%)$/'
variables:
CODECOV_URL: "https://codecov.example.com"
Set CODECOV_TOKEN as a CI/CD variable in the project or group settings.
Summary
This setup gives you a production-ready Codecov deployment with:
- High availability: Multi-AZ RDS, multi-AZ NAT gateways, 2+ replicas on gateway/frontend, circuit-breaker rollback on all services
- Security: No public IPs on workloads, VPC endpoints, Secrets Manager for all sensitive values, S3 HTTPS-only policy, IAM-based S3 auth (no static credentials), least-privilege roles
-
Cost awareness:
cache.t3.microRedis,db.t3.smallRDS, VPC endpoints reducing NAT data transfer costs, Fargate pay-per-use - GitLab integration fully automated: OAuth application, credentials, and env vars all provisioned by Terraform, no manual steps in the GitLab UI
- Operational simplicity: GitOps via Terraform, centralized secrets, CloudWatch logging per service, CPU-based auto-scaling
The full Terraform state covers ~60 resources. A terraform destroy cleans up everything except the S3 state bucket, the RDS final snapshot, and any secrets with ignore_changes all intentional protections against accidental data loss.
Top comments (0)