Table Of Contents
- The Goal
- Architecture
- End-to-End Flow
- Deployment Itmes
- Terraform Code
- Verification
- Troubleshooting
- Security
- Change Log
The New Zenler is moving to a shiny new platform (at the time of this writing), where we decided to FluxCD as our Continuous Delivery tool. The otherday, one of our developers reported that FluxCD suddenly stopped pulling the latest image from ECR. When I checked, I saw this error in the log:
failed to configure authentication options: operation error ECR: GetAuthorizationToken, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded
And that moment I knew the reson for the FluxCD pulling-image failure: EC2 Instance Metadata Service (IMDS) wasn't being able to retrieve credentials via the instance profile on EC2 nodes; the code attempted the default credential chain (which includes IMDS) but IMDS wasn't available/accessible; therefore, Flux didn’t have the redentials to call ECR APIs and coundn’t get the token, hence the failure!!
As I was already using IRSA for CNI, CSI etc. I thought this is a great oppurtunity to move Flux from using IMDS to IRSA as well. This is
🎯 Goal
Deploy FluxCD (Operator + Flux Instance) on Amazon EKS with IAM Roles for Service Accounts (IRSA) so Flux Image Reflector/Automation can authenticate to Amazon ECR without node credentials, and auto-deploy images from ECR.
🏗️ High-Level Architecture
The diagram above shows how Bitbucket, FluxCD controllers, and AWS ECR interact:

| Component | Description |
|---|---|
🧩 Bitbucket Git Repo
|
Holds your GitOps manifests, synced by Flux Source Controller |
⚙️ Flux Source Controller
|
Pulls manifests from Bitbucket repo |
🧱 Flux Kustomize Controller
|
Applies manifests to your cluster |
🔁 Flux Image Automation Controller
|
Watches image updates and commits manifest changes back to Git |
🔍 Flux Image Reflector Controller
|
Scans ECR for image tags and metadata |
🔐 EKS OIDC Provider
|
Issues tokens used by IRSA authentication |
🧾 STS AssumeRoleWithWebIdentity
|
Exchanges OIDC token for temporary IAM credentials |
🧠 IRSA IAM Role
|
Grants scoped ECR access (read + token retrieval) |
🐳 Amazon ECR Repository
|
Stores application images, queried by Image Reflector |
🔄 End-to-End Flow
- Pod starts with SA
image-reflector-controller. - EKS injects an OIDC web identity token.
- AWS STS exchanges it for temporary IAM creds.
- Flux Image Reflector uses them to call ECR APIs.
- Reflector lists tags → Image Automation commits new manifests → Kustomize applies updates.
📦 Deployment Itmes
-
IRSA IAM role for Flux image controllers:
- Module:
terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts(v6.2.1) - Trusted via your EKS OIDC provider
-
Managed policy granting ECR read +
GetAuthorizationToken
- Module:
-
ServiceAccount annotations (Kubernetes) for:
flux-system/image-reflector-controllerflux-system/image-automation-controller
- Flux Operator Helm chart (v0.32.0)
- Flux Instance Helm chart (distribution v2.4.0), Git sync to your Bitbucket repo
⚙️ Putting in Terraform
1️⃣ IAM Policy for ECR (least privilege)
# ------------------------------------------------------
# IRSA policy document
# ------------------------------------------------------
data "aws_iam_policy_document" "flux_ecr_read" {
statement {
sid = "ECRAuthToken"
effect = "Allow"
actions = ["ecr:GetAuthorizationToken"]
resources = ["*"]
}
statement {
sid = "ECRRead"
effect = "Allow"
actions = [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:DescribeImages",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:DescribeRegistry",
]
resources = [
"arn:aws:ecr:${var.aws_region}:${var.aws_acc_id}:repository/*"
]
}
// Render the policy document
resource "aws_iam_policy" "flux_ecr_ro" {
name = "${var.name_prefix}-flux-ecr-ro"
policy = data.aws_iam_policy_document.flux_ecr_read.json
}
⚠️ Common pitfall: Avoid something like:
repository/<apps>/*— ECR isn’t hierarchical; that pattern doesn’t match e.g.apps/mainappand causesAccessDeniedException.
2️⃣ IRSA Role for Flux Controllers
# ------------------------------------------------------
# Flux Image-Controller IRSA
# ------------------------------------------------------
module "flux_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts"
version = "6.2.1"
name = "${var.name_prefix}-flux-imgctrl-irsa"
oidc_providers = {
automation = {
provider_arn = var.eks_oidc_provider
namespace_service_accounts = ["flux-system:image-automation-controller"]
}
reflector = {
provider_arn = var.eks_oidc_provider
namespace_service_accounts = ["flux-system:image-reflector-controller"]
}
}
policies = {
ecr_ro = aws_iam_policy.flux_ecr_ro.arn
}
}
💡 One IRSA role for both controllers simplifies management; separation is optional.
3️⃣ Annotate Flux ServiceAccounts
# ------------------------------------------------------
# Controller annotations for SA
# ------------------------------------------------------
resource "kubernetes_annotations" "flux_irsa_sa" {
for_each = toset(["reflector", "automation"])
api_version = "v1"
kind = "ServiceAccount"
metadata {
name = "image-${each.value}-controller"
namespace = "flux-system"
}
annotations = {
"eks.amazonaws.com/role-arn" = module.flux_irsa.arn
}
}
4️⃣ Use Bitbucket credential
# ------------------------------------------------------
# FluxCD namespace
# ------------------------------------------------------
resource "kubernetes_namespace_v1" "flux_system" {
metadata {
name = var.flux_namespace
annotations = {
name = var.flux_namespace
"kustomize.toolkit.fluxcd.io/ssa" = "Ignore"
}
}
depends_on = [null_resource.kube_config]
lifecycle {
ignore_changes = [
metadata[0].labels,
metadata.0.annotations,
]
create_before_destroy = false
}
}
# ------------------------------------------------------
# Inject BB credential
# ------------------------------------------------------
resource "kubernetes_secret_v1" "bb_passwd" {
metadata {
name = "bb-${var.bb_user}-passwd"
namespace = kubernetes_namespace_v1.flux_system.id
}
data = {
username = var.bb_user
password = var.bb_user_secrets.apwd
}
type = "Opaque"
}
5️⃣ Flux Operator & Flux Instance Helm releases
# ------------------------------------------------------
# Flux Operators and Instance helm-chart
# ------------------------------------------------------
resource "helm_release" "flux_operator" {
repository = "oci://ghcr.io/controlplaneio-fluxcd/charts"
chart = "flux-operator"
version = "0.32.0"
name = "flux-operator"
namespace = kubernetes_namespace_v1.flux_system.id
create_namespace = false
wait = true
}
resource "helm_release" "flux_instance" {
depends_on = [helm_release.flux_operator]
repository = "oci://ghcr.io/controlplaneio-fluxcd/charts"
chart = "flux-instance"
name = "flux"
namespace = kubernetes_namespace_v1.flux_system.id
# Flux components and kustomize patches
values = [
file("${path.module}/values/components.yaml")
]
# Flux distribution
set {
name = "installCRDs"
value = true
}
set {
name = "instance.distribution.version"
value = var.flux_version
}
set {
name = "instance.distribution.registry"
value = var.flux_registry
}
# Configure Flux Git sync
set {
name = "instance.sync.eks"
value = "GitRepository"
}
set {
name = "instance.sync.url"
value = "${var.bb_base_url}/${var.bb_flex_repo}.git"
}
set {
name = "instance.sync.path"
value = var.git_path #e.g. "/"
}
set {
name = "instance.sync.ref"
value = var.git_ref #e.g. "refs/heads/main"
}
set {
name = "instance.sync.pullSecret"
value = kubernetes_secret_v1.bb_passwd.metadata[0].name
}
}
✅ Verification Checklist
1️⃣ IRSA projected token exists
~$ kubectl exec -n flux-system deploy/image-reflector-controller -- \ ls -l /var/run/secrets/eks.amazonaws.com/serviceaccount
total 0
lrwxrwxrwx 1 root 1337 12 Oct 27 19:49 token -> ..data/token
2️⃣ Token audience is STSToken audience is STS
~$ kubectl exec -n flux-system deploy/image-reflector-controller -- \
cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token \
| jq -R 'split(".") | .[1] | @base64d | fromjson' | jq '.aud,.sub'
[
"sts.amazonaws.com"
]
"system:serviceaccount:flux-system:image-reflector-controller"
3️⃣ ImageRepository status
~$ kubectl get imagerepository -n flux-system mainapp -o yaml | yq '.status.conditions'
[
{
"lastTransitionTime": "2025-10-27T18:48:23Z",
"message": "successful scan: found 2 tags",
"observedGeneration": 1,
"reason": "Succeeded",
"status": "True",
"type": "Ready"
}
]
🧰 Troubleshooting Highlights
| Symptom | Cause | Fix |
|---|---|---|
no EC2 IMDS role found |
IRSA not active | Annotate SA + restart pod |
Only token file exists |
Normal (modern IRSA token projection) | Decode token for aud: sts.amazonaws.com
|
AccessDenied: ecr:GetAuthorizationToken |
Missing token permission | Add ecr:GetAuthorizationToken on *
|
AccessDenied on repo read |
Wrong ARN pattern | Use repository/zenler/mainapp or repository/*
|
| SA annotation missing | FluxInstance CRD type is string list | Patch SAs with Terraform annotations |
🔒 Security Notes
- Scope ECR access narrowly (
repository/<repo>preferred) - Keep
ecr:GetAuthorizationTokenunscoped - IRSA isolates pod access — no node IAM exposure
🕑 Change Log
- Operator: v0.32.0
- Flux: v2.4.0
- IAM module: v6.2.1
- Shared IRSA for image controllers
- Correct repo ARN (no nested wildcard)
Top comments (0)