Table Of Contents
- The Goal
- Architecture
- End-to-End Flow
- Deployment Overview
- Terraform Implementation
- Verification
- Troubleshooting
- Security
- Change Log
The New Zenler is moving to a shiny new platform (at the time of this writing), where we decided to use FluxCD as our Continuous Delivery tool. The otherday, one of our developers reported that FluxCD suddenly stopped pulling the latest image from ECR and when I checked, I saw this error in the log:
failed to configure authentication options: operation error ECR: GetAuthorizationToken, get identity: get credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, canceled, context deadline exceeded
That moment I knew the reson for the FluxCD pulling-image failure: EC2 Instance Metadata Service (IMDS) wasn't being able to retrieve credentials via the instance profile on EC2 nodes; the code attempted the default credential chain (which includes IMDS) but IMDS wasn't available/accessible; therefore, Flux didn’t have the redentials to call ECR APIs and coundn’t get the token, hence the failure!!
As I was already using IRSA for CNI, CSI etc. I thought this is a great oppurtunity to move Flux from using IMDS to IRSA as well. This document captures my journey from the thought to the finishing-line, the issues I exprienced and how did I solve those to achieve the end goal.
🎯 Goal
Deploy FluxCD (Operator + Flux Instance) on Amazon EKS with IAM Roles for Service Accounts (IRSA), so Flux Image Reflector/Automation can authenticate to Amazon ECR without node credentials, and auto-deploy images from ECR.
🏗️ High-Level Architectural View
The diagram below shows how Bitbucket, FluxCD controllers, and AWS ECR interact: Bitbucket syncs manifests to Flux controllers, which pull images securely from ECR using IRSA

| Component | Description |
|---|---|
🧩 Bitbucket Git Repo
|
Holds our GitOps manifests, synced by Flux Source Controller |
⚙️ Flux Source Controller
|
Pulls manifests from that Bitbucket repo |
🧱 Flux Kustomize Controller
|
Applies manifests to our cluster |
🔁 Flux Image Automation Controller
|
Watches image updates and commits manifest changes back to Git |
🔍 Flux Image Reflector Controller
|
Scans ECR for image tags and metadata |
🔐 EKS OIDC Provider
|
Issues tokens used by IRSA authentication |
🧾 STS AssumeRoleWithWebIdentity
|
Exchanges OIDC token for temporary IAM credentials |
🧠 IRSA IAM Role
|
Grants scoped ECR access (read + token retrieval) |
🐳 Amazon ECR Repository
|
Stores application images, queried by Image Reflector |
🔄 End-to-End Flow
- Pod starts with SA
image-reflector-controller. - EKS injects an OIDC web identity token.
- AWS STS exchanges it for temporary IAM creds.
- Flux Image Reflector uses them to call ECR APIs.
- Reflector lists tags → Image Automation commits new manifests → Kustomize applies updates.
📦 Deployment Overview
-
IRSA IAM role for Flux image controllers:
- Module:
terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts(v6.2.1) - Trusted via your EKS OIDC provider
-
Managed policy granting ECR read +
GetAuthorizationToken
- Module:
-
ServiceAccount annotations (Kubernetes) for:
flux-system/image-reflector-controllerflux-system/image-automation-controller
- Flux Operator Helm chart (v0.32.0)
- Flux Instance Helm chart (distribution v2.4.0), Git sync to your Bitbucket repo
⚙️ Terraform Implementation
1️⃣ IAM Policy for ECR (least privilege)
# ------------------------------------------------------
# IRSA policy document
# ------------------------------------------------------
data "aws_iam_policy_document" "flux_ecr_read" {
statement {
sid = "GetAuthTokenForECR"
effect = "Allow"
actions = ["ecr:GetAuthorizationToken"]
resources = ["*"]
}
statement {
sid = "ReadOnlyAccessToECR"
effect = "Allow"
actions = [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:DescribeImages",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:DescribeRegistry",
]
resources = [
"arn:aws:ecr:${var.aws_region}:${var.aws_acc_id}:repository/*"
]
}
// Render the policy document
resource "aws_iam_policy" "flux_ecr_ro" {
name = "${var.name_prefix}-flux-ecr-ro"
policy = data.aws_iam_policy_document.flux_ecr_read.json
}
📝 Note: ecr:GetAuthorizationToken must always be unscoped (Resource="*"), since this API isn’t repository-specific
⚠️ Common pitfall: Avoid something like:
repository/<apps>/*— ECR isn’t hierarchical; that pattern doesn’t match e.g.apps/mainappand causesAccessDeniedException.
2️⃣ IRSA Role for Flux Controllers
# ------------------------------------------------------
# Flux Image-Controller IRSA
# ------------------------------------------------------
module "flux_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts"
version = "6.2.1"
name = "${var.name_prefix}-flux-imgctrl-irsa"
oidc_providers = {
automation = {
provider_arn = var.eks_oidc_provider
namespace_service_accounts = ["flux-system:image-automation-controller"]
}
reflector = {
provider_arn = var.eks_oidc_provider
namespace_service_accounts = ["flux-system:image-reflector-controller"]
}
}
policies = {
ecr_ro = aws_iam_policy.flux_ecr_ro.arn
}
}
💡 One IRSA role for both controllers simplifies management; separation is optional.
3️⃣ Annotate Flux ServiceAccounts
# ------------------------------------------------------
# Controller annotations for SA
# ------------------------------------------------------
resource "kubernetes_annotations" "flux_irsa_sa" {
for_each = toset(["reflector", "automation"])
api_version = "v1"
kind = "ServiceAccount"
metadata {
name = "image-${each.value}-controller"
namespace = "flux-system"
}
annotations = {
"eks.amazonaws.com/role-arn" = module.flux_irsa.arn
}
}
4️⃣ Use of Bitbucket credential
# ------------------------------------------------------
# FluxCD namespace
# ------------------------------------------------------
resource "kubernetes_namespace_v1" "flux_system" {
metadata {
name = var.flux_namespace
annotations = {
name = var.flux_namespace
"kustomize.toolkit.fluxcd.io/ssa" = "Ignore"
}
}
depends_on = [null_resource.kube_config]
lifecycle {
ignore_changes = [
metadata[0].labels,
metadata.0.annotations,
]
create_before_destroy = false
}
}
# ------------------------------------------------------
# Inject BB credential
# ------------------------------------------------------
resource "kubernetes_secret_v1" "bb_passwd" {
metadata {
name = "bb-${var.bb_user}-passwd"
namespace = kubernetes_namespace_v1.flux_system.id
}
data = {
username = var.bb_user
password = var.bb_user_secrets.apwd
}
type = "Opaque"
}
5️⃣ Flux Operator & Flux Instance Helm releases
# ------------------------------------------------------
# Flux Operators and Instance helm-chart
# ------------------------------------------------------
resource "helm_release" "flux_operator" {
repository = "oci://ghcr.io/controlplaneio-fluxcd/charts"
chart = "flux-operator"
version = "0.32.0"
name = "flux-operator"
namespace = kubernetes_namespace_v1.flux_system.id
create_namespace = false
wait = true
}
resource "helm_release" "flux_instance" {
depends_on = [helm_release.flux_operator]
repository = "oci://ghcr.io/controlplaneio-fluxcd/charts"
chart = "flux-instance"
name = "flux"
namespace = kubernetes_namespace_v1.flux_system.id
# Flux components and kustomize patches
values = [
file("${path.module}/values/components.yaml")
]
# Flux distribution
set {
name = "installCRDs"
value = true
}
set {
name = "instance.distribution.version"
value = var.flux_version
}
set {
name = "instance.distribution.registry"
value = var.flux_registry
}
# Configure Flux Git sync
set {
name = "instance.sync.eks"
value = "GitRepository"
}
set {
name = "instance.sync.url"
value = "${var.bb_base_url}/${var.bb_flex_repo}.git"
}
set {
name = "instance.sync.path"
value = var.git_path #e.g. "/"
}
set {
name = "instance.sync.ref"
value = var.git_ref #e.g. "refs/heads/main"
}
set {
name = "instance.sync.pullSecret"
value = kubernetes_secret_v1.bb_passwd.metadata[0].name
}
}
📝 Note:
installCRDs=trueensures proper CRD registration before Flux Instance is applied.📝 Note:
instance.distribution.versioncan be safely upgraded once IRSA/ECR is stable.
✅ Verification Checklist
1️⃣ IRSA projected token exists
kubectl exec -n flux-system deploy/image-reflector-controller -- \ ls -l /var/run/secrets/eks.amazonaws.com/serviceaccount
total 0
lrwxrwxrwx 1 root 1337 12 Oct 27 19:49 token -> ..data/token
2️⃣ Token audience is STS
kubectl exec -n flux-system deploy/image-reflector-controller -- \
cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token \
| jq -R 'split(".") | .[1] | @base64d | fromjson' | jq '.aud,.sub'
[
"sts.amazonaws.com"
]
"system:serviceaccount:flux-system:image-reflector-controller"
3️⃣ ImageRepository status
kubectl get imagerepository -n flux-system mainapp -o yaml | yq '.status.conditions'
[
{
"lastTransitionTime": "2025-10-27T18:48:23Z",
"message": "successful scan: found 2 tags",
"observedGeneration": 1,
"reason": "Succeeded",
"status": "True",
"type": "Ready"
}
]
🧰 Troubleshooting
| Symptom | Probable Cause | Fix |
|---|---|---|
no EC2 IMDS role found |
IRSA not active or the Pod still not using it | Annotate SA + restart pod |
Only token file exists |
Normal (modern IRSA token projection) | Decode token for aud: sts.amazonaws.com
|
AccessDenied: ecr:GetAuthorizationToken |
Role missing token permission | Add ecr:GetAuthorizationToken on *
|
AccessDenied on repo read |
Wrong ARN pattern | Use exact ARN (e.g.repository/<prefix>/<app_name>) or allow repository/*
|
| SA annotation missing | FluxInstance CRD type is list of strings | Patch SAs with Terraform annotations |
🔒 Security Notes
- Scope ECR access to specific repositories whenever possible (e.g.
repository/<repo>) - Keep
ecr:GetAuthorizationTokenunscoped - IRSA isolates pod access — no node IAM exposure
- Rotate IAM policies periodically and audit for unused repositories
🏁 Conclusion
By combining FluxCD with IRSA, we’ve achieved:
- Pod-scoped ECR authentication (no IMDS dependency)
- Least-privilege IAM access
- Fully automated GitOps workflow via Terraform
This setup ensures that your CI/CD pipeline runs natively within EKS — secure, scalable, and elegant.
🕑 Change Log
- Operator: v0.32.0
- Flux: v2.4.0
- IAM module: v6.2.1
- Shared IRSA for image controllers
- Correct repo ARN (no nested wildcard)
Top comments (0)