I stopped putting AWS keys in GitHub Secrets. Here's what I do instead.

#aws #terraform #githubactions #iam

For a while my deploy pipelines all worked the same way. Generate an IAM user, copy its access key and secret into GitHub repo secrets, and let the workflow use them. It deploys fine. It also means a long-lived credential to my AWS account is sitting in a third party's vault, valid until I remember to rotate it, which I never do.

For Wingman I cut that out. The GitHub Actions workflow holds zero AWS keys. It asks GitHub for a short-lived token at runtime, hands that token to AWS, and AWS gives back temporary credentials that expire when the job ends. This is OIDC federation. Below is how it's wired, the parts that are easy to get wrong, and one decision I'm still not happy about.

The problem with the old way

An IAM user access key is a static secret. Once it exists, anyone who can read it can use it from anywhere, forever, until it's rotated or deleted. Storing it in GitHub Secrets means the blast radius of a GitHub breach now includes my AWS account. It also means I own a rotation chore I will forget.

What I actually want: GitHub should be able to deploy this one repo to this one account, prove who it is on each run, and never hold a credential longer than the run takes.

The approach: trust GitHub's identity, not a stored key

OIDC flips the model. Instead of AWS trusting a secret that GitHub stores, AWS trusts GitHub's identity provider directly. On each run GitHub mints a signed JWT describing the job (which repo, which branch, which environment). The workflow passes that JWT to AWS STS. AWS checks the signature against GitHub's public keys, checks the token's claims against a trust policy I wrote, and if both pass, returns temporary credentials.

Two pieces make this work: a permission in the workflow, and a trust policy in AWS.

The workflow needs permission to request the token at all:

permissions:
  id-token: write   # required for OIDC
  contents: read

Miss this and the token request fails before AWS is ever contacted. It's the first thing to check when nothing else looks wrong.

Then the credential step, which carries no key, only a role ARN:

- name: Configure AWS credentials (OIDC — no long-lived keys)
  uses: aws-actions/configure-aws-credentials@v4.1.0
  with:
    role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
    aws-region: ${{ env.AWS_REGION }}

AWS_ROLE_ARN is not a secret in the credential sense. It's just an identifier. There's nothing in this repo's secrets that lets you authenticate to AWS on its own.

The trust policy is where the real work is

On the AWS side I register GitHub as an OIDC provider and write a role that only GitHub Actions can assume, and only from my repo. This is the Terraform:

data "aws_iam_openid_connect_provider" "github" {
  url = "https://token.actions.githubusercontent.com"
}

resource "aws_iam_role" "github_actions" {
  name = "${var.project_name}-github-actions"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = {
        Federated = data.aws_iam_openid_connect_provider.github.arn
      }
      Action = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:*"
        }
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
      }
    }]
  })
}

The sub condition is the load-bearing line. It says: only tokens whose subject is repo:my-org/wingman:* may assume this role. Without it, or with a loose wildcard, you've built a role that any GitHub repo on the platform can assume. That is a real misconfiguration people ship, and it turns "keyless" into "anyone's keys."

The :* at the end matches any branch, tag, or environment in that repo. If you want to lock deploys to main only, you tighten it to repo:my-org/wingman:ref:refs/heads/main. I kept it open across refs deliberately, since I run the same role from workflow_dispatch too.

The aud check pins the audience to sts.amazonaws.com. Keep it as a StringEquals, not a StringLike. Exact match on audience is one of those small things that quietly closes a door.

Scope the role to exactly what the deploy touches

Authentication gets GitHub in the door. Authorization decides what it can do once inside. The deploy does four things, so the role grants four things and stops:

resource "aws_iam_role_policy" "github_actions" {
  name = "deploy-permissions"
  role = aws_iam_role.github_actions.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      # ECR: push container images
      {
        Effect = "Allow"
        Action = [
          "ecr:GetAuthorizationToken",
          "ecr:BatchCheckLayerAvailability",
          "ecr:PutImage",
          "ecr:InitiateLayerUpload",
          "ecr:UploadLayerPart",
          "ecr:CompleteLayerUpload",
        ]
        Resource = "*"
      },
      # Lambda: update function code and config (scoped to the one function)
      {
        Effect = "Allow"
        Action = [
          "lambda:UpdateFunctionCode",
          "lambda:UpdateFunctionConfiguration",
          "lambda:GetFunction",
          "lambda:GetFunctionConfiguration",
        ]
        Resource = aws_lambda_function.backend.arn
      },
      # S3: sync frontend assets (scoped to the one bucket)
      {
        Effect   = "Allow"
        Action   = ["s3:PutObject", "s3:DeleteObject", "s3:ListBucket"]
        Resource = [aws_s3_bucket.frontend.arn, "${aws_s3_bucket.frontend.arn}/*"]
      },
      # CloudFront: invalidate after deploy (scoped to the one distribution)
      {
        Effect   = "Allow"
        Action   = ["cloudfront:CreateInvalidation"]
        Resource = aws_cloudfront_distribution.main.arn
      },
    ]
  })
}

Lambda, S3, and CloudFront are pinned to specific ARNs. The ECR auth-token call needs Resource = "*" because ecr:GetAuthorizationToken is account-scoped and won't accept a resource constraint. That's an AWS quirk, not laziness. The push actions around it are still gated by which repo the token allows.

What I'd flag if I were reviewing this

It mostly came together without drama, but two things are worth being honest about.

The OIDC provider is a one-per-account resource. The first time you set this up there's a decent chance one already exists, from another repo or an earlier experiment, and terraform apply will refuse to create a duplicate. I handle it with a data source that reads the existing provider, plus a commented-out resource block as the fallback for a truly fresh account:

# If the OIDC provider doesn't exist yet, comment out the data source
# and uncomment this, then re-run apply:
# resource "aws_iam_openid_connect_provider" "github" {
#   url             = "https://token.actions.githubusercontent.com"
#   client_id_list  = ["sts.amazonaws.com"]
#   thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
# }

It's a small thing, but it's the difference between a clean apply and ten minutes of confusion about why a global resource conflicts.

The part I'm not satisfied with: the workflow still injects third-party API keys (OpenAI, Serper, Pushover) into the Lambda's environment at deploy time, pulled from GitHub Secrets and pushed in with update-function-configuration. So I solved the AWS-credential problem cleanly and then left application secrets living in GitHub and in plaintext Lambda env vars. The right move is Secrets Manager or SSM Parameter Store, with the Lambda reading them at cold start and the GitHub role granted nothing more than permission to trigger a deploy. I know the fix. I just haven't done it yet, and pretending otherwise would be dishonest.

Takeaway

OIDC removes the standing AWS credential from your CI, but the security only holds if the trust policy's sub claim is scoped to your exact repo. Get that line right and the rest is just IAM.