I didn’t start backing up my GitHub repositories because I distrusted GitHub.
I started because I realized something uncomfortable: GitHub had become a single point of failure for work I actually cared about.
Between long-lived projects, experiments I might want years later, and repositories that quietly became important, I didn’t like the idea that a deleted repo, a locked account, or a bad force-push could wipe everything out.
I wanted an off-platform, boring, automated backup.
Amazon S3 fit that mental model perfectly:
- Independent of GitHub
- Cheap
- Extremely durable
- Built for long-term storage
What sounded simple turned out to be very easy to get wrong.
This article documents the approach that finally worked — including the mistakes.
What this article covers
This guide shows how to:
- Back up multiple GitHub repositories
- Run backups weekly
- Preserve full Git history (branches + tags)
- Avoid AWS access keys
- Use OIDC + temporary credentials
- Store backups safely in Amazon S3
This is not a ZIP download tutorial.
This is a real backup.
High-level architecture (correct model)
Architecture flow
- GitHub Actions runs on a schedule
- GitHub issues an OIDC identity token
- AWS STS validates the token
- AWS issues temporary credentials
- The workflow uploads backups to S3
No IAM users.
No static secrets.
Nothing long-lived.
Why git bundle (and not ZIP files)
ZIP files look tempting — until you need to restore.
ZIP backups:
- ❌ Lose commit history
- ❌ Drop branches and tags
- ❌ Are painful to restore correctly
A git bundle is different. It contains:
- All commits
- All branches
- All tags
- In a single portable file
Creating a bundle
git bundle create repo-backup.bundle --all
If your backup can’t restore history, it’s not a backup.
The IAM problem that caused most of the pain
The hardest part wasn’t GitHub Actions.
It was AWS permissions.
The confusing part
AWS uses two different policy types:
| Policy type | Used for | Requires Principal
|
|---|---|---|
| IAM role policy | Identity permissions | ❌ No |
| S3 bucket policy | Resource permissions | ✅ Yes |
They look similar.
They behave very differently.
Why “invalid principal” kept appearing
At one point, everything looked correct — but AWS kept returning:
Invalid principal
The reason:
- An IAM policy was pasted into an S3 bucket policy
- Or the principal ARN didn’t match the actual role
The rule that finally made it click
- IAM role policies never define a Principal
- S3 bucket policies must define who is allowed access
S3 authorization model (the missing mental model)
This diagram explains the core issue that caused most confusion.
Key idea
An S3 upload succeeds only if BOTH are true:
- The IAM role policy allows the action
- The S3 bucket policy allows the same role
If either side denies it → AccessDenied
The GitHub Actions workflow (clean and boring)
Once the security model was clear, the workflow itself became simple.
name: Weekly S3 Repo Backup
on:
schedule:
- cron: "15 3 * * 0" # Weekly
workflow_dispatch: {}
permissions:
id-token: write
contents: read
jobs:
backup:
runs-on: ubuntu-latest
steps:
- name: Checkout full history
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Create git bundle
run: |
set -e
REPO_NAME="${GITHUB_REPOSITORY#*/}"
TS="$(date -u +%Y-%m-%dT%H-%M-%SZ)"
mkdir -p backups
git bundle create "backups/${REPO_NAME}-${TS}.bundle" --all
sha256sum "backups/${REPO_NAME}-${TS}.bundle" > "backups/${REPO_NAME}-${TS}.sha256"
- name: Configure AWS credentials (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: <IAM_ROLE_ARN>
aws-region: <AWS_REGION>
- name: Upload to S3
run: |
aws s3 cp backups/ \
s3://<bucket-name>/github-backups/${GITHUB_REPOSITORY}/ \
--recursive
Nothing clever.
Nothing hidden.
That’s intentional.
Terraform setup (AWS side)
This is a minimal Terraform configuration — no extras.
GitHub OIDC provider
resource "aws_iam_openid_connect_provider" "github" {
url = "https://token.actions.githubusercontent.com"
client_id_list = [
"sts.amazonaws.com"
]
thumbprint_list = [
"6938fd4d98bab03faadb97b34396831e3780aea1"
]
}
IAM role for GitHub Actions
resource "aws_iam_role" "github_backup" {
name = "github-actions-s3-backup"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "sts:AssumeRoleWithWebIdentity"
Principal = {
Federated = aws_iam_openid_connect_provider.github.arn
}
Condition = {
StringLike = {
"token.actions.githubusercontent.com:sub" = "repo:*/*:*"
}
}
}]
})
}
IAM role policy (write-only S3 access)
resource "aws_iam_role_policy" "s3_backup" {
role = aws_iam_role.github_backup.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = ["s3:ListBucket"]
Resource = "arn:aws:s3:::example-backup-bucket"
},
{
Effect = "Allow"
Action = [
"s3:PutObject",
"s3:AbortMultipartUpload"
]
Resource = "arn:aws:s3:::example-backup-bucket/*"
}
]
})
}
Restoring from a backup
Restoring is refreshingly simple.
git clone repo-backup.bundle restored-repo
cd restored-repo
git push --all origin
git push --tags origin
No GitHub API.
No special tooling.
Just Git.
Lessons learned
- Sketch trust relationships before writing policies
- Don’t trust AWS error messages blindly
- Never use
rootas a bucket principal - Test with one repo before scaling
- Keep backups boring
Final thoughts
This setup isn’t flashy — and that’s the point.
A good backup system is something you forget about until the day you need it.
And when that day comes, it should just work.



Top comments (0)