TL;DR — The safest way to make changes to your Helm Charts and Kustomize Overlays is to let Argo CD render them for you. This can be done by spinning up an ephemeral cluster in your automated pipelines. This article presents a tool (
argocd-diff-preview
) for rendering manifest changes on pull requests. The rendered output is similar to what Atlantis creates for Terraform.
Problem
In the Kubernetes world, we often use templating tools like Kustomize and Helm to generate our Kubernetes manifests. These tools make maintaining and streamlining configuration easier across applications and environments. However, they also make it harder to visualize the application's actual configuration in the cluster.
Mentally parsing Helm templates and Kustomize patches is hard without rendering the actual output. Thus, making mistakes while modifying an application's configuration is relatively easy.
In the field of GitOps and infrastructure as code, all configurations are checked into Git and modified through PRs. The code changes in the PR are reviewed by a human, who needs to understand the changes made to the configuration. This is hard when the configuration is generated through templating tools like Kustomize and Helm.
If you are interested in a more detailed walkthrough for this problem, I recommend watching Nicholas Morey's talk at KubeCon 2024: "The Rendered Manifests Pattern: Reveal Your True Desired State"
This article introduces the tool argocd-diff-preview
that solves this problem by rendering manifest changes directly on pull requests.
... but first, let's go through two simple examples where not rendering manifests can result in misconfiguration:
Helm misconfiguration example
Here we see an example of a developer trying to override the replica count on an Argo CD application:
This PR may look correct, but as a reviewer, you do not know if the value specified in the Helm Chart is named replicas:
or replicaCount:
. The code change has no effect if the value name is incorrect. Without rendering the Helm templates, the likelihood of these errors going to production is high.
Kustomize misconfiguration example
Here we see an example of a developer trying to set the replica count for both staging and production:
Again, this PR may look correct because the change happens in a base folder, so the change applies to all overlays (production and staging). But as a reviewer, you do not know if this value is overridden later down the chain of overlays.
~/someApp
├── base
│ ├── deployment.yaml ⬅️ File changed in Pull Request
│ ├── kustomization.yaml
│ └── service.yaml
└── overlays
├── staging
│ ├── cpu_count.yaml
│ ├── kustomization.yaml
└── production
├── cpu_count.yaml
├── kustomization.yaml
└── replica_count.yaml ⬅️ replicaCount overwritten here
This unintended result might not have been caught without rendering the final output for staging and production.
Other solutions to the problem
This problem has been pointed out many times in articles and tech talks about GitOps and infrastructure as code.
If you are interested in different approaches to solving the problem and their limitations, check out Kostis Kapelonis's article on the topic.
argocd-diff-preview
is not the first tool that tries to tackle this problem. Other open-source repos include quizlet/argocd-diff-action and zapier/kubechecks.
quizlet/argocd-diff-action generates an Argo CD diff between the current PR and the current state of the cluster using the
argocd app diff
command. Thus, this tool needs the Argo CD applications to already be in sync with Git to be helpful. Applications that are out-of-sync on the Argo CD instance will be rendered as a diff on every PR. Additionally, you need to provide your CI pipeline with credentials to your Argo CD server, which may not be possible or desirable.zapier/kubechecks is a system that you install on your cluster, which may not be desirable for organizations with strict security restrictions. The tool is complex but has many interesting features. Again, this tool requires access to your running Argo CD instance, which may not be possible or desirable.
argocd-diff-preview
was created to avoid installing a tool directly on a cluster or providing it with credentials to your live Argo CD instance.
New solution: argocd-diff-preview
Goal
Create a tool that works like Atlantis for Terraform but for Argo CD. The tool should render a reliable diff of the configuration changes directly on the PR. Additionally, it should work without needing access to your existing infrastructure.
Instead of creating some scripts that try to mimic how Argo CD would render the manifests, why not let Argo CD render the manifests itself? This would ensure that the rendered manifests are exactly how Argo CD would render the manifests.
How it works
argocd-diff-preview
spins up a local cluster, installs Argo CD, applies the manifests to the cluster, extracts the rendered manifests from Argo CD, and compares it to the main branch.
This tool runs an ephemeral local cluster inside Docker, so it does not need access to your infrastructure. It only needs read access to the Git repository and your Helm Charts (either stored in Git or a registry)
In other words, it follows these 10 steps:
- Start a local cluster
- Install Argo CD
- Add the required credentials (Git credentials, image pull secrets, etc.)
- Fetch all Argo CD application files on your PR branch
- Point their
targetRevision
to the Pull Request branch - Remove the
syncPolicy
from the application (to avoid the application to sync locally)
- Point their
- Apply the modified applications to the cluster
- Let Argo CD do its magic
- Extract the rendered manifests from the Argo CD server
- Repeat steps 4–7 for the base branch (main branch)
- Create a diff between the manifests rendered from each branch
- Display the diff in the PR
Example
If you are asked for a review on a PR that looks like this:
Then you can verify that it is configured correctly by checking the output generated by argocd-diff-preview
. The output would look similar to this:
Pros
- Always renders the correct difference between branches because it is rendered by Argo CD itself.
- Fully ephemeral cluster.
- Does not access any of your existing infrastructure. It only requires read access to the Git repository and your Helm Charts.
- Can be run locally before you open the pull request.
- Supports multi-source applications
- Supports Argo CD Config Management Plugins (CMP)
- Renders changes in resources from external sources (e.g., Helm Charts). For example, when you update the Helm Chart version of
nginx
, you can see what exactly changed - PR example.
Cons
- It is slow. Spinning up a cluster and installing Argo CD takes a few minutes each run (see table below)
Comparing desired states - Not actual state
An important point to understand is that, unlike Atlantis or the argocd diff
CLI command, this approach doesn't compare the desired state in Git with the actual state in Kubernetes. Instead, it compares the desired state of the two branches stored in Git. I would argue that this is better than comparing Git with the actual state in Kubernetes because the state can change, resulting in non-deterministic output. The actual state in Kubernetes can temporarily go out-of-sync with Git, and we don't want this to be highlighted in our diff preview. Developers who work with Altanis experience this a lot - each time you run atlantis plan
, it may produce a different result if the infrastructure changes often.
How to use it in GitHub Actions
Here is an example of how you would trigger argocd-diff-preview
on your pull requests in GitHub Actions
name: Argo CD Diff Preview
on:
pull_request:
branches:
- main
jobs:
render-diff:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
path: pull-request
- uses: actions/checkout@v4
with:
ref: main
path: main
- name: Generate Diff
run: |
docker run \
--network=host \
-v /var/run/docker.sock:/var/run/docker.sock \
-v $(pwd)/main:/base-branch \
-v $(pwd)/pull-request:/target-branch \
-v $(pwd)/output:/output \
-e TARGET_BRANCH=${{ github.head_ref }} \
-e REPO=${{ github.repository }} \
dagandersen/argocd-diff-preview:v0.0.23
- name: Post diff as comment
run: |
gh pr comment ${{ github.event.number }} --repo ${{ github.repository }} --body-file output/diff.md --edit-last || \
gh pr comment ${{ github.event.number }} --repo ${{ github.repository }} --body-file output/diff.md
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Handling credentials
In the simple code example above, I do not provide argocd-diff-preview
with any credentials, which only works if the Helm Chart registry and the Git repository are public. If you want to use this tool in a private repository, you need to provide the tool with the required credentials. More details on this can be seen in the GitHub Repository
Output
On a successful run, the tool prints the following output:
✨ Running with:
✨ - base-branch: main
✨ - target-branch: helm-example-3
✨ - repo: dag-andersen/argocd-diff-preview
✨ - timeout: 180
🚀 Creating cluster...
🚀 Cluster created successfully
🦑 Installing Argo CD...
...
🤖 Patching applications for branch: main
🤖 Patching applications for branch: helm-example-3
🌚 Getting resources for base-branch
🌚 Getting resources for target-branch
...
🔮 Generating diff between main and helm-example-3
🙏 Please check the ./output/diff.md file for differences
If something is wrong with your configuration, it prints the Argo CD Application error message:
...
🤖 Patching 4 Argo CD Application[Sets] for branch: helm-example-3
🌚 Getting resources for target-branch
⏳ Waiting for 4 out of 4 applications to become 'OutOfSync'. Retrying in 5 seconds. Timeout in 180 seconds...
❌ Failed to process application, my-app, with error:
Failed to load target state: failed to generate manifest for source 2 of 2: rpc error: code = Unknown desc = authentication required
Speed
The table below shows how the number of applications correlates with the time it takes to render them all:
Number of applications | 1 | 50 | 250 | 500 |
---|---|---|---|---|
Seconds** | 80 | 100 | 210 | 330 |
Creating a cluster and installing Argo CD on it takes around 1 minute, which is why rendering a single application takes over a minute.
**The speed can vary depending on the distribution between applications used with Kustomize, Helm, and raw manifests. This test's result is based on a codebase mainly filled with Helm Charts.
Speeding up the rendering process
Rendering the manifests generated by all applications in the repository for each pull request can be slow. The tool offers various options to limit the number of applications rendered on each PR. You can choose applications based on label selectors, file paths, or by tracking specific file changes For more information: [docs]
Conclusion
In conclusion, tackling the challenge of accurately visualizing Kubernetes configuration changes within GitOps workflows is essential for ensuring smooth operations and minimizing errors.
argocd-diff-preview
works like Atlantis for Terraform. The tool lets you render the diff on PRs, making it easier to review the changes made to the configuration. Since the diff is rendered by Argo CD itself, it is as accurate as possible.
In contrast to other existing solutions, argocd-diff-preview
works without direct access to your infrastructure, which can be desirable for organizations with strict security requirements.
If you experience any issues with the tool, please open an issue on the repository
Top comments (4)
This is doooope! Good stuff!!!
Thank you! Any feedback on the tool would be highly appreciated, especially if you encounter any issues 🙌🏻
Hi thanks for this. Most of my argo workflows have embedded python workflow code either as artifacts or in config maps. Can you render the diff on PRs to highlight changes in python code using this?
I am not sure I understand your setup correctly :) However, anything that ArgoCD controls and renders in your existing live cluster can be rendered by this tool. If you make changes to your Python code stored in a ConfigMap, it will be highlighted in the preview. I hope this answers your question. Otherwise, feel free to provide more context, and I may be able to give a better answer ⭐️ I am not very familiar with Argo Workflows.