Simanta Sarma

Posted on Mar 13

Stop Lying to Your CI Pipeline

#springboot #githubactions #ci

Part of the "Stop Lying to Your Stack" series

In the age of AI-assisted development, a new service can be scaffolded in minutes. Spring Boot project structure, security configuration, database migrations, REST endpoints, everything generated, reviewed, and committed at blazing speed.

But here is what I see teams consistently treat as an afterthought: The CI pipeline.

It gets added last, often copied from another service in the same organisation, and rarely analysed. And almost never treated as the architectural artefact it actually is.

The result is a pipeline that appears to work. Tests go green, PRs merge, deployments happen. But underneath, the pipeline is carrying assumptions that were never validated, permissions that were never justified, and coverage gaps that nobody noticed.

This article is about what I have learned from setting up and reviewing PR verification pipelines for Spring Boot services, and the patterns that mature teams follow after they learn these lessons the hard way.

The Problem With Copying Pipelines

When a new service spins up inside an existing organisation, the obvious move is to find the nearest sibling service and copy its workflow files. Same stack, same team conventions, same cloud provider.

Most of the time, it mostly works. But "mostly works" is not the same as "correct."

What copying preserves is not just the decisions that were made deliberately. It also preserves the assumptions that were never questioned, the permissions that were granted lazily, the placeholders that were never updated, and the coverage gaps that nobody noticed because the tests still went green.

You should treat a CI workflow review the same way you treat a dependency audit. Read it line by line. Understand what each piece does. Ask whether it belongs for this specific service.

Test Before You Trust: Running CI Locally With `act`

Before pushing a new workflow to a real repository and waiting for GitHub to run it, there is a better path. Run it locally.

act is an open-source tool by nektos that executes GitHub Actions workflows on your machine using Docker. The GitHub Actions runner environment is simulated inside a container. Steps execute in order, logs appear in your terminal, and you find the problems before the PR does.

The critical decision when using act is which Docker image to use for the simulated ubuntu-latest runner. There are three options, and the tradeoffs are real.

Micro (~200MB): Barely anything. Shell tools and not much else. Useful only for workflows with zero dependencies on pre-installed tooling. Not suitable for Maven builds or anything that touches Node.
Medium (~500MB to 1GB): A reasonable development environment: git, curl, Java, common build tools. Fast to pull. Low disk footprint. Suitable for many CI jobs.

Note: What it does not have: Node.js and this catches teams off guard.

A Spring Boot service with a React frontend built via frontend-maven-plugin might seem like it handles Node internally, so Node being absent in the runner should not matter.

In practice, the Maven plugin downloads its own Node runtime, but the download process depends on the network environment, SSL certificates, and system-level dependencies that may differ between the medium image and what GitHub's real runner provides.

Hiccups happen. The build that works perfectly on the real runner can fail in the medium image in ways that are difficult to diagnose. So this sets up the justification for the third option.

Large (~17GB+): This mirrors GitHub's actual ubuntu-latest runner almost exactly, Node is present, Docker is present. Every tool that GitHub pre-installs on its runners is there. What works locally with the large image will work in CI.

The cost is the size. Pulling 17GB once is acceptable, but storing it and keeping it is a commitment to disk space. On a developer workstation with limited storage, this is a real constraint.

The practical guidance

If you are validating a new pipeline for the first time and want confidence that what you see locally reflects what CI will do, use the large image. The disk cost is a one-time investment.

If you are making small iterative changes and have already validated the pipeline end-to-end, the medium image is fast enough for quick feedback on configuration changes, as long as you know its gaps.

I advise to never use the micro image for a full Maven build with a frontend.

Over-Permissioning: Not Just Bad Practice

The most common issue in copied workflows is excessive permissions on the GITHUB_TOKEN. This is easy to overlook because it does not break anything.

The workflow passes. No error is thrown. The permissions are simply broader than necessary, and nobody notices until something goes wrong.

What "goes wrong" in this context is worth understanding in concrete terms.

A typical PR verification job in a copied workflow often looks like this:

permissions:
  actions: write
  packages: write
  contents: write
  checks: write
  pull-requests: write

Each of these permissions is a real capability granted to a short-lived token that executes inside your build. Consider what each one enables:

contents: write allows pushing commits and creating releases. A workflow step with this permission can commit code to your repository without a pull request.
packages: write allows pushing images to GitHub Container Registry under your organisation's namespace.
actions: write allows cancelling workflow runs, re-running failed jobs, and modifying workflow dispatch inputs.
packages: write and contents: write together on a PR build means that any compromised dependency in your Maven build, any malicious action that slips into your transitive chain, or any step that is manipulated via a malicious pull request from a fork has the credentials to write to your repository and push images to your registry.

This is the supply chain attack surface. It is not hypothetical. GitHub Actions supply chain attacks have occurred against real organisations. The mitigation is not complex detection tooling. It is the leat priviledge principle: simply not granting permissions that are not needed.

The correct permissions for a job that runs Maven tests and posts a PR comment are three:

permissions:
  contents: read       # checkout requires this
  checks: write        # publish-report creates check run results
  pull-requests: write # publish-report posts PR comments

Nothing more. The job functions identically and the attack surface is reduced to what is actually required.

There is a deeper point here about how teams think about CI security. Most security attention in a repository goes toward the application code: dependency audits, SAST scanning, secret detection. The CI configuration itself is treated as infrastructure, not as code that runs with credentials. That asymmetry is exactly what attackers exploit.

Pin Actions to a Commit SHA, Not a Tag

Related to over-permissioning is a practice that takes one extra minute to implement and significantly improves your pipeline's security posture.

When a workflow references an action by a floating tag:

uses: actions/checkout@v6

the tag v6 is a mutable pointer. The action maintainer can move it to any commit at any time.

If an attacker compromises the action repository and moves the tag to a commit containing malicious code, your workflow pulls and executes that code on the next CI run. You have not changed anything and your workflow file looks the same. But what executes is not what you reviewed.

This is tag hijacking. It is a real attack vector.

The defence is to pin to a specific commit SHA:

uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v6

This reference is immutable. Moving the v6 tag has no effect on your workflow. The comment documents which version the SHA corresponds to, so the intent remains readable.

Pinning also solves a stability problem. Maintainers occasionally move tags intentionally: a patch release gets tagged under the same major version, and v6 now points to a commit with different behaviour than it did yesterday.

Teams discover this during post-incident reviews when CI behaviour changed without anyone modifying the workflow file. With a pinned SHA, your CI is stable. Changes to the action are opt-in, not automatic.

The operational concern is keeping SHAs up to date. The answer is Dependabot.

When configured for GitHub Actions, Dependabot automatically submits PRs to update pinned SHAs when new versions of actions are released. You get security reviews, a documented changelog, and a deliberate merge decision instead of silent auto-updates via mutable tags.

Pinning to SHAs is a small change. It takes seconds per action. Teams that add it after their first pipeline setup rarely remove it.

The Missing Integration Test Report

This is a simple oversight. Easy to miss. Easy to fix.

Spring Boot services under Maven run two test categories: Surefire for unit tests, Failsafe for integration tests. A workflow that only reports Surefire:

report-path: '**/target/surefire-reports/TEST*.xml'

silently discards integration test results. The tests run. They pass or fail. Nobody sees the outcome in the PR.

The fix is a second report step pointing at Failsafe:

report-path: '**/target/failsafe-reports/TEST*.xml'

I have seen this missed repeatedly because the integration tests rarely fail locally and the pipeline "works."

When integration tests do fail in CI, engineers look at the job log, not the missing PR comment, and eventually find the failure. The gap becomes visible only when someone asks why there is no integration test summary on the PR.

Two report steps, two report paths. It takes two minutes to add and closes a coverage gap that has caused confusion on more than one team I have worked with.

Placeholders as Technical Debt Markers

The turing85/publish-report action has a comment-header field. Its purpose is to identify the comment uniquely across CI runs so the action can update the same comment rather than creating a new one each time.

A value like my-comment-header is the default placeholder from the documentation. It was never meant to stay.

In a mature codebase, descriptive values are a must:

comment-header: pr-unit-test-results
comment-header: pr-integration-test-results

This is not cosmetic. When two separate test report steps both use my-comment-header, the second step overwrites the first. One report disappears. The PR shows a single comment that alternates between unit test results and integration test results depending on execution order.

Placeholders left in production configuration are a reliable signal. They tell you how the pipeline was built and whether it was reviewed after the initial setup. In a mature repository, there are none.

Patterns Mature Teams Add After the First Pipeline

The patterns below are not prominently documented. They are things teams add after running a pipeline for a few weeks and observing what actually happens in practice.

Sticky PR Comments

Without this, every CI run on a PR creates a new comment. Push five commits while fixing a failing test, and the PR fills with five separate test report comments. The PR timeline becomes noise.

Add one line to each report step:

- name: Publish Unit Test Report
  uses: turing85/publish-report@v2.3.1
  if: ${{ always() }}
  with:
    sticky-comment: true
    comment-header: pr-unit-test-results

The first CI run creates the comment. Every subsequent run on the same PR updates it in place. The PR always shows one current test report per category, not a history of every run.

The comment-header value is what makes this work. It is the key the action uses to find and update the existing comment. This is another reason the placeholder value my-comment-header is a problem: if multiple steps share the same header, the sticky logic collapses them into one comment instead of maintaining separate unit and integration test reports.

Cancel Redundant Runs

When a developer pushes a commit, realises there is a typo, and pushes a second commit immediately, two CI runs start. The first run will never matter. It will complete, consume minutes from your CI quota, and potentially block the PR check until it finishes.

Concurrency groups cancel the older run as soon as the newer one starts:

concurrency:
  group: pr-${{ github.ref }}
  cancel-in-progress: true

The group key is the branch reference. Two pushes to the same branch are in the same group. The second push cancels the first run automatically. On an active PR with multiple review cycles, this can save meaningful CI minutes and keeps the PR checks reflecting the current commit rather than a stale one.

Timeout Guards

A test that hangs, like a database connection that never times out, a TestContainers startup that stalls will cause a CI job to run until GitHub's default job timeout of six hours. Six hours of consumed runner minutes, and a PR that appears to be running indefinitely.

Add an explicit timeout at the job level:

jobs:
  test:
    runs-on: ubuntu-latest
    timeout-minutes: 30

Thirty minutes is generous for a Spring Boot build with integration tests. If the job exceeds it, it fails fast with a clear timeout message rather than hanging indefinitely. The failure is immediately diagnosable.

Manual Trigger for Debugging

Add workflow_dispatch to the trigger block:

on:
  pull_request:
    branches:
      - main
  workflow_dispatch:

This allows any team member to trigger the workflow manually from the GitHub Actions UI without pushing a commit. Useful for debugging CI failures that do not reproduce locally, re-running after fixing a flaky external dependency, or validating a workflow change on a branch before raising a PR.

It costs nothing to add and has saved debugging time more than once.

Emoji Shortcodes Over Unicode

A small decision that matters for rendering reliability.

Using Unicode emoji directly in workflow YAML:

comment-message-success: "🥳 {0} passed | ✅ {1} | ❌ {2} | ⚠️ {3}"

This works most of the time. But Unicode emoji render inconsistently across markdown processors, email clients that deliver GitHub notifications, and Slack integrations that mirror PR comments. The rendering depends on the system font, the platform's emoji version, and the markdown parser in use.

GitHub emoji shortcodes resolve through GitHub's own rendering engine:

comment-message-success: ":partying_face: {0} passed | :white_check_mark: {1} | :x: {2} | :warning: {3}"

Wherever GitHub Markdown is rendered, be it PR comments, check run summaries, email notifications, the shortcodes produce consistent output because GitHub controls the rendering. This is not a style preference. It is a rendering contract.

Know What Your Service Does Not Need

Copying a workflow from a sibling service also copies its context. Environment variables referencing external APIs the new service does not call.

Secrets that need to be provisioned even though they are never read. Setup steps that configure toolchains the service manages internally.

The discipline here is subtraction. Before asking what to add, ask what in the copied workflow does not belong to this service.

This is harder in an AI-assisted context because AI tooling is additive. It optimises for completeness and includes patterns from similar services because they are plausible.

The engineer's role is to evaluate each inclusion critically. What does this env var point to? Does this service call that API? Is this setup step actually needed, or does the build tool handle it internally?

None of the unnecessary inclusions break the build. They add weight: configuration that misleads future readers, secrets that need rotation, steps that consume time. In a payment processing service, clarity about what runs in the build environment is not optional.

Summary

Pattern	Why It Matters
Least-privilege `GITHUB_TOKEN` permissions	Reduces supply chain attack surface; each permission is a real credential capability
Pin actions to commit SHA	Prevents tag hijacking and silent behaviour changes from tag moves
Separate surefire and failsafe reports	Unit and integration failures signal different root causes; surface them separately
Meaningful `comment-header` values	Required for sticky comments to work correctly; placeholders cause reports to overwrite each other
`sticky-comment: true`	Keeps PR timeline clean; one updated comment instead of one per CI run
Concurrency group with cancel-in-progress	Stops redundant runs when new commits arrive; saves CI minutes
`timeout-minutes` on jobs	Prevents hanging tests from consuming hours of runner time
`workflow_dispatch` trigger	Enables manual re-runs for debugging without requiring a commit push
GitHub emoji shortcodes over Unicode	Consistent rendering across PR comments, email, and Slack integrations
Remove irrelevant env vars and setup steps	Keep the pipeline aligned with what the service actually does

Final Thoughts

The conversation in software engineering has shifted. Velocity is no longer the constraint. AI handles the implementation baseline quickly. What remains scarce is judgment. Knowing what to build, what to include, and what each decision does to the system at runtime and in production.

A CI pipeline is a small system. It holds credentials. It runs code. It produces the signals your team acts on after every pull request. Treating it with the same care you give to a service's security configuration or schema design is not over-engineering.

It is engineering.

Set it up correctly once. Every pull request after that benefits from it.

If this resonated, share it with a teammate who has a my-comment-header somewhere in production. The conversation is worth having before the supply chain review, not after.

Coming up more in the "Stop Lying to Your Stack" series.

DEV Community

Stop Lying to Your CI Pipeline

The Problem With Copying Pipelines

Test Before You Trust: Running CI Locally With `act`

Over-Permissioning: Not Just Bad Practice

Pin Actions to a Commit SHA, Not a Tag

The Missing Integration Test Report

Placeholders as Technical Debt Markers

Patterns Mature Teams Add After the First Pipeline

Sticky PR Comments

Cancel Redundant Runs

Timeout Guards

Manual Trigger for Debugging

Emoji Shortcodes Over Unicode

Know What Your Service Does Not Need

Summary

Final Thoughts

Top comments (0)

The Problem With Copying Pipelines

Test Before You Trust: Running CI Locally With act

Over-Permissioning: Not Just Bad Practice

Pin Actions to a Commit SHA, Not a Tag

The Missing Integration Test Report

Placeholders as Technical Debt Markers

Patterns Mature Teams Add After the First Pipeline

Sticky PR Comments

Cancel Redundant Runs

Timeout Guards

Manual Trigger for Debugging

Emoji Shortcodes Over Unicode

Know What Your Service Does Not Need

Summary

Final Thoughts

Test Before You Trust: Running CI Locally With `act`