Renicius Pagotto

Posted on Feb 6

Designing Effective CI/CD Pipelines: Practical Insights and Best Practices

#devops #pipeline #cicd #automation

In today's fast-paced development environment, modern CI/CD pipelines have become essential for delivering high-quality software efficiently. By automating key processes like testing and deployment, they enable teams to focus on innovation while minimizing risks.

Before we jump on how to design our pipelines, let's go through for some concept to make sure a comprehensive understanding.

CI/CD Pipeline

A CI/CD pipeline is an automated workflow that integrates Continuous Integration (CI) and Continuous Delivery/Deployment (CD) practices. It automates the process of building, testing, and deploying code changes, ensuring that software is delivered reliably and efficiently. The pipeline typically includes stages for code integration, automated testing, and deployment to production.

Continuous Integration

It's a practice where the code is frequently integrated in a shared branch and each integration is tested to detect early issues, ensuring that new code doesn't break existing functionality. This practice promotes early bug detection, strengthens collaboration among developers, and accelerates the development process by streamlining integration and testing.

Continuous Delivery / Deployment

There is a subtle distinction between Continuous Delivery and Continuous Deployment, but both involve automating the deployment process to eliminate manual intervention. The main difference is that Continuous Delivery includes a manual approval step before deploying code to the production environment, while Continuous Deployment automates the entire process, with or without automatic approval. Automatic approval can involve integration with tools like ServiceNow, which track changes in production and collect necessary approvals from stakeholders.

Here is a very good image demonstrating what a CI/CD pipeline looks like and how it works. It gives a clear picture of each step in the process.

Now we have a clear picture about a CI/CD pipeline, let's discuss about the step we must include in the process in order to have a reliable pipeline and guarantee a good quality for our code.

A well-designed CI/CD pipeline should include steps to ensure that the code builds successfully without any issues, incorporates thorough testing, and, most importantly, prioritizes security. Below, we will outline the process step by step.

Build and Restore

Here the download of the dependencies will happen and the code will be compiled or assembled into executable software. The build step ensures that the code is correctly structured and ready for further stages like testing and deployment. Restore is related to dependencies, in the case of .NET we have an explicit command to download only the dependencies before compilation, the same goes for Java, but not all of them are necessary, however it depends entirely on whether or not you have this specific step.

Tests

It's a pipeline step that will be to run various tests to ensure code quality. This can include unit tests, integration tests, smoke tests, and so on. It's not mandatory to have all of these types of tests, it will depend a lot on the context of the project, but at least unit tests should be present.

I would like to say that this is a mandatory step, but unfortunately not all projects have tests to run (I won't go into that discussion here), so if tests are present, this step is mandatory and the pipeline should fail if any test suite fails.

Security

This is a kind of step that it's always forgotten by teams. In general, we don't focus on security, specially when comes to docker image but is really important to have a step dedicated for it. In security step, we must cover vulnerabilities in our code, in the dependencies and specially in the image we are using as base to generate a image of our application.

This step ensures vulnerabilities are identified and addressed before deployment, reducing risks in production environments.

Vulnerabilities are classified by levels and scores (thinking specifically of the Snyk tool) and not all vulnerabilities should fail in the pipeline. We can think of it as follows.

Critical Vulnerabilities: Must fail the pipeline
High Vulnerabilities: Must fail the pipeline
Medium and Low: Do not need to fail the pipeline, but they should be addressed proactively.Teams should monitor these vulnerabilities weekly to ensure their number does not increase over time. Regular reviews and tracking help prevent the accumulation of vulnerabilities that could potentially escalate into more significant risks.

Here's a simple example of a GitHub Action where I defined 2 different steps for security.

Image security must be included in our pipeline in order to guarantee a reliable software. The vulnerability may not originate from the application or its dependencies but it can come by part of a image, that's why it's important to have a dedicated step for it.

One of the tool I like to use is Trivy, an open-source security tool that can be used to scan any kind of vulnerability in dockerfile. As Snyk, Trivy has different levels of vulnerability and can be defined a vulnerability threshold to fail the pipeline.

Here is a simple example of a security pipeline I have implemented before. It contains both Snyk and Trivy steps

Artifact

This step is responsible for generating the versioned artifact that will be deployed later, whether it’s an executable binary or a container image. The specific implementation depends on whether your application is containerized or relies on binary code.

For Containerized Applications

For containerized applications, this step involves creating a Docker image using Docker commands and pushing it to a container registry. Examples of popular registries include:

Azure Container Registry
Private Docker Hub
JFrog Artifactory

The Docker image serves as the deployable artifact, encapsulating all dependencies and configurations needed to run the application reliably across environments.

For Non-Containerized Applications

Step responsible for generating binary code (e.g., DLLs, JARs), in this case, binaries are saved to a specific versioned directory in an artifact repository like Artifactory. This ensures that each build is uniquely identifiable and traceable.

Versioning

Versioning is essential for both container images and binary artifacts. It can be implemented during this step or earlier in the pipeline by defining a version based on specific information, such as:

Labels defined in pull requests.
Commit messages or titles.

Proper versioning ensures traceability and consistency across deployments, reducing the risk of deploying incorrect artifacts.

Deployment

Deployment is the next critical step in the CI/CD pipeline, representing the Continuous Delivery or Continuous Deployment phase. This is where the binary code or container image generated during the artifact stage is used to deploy the application to environments such as Azure, AWS, GCP, or even on-premises infrastructure.

For most environments—like development, testing, and staging—this step typically runs automatically without requiring manual intervention. However, for production environments, an additional layer of control may be necessary. In such cases, an approval gateway might be implemented to ensure compliance with organizational policies before deployment proceeds.

Approval Gateway

An approval gateway is a manual step in CI/CD pipelines, primarily used in production environments to ensure compliance with organizational policies and mitigate risks. In many companies, deploying to production requires approvals from managers or stakeholders, often collected through emails or meetings. This reliance on manual intervention makes full automation impossible. However, modern tools like ServiceNow can streamline this process by integrating directly into the pipeline. For example, the pipeline can automatically create a change request in ServiceNow and pause execution until the request is approved. This integration balances automation with oversight, ensuring deployments meet compliance requirements while maintaining efficiency.

In contrast, approval gateways are unnecessary when manual approvals are not required. In such cases, pipelines can automatically deploy applications to production after passing all predefined stages, such as rigorous testing and post-deployment validations. This approach is common in Continuous Deployment scenarios, where speed and efficiency are prioritized. However, it requires well-defined pipeline steps and robust testing to minimize risks. For non-production environments like development or staging, direct deployments without manual intervention should always be encouraged to maintain agility and accelerate feedback cycles.

Post-Deployment Testing

Post-deployment testing is a crucial stage in the CI/CD pipeline, designed to validate the functionality and stability of an application after deployment. This stage can be executed after deploying to a staging environment or even after a production deployment. It involves running various test cases, such as end-to-end (E2E) tests, smoke tests, regression tests, and others, to ensure that the application operates as expected with the newly implemented features. While this stage is not mandatory, its inclusion depends on the project’s context and the agreements made within the team.

Typically, post-deployment testing is most effective when performed after deployment to a pre-production environment. This allows teams to detect and address potential issues before deploying to production. However, in Continuous Deployment scenarios, where changes are automatically pushed to production, post-deployment testing can also occur after production deployment. In such cases, having a robust validation suite is essential to catch any issues early and trigger an automated rollback process if necessary.

Rollback Mechanisms

Rollback mechanisms are an essential part of a CI/CD pipeline, designed to maintain system stability in case of deployment or post-deployment test failures. When an issue is detected—either during automated testing or through monitoring in production—the pipeline can automatically revert the application to the last known stable version. This process minimizes downtime and mitigates the impact of faulty deployments.

Automated rollback typically works by leveraging versioned artifacts or container images stored in registries or repositories. For instance, if a deployment fails, the pipeline can redeploy the previous version of the application. Rollbacks can also be triggered by monitoring tools that detect anomalies, such as increased error rates or degraded performance. Integrating rollback mechanisms ensures a safer deployment process, especially in Continuous Deployment scenarios where changes are pushed to production frequently.

Flaky Tests

Flaky tests are tests that produce inconsistent results, passing sometimes and failing others, despite no changes having been made to the underlying code. These unpredictable outcomes make it difficult to identify the root cause of the failures. Addressing flaky tests is crucial because they can erode trust in the testing pipeline, reduce productivity, obscure real issues, and delay deployments. To mitigate these problems, teams should focus on reviewing test cases, maintaining clean and well-organized test suites, and refactoring brittle or unreliable tests. These practices help ensure the reliability of automated testing, fostering a robust and efficient pipeline.

Below is an example of a common flaky test behavior

Optimization Techniques

Pipeline performance is a crucial factor in ensuring efficient and timely software delivery. A poorly optimized pipeline can lead to delays, wasted resources, and frustration for development teams. Here, we’ll explore some important optimization techniques to improve the performance and reliability of your CI/CD pipeline.

Cache Dependencies

One of the most important aspects to consider in pipeline execution is managing the dependencies required to perform its steps. Without a caching mechanism, the pipeline will repeatedly download these dependencies for every execution, significantly delaying completion and wasting resources. The same principle applies to tools like Trivy, where each execution downloads a new vulnerability database. By caching these dependencies and databases, we can reuse them in subsequent runs, reducing build times, optimizing resource usage, and improving overall efficiency. Caching not only accelerates pipelines but also ensures consistency across builds by reusing the same versions of dependencies.

Below are examples of caching dependencies in GitHub Actions to enhance pipeline efficiency. The first caches the Trivy vulnerability database to reduce scan times, and the second caches Gradle dependencies to accelerate builds. Caching improves speed and optimizes CI workflows.

Applying Parallelism: When and Why?

Parallelism is a concept that can help improve pipeline performance, but we don't always need to apply it, it depends a lot on the context.

Bringing the term parallelism to the CI/CD pipeline, we can apply this concept at various times, such as when running tests at the beginning of the execution. We don't need to run the tests sequentially, that is, first run the unit tests, then the integration tests, then the vulnerability tests. This will completely reduce the pipeline's performance and will increase the execution time for unnecessary things. We can use parallelism very well here, where everything will be executed in parallel, but the failure of one thing will not affect the execution of another thing and at the end we will have a lot of information about everything we need to fix in case of failure. The execution time in the case of parallelism is given by the process that takes the longest.

Sequential execution is applied in cases where we need to wait for the result of the previous step. A good example is the initial tests I mentioned above that can be executed in parallel where the binary generation step or the container image generation step needs to wait for the result of the previous step, in this case the unit, integration and security tests. We cannot generate artifacts that contain vulnerabilities, are of low quality and full of bugs because it is a waste of time and processing, in addition to the possibility of this code being deployed due to some human error.

Below is an example of parallelism applied to the security step, enabling multiple security checks to run simultaneously.

Conclusion

A well-designed CI/CD pipeline is not just a technical necessity but a strategic advantage in delivering reliable software quickly. By incorporating thorough testing, security measures, and optimization techniques, teams can ensure smooth deployments while maintaining high quality.

Did you enjoy this post? Interested in discussing CI/CD further? Feel free to share your questions and ideas in the comments or connect with me on LinkedIn.

DEV Community