DEV Community

Mustafa ERBAY
Mustafa ERBAY

Posted on • Originally published at mustafaerbay.com.tr

Dependency Security: Stopping the Build or Warning?

Dependency management in software projects, while seemingly easy at first glance, becomes complex when security is involved. Once you start using a few libraries, and those libraries have their own dependencies, you quickly find yourself managing hundreds, even thousands, of packages. This is where the issue of Dependency Security brings with it a fundamental question: "Should we stop the build, or just issue a warning?"

Over the years, I've encountered this dilemma many times, both in large corporate projects and in my own side projects. Both approaches have their advantages and disadvantages. As a pragmatic systems engineer, what's important to me is to keep the risk at an acceptable level without completely killing development speed. In this post, I'll share the points I consider when making this decision and the experiences I've gained in the field.

Why Does Dependency Security Constantly Cause Headaches?

Dependencies in our projects are the libraries we use and their own dependencies. Modern software development is unthinkable without these packages, as writing everything from scratch is both time-consuming and inefficient. However, this convenience brings serious security risks.

A few years ago, while working on the backend of an e-commerce site, we had a constantly updated stack of packages. When we ran the npm audit command, the results sometimes showed 20-30 "High" level CVEs. Most of these were not directly related to our code but had infiltrated the system through transitive dependencies. This situation meant a significant potential vulnerability, especially in a publicly exposed system. Every new vulnerability in open-source libraries could directly affect our project.

ℹ️ Transitive Dependencies

Transitive dependencies are other libraries used by a library that your project directly uses. This layered structure makes it difficult to trace security vulnerabilities and can lead to problems emerging from unexpected places.

One of the main reasons for this constant headache is the complexity of the dependency tree. If a library has 5-10 dependencies, and those also have their own dependencies, the chain quickly extends. Manually checking the security of each dependency is almost impossible. That's why we need automated tools, but how these tools should act becomes a critical question.

Stopping the Build: A Zero-Tolerance Approach to Security

Stopping the build, or applying the "fail-fast" principle, is a zero-tolerance approach to security. In this method, when your CI/CD pipeline detects a vulnerability, it completely prevents the code from being deployed. The basic argument is: preventing vulnerable code from reaching the production environment from the outset is much cheaper than the costs that would arise later.

We adopted this approach for a service we developed for an internal banking platform. The security team demanded that the build absolutely fail if any "High" or "Critical" level CVE was detected. Initially, it sounded logical: clean code, secure system. However, this led to significant friction within the development team. We were getting an average of 12 build failures a day. Most of the time, the entire deployment process would stop due to a vulnerability in a small library's function that we weren't even directly using.

# Example CI/CD pipeline step (pseudo-code)
security_scan:
  stage: test
  script:
    - npm audit --production --audit-level=high
    - if [ $? -ne 0 ]; then
        echo "Critical or High-level dependency vulnerabilities found. Stopping the build!";
        exit 1;
      fi
  allow_failure: false # This is critical, it stops the build
Enter fullscreen mode Exit fullscreen mode

The biggest advantage of this approach is that it minimizes the risk of security vulnerabilities leaking into the production environment. Every error is immediately visible and must be fixed. However, its disadvantages are also considerable. Developers can lose motivation due to constantly broken builds and develop a kind of "blindness" to security scanners. Additionally, there are situations where not every dependency vulnerability poses an immediate risk, but this approach doesn't differentiate them.

Issuing a Warning: Flexibility or Risk Postponement?

Issuing only a warning instead of stopping the build is a more flexible approach. In this scenario, dependency scanners detect and report vulnerabilities, but the CI/CD pipeline continues to run. The goal is to inform developers and provide security teams with a list to track.

In one of my side projects, I initially preferred this method. I didn't want to interrupt development speed and found it unnecessary for "Medium" or "Low" level vulnerabilities to immediately stop the build. At first, everything was fine; we occasionally reviewed the warnings and fixed the critical ones. However, about 6 months later, the accumulation of over 40 medium-level CVEs made me seriously reconsider. Most of these vulnerabilities, though not directly related, were starting to pose a significant overall risk.

# Example CI/CD pipeline step (pseudo-code)
security_scan:
  stage: test
  script:
    - npm audit --production --audit-level=info
    - echo "Dependency vulnerabilities detected. Please review the report."
  allow_failure: true # This is important, it does not stop the build
Enter fullscreen mode Exit fullscreen mode

The main advantage of this approach is that the development flow is not interrupted. Developers are informed about security issues but are not required to make an immediate fix. This can be preferred in projects requiring rapid delivery. However, the risk is that these warnings may be ignored over time, and security debt accumulates. Over time, accumulated warnings become "noise," and even a truly critical vulnerability can get lost in this noise.

⚠️ Warnings Getting Lost

Too many warnings, just like too many logs, can cause important information to be overlooked. Development teams may eventually start to disregard constant warnings, which can lead to serious security vulnerabilities going unnoticed.

Criticality Levels and the Role of Automated Fixes

Not all dependency vulnerabilities are equal. Criticality levels such as "Critical," "High," "Medium," and "Low" indicate the potential impact and exploitability of a vulnerability. Taking action based on this distinction offers a more balanced approach. For example, stopping the build for a "Low" level vulnerability might be less sensible than only stopping it for "Critical" or "High" levels.

In an ERP project for a manufacturing company, we adjusted our security policy according to these criticality levels. We decided to stop the build only for "Critical" and "High" level CVEs. This reduced the number of build failures by 75% and allowed developers to deal with fewer "false positives." For "Medium" and "Low" level vulnerabilities, we created a separate security dashboard and tracked them regularly.

💡 Automated Remediation Bots

Tools like Dependabot or Renovate can help remediate vulnerabilities by automatically updating your dependencies. These bots create pull requests for secure updates and reduce developer workload. However, it's important to remember that automated updates don't always work flawlessly and can sometimes lead to breaking changes.

Automated dependency updaters also play an important role in this process. These bots can automatically create a pull request for a patched version of a dependency when a new security vulnerability is detected. This significantly reduces the manual workload developers have to track. However, it's also important to consider that automated updates don't always work flawlessly and can sometimes lead to breaking changes or incompatibilities. Therefore, automated updates must also pass through the CI/CD pipeline and be tested.

My Preference: A Context-Based Hybrid Approach

Years of experience have shown me that a "one-size-fits-all" solution does not exist for dependency security. Every project, every team, and every organization has its unique risk tolerance and development culture. Therefore, my clear position is a context-based, hybrid approach.

The strategy we applied in an internal banking platform was completely different from the strategy I applied in my Android spam application. In the bank, even the slightest vulnerability carried significant financial and reputational risks; therefore, stopping the build for "Critical" and "High" level vulnerabilities was mandatory. In my Android application, being a less risky project, I only monitored "Medium" level vulnerabilities and intervened manually periodically.

I generally implement my hybrid approach with the following steps:

  1. Stop the Build for Critical Level Vulnerabilities: I absolutely stop the CI/CD pipeline for all CVEs marked as "Critical" or "High." This is the fastest way to eliminate the most urgent and potentially most destructive risks.
  2. Warn and Track for Medium and Low Level Vulnerabilities: I do not stop the build for "Medium" and "Low" level vulnerabilities. Instead, I track these vulnerabilities on a separate security dashboard (e.g., via Slack integration or Jira tickets). This keeps developers informed without disrupting their flow.
  3. Use Automated Updates: I try to automatically integrate patched dependency versions using tools like Dependabot or Renovate. These pull requests pass through the test pipeline like other code changes.
  4. Periodic Manual Review and Risk Assessment: Every quarter or before a major release, I manually review accumulated "Medium" and "Low" level vulnerabilities. During this review, I assess how much risk the vulnerability poses in the project's real-world usage scenario. Sometimes a vulnerability may not affect the module we are using, and in this case, it can be added to an exception list.

This approach allows for a delicate balance between security and development speed. We eliminate the most critical risks and prevent developers from constantly struggling with build failures. My experiences with [related: Observability in Software Development] have repeatedly shown me how important these tracking processes are. Similarly, I detailed these automation steps in a post I wrote on [related: CI/CD Pipeline Security].

Considerations and Metrics in Practice

When implementing a hybrid dependency security strategy, there are a few important points to consider. First, the "false positive" rate needs to be managed well. Sometimes security scanners can issue warnings for situations that do not actually pose a risk. In such cases, it is important to carefully evaluate whether the vulnerability is truly exploitable in the project's context and, if necessary, add it to an exception list. However, these exceptions must be used very carefully and documented.

In an ERP for a manufacturing company, we received a "false positive" warning for a "Medium" level CVE in a specific library for 6 months. Constantly seeing the same warning caused the team to become desensitized to other critical warnings. Then we realized that in our use case, this did not pose a risk because we never called the vulnerable function. In such situations, creating a decision log and clearly stating why an exception was made is vital.

🔥 Exception Lists and Risks

While exception lists are useful for managing "false positive" situations, they can create security gaps if misused. Every exception should be made with a detailed risk assessment and security team approval, and also reviewed regularly.

Second, it's important to track the right metrics to measure security performance. Some key metrics I track include:

  • New Vulnerabilities Per Sprint: The number of new "Critical" and "High" level vulnerabilities detected at the end of each sprint.
  • Critical Vulnerability Mean Time To Resolve (MTTR): The average time from detection to resolution of a "Critical" or "High" marked vulnerability.
  • Build Failure Rate Due to Security: The percentage of builds that fail due to security vulnerabilities, relative to the total number of builds.
  • Security Debt: The total number of accumulated "Medium" and "Low" level vulnerabilities.
Metric Name Definition Target Value
New Critical Vulnerabilities / Sprint Number of new Critical/High vulnerabilities emerging each sprint < 1
Critical Vulnerability MTTR Time from detection to remediation of a Critical/High vulnerability < 24 hours
Security Build Failure Rate Ratio of builds failing due to security scans < 5%
Security Debt (Medium/Low) Total number of accumulated Medium/Low vulnerabilities < 50 (varies by project)

These metrics allow us to see trends over time and understand whether our security posture is improving. For example, if the critical vulnerability resolution time is increasing, this could indicate a workload issue for the security team or developers.

Finally, creating and maintaining an SBOM (Software Bill of Materials) provides transparency in dependency security. An SBOM is a list of all dependencies used in your project and their versions. This list helps you quickly identify which of your projects are affected when a new CVE is published.

Conclusion

Dependency security is an inevitable reality of modern software projects. Choosing between stopping the build or just issuing a warning depends on the project's context, risk tolerance, and team culture. In my experience, the most effective way is to implement a hybrid strategy that balances these two approaches.

Showing zero tolerance for critical vulnerabilities while providing flexibility for lower-level issues is key to both maintaining security and preserving development speed. Let's remember that security is a journey and requires continuous adaptation and learning. The important thing is to understand the risks, use the right tools, and keep the team's security awareness high.

Top comments (0)