DEV Community: Thomas Segura

Pipeline Integrity and Security in DevSecOps

Thomas Segura — Wed, 05 Jun 2024 13:58:30 +0000

This is the third blog post in a series that is taking a deep dive into DevSecOps program architecture. The goal of this series is to provide a holistic overview of DevSecOps as a collection of technology-driven, automated processes. Make sure to check out the first and second parts too!

At this point in the series, we have covered how to manage existing vulnerabilities and how to prevent the introduction of new vulnerabilities. We now have a software development lifecycle (SDLC) that produces software that is secure-by-design. All we have left to cover is how to enforce and protect this architecture that we’ve built.

In this article, we will be learning how to add integrity and security to the systems involved in our SDLC. These processes should be invisible to our software engineers. They should simply exist as guardrails that ensure the rest of our architecture is utilized and not interfered with.

With that in mind, I want to reiterate my DevSecOps mission statement:

“My job is to implement a secure-by-design software development process that empowers engineering teams to own the security for their own digital products. We will ensure success through controls and training, and we will reduce friction and maintain velocity through a technology-driven, automated DevSecOps architecture.”

Threat landscape

Whenever we talk about securing something, we need to answer the question, “From what?” Threat modeling is the practice of identifying things that could go wrong based on what we are trying to protect and who/what we are protecting it from. We can’t possibly cover every scenario, but in the context of our DevSecOps architecture there are a handful of threats that we should be considering broadly.

The diagram below is a threat model of the software development process that I like a lot:

Of the threats listed above, “use compromised dependency” (Threat D) is the most challenging to mitigate. We usually have little control over the external dependencies that we use in our code. The xz utils backdoor was an eye-opening spotlight on the widespread impact that a single compromised dependency can have.

Unfortunately, malicious insiders in the open-source ecosystem are an unsolved problem at this time. Personally, I don’t think anything will improve until well-resourced consumers of open-source software become more involved in improving the security and support of their open-source dependencies.

In this article, we will be focusing on the things that we can control. For dependency threats, we can look out for malicious look-alike packages, and we can use SCA tools to identify when we are using outdated, vulnerable versions of our dependencies.

In the following sections, we will explore ways to mitigate source threats and build threats through integrity checks. Then, we will examine the assumptions we are making about the integrity checks and discuss how we can use security to build trust in those assumptions.

Pipeline integrity

When we think about integrity in software, we often default to thinking of it as signing binaries. In our DevSecOps architecture, we go beyond verifying individual software artifacts. We need to be able to verify the integrity of our pipeline. You might be wondering, “We set up the software development pipeline ourselves… Why would we need to verify its integrity?” It turns out that the assumptions we make about our SDLC can be wrong. We have security gates, but that doesn’t guarantee that we have no gaps.

In software development environments, there are usually ways to skip steps or bypass controls. For one, it’s common for software engineers to be able to publish artifacts like container images directly to our registry. Even innocent intentions can help vulnerabilities slip around security checks that would have otherwise caught them. In a worse scenario, a compromised developer account could allow a threat actor to push backdoored packages directly to our registry.

To verify that our software artifacts are the product of the DevSecOps systems that we have in place, we must improve the integrity of our software development pipeline.

Branch protection

One of the most important controls in our software development pipeline is branch protection. Branch protection rules protect our integrity against the source threats in our threat model (Figure 1).

By requiring a Pull Request (PR) to merge code into our production branch, we are ensuring that humans are authorizing changes (Threat A) and verifying that the source code is free from vulnerabilities and backdoors (Threat B). We can also trigger automatic builds when there are changes to our production branch, which will produce builds that come from the source code that has been reviewed (Threat C).

Reproducible builds

In the Solarwinds supply chain attack, it was the compromise of Solarwinds’ build servers that led to the injection of the Sunburst backdoor. Injecting malicious code late in the software development pipeline is an effective way to reduce the chance that a human will catch the backdoor.

In the long run, the best mitigation strategy we have against compromised builds (Threat E) is to make our software builds reproducible. Having reproducible builds means that we can run the same steps against the same source code on a different system and end up with the exact same binary result. We would then have a way to verify that a binary was not tampered with. If we rebuilt and the resulting binary was different, it would raise some questions and warrant investigation.

Artifact signing

Signing software has been a common practice for a long time because it provides a way for consumers to verify that the software came from us. This protects software consumers even if our package registry is compromised (Threat G).

Unfortunately, this still leaves us with a lot of assumptions because a binary signature doesn’t say anything about how the software was built. For example, a software engineer or threat actor might have write access to our container registry and the ability to sign container images. By pushing a locally built container image directly to our registry, they would still be bypassing the automated checks and human reviews that happen in our PRs.

SLSA framework

To provide a way for us to verify how and where a piece of software was built, the Open Source Security Foundation (OpenSSF) created the Supply-chain Levels for Software Artifacts (SLSA) framework. What does SLSA do for us in practice? If we reuse the earlier example of a software engineer pushing a container directly to our registry, we could have a verification step before deployment that would detect that the container wasn’t built in our CI pipeline.

SLSA ranks our software build process on a 0-3 scale to determine how verifiable it is. A whole article could be written about SLSA, but to keep things short, here is a summary of the 4 levels and what they aim to protect against:

Level 0 – Nothing is done to verify the build process. We don’t have any way to verify who built the software artifact nor how they built it.

Level 1 – Software artifacts are distributed with a provenance that contains detailed information about the build process. Before using or deploying the software, we can use the information in the provenance to make sure that the components, tools, and steps used in the build are what we expect.

Level 2 – The provenance is generated at build time by a dedicated build platform that also signs the provenance. Adding a signature to the provenance allows us to verify that the documentation came from our build platform and hasn’t been forged nor tampered with.

Level 3 – The build platform is hardened to prevent the build process from having access to the secret used to sign the provenance. This means that a tampered build process cannot modify the provenance in a way that would hide its anomalous characteristics.

At SLSA level 3, we have a way to verify that we aren’t falling for build threats E-H in our threat model (Figure 1). However, you might notice that we start placing some trust in the build platform to be hardened in an adequately secure way.

Trust in the platforms that make up our SDLC is one of the guiding principles of SLSA. The purpose of SLSA is to verify that our software artifacts came from the expected systems and processes rather than individuals with write access to our package registries. How do we build trust in the systems that produce our software? By securing them.

Pipeline security

Our DevSecOps architecture is technology-driven, which means there are multiple systems that compose our software development pipeline. At this point in our DevSecOps journey, we have confidence that our SDLC is producing reasonably secure software, and we have ways to verify that our pipeline is being used and not skipping steps. The final threat we have to deal with is the compromise of our deployed services or the systems involved in our software development pipeline.

Securing development systems

If a system involved in development or CI gets compromised by a threat actor, they may be able to inject backdoors into our software, steal secrets from our development environment, or even steal data from the downstream consumers of our software by injecting malicious logic. It’s common for us to put a lot of energy into securing the production systems that we deploy our software to, but we need to treat the systems that build our software like they are also production systems.

Workstation security

On the “left” side of our SDLC, our developers are writing code on their workstations. This isn’t an article about enterprise security, but endpoint protection solutions such as antivirus and EDR play an important role in securing these systems. If we are concerned about our source code being leaked or exfiltrated, we might also consider data-loss prevention (DLP) tools or user and entity behavior analytics (UEBA).

Remote development

If we want to go a step further in protecting our source code, we can create remote development environments that our developers use to write the code. Development tools like Visual Studio Code and Jetbrains IDEs support connecting to remote systems for remote development. This is not the same thing as dev containers which can run on our local host. Remote development refers to connecting our IDE to a completely separate development server that hosts our source code.

This isolation of the software development process separates our source code from high-risk activity like email and browsing the internet. We can combine remote development with a zero-trust networking solution that requires human verification (biometrics, hardware keys, etc) to connect to the remote development environment. If a developer’s main device gets compromised, remote development makes it much harder to steal or tamper with the source code they have access to.

Remote development obviously adds friction to the software development process, but if our threat model requires it, this is a very powerful way to protect our source code at the earliest stages of development.

Build platform hardening

The SolarWinds supply chain attack that we covered earlier is a prime example of why we need to treat build systems with great scrutiny. Reproducible builds are a way to verify the integrity of our build platform, but we still want to secure these systems to the best of our ability.

Similarly to workstation security, endpoint protection and other enterprise security solutions can help monitor and protect our build platform. We can also take additional steps like limiting administrator access to the build platform and restricting file system permissions.

Securing deployment systems

If our software is deployed as a service for others to use, we need to make sure that we are securing our deployment systems. A compromised service can leak information about our users and allow a threat actor to pivot to other systems.

Zero-trust networking

A powerful control against the successful exploitation of our applications is restricting outbound network access. Historically, it’s been very common for public-facing applications to be in a DMZ, a section of our internal network that can’t initiate outbound network connections to the internet or any other part of our network (except for maybe a few necessary services). Inbound connections from our users are allowed through, but in the event of a remote code execution exploit, the server is unable to download malware or run a reverse shell.

If we use Kubernetes for our container workloads, we can utilize modern zero-trust networking tools like Cilium to connect our services and disallow everything else. Cilium comes with a UI called Hubble that visualizes our services in a diagram to assist us in building and troubleshooting our network policies.

Privilege dropping and seccomp

If we run our services inside Linux containers, we can easily limit their access to various system resources. Seccomp is a Linux kernel feature that works with container runtimes to restrict the syscalls that can be made to the kernel. By default, most container deployments run using “Unconfined” (seccomp disabled) mode.

At a minimum, we can use the “RuntimeDefault” seccomp filter in Kubernetes workloads to utilize the default seccomp profile of our container runtime. Here is an example of the syscalls blocked by Docker’s default seccomp filter. The list of syscalls in a default filter are typically blocked to prevent privilege escalation from the container to the host. There may be certain low-level or observability workloads that do need to run “unconstrained,” but in general, the default seccomp filter is intended to be safe for most applications.

If we wanted to be even more restrictive, we could create our own seccomp filters that only allow the syscalls needed by our application. In some cases, restricting syscalls at this level could even prevent the successful exploitation of a vulnerable system. I did a talk on this back in 2022 that explains how to automate the creation of seccomp allowlist filters in a way that fits nicely into existing DevOps workflows. Be aware, however, that seccomp allowlist filters can introduce instability into our application if we aren’t performing the necessary testing when creating the filter.

Container drift monitoring

Another powerful security feature that container deployments enable is container drift monitoring. Many container applications are “stateless,” which means that we wouldn’t expect them to be changing in any way. We can take advantage of this expectation and monitor our stateless containers for any drift from their default state using tools like Falco. When a stateless container starts doing things that it wouldn’t normally do, it could indicate that our app has been exploited.

Identity

Lastly, let’s look at a few identity-related practices that can meaningfully improve the security of the systems in our software development pipeline.

Secrets management

There is a lot of complexity in DevSecOps around identity and access management because we are dealing with both human and machine identities at multiple stages of our SDLC. When our services talk to one another, they need access to credentials that will let them in.

Managing the lifecycle of these credentials is a bigger topic than what we will be covering here, but having a strategy for secret management is one of the most important things we can do for the security of the systems in our SDLC. For detailed advice on this topic, check out GitGuardian’s secret management maturity model whitepaper.

Leaked secret prevention

No matter how mature our secret management process is, secrets always seem to find a way into places they shouldn’t be. Whether they are in source code, Jira tickets, chats, or anywhere else, it’s impossible to prevent all our secrets from ever being exposed. For that reason, it’s important to be able to find secrets where they shouldn’t be and have a process to rotate leaked secrets so they are no longer valid.

Honeytokens

Leaked secrets are a very sought-after target for threat actors because of their prevalence and impact. We can take advantage of this temptation and intentionally leak special secrets called honeytokens that would never be used except by malicious hackers that are looking for them. By putting honeytokens in convincing locations like source code and Jira tickets, we are setting deceptive traps with high-fidelity alerts that catch even the stealthiest attackers.

Others

We could list more ways to secure our infrastructure, but, like I said earlier, this isn’t an article about enterprise security. The topics we covered were included because of the special considerations we must take in the context of the software development environment.

Ultimately, collaboration between product and enterprise security is an important factor in protecting the integrity of our DevSecOps architecture. It is our collective duty to prevent threats from impacting us and those downstream who use our software products.

Conclusion

DevSecOps architecture is driven by the technologies involved in the development process. Securing these systems builds trust in our software development pipeline, and adding ways to verify the integrity of our pipeline is what ultimately allows us to mitigate many of the supply-chain threats that we and our customers face.

There will always be new things to learn and new ways to iterate on these strategies. The DevSecOps architecture described in this series is meant to provide a holistic and modern approach to application security that can be built upon. I hope that you are leaving with goals that will set your development teams up for success in securing their digital products.

Secure-by-Design Software in DevSecOps

Thomas Segura — Wed, 22 May 2024 12:56:46 +0000

This is the second blog post in a series that is taking a deep dive into DevSecOps program architecture. The goal of this series is to provide a holistic overview of DevSecOps as a collection of technology-driven, automated processes. If you didn’t read the first blog post, make sure to check that out too!

This entry will be less about the “decision-making” side of things, and more about the developer experience. We will learn how to equip our software engineers with the tools they need to successfully own the security for their own code, and how to support them through automation. Before getting started here, I want to reiterate my mission statement for DevSecOps:

What does “secure-by-design” mean?

In this blog post, we will be constructing a process that makes our applications “secure-by-design”. This means that our software development lifecycle (SDLC) includes security as part of the development process, which guarantees a minimum level of security for every software product it produces.

Through this strategy, our digital products will have gone through rigorous security testing by the time they are published, and we will have caught many vulnerabilities before anything is made available to customers.

In DevSecOps, there are 2 key components in our software development pipeline that support our secure-by-design strategy: Security Gates and Automation. These 2 components work together to enforce a security baseline while preserving efficiency. Just like last time, we will be building a diagram to visualize how this process works.

Software development pipeline

In DevSecOps, we build our security practices on top of our existing software development practices. For the purposes of this blog post, we will say that our example organization produces container-based apps deployed through Kubernetes. These are the high-level artifacts or steps in our example pipeline:

Your development process may look different, but as we go, you’ll find that many of the concepts will be transferable to whatever you’re working with. Let’s break each stage down briefly before moving on.

Uncommitted Code

The software development pipeline begins with the software engineer drafting new code or modifying the application. This typically happens on the developer’s local workstation. From there, the code is added to the git history using the “git commit” command.

Committed Code

Once the code has been committed to git, it now exists in our repository’s history forever. We can make additional changes, but the old commits are still able to track how our code changed over time. This becomes important later. At some point, we will “git push” our code to our git hosting provider. In this case, that’s GitHub.

GitHub Dev Branch

It’s not good practice to make changes directly on your production (prod) branch, so we have a different development (dev) branch to push our code to first. Once our code is ready to be published, we will open a Pull Request to review and merge the changes from the dev branch into our prod branch.

GitHub Prod Branch

If the code passes review in the Pull Request from the dev branch, it is ready to be published as a container. We will build a container image with our code and publish the image to our container registry.

Container Registry

In the container registry, our software sits frozen as an image waiting to be deployed. Our example organization uses Kubernetes for orchestration to deploy the container(s) to run our app.

Deployment

Once we’ve deployed our app, it is running and exposed to the internet. Years ago, this was where most organizations would start doing vulnerability assessments of their applications. This means any major security issue would require that we start all the way back from the beginning and go through the entire process again.

That’s obviously not very efficient, so over time, new tools were created to help developers shift security “left” in the pipeline and catch vulnerabilities earlier. Unfortunately, these tools have only been marginally effective, and vulnerable software is still extremely common. If we have the tools to tell us what is vulnerable and things still aren’t getting better, what are we doing wrong?

I theorize that our problem with secure software development lies with our requirements. The ONLY requirement for software development has traditionally been that the app meets our functionality needs.

To make software that is secure-by-design, we need to redefine what “production-ready” means for our digital products. We must agree upon some minimum level of security and guarantee that we are meeting those standards through Security Gates that prevent us from moving on until we have met our security requirements. This leads us to the first key component of our DevSecOps pipeline.

Security gates

To guarantee security in our software development pipeline, we will be adding scans or checks to each step of the process. Each of these scans will be automated to make sure that they can’t be forgotten.

Scanner Types

In the last blog post, we covered many types of vulnerability scanners in the “Identification” stage of our process diagram. The same types of scanners will be used in the security gates we are about to cover, though we use them differently here.

In the vulnerability management lifecycle from last time, our security scanners were cataloging historical, previously introduced vulnerabilities in our software. In our security gates, we are mostly concerned with preventing the introduction of new vulnerabilities.

There’s one more distinction in how we are implementing the scanners here that I talked about briefly in the last post. Rather than making our software engineers context-switch and look at some web dashboards to see their vulnerabilities, we will be integrating our security scanners with the tools they already use. By doing this, we are showing them vulnerabilities where they’re relevant, and when they’re relevant.

With that out of the way, let’s jump into the different types of security gates that we can set up in our pipeline. Below is the next part of our diagram which shows the interactions between our traditional pipeline and our security gates.

" width="800" height="1215">

As you can see, rather than moving directly from one stage of our pipeline to the next, our software now has to pass an automated security check before it can progress. We can set whatever we want as our threshold for failure, but ultimately, a failure will require us to fix vulnerabilities before we can move on.

Now, let’s go into detail on what each type of security gate looks like. We’re going to start slightly out of order, but it will be clear why in a moment.

Pull Request checks

Pull Request (PR) checks are one of the most critical pieces in our secure-by-design SDLC because PRs are the first place in our pipeline where we can truly guarantee that security checks are happening. Prior to the code being in GitHub, everything is happening on our local workstations. We trust our software engineers, but we can never be 100% sure that they aren’t disabling security checks or using personal devices to write code. We do know that we have full control of the GitHub repository, though.

By configuring branch protection rules, we can force all code to go through a Pull Request before transitioning from our “development” branch to our “production” branch. In DevSecOps, we use PRs as an opportunity to run various security checks and require them to pass before the code can be merged.

Pre-commit and pre-push hooks

Even though the Pull Request is the first enforceable security gate, we should still try to help our developers out by shifting checks further “left” to catch things even earlier in the process. Pre-commit and pre-push hooks are automated actions that run when we execute a “git commit” or “git push” command.

As a security gate on our software engineer’s workstations, we can configure pre-commit or pre-push hooks. I emphasize the word “or” because the two hook types are redundant if all we want to accomplish is passing security checks before pushing our code to GitHub.

There are reasons to use one or the other depending on what type of scan we’re talking about. For example, we can set up a hook for discovering leaked secrets. A pre-push hook won’t detect a leaked secret until it has already been committed to the git history. At that point, it might be difficult to undo our commits and we may have to revoke and rotate the secret.

We can save ourselves from all that extra work by configuring our secret scanner as a pre-commit hook, which prevents our code from entering the git history until it is free from secrets. This is a great example of why we need security gates at multiple steps of the software development pipeline. We can’t guarantee or perform some scans until later, but it saves us time to catch things as early as possible.

On the other hand, pre-push hooks might make more sense for SAST or SCA scanners. One of the software engineers I work with once gave me feedback that overdoing pre-commit hooks can cause you to be unable to commit the code you are working on. Especially when you are trying to commit a lot of new code all at once. By using pre-push hooks instead of pre-commit, we are allowed to commit code with vulnerabilities and still use the version control functionality of git. Then, we just fix any issues before pushing to GitHub.

The debate of pre-commit versus pre-push hooks is arguable other than secret scans. My recommendation is to have security and software engineering teams collaborate and agree on what makes the most sense. The key thing to remember is that performing the same scan at different stages of the SDLC is helpful, not unnecessary.

Container image scans

Now let’s get back on track in our software development pipeline. We’ve merged a Pull Request that passed the security gates in our PR checks. Now the code has been merged into our “prod” branch, and we are ready to build the container image that will run our application. If you have multiple environments, this same thing might occur in your “dev” branch. This security gate will look the same either way.

In this example, we will build and publish our container image using an automated workflow. After the step of our workflow that builds the image, we can run a container scanner that will fail the build if it finds vulnerable dependencies in our image that exceed our severity threshold. If the scan passes, our workflow publishes the image to our container registry, where it waits to be deployed.

Kubernetes admission controller

The last security gate in our example pipeline is a Kubernetes admission controller. Admission controllers are just plugins that validate or even modify the instructions given to our Kubernetes cluster. The capabilities of the admission controller depend on the plugins.

In our example pipeline, our admission controller is scanning the container images being deployed for critical dependency vulnerabilities that may have been introduced since the container was built. We could also enforce configuration policies that prevent unsafe settings from our infrastructure-as-code (IaC).

If our admission controller plugins pass all their validations, our application finally reaches deployment. Having gone through each of the security gates we covered along the way, we can now be confident that our application meets the minimum security level that we configured in our security gates.

Assisting with automation

At this point, you’re probably thinking, “This is adding a lot of hassle to getting to deployment.” You’re not wrong, but you should remember that we are redefining what “production-ready” means so we can improve the security of our apps. Luckily for us, this isn’t the end of the post. Now comes the part where we look at the other key component of our DevSecOps pipeline that aims to assist developers and protect our velocity: Continuous Automation.

Our goal for Continuous Automation is to provide software engineers with assistive technology that produces automated feedback and security fixes. This will reduce the time and cognitive load needed to meet our new security requirements. Just like our security gates, we want to utilize technologies that operate where work is already being done to reduce context-switching. Below is our final process diagram, to which we have added the Continuous Automation column:

IDE Plugins

The IDE is as far “left” as we can get when providing vulnerability feedback – it’s where our software engineers are writing the code! By using security scanners as IDE plugins, we can get instant feedback in the linter and see issues in the code we are already working on. We can have all sorts of security scans take place in our IDEs: SAST, SCA, Secrets, IaC, etc… Also, these plugins will work the same whether the code has been committed or not.

Local container image scan

Because container images aren’t built until the later stages of our software development pipeline, they don’t get stopped by a security gate until quite far into the process. The further “right” we are when an issue is discovered, the more time-consuming it is for us to fix it.

To reduce the number of times we get stopped by a container image scan late in our security gates, we need to give our software engineers the ability to run the same container image scans locally on their workstations in the earliest stages of development.

Automated Dependency Updates

Out of everything in our DevSecOps pipeline, one of the most challenging things we have to deal with is the constant discovery of vulnerabilities in our dependencies. Our software can pass through all our security gates and reach deployment, only to have a critical vulnerability discovered in some component that our app uses the very next day. While our software engineers can prevent the introduction of many types of vulnerabilities, this will always be something that they have to deal with. This problem can be helped by using minimal container base images that have fewer packages, but we will always have some dependencies.

To make this ongoing challenge easier and less time consuming, we must use tools that automate dependency updates for us. We can use SCA and Container Image scanners to identify vulnerable dependencies in our runtimes, container registries, and/or git branches. Robust dependency scanners will have the option to automatically create Pull Requests that update the vulnerable packages to a fixed version. Then, all we need to do is accept the fix and merge the PR.

Others

As technology evolves, there may be new ways to automate security fixes. For example, security researcher Jonathan Leitschuh has used CodeQL and OpenRewrite to automate vulnerability discovery and patches across many open-source repositories. There is also a chance that we can use future AI models to find and/or fix vulnerabilities in our code for us. Ultimately, our goal is to help developers work less while maintaining security. Any chance we get to do that will be a huge win for DevSecOps and the security of our software.

Reducing friction

Now, we’ve covered the entire process of creating a secure-by-design software development pipeline. Before wrapping up, I’d like to provide some final advice from my experiences in implementing this architecture at my own company.

Importance of consistency

Throughout this blog post and the last one, I’ve mentioned the importance of reducing context switching. Application security is a complex problem, even for those of us who dedicate our entire career to it. To simplify things for our non-security collaborators, the implementation of our DevSecOps architecture needs to be as consistent as possible. There are two main ways to accomplish this.

The first way to provide a consistent security experience for our software engineers is to ensure that our scan results are the same at every stage of the SDLC. If we use a specific container image scanner in our Kubernetes admission controller, we need to make sure that our software engineers have access to the same scan on their local workstation and build workflow. They need to be getting the same information in the same format so there are no surprises in the later stages of our software development pipeline.

The other way to drive consistency in our DevSecOps implementation is to consolidate scanning tools where possible. In our last blog post, we discussed how no one tool is the best at scanning for every type of vulnerability. But if we have the option of using the same provider for a few of our scans (e.g., SAST, SCA, and Container) rather than having a different tool for each one, it will reduce management overhead and provide a more consistent experience for our software engineers.

Order of operations

Lastly, I wanted to provide some generalized guidance on what order we should go about implementing the various components in our process diagram. The first thing you should do is help your software engineers get familiar with the security tooling and vulnerability scans. Take the time to help them get their IDE plugins set up, show them how to address false positive findings, set expectations and timelines, configure automated dependency fixes, and gather feedback about all the tools.

Once your software engineers are familiar with the tools and findings, start implementing warnings in your security gates. Don’t go blocking things right away, just display warnings for the things that would cause blocks in the future. Gather feedback about the warnings.

Finally, meet with leaders on the software engineering side and agree upon severity thresholds that will be blocking based on the feedback from the warnings and the perceived workload increase. Maybe start with PR checks and work from there on the rest.

Conclusion

There are a lot of technologies involved in the implementation of our secure-by-design SDLC. As we evaluate new code security tools, we should be trying to answer these questions to find the right fit for our DevSecOps architecture:

How and where can we use this tool as a required security gate?
How can we automate fixes or feedback with this tool to protect our velocity while meeting security requirements?

If the answer to either of these questions suggests that the tool won’t work well with our goals, it may be worth checking out alternatives to see if any can do these things.

Implementing a secure-by-design SDLC is a complex challenge. But by leveraging technology and automation, we can guarantee that our digital products meet a minimum level of security without much drag on our ability to deliver. Here is the template for the process diagram from this post, so you can use it to track your own progress.

This concludes the second entry in this series on DevSecOps architecture. The next post will be the final entry, where we will be exploring how to protect the security and integrity of the systems involved in our SDLC.

Vulnerability Management Lifecycle in DevSecOps

Thomas Segura — Thu, 25 Apr 2024 09:45:28 +0000

This is the first blog post in a series that will take a deep dive into DevSecOps program architecture. The goal of this series is to provide a holistic overview of DevSecOps as a collection of technology-driven, automated processes. What are the tools and technologies that play a role in DevSecOps? How can we use technology to set software engineering teams up for success? How do we align roles and responsibilities to ensure cohesion, safety, and velocity? These are some of the questions that will be answered as we progress through the series.

I’ve spent the last 3 years in my professional role building the DevSecOps program at my company from the ground up. There will be parts of this series that may not apply directly to your company, but the themes and overall approach may still be valuable to you. The following mission statement summarizes my approach to DevSecOps:

Vulnerability management overview

In this post, we will be examining how technology can support the “people and processes” side of DevSecOps. Controls and automation will increasingly play a role in this series, but we need to start by defining the roles and responsibilities of the humans involved.

At its core, DevSecOps is about managing vulnerabilities within software products. Some of these vulnerabilities are introduced by the software engineers creating the product, and others come from third-party dependencies that teams have little control over. The work needed to remediate these vulnerabilities must compete with other work like new features and bug fixes. In this blog post, we will look at a technology-driven vulnerability management lifecycle that allows informed decisions to be made about work selection.

Stages of vulnerability management

The vulnerability management process can be broken down into 3 stages: Identification, Observability, and Management. Each stage is critical for the success of the next one. Below is a simplified diagram that we will add to as we go through this article. Technology plays a central role in each stage, but humans are an important part of this process as well.

The 3 stages of the vulnerability management process

Identification

This probably goes without saying, but you can’t fix vulnerabilities that you aren’t aware of. Identifying vulnerabilities is a complex challenge, though. It’s not just a matter of scanning source code; there are many artifacts in the layers of a software product that require their own type of vulnerability scanner.

The first stage of our diagram is a list of places in the software development lifecycle (SDLC) where vulnerabilities or misconfigurations can arise, as well as the types of vulnerability scanning tools that can identify security issues in these areas. Here’s a quick rundown of each area:

Version Control System

GitHub is an example of a VCS, and it is often overlooked as an area that can create security risks. Things like misconfigured actions and poor access control can allow attackers to gain read access to internal code or even the ability to inject malicious code.

Code

In this context, “Code” refers to the custom code that was written by the software engineers at your own company. Vulnerability scanners such as SAST and DAST (Static/Dynamic Application Security Testing) can identify various vulnerabilities that were accidentally created in the code. But they can’t catch everything, which is why we have other specialized tools for vulnerability identification.

Secrets

Finding and preventing leaked secrets is what GitGuardian is all about. Hackers and red teamers regularly find secrets in plain text that allow them to elevate their access. Identifying leaks and prioritizing their cleanup is a critical piece of DevSecOps.

Dependencies/SBOM

SBOM stands for Software Bill of Materials, and it refers to a document containing all the third-party dependencies in your code. Why would you want to track software dependencies? New vulnerabilities emerge every day in open-source code. If your software uses a vulnerable dependency, it may also be exploitable. Software Composition Analysis (SCA) tools identify vulnerabilities in the dependencies you use.

IaC

Infrastructure-as-Code (IaC) refers to deployment code such as Terraform or Kubernetes that declaratively defines how your deployment will be configured. IaC scanners identify security misconfigurations and unsafe exposure in your planned deployments.

Containers

Containers add another layer of dependencies to your software, because they provide all the operating system programs that your code needs to run. These added dependencies can also introduce security issues into the runtime of your application. Container image scanners help you identify these kinds of issues.

Deployment

Application deployments are the final area that we will cover. If you run your app in the cloud, you can use Cloud-Native Application Protection Platform (CNAPP) or Cloud Security Posture Management (CSPM) tools to identify misconfigurations and known vulnerabilities in your deployed application. Pentesting your live applications is another way to find previously undiscovered vulnerabilities. Human-operated security testing is the only way to find some kinds of vulnerabilities, such as business logic flaws.

This list may not be exhaustive, and new threats and tools are emerging all the time. But this highlights the plethora of vulnerability types that we’re dealing with. The complexity of security scanners is the first major challenge of vulnerability management. There are many things that need to be evaluated for security issues, and no single tool can do it all. If you have coverage in all these areas, you will likely end up with a mix of sources that identify vulnerabilities. This can get messy, which is what leads us to the next stage of our process.

Observability

Once we have identified vulnerabilities, we need to make sense of the noise. The observability stage is where the magic happens in our process diagram. In this stage, we are translating the technical security issues into higher-level metrics and scores to inform our decision-makers who are not security experts.

Before we get into the diagram, I want to take a moment to rant about my biggest pet peeve in security. “Critical” has lost all meaning thanks to poor moderation by vendors of security tools. Security scanners aren’t perfect, and there will always be some number of “false positive” findings because of the wide net that they cast. But I think the word “critical” is used so often that we no longer have a term that invokes a swift response without interpretation and guidance from a security professional. If we did have a more urgent term than “critical” it would probably become useless to us too.

This leads us to the second major challenge of vulnerability management: translating risk for non-security professionals and prioritizing vulnerability findings. The observability stage of our vulnerability management process is where we attempt to do that translation.

Observability stage

On the right side of our diagram, we see that observability mainly happens in the places where our software engineers do their work. By taking advantage of features like Pull Request checks and IDE plugins, we are making vulnerabilities relevant where they are relevant. Minimizing context switching is a key focus in DevSecOps architecture that aims to reduce friction for those involved.

On the management-focused side, there is a bit more going on. The Product Owners (POs) or Managers are the ones that make the decisions about what gets worked on. For these people to make informed decisions, we need to present existing vulnerabilities in a way that is digestible for someone who is not a security expert.

Informing POs and managers about vulnerabilities can be as simple as granting them access to the dashboards of our vulnerability scanning tools. All security scanners have their own severity scoring system that attempts to rank the findings, but some tools are better than others. GitGuardian allows you to tune the severities yourself, which is a great feature.

Using individual tools to inform our less technical decision-makers isn’t ideal, though. Decision-makers don’t want to go to a bunch of different dashboards and learn how each tool works. They want a single place they can go to get a clear view of the security risk in the digital products they are responsible for.

To set our decision-makers up for success, we need to distill the multitude of vulnerabilities into a prioritized list that is relevant to our business. A tool category that aims to do this is Application Security Posture Management (ASPM).

The term “ASPM” is relatively new. Garter defined in it May 2023 as a solution that “continuously manages application risk through collection, analysis, and prioritization of security issues across the software life cycle.” The main idea is that you ingest the findings from your vulnerability scanning tools into an ASPM, and it does the prioritization and metrics for you. If you want to read more about ASPM, check out this article from GitGuardian.

I’m not saying you need an ASPM for success, but you will need many of the capabilities that an ASPM provides. For the sake of our process diagram, “ASPM” can refer to a process or solution that provides the following benefits for the observability of security issues:

“Single pane of glass” overview of your risk across multiple types of vulnerabilities
Hierarchical grouping of projects, teams, product groups, etc.
Context-based prioritization based on public exposure, known exploitation, business criticality, commit frequency, etc.
Deduplication of vulnerabilities across multiple tools
Generation of SLIs such as time-to-remediation, risk scores, etc.
Team or developer-based metrics to identify training needs
Corporate memory (who has fixed similar vulnerabilities that can help)

You may be able to build your own solution for some of these features, or your existing tools might get you close enough. You could also hire someone whose job it is to organize and track vulnerability findings. In the end, the most important outcomes of the observability stage are the metrics and prioritization of your open vulnerability findings. If your vulnerability management program feels like a mess of findings, you’re probably lacking some of the organizational capabilities listed above.

Lastly, as you tune the prioritization or scoring for your observability layer, remember the key warning from this section: if everything is critical, nothing is. Do your best to present the risk in a way that accurately reflects the likelihood and business impact of the vulnerabilities. In the next section, we will cover what it looks like for your product owners or business leaders to make decisions about vulnerability risk and remediation.

Management

Once the team leaders have access to high-fidelity data on their team’s security risk, they can make informed decisions about work selection, risk management, and training opportunities. Below is the last stage of our process diagram.

Management stage

Most use Jira or another ticketing system to plan and track work. Product owners or managers can use these systems to create issues for vulnerabilities that need to be remediated. If the observability stage has been successful in prioritizing risks, team leaders should be equipped to make informed decisions about this work themselves.

If more guidance is desired, Service Level Objectives (SLOs) are another way to think about work selection. Examples of SLOs could be:

Introducing no more than 1 preventable vulnerability per sprint cycle
Remediating critical vulnerabilities within 7 days
Remediating high vulnerabilities within 4 sprint cycles

These are just made-up examples. Security, product teams, and decision-makers should collaborate on the creation of SLOs to find a balance that works best for the business. Additionally, the security team will continue to play an important role in auditing open vulnerabilities for imminent threats.

There may be times when an SLO is not able to be met. In those cases, the security risk needs to be escalated to higher-level decision-makers along with the current workload of the product team. Security issues must fight for developer time with new features, bug fixes, and tech debt. Sometimes, business leaders may decide to accept the risk of a security issue or delay the fix because they believe it’s in the best interest of the company. Other times, a critical security issue may need to delay the launch of a new feature, and timelines need to be shifted.

The last piece of vulnerability management is an often-overlooked area: security training for non-security personnel. Insights from our observability stage can highlight which types of vulnerabilities a developer or team is struggling with. These insights help security teams and leaders identify training needs.

The most important part of security training is to create a positive learning culture. Security training shouldn’t feel like a punishment. If it does, then the implementation needs some work. A great example of security training is to have a regular security segment in a meeting dedicated to sharing and learning. The security topics covered should prioritize the needs of the audience and be presented in a way that is positive, not pointing fingers.

Final process diagram

Now that we’ve covered each stage of vulnerability management, this is the final process diagram. Here’s a link to a copyable version on Lucidchart that you can use to track your progress. You can color-code areas to mark them as having full, partial, or no coverage.

Final process diagram

Roles and responsibilities

We’ve covered the whole process diagram, but we still need to define roles and responsibilities. The following section is my suggested role structure to support the success of the vulnerability management process that I’ve laid out in this article.

Software Engineers and Product Owners

Product teams own the security work for the business’s software products. They are responsible for:

Being aware of the existing vulnerabilities in their team's projects
Managing and performing the work related to resolving vulnerabilities
Adhering to security-related SLOs, if any

Managers, Directors, Tech Executives

The higher-level decision makers own or escalate the security risk for the business’s software products. They are responsible for:

Being aware of critical severity vulnerabilities in the business’s digital products
Accepting or escalating risk when prioritizing other work over vulnerability remediation
Working with the business when prioritizing vulnerability remediation over other types of work
Auditing security-related SLOs, if any

Security Team

The software security team owns the work that supports the success of the vulnerability management process. They are responsible for:

Managing technical tools for vulnerability identification and prevention
Providing conceptual and tool-based training for software engineers
Managing technical tools for vulnerability observability
Auditing open vulnerabilities for imminent threats
Consulting for individual vulnerabilities as needed
Human-operated penetration testing for additional vulnerability identification
Working with other roles to establish security-related SLOs

Conclusion

We’ve covered a lot in this blog post, so let’s do a quick recap. This blog post showed how technology can support the “people and processes” side of the vulnerability management lifecycle. In the context of DevSecOps, our vulnerability management process includes 3 major stages: identification, observability, and management. In each stage, a considerate implementation of the technologies we covered is critical for setting ourselves up for success. In the end, our goal is to empower digital product teams to make informed decisions about how much work needs to be dedicated to remediating security risks.

After reading this blog post, I hope you’ve got a solid idea of how to approach DevSecOps from the vulnerability management side of software development. The next blog post in this series will cover the software engineering side of things, where we will look at the implementation of controls and automation to create a secure-by-design software development pipeline.

The Open-Source Backdoor That Almost Compromised SSH

Thomas Segura — Wed, 24 Apr 2024 15:45:38 +0000

Researchers discovered a backdoor in the compression tools xz Utils versions 5.6.0 and 5.6.1, targeting SSH authentication in several Linux distributions. Although not affecting production releases,the early detection prevented a potential catastrophe, underscoring the importance of vigilant open-source software monitoring.

Here is a technical overview of the supply chain attack attempt by Thomas Roccia:
// Detect dark theme var iframe = document.getElementById('tweet-1774342248437813525-527'); if (document.body.className.includes('dark-theme')) { iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1774342248437813525&theme=dark" }

Incident Recap

A malicious backdoor was found in xz utils versions 5.6.0 and 5.6.1.
The backdoor was designed to interfere with SSH public key verification, potentially allowing remote code execution.
This vulnerability targeted SSH connections in Linux distributions such as Red Hat, Debian, and Arch Linux.
Homebrew, a package manager for macOS, also incorporated the compromised version but has since reverted to an unaffected version.
Early detection prevented the backdoored versions from being widely deployed, averting widespread impact.

Long Story

In a frankly alarming discovery, researchers have uncovered a malicious backdoor deeply embedded within xz Utils, a widely utilized Linux compression tool. The discovery was particularly concerning given the broad usage of xz Utils across multiple Linux distributions, including heavyweights like Red Hat and Debian. Even more distressing was the fact that these compromised versions had made their way into beta releases of these distributions, specifically targeting software that manages SSH connections – a backbone of secure remote access.

This backdoor was ingeniously masked and went unnoticed for over a month, posing a critical risk to SSH's encrypted authentication process. Had a vigilant developer not discovered it early, the ramifications could have been dire, potentially allowing unauthorized access or remote code execution to countless systems worldwide.

The malicious code was sneakily introduced through updates under the guise of standard improvements. It was later revealed that these changes aimed at tampering with SSH functions, thereby jeopardizing secure remote access. The perpetrator behind these changes was a prominent developer within the project, raising questions about security and trust within the open-source community. This chain of events underscores not just the ever-present risks in software development and distribution but also the importance of early detection and rigorous vetting in the open-source ecosystem.

"As systems grow more interconnected, the impact of a single compromised tool can be exponential. The xz Utils case highlights the importance of scrutinizing even seemingly benign updates for potential security implications." said Mackenzie Jackson, Security Advocate.

Thankfully, the quick response from various stakeholders, including the immediate rollback of vulnerable versions and rigorous patching efforts, helped mitigate the potential damage.

Lesson Learned

This incident vividly highlights the critical need for vigilant monitoring of software dependencies. While invaluable, open-source software can also serve as a conduit for significant security threats when not properly scrutinized. In this case, the backdoor targeted SSH, a cornerstone of secure system management and operations.

"Early detection and constant vigilance are the cornerstones of modern cybersecurity. This incident is a testament to that principle. Trust but verify should be the mantra in open-source software development." said Eric Fourrier, GitGuardian co-founder and CEO.

GitGuardian's Software Composition Analysis (SCA) tool exemplifies how advanced monitoring and early detection can serve as a critical defense against such threats. By scanning dependencies for vulnerabilities and offering actionable remediation guidance, GitGuardian SCA enables a proactive approach to software security.

Recommendations

Regularly update and patch software from trusted sources.
Employ continuous monitoring solutions to detect and address vulnerabilities promptly.

Supply Chain Attacks: 6 Steps to protect your software supply chain

This article looks at software supply chain attacks, exactly what they are and 6 steps you can follow to protect your software supply chain and limit the impact of a supply chain attack.

Supply Chain Attacks: 6 Steps to protect your software supply chain

Review and vet contributions to critical software components, especially those with extensive access or privileges.
Integrate security early in the software development lifecycle to detect and mitigate threats preemptively.
Foster a culture of openness and prompt communication within development communities.
Prioritize security when selecting and updating software dependencies.
Engage with the open-source community for collective defense against emerging threats and vulnerabilities.

The State of Secrets Sprawl 2024

Thomas Segura — Wed, 10 Apr 2024 13:47:28 +0000

We're excited to share the State of Secrets Sprawl 2024 report, GitGuardian's annual deep dive into the secrets exposed on public GitHub repositories. This year's findings issue a stark reminder of the escalating challenge we face, with a staggering 12.8 million new secrets leaked in 2023—a 28% increase from the previous year.

Read the full report

Secrets Sprawl on GitHub

Our research, now more comprehensive than ever, shows the rapid growth of exposed secrets since our first report in 2021, quadrupling in number. With GitHub's repository count growing by 50 million in just the last year, the risk of both accidental and deliberate secret exposures skyrockets.

In 2023 alone, GitGuardian's vigilance over 1.1 billion new commits revealed that secret sprawl is not just widespread but deepening, affecting a vast range of industries from IT to Education, Retail, and Finance:

7 commits out of 1,000 exposed at least one secret;
4.6% of active repositories leaked a secret;
11.7% of authors who contributed leaked a secret.

GenAI Secrets Leaks

A particularly alarming trend is the 1212x surge in OpenAI API key leaks, spotlighting the growing allure of AI services among developers—and the risks that accompany their popularity.

Looking at other AI services, another trend emerged: the slow but steady rise of open-source AI:

A Hidden Threat: Zombie Leaks

An alarming revelation from our report is the persistence of "zombie leaks," with over 90% of exposed secrets remaining active five days post-leakage.

This negligence, often resulting from deleting leaky commits or privatizing repositories without revoking the exposed secrets, creates a gaping security vulnerability.

“Developers erasing leaky commits or repositories instead of revoking are creating a major security risk for companies, which will remain vulnerable to threat actors mirroring public GitHub activity for as long as the credential remains valid. These zombie leaks are the worst,” said Eric Fourrier, CEO and Founder of GitGuardian.

And More...

Our investigation also delves into:

The most sensitive file types on GitHub
The fastest-remediated secrets
The use of DMCA notices to stop leaks
The potential of Large Language Models (LLMs) in secret detection
The intersection of private and public secret leaks
Secrets exposure within Python's official package management system, PyPI
Strategies to combat secrets sprawl
Read the Press Release

Read the full report

Stay tuned for more updates from GitGuardian as we continue to monitor the ever-changing threat landscape and provide the latest insights and recommendations to help you stay ahead of the curve.

Understanding the Risks of Long-Lived Kubernetes Service Account Tokens

Thomas Segura — Mon, 29 Jan 2024 09:20:54 +0000

The popularity of Kubernetes (K8s) as the defacto orchestration platform for the cloud is not showing any sign of pause. This graph, taken from the 2023 Kubernetes Security Report by the security company Wiz, clearly illustrates the trend:

As adoption continues to soar, so do the security risks and, most importantly, the attacks threatening K8s clusters. One such threat comes in the form of long-lived service account tokens. In this blog, we are going to dive deep into what these tokens are, their uses, the risks they pose, and how they can be exploited. We will also advocate for the use of short-lived tokens for a better security posture.

Service account tokens are bearer tokens (a type of token mostly used for authentication in web applications and APIs) used by service accounts to authenticate to the Kubernetes API. Service accounts provide an identity for processes (applications) that run in a Pod, enabling them to interact with the Kubernetes API securely.

Crucially, these tokens are long-lived: when a service account is created, Kubernetes automatically generates a token and stores it indefinitely as a Secret, which can be mounted into pods and used by applications to authenticate API requests.

Note: in more recent versions, including Kubernetes v1.29, API credentials are obtained directly by using the TokenRequest API, and are mounted into Pods using a projected volume. The tokens obtained using this method have bounded lifetimes, and are automatically invalidated when the Pod they are mounted into is deleted.

As a reminder, the Kubelet on each node is responsible for mounting service account tokens into pods, so they can be used by applications within those pods to authenticate to the Kubernetes API when needed:

If you need a refresher on K8s components, look here.

The Utility of Service Account Tokens

Service account tokens are essential for enabling applications running on Kubernetes to interact with the Kubernetes API. They are used to deploy applications, manage workloads, and perform administrative tasks programmatically. For instance, a Continuous Integration/Continuous Deployment (CI/CD) tool like Jenkins would use a service account token to deploy new versions of an application or roll back a release.

The Risks of Longevity

While service account tokens are indispensable for automation within Kubernetes, their longevity can be a significant risk factor. Long-lived tokens, if compromised, give attackers ample time to explore and exploit a cluster. Once in the hands of an attacker, these tokens can be used to gain unauthorized access, elevate privileges, exfiltrate data, or even disrupt the entire cluster's operations.

Here are a few leak scenarios that could lead to some serious damage:

- Misconfigured Access Rights: A pod or container may be misconfigured to have broader file system access than necessary. If a token is stored on a shared volume, other containers or malicious pods that have been compromised could potentially access it.

- Insecure Transmission: If the token is transmitted over the network without proper encryption (like sending it over HTTP instead of HTTPS), it could be intercepted by network sniffing tools.

- Code Repositories: Developers might inadvertently commit a token to a public or private source code repository. If the repository is public or becomes exposed, the token is readily available to anyone who accesses it.

- Logging and Monitoring Systems: Tokens might get logged by applications or monitoring systems and could be exposed if logs are not properly secured or if verbose logging is accidentally enabled.

- Insider Threat: A malicious insider with access to the Kubernetes environment could extract the token and use it or leak it intentionally.

- Application Vulnerabilities: If an application running within the cluster has vulnerabilities (e.g., a Remote Code Execution flaw), an attacker could exploit this to gain access to the pod and extract the token.

How Could an Attacker Exploit Long-Lived Tokens?

Attackers can collect long-lived tokens through network eavesdropping, exploiting vulnerable applications, or leveraging social engineering tactics. With these tokens, they can manipulate Kubernetes resources at their will. Here is a non-exhaustive list of potential abuses:

- Abuse the cluster's (often barely limited) infra resources for cryptocurrency mining or as part of a botnet.

- With API access, attackers could deploy malicious containers, alter running workloads, exfiltrate sensitive data, or even take down the entire cluster.

- If the token has broad permissions, it can be used to modify roles and bindings to elevate privileges within the cluster.

- The attacker could create additional resources that provide them with persistent access (backdoor) to the cluster, making it harder to remove their presence.

- Access to sensitive data stored in the cluster or accessible through it could lead to data theft or leakage.

Why Aren’t Service Account Tokens Short-Lived by Default?

Short-lived tokens are a security best practice in general, particularly for managing access to very sensitive resources like the Kubernetes API. They reduce the window of opportunity for attackers to exploit a token and facilitate better management of permissions as application access requirements change. Automating token rotation limits the impact of a potential compromise and aligns with the principle of least privilege—granting only the access necessary for a service to operate.

The problem is that implementing short-lived tokens comes with some overhead.

First, implementing short-lived tokens typically requires a more complex setup. You need an automated process to handle token renewal before it expires. This may involve additional scripts or Kubernetes operators that watch for token expiration and request new tokens as necessary.

This often means integrating a secret management system that can securely store and automatically rotate the tokens. This adds a new dependency for system configuration and maintenance.

Note: it goes without saying that using a secrets manager with Kubernetes is highly recommended, even for non-production workloads. But the overhead cannot be understated.

Second, software teams running their CI/CD workers on top of the cluster will need adjustments to support dynamic retrieval and injection of these tokens into the deployment process. This could require changes in the pipeline configuration and additional error handling to manage potential token expiration during a pipeline run, which can be a true headache.

And secrets management is just the tip of the iceberg. You will also need monitoring and alerts if you want to troubleshoot renewal failures. Fine-tuning token expiry time could break the deployment process, requiring immediate attention to prevent downtime or deployment failures.

Finally, there could also be performance considerations, as many more API calls are needed to retrieve new tokens and update the relevant Secrets.

By default, Kubernetes opts for a straightforward setup by issuing service account tokens without a built-in expiration. This approach simplifies initial configuration but lacks the security benefits of token rotation. It is the Kubernetes admin’s responsibility to configure more secure practices by implementing short-lived tokens and the necessary infrastructure for their rotation, thereby enhancing the cluster's security posture.

Mitigation Best Practices

For many organizations, the additional overhead is justified by the security improvements. Tools like service mesh implementations (e.g., Istio), secret managers (e.g., CyberArk Conjur), or cloud provider services can manage the lifecycle of short-lived certificates and tokens, helping to reduce the overhead.

Additionally, recent versions of Kubernetes offer features like the TokenRequest API, which can automatically rotate tokens and project them into the running pods.

Even without any additional tool, you can mitigate the risks by limiting the Service Account auto-mount feature. To do so, you can opt out of the default API credential automounting with a single flag in the service account or pod configuration. Here are two examples:

For a Service Account:

apiVersion: v1
kind: ServiceAccount
metadata:
 name: build-robot
automountServiceAccountToken: false
...

And for a specific Pod:

apiVersion: v1
kind: Pod
metadata:
 name: my-pod
spec:
 serviceAccountName: build-robot
 automountServiceAccountToken: false
 ...

The bottom line is that if an application does not need to access the K8s API, it should not have a token mounted. This also limits the number of service account tokens an attacker can access if the attacker manages to compromise any of the Kubernetes hosts.

Okay, you might say, but how do we enforce this policy everywhere? Enter Kyverno, a policy engine designed for K8s.

Enforcement with Kyverno

Kyverno allows cluster administrators to manage, validate, mutate, and generate Kubernetes resources based on custom policies. To prevent the creation of long-lived service account tokens, one can define the following Kyverno policy:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: deny-secret-service-account-token
spec:
  validationFailureAction: Enforce
  background: false
  rules:
  - name: check-service-account-token
    match:
      any:
      - resources:
          kinds:
          - Secret
    validate:
      cel:
        expressions:
        - message: "Long lived API tokens are not allowed"
          expression: >
            object.type != "kubernetes.io/service-account-token"

This policy ensures that only Secrets that are not of type kubernetes.io/service-account-token can be created, effectively blocking the creation of long-lived service account tokens!

Applying the Kyverno Policy

To apply this policy, you need to have Kyverno installed on your Kubernetes cluster (tutorial). Once Kyverno is running, you can apply the policy by saving the above YAML to a file and using kubectl to apply it:

kubectl apply -f deny-secret-service-account-token.yaml

After applying this policy, any attempt to create a Secret that is a service account token of the prohibited type will be denied, enforcing a safer token lifecycle management practice.

Wrap up

In Kubernetes, managing the lifecycle and access of service account tokens is a critical aspect of cluster security. By preferring short-lived tokens over long-lived ones and enforcing policies with tools like Kyverno, organizations can significantly reduce the risk of token-based security incidents. Stay vigilant, automate security practices, and ensure your Kubernetes environment remains robust against threats.

How to Secure Your Secrets Manager with GitGuardian Honeytoken

Thomas Segura — Fri, 08 Dec 2023 14:59:40 +0000

Protecting sensitive data is a crucial responsibility for modern businesses. To ensure the security of critical information, organizations utilize various tools and strategies.

One such tool is a secrets manager, which securely stores and manages sensitive data like passwords, API keys, and encryption keys. Secrets managers offer a centralized and encrypted repository, providing a secure alternative to storing secrets in configuration files or source code. Popular secrets managers include Hashicorp Vault, AWS Secrets Manager, Doppler and CyberArk.

In this blog, we will explore how GitGuardian Honeytoken can enhance the security of secrets managers by enabling easy and scalable breach detection.

Understanding Honeytokens

Honeytokens are digital baits designed to lure attackers into a trap. They have no permissions attached and serve as alerts when unauthorized users attempt to use them.

When a honeytoken is triggered, it alerts you of the intrusion and provides crucial information about the attacker, such as their IP address and user agent. By storing honeytokens in a secure environment like a secrets manager, you can effectively detect and respond to unauthorized access attempts.

In the upcoming section, we will guide you on how to place a honeytoken in your Hashicorp Vault instance, but these guidelines are applicable to any other secrets manager.

➡️ Before we begin, follow the step-by-step instructions below to create your first honeytoken! 👇

How to Create and Use Honeytokens: Step-by-Step Instructions

Enhancing Secrets Manager Security with Honeytokens

Hashicorp Vault is a popular secrets manager that securely stores and controls access to sensitive elements in a DevOps environment. While Vault provides advanced security features, the possibility of an instance compromise cannot be completely eliminated.

Attackers understand the value of compromising a secrets manager, as it can provide them with access to sensitive information and facilitate lateral movement within your system. Integrating GitGuardian Honeytoken with Vault adds an additional layer of security, alerting you to any unauthorized attempts to access your secrets.

Integrating GitGuardian Honeytoken with Vault (or any Secrets Manager)

Before integrating GitGuardian Honeytoken, ensure that Vault is installed and running in your environment.

Installing Vault:

Download the latest version of Vault from the official website.
Extract the downloaded file in a directory of your choice.
Add the Vault binary to your system PATH for easy access.

Configuring Vault:

Initiate the Vault server with the appropriate configurations for your organization.
Initialize Vault to create initial tokens and unseal keys that are critical for Vault's operations.

Using GitGuardian Honeytoken:

Sign in to your GitGuardian account and navigate to the Honeytoken section.
Create new honeytokens with a descriptive name and description

Note down the keys (AWS Access and Secret Keys), as these will be used later in Vault.

Honeytoken details

Integration Procedure: Step-by-Step Guide

Vault supports multiple secret engines, and for simplicity, we will consider the key value (KV) secret engine. This engine is a generic Key-Value store used to store arbitrary secrets within Vault.

💡A set of permissions policies will be needed to list, write, and manage secrets through the KV engine. Read more here

First, ensure that your Vault cluster is reachable by running the command:

vault status

If you get an error message, follow Vault's Lab Setup instructions here.

Next, verify that the KV secrets engine is enabled and set to version 2 at the secret/ path. This should be the default configuration if you started a dev server.

vault secrets list -detailed

Path          Type         Accessor           ...   Options           Description
----          ----         --------                 -------           -----------
cubbyhole/    cubbyhole    cubbyhole_9d52aeac ...   map[]             per-token private secret storage
identity/     identity     identity_acea5ba9  ...   map[]             identity store
secret/       kv           kv_2226b7d3        ...   map[version:2]    key/value secret storage
...

Finally, write the previously generated AWS secret values at the path of your choice:

vault kv put secret/aws aws_access_key_id ="AKIA..." \
        aws_secret_access_key = "QzfQeV..."

If everything went well, you should see this output:

====== Secret Path ======
secret/aws

======= Metadata =======
Key                Value
---                -----
created_time       <CREATION_TIME>
custom_metadata    <nil>
deletion_time      n/a
destroyed          false
version            1

This confirms that your bait credentials have been set. If you receive an email alert about this honeytoken being triggered, you can assume that your secrets manager is compromised.

💡 Currently, GitGuardian Honeytoken only supports one type of secret: AWS key. In the future, more types of secrets will be available, allowing to place them in other Vault engines, like, for example, the SSH or the database secret engine

You can further configure your alerts to receive instant notifications through custom webhooks, allowing you to customize your alerting workflow.

Benefits of protecting your Secrets Manager with Honeytokens

Taking proactive measures to secure sensitive information is both prudent and essential. There are a number of security benefits of integrating GitGuardian Honeytoken with Vault:

Proactive Breach Detection: By integrating GitGuardian Honeytoken with Vault, you can proactively detect and respond to potential breaches. Honeytokens act as a silent alarm, alerting you when unauthorized users attempt to use them. This allows you to take immediate action and strengthen your system's security.
Enhanced Threat Intelligence: Honeytokens provide valuable information about the actions taken by intruders, such as their IP address and user agent. This information helps you analyze and understand the threat, enabling you to further secure your system and track down the attackers.
Scalable Solution: GitGuardian Honeytoken offers a scalable solution for breach detection. You can create and deploy multiple honeytokens, strategically placing them in your secrets manager to effectively deceive attackers. This scalability ensures comprehensive coverage and protection for your critical assets.
Customizable Alerting Workflow: GitGuardian Honeytoken allows you to configure custom alerts through webhooks. This flexibility enables you to tailor your alerting workflow according to your specific requirements, ensuring that you receive instant notifications when a breach is detected.

Conclusion

By securing your secrets manager with GitGuardian Honeytoken, you can significantly enhance the security of your sensitive information. The proactive breach detection, enhanced threat intelligence, easy integration, scalability, and customizable alerting workflow provided by honeytokens make them a valuable addition to your security arsenal. Follow the step-by-step instructions in this blog to get started and protect your critical assets effectively.

HasMySecretLeaked - Building a Trustless and Secure Protocol

Thomas Segura — Wed, 15 Nov 2023 15:27:52 +0000

HasMySecretLeaked is the first free service that allows security practitioners to proactively verify if their secrets have leaked on GitHub.com. With access to GitGuardian's extensive database of over 20 million records of detected leaked secrets, including their locations on GitHub, users can easily query and protect their sensitive information. This database is compiled from scanning billions of code files, commits, GitHub gists, and issues since 2017.

By opening this project to everyone, we strive to enhance secrets protection at scale while acknowledging the responsibility and security challenges involved.

At GitGuardian, we believe in transparent, accessible and understandable security. This is why we aim to uphold complete transparency regarding the protocols behind HasMySecretLeaked. Our goal is to ensure the service remains resilient against malicious misuse while also fostering a trustless environment: with HasMySecretLeaked, you can rest assured that your secrets are secure - at no point do we have access to them, so you don’t even need to trust us!

To support these claims, this blog post dives into the technical choices that went into creating a safe yet pragmatic and easy-to-use protocol. Without further ado, let’s get started.

How We Built the Protocol

HasMySecretLeaked is a highly efficient REST API designed to take a string of characters as input. Its key function is to answer with 100% assurance the critical question - 'Has my secret leaked?'

Here is an example request for the fffff input to show you what a typical response looks like:

$ curl https://api.hasmysecretleaked.com/v1/prefix/fffff

For this input, the HasMySecretLeaked API response's payload is:

"matches": [
    {
      "hint": "0ea3b034632487a17ba18e7086156c14cc8eb9993e33aec8e348183044b299b1",
      "payload": "adO72+jWqHstJso2DacOj..."
    },
    {
      "hint": "ddf4ef722ca2ec541074811692eff789bf48d963fa28e42d5542edc8e887239c",
      "payload": "sHrlooMIShmf+Oh0yqcYm..."
    },
    ...

As you can notice, there is more going on under the hood. To understand how the client uses this information and the API design decisions involved, let's start from scratch with a naive, insecure, protocol implementation and iterate from there.

1. Naive approach

In the most straightforward approach, the user would send their secret directly to the API. Problem: the service, aka GitGuardian, would "see" the users' secrets in cleartext, which is unacceptable.

So, how can a user know whether their secret is in our database of leaked credentials without sharing their secret with us?

Response: using a hashed version of the secret.

2. Hash-based approach

Now, let’s imagine the user sends a hashed version of their secret to the API. This is definitely an improvement, as the secret is obfuscated, and the service cannot reverse the hash. There is still a privacy problem, though: by definition, if the hashed secret is present in our database, it implies that its cleartext version was once publicly accessible, indicating that GitGuardian has or had knowledge of it. In other words, the user would be leaking their secret to us. And that’s not acceptable either.

So, how can we prove GitGuardian has zero-knowledge about the requested secret?

Response: by responding with multiple possibilities.

3. Bucket-based approach

In this approach, the user only sends a fragment of their secret's hash - specifically, the initial 5 characters. The service then retrieves all secrets it holds that align with these five characters. By ensuring the response 'bucket' is sufficiently large, we effectively veil the original secret request from the service, maintaining the user's privacy.

💡The bucket's size is intrinsically linked to the hash chunk's length: a longer string yields a smaller bucket. Through hands-on experimentation, we've found that a five-character size currently delivers optimal results. However, this is not set in stone and may be fine-tuned in the future.

👍 A hearty acknowledgment to Troy Hunt's HaveIBeenPwned, which significantly inspired this design!

Now, we have a secure service, yet it's missing a crucial feature. For the service to prove its claim that the secret has been leaked, it must provide the URL location of that leak. Without this feature, the service's usefulness would be almost null.

So, how can we share the URL of the leak while maintaining the confidentiality of the remaining URLs in the bucket?

Response: by encrypting this information so only the secret's owner can retrieve it.

4. Adding encryption

The idea is simple: each element in the response bucket is now encrypted (using AES-GCM) with the (full-length) hash of the secret as the key. This way, we can guarantee only a user knowing the hash (and therefore the secret) will be able to decrypt the payload and retrieve the URL location.

This additional layer of encryption significantly reduces the risk of an enumeration attack. Without it, a cybercriminal could potentially attempt every five-character combination, thereby harvesting our entire database of hashed secrets. However, there remains another potential attack scenario that we'll be addressing shortly.

We are now faced with a usability issue: the client must painstakingly decipher each item individually to potentially locate the one they possess the key for. There is no way to quickly get an answer to the crucial question: has my secret leaked?

So, how can we allow the client to get an answer quicker?

Response: by providing an easy-to-check hint for each match.

5. Providing a hint

The API now provides encrypted responses, coupled with a unique feature - a hint, essentially a hash of the hash. This allows the client to swiftly calculate the hint and determine if a match exists within the response. Put simply, they can rapidly confirm if the hash of their secret is present in the database. To dig deeper, they only need to decrypt the specific item using the original hash of their secret.

We’ve now built a robust service, which corresponds to the example structure presented at the beginning of this article. Let's address the attack scenario mentioned before.

6. Preventing other risk scenarios

a. An attacker compromises a similar database

This scenario implies that an attacker manages to compromise a database of hashed secrets using the same hash function as the one used by our service. In that very special case, they could use our service to reverse the hashes, collect all the locations, and uncover the cleartext secrets.

To mitigate the above threat, we added a pepper to our hashing function.

A "pepper" is an additional string added into the hashing process to further increase security. Unlike a "salt", which is unique for each item, a "pepper" is global.

Adding a pepper makes our hashing function unique and mitigates the possibility of an attacker reversing our hashes with a dumped database.

b. An attacker uses a "rainbow table"

What happens if someone attempts to guess the secrets? Similar to passwords, many secrets are not randomly generated but hastily created. This makes them vulnerable to attackers who may use a "rainbow table" - a precomputed table for reversing cryptographic hash functions - to probe our service with a list of common combinations in an attempt to uncover any leaks.

To mitigate this risk, we've implemented two safeguards:

The service only discloses the first location where a secret is found. This strategy restricts the amount of information that can be misused without hindering legitimate usage.
Unauthenticated users are limited to five service queries per day. This IP-based rate limit creates a barrier for anyone hoping to strike it lucky with random attempts (we also rate limit authenticated users, but it is less stringent).

7. Wrapping up: final design implementation

Here is the final interaction diagram to summarize what happens client-side and server-side:

HasMySecretLeaked protocol

Two interfaces implement the client-side protocol we described here to interact with the service API:

the Javascript code your browser downloads when visiting HasMySecretLeaked
the GitGuardian CLI ggshield, with the new option ggshield hmsl

This is the end of our journey! To wrap up, here are the key takeaways you need to remember about HasMySecretLeaked:

At no point does GitGuardian have access to your secret.
We fully recognize the immense responsibility associated with managing sensitive data, and we take the security challenges seriously.
Your secret is obfuscated client-side and cannot be reversed by our services.
Your secret request remains private because you only send the initial five characters of the hash.
If we find a hit, only you, as the owner of the secret, can decrypt the payload and retrieve the URL location.
We have ensured that checking if your secret is included in the response is quick and simple without compromising privacy.
To reduce potential risks, we have added extra security measures to our system, such as making our hashing function more complex and restricting the number of service queries allowed per day.

We hope you are now ready to take HasMySecretLeaked for a spin with the peace of mind knowing your secrets are safe!

Get started with HasMySecretLeaked

Addendum: Deep dive into technical choices

Prefix size

As said earlier, it is important that users share only the prefix of their hashed secret with us, and it is as important for GitGuardian to design the service so that buckets are big enough. This approach ensures that GitGuardian retains minimal, if any, knowledge of the user’s secret.

The challenge here lies in striking a balance: if the buckets are too small, GitGuardian inadvertently gains excessive insight into the user’s secret. Conversely, overly large buckets can burden the API with hefty payloads. To determine the ideal prefix size, we need to estimate the potential number of secrets in our database.

Currently, our HMSL database houses approximately 22 million unique secrets. We've selected a prefix composed of 5 hexadecimal characters, yielding roughly 1 million buckets and an average of about 22 secrets per bucket. Assuming that the hash distribution is random and follows a normal distribution, we've ensured no bucket contains fewer than 8 secrets, which aligns with our protocol.

It's important to note that as more secrets are leaked over time, the buckets will gradually fill up. If they become too cumbersome for our API, we can simply extend the prefix length by one, effectively reducing the average bucket size by a factor of 16.

Hashing function

Our chosen hashing function must adhere to the canonical properties of a hashing function, primarily to deter users from reversing a hash and, consequently, extracting passwords from our database. However, there's an additional complexity to consider. Passwords don't follow a random distribution; in fact, they're quite predictable. Passwords can be systematically enumerated, beginning with the most common ones.

If an individual knows the prefix of a user's hash, they could swiftly compute hashes from a standard list of passwords and compare them to the prefix. If they stumble upon a match, they can reasonably conclude that they've cracked the user's password.

To counter this, we need to select a hashing method that strikes a balance between speed and complexity. It must be quick enough to ensure a smooth user experience yet intricate enough to thwart attackers from rapidly generating a vast list of hashes. This led us to select scrypt as our preferred hashing method.

Yes, GitHub's Copilot can Leak (Real) Secrets

Thomas Segura — Wed, 08 Nov 2023 16:18:46 +0000

There has been a growing focus on the ethical and privacy concerns surrounding advanced language models like ChatGPT and OpenAI GPT technology. These concerns have raised important questions about the potential risks of using such models. However, it is not only these general-purpose language models that warrant attention; specialized tools like code completion assistants also come with their own set of concerns.

Read: Why ChatGPT is a security concern for your organization (even if you don't use it)

A year into its launch, GitHub’s code-generation tool Copilot has been used by a million developers, adopted by more than 20,000 organizations, and generated more than three billion lines of code, GitHub said in a blog post.

However, since its inception, security concerns have been raised by many about the associated legal risks associated with copyright issues, privacy concerns, and, of course, insecure code suggestions, of which examples abound, including dangerous suggestions to hard-code secrets in code.

Extensive security research is currently being conducted to accurately assess the potential risks associated with these newly advertised productivity-enhancing tools.

This blog post delves into recent research by Hong Kong University to test the possibility of abusing GitHub’s Copilot and Amazon’s CodeWhisperer to collect secrets that were exposed during the models' training.

As highlighted by GitGuardian's 2023 State of Secrets Sprawl, hard-coded secrets are highly pervasive on GitHub, with 10 million new secrets detected in 2022, up 67% from 6 million one year earlier.

Given that Copilot is trained on GitHub data, it is concerning that coding assistants can potentially be exploited by malicious actors to reveal real secrets in their code suggestions.

To test this hypothesis, the researchers conducted an experiment to build a prompt-building algorithm trying to extract credentials from the LLMs.

The conclusion is unambiguous: by constructing 900 prompts from GitHub code snippets, they managed to successfully collect 2,702 hard-coded credentials from Copilot and 129 secrets from CodeWhisper (false positives were filtered out with a special methodology described below).

Impressively, among those, at least 200, or 7.4% (respectively 18 and 14%), were real hard-coded secrets they could identify on GitHub. While the researchers refrained from confirming whether these credentials were still active, it suggests that these models could potentially be exploited as an avenue for attack. This would enable the extraction and likely compromise of leaked credentials with a high degree of predictability.

The Design of a Prompt Engineering Machine

The idea of the study is to see if an attacker could extract secrets by crafting appropriate prompts. To test the odds, the researchers built a prompt testing machine, dubbed the Hard-coded Credential Revealer (HCR).

The machine has been designed to maximize the chances of triggering a memorized secret. To do so, it needs to build a strong prompt that will "force" the model to emit the secret. The way to build this prompt is to first look on GitHub for files containing hard-coded secrets using regex patterns. Then, the original hard-coded secret is redacted, and the machine asks the model for code suggestions.

Of course, the model will need to be requested many times to have a slight chance of extracting valid credentials, because it often outputs "imaginary" credentials.

They also need to test many prompts before finding an operational credential, allowing them to log into a system.

In this study, 18 patterns are used to identify code snippets on GitHub, corresponding to 18 different types of secrets (AWS Access Keys, Google OAuth Access Token, GitHub OAuth Access Token, etc.).

💡Although 18 secrets types is far from exhaustive (the GitGuardian secrets scanner is able to detect 350+ types of secrets), they are still representative of services widely used by software developers and are easily identifiable.

Then, the secrets are removed from the original file, and the code assistant is used to suggest new strings of characters. Those suggestions are then passed through four filters to eliminate a maximum number of false positives

Secrets are discarded if they:

- don't match the regex pattern

- don't show enough entropy (not random enough, ex: AKIAXXXXXXXXXXXXXXXX)

- have a recognizable pattern (ex: AKIA3A3A3A3A3A3A3A3A)

- include common words (ex: AKIAIOSFODNN7EXAMPLE)

A secret that passes all these tests is considered valid, which means it could realistically be a true secret (hard-coded somewhere else in the training data).

Results

Among 8,127 suggestions of Copilot, 2,702 valid secrets were successfully extracted. Therefore, the overall valid rate is 2702/8127 = 33.2%, meaning that Copilot generates 2702/900 = 3.0 valid secrets for one prompt on average.

CodeWhisperer suggests 736 code snippets in total, among which we identify 129 valid secrets. The valid rate is thus 129/736 = 17.5%.

💡Keep in mind that in this study, a valid secret doesn't mean the secret is real. It means that it successfully passed the filters and, therefore has the properties corresponding to a real secret.

So, how can we know if these secrets are genuine operational credentials? The authors explained that they only tried a subset of the valid credentials (test keys like Stripe Test Keys designed for developers to test their programs) for ethical considerations.

Instead, the authors are looking for another way to validate the authenticity of the valid credentials collected. They want to assess the memorization, or where the secret appeared on GitHub.

The rest of the research focuses on the characteristics of the valid secrets. They look for the secret using GitHub Code Search and differentiate strongly memorized secrets, which are identical to the secret removed in the first place, and weakly memorized secrets, which came from one or multiple other repositories. Finally, there are secrets that could not be located on GitHub and which might come from other sources.

Consequences

The research paper uncovers a significant privacy risk posed by code completion tools like GitHub Copilot and Amazon CodeWhisperer. The findings indicate that these models not only leak the original secrets present in their training data but also suggest other secrets that were encountered elsewhere in their training corpus. This exposes sensitive information and raises serious privacy concerns.

For instance, even if a hard-coded secret was removed from the git history after being leaked by a developer, an attacker can still extract it using the prompting techniques described in the study. The research demonstrates that these models can suggest valid and operational secrets found in their training data.

These findings are supported by another recent study conducted by a researcher from Wuhan University, titled Security Weaknesses of Copilot Generated Code in GitHub. The study analyzed 435 code snippets generated by Copilot from GitHub projects and used multiple security scanners to identify vulnerabilities.

According to the study, 35.8% of the Copilot-generated code snippets exhibited security weaknesses, regardless of the programming language used. By classifying the identified security issues using Common Weakness Enumerations (CWEs), the researchers found that "Hard-coded credentials" (CWE-798) were present in 1.15% of the code snippets, accounting for 1.5% of the 600 CWEs identified.

Mitigations

Addressing the privacy attack on LLMs requires mitigation efforts from both programmers and machine learning engineers.

To reduce the occurrence of hard-coded credentials, the authors recommend using centralized credential management tools and code scanning to prevent the inclusion of code with hard-coded credentials.

During the various stages of code completion model development, different approaches can be adopted:

- Before pre-training, hard-coded credentials can be excluded from the training data by cleaning it.

- During training or fine-tuning, algorithmic defenses such as Differential Privacy (DP) can be employed to ensure privacy preservation. DP provides strong guarantees of model privacy.

- During inference, the model output can be post-processed to filter out secrets.

Conclusion

This study exposes a significant risk associated with code completion tools like GitHub Copilot and Amazon CodeWhisperer. By crafting prompts and analyzing publicly available code on GitHub, the researchers successfully extracted numerous valid hard-coded secrets from these models.

To mitigate this threat, programmers should use centralized credential management tools and code scanning to prevent the inclusion of hard-coded credentials. Machine learning engineers can implement measures such as excluding these credentials from training data, applying privacy preservation techniques like Differential Privacy, and filtering out secrets in the model output during inference.

These findings extend beyond Copilot and CodeWhisperer, emphasizing the need for security measures in all neural code completion tools. Developers must take proactive steps to address this issue before releasing their tools.

In conclusion, addressing the privacy risks and protecting sensitive information associated with large language models and code completion tools requires collaborative efforts between programmers, machine learning engineers, and tool developers. By implementing the recommended mitigations, such as centralized credential management, code scanning, and exclusion of hard-coded credentials from training data, the privacy risks can be effectively mitigated. It is crucial for all stakeholders to work together to ensure the security and privacy of these tools and the data they handle.

From Code to Cloud: Security for Developers [cheat sheet included]

Thomas Segura — Sun, 15 Oct 2023 22:00:00 +0000

From code development and approval to continuous integration and deployment, each stage presents unique challenges and opportunities to enhance the security of your applications.

Download the cheat sheet ⬇️⬇️

By following the recommendations in this cheat sheet, you will be able to:

Identify and fix vulnerabilities in your code early using the Snyk CLI and Snyk IDE plugins.
Prevent secrets from being committed to your repository with GitGuardian's ggshield.
Safely store and share secrets using encryption or secrets managers.
Protect your source code from compromise using honeytokens.
Conduct code reviews and address security weaknesses during the code approval stage.
Analyze infrastructure code for misconfigurations using Snyk Infrastructure as CodeI.
Scan your code and artifacts for vulnerabilities during the continuous integration and deployment process.
Ensure that no secrets are leaked in the final artifact with GitGuardian integration.
Be alerted if your CI/CD service gets compromised using Honeytokens.
Scan container images for vulnerabilities and leaked secrets.

With these security measures in place, you can have confidence in the integrity and security of your code as it moves from development to deployment in the cloud.

Let's dive into each stage and explore the tools and practices that will help you build secure and robust applications.

Stage 1: Code Development

At this stage, you are crafting your code, adding dependencies, preparing your app to interface with external systems (backends, SaaS, databases, etc.), and hopefully writing tests!

The main threats here are committing a secret, misconfiguration, coding a vulnerability, or adding a malicious dependency. To prevent these outcomes, there is a lot that you can set up to detect and fix issues early on:

Get security feedback right into your IDE with the Snyk IDE plugin. This plugin provides real-time security feedback as you code, helping you identify and fix vulnerabilities and issues directly within your development environment.
Detect and fix vulnerabilities with the Snyk CLI. Install the Snyk CLI globally by running npm install snyk-g. This allows you to run various commands such as snyk test, snyk code test, snyk container test, and snyk iac test to scan your code, dependencies, containers, and infrastructure-as-code for vulnerabilities and misconfiguration. Never commit a secret again with GitGuardian ggshield. Install GitGuardian by running the following command (on macOS):

Then automatically get a token for scanning with:gghsield auth login

To prevent committing secrets, add the ggshield secret scan pre-commit hook to your local .git/hooks folder by using the command: gghsield install --mode local --hook-type pre-commit -a

You can also scan your local repository immediately to identify hard-coded secrets by running gghsield secret scan repo.

Store and share secrets safely: To securely store and share secrets, you can either encrypt your secrets in your repository using tools like SOPS or call a secrets manager from your code using clients such as hvac (for Hashicorp’s Vault) or boto3 (for AWS Secrets Manager). Check out examples of how to handle secrets in Python here. Protect against code compromise: Malicious actors actively seek source code to compromise organizations of all scales. Fortunately, you can protect your source code by using honeytokens. Use the gghsield honeytoken create command to create honeytokens if enabled in your workspace. Honeytokens can help detect if an unauthorized actor accessed your source code or if it was publicly exposed on GitHub. Learn how to secure your SCM repositories with GitGuardian honeytokens here.

Stage 2: Code Approval

Whether you are working alongside dozens of fellow coders or contributing to an open-source project, the next step is to push your work to a central repository. This is the occasion for your changes to get reviewed by peers. The goal is to keep code quality consistent, get feedback, and fix security vulnerabilities and license issues.

To ensure the security of your code during the code approval stage, consider the following guidelines:

Get feedback: When working in a team, it's essential to have at least one pair of eyes review the code for quality, security, and adherence to coding standards. Code reviews can help identify potential security vulnerabilities and provide valuable feedback.
Check for security failures: Use Snyk Open Source to assess the reliability of the added dependencies. Additionally, it can help identify if a license is compliant with your organization's policy. This ensures that the dependencies you are using are secure and meet the necessary licensing requirements.
Check for common security weaknesses: Leverage Snyk Code, a real-time static analysis tool, to identify common security weaknesses in your code. It can provide alerts and recommendations to help you address these weaknesses and improve the overall security of your codebase. Enforce no secrets get leaked: Implement a check for leaked secrets in the main repository. This can be done by integrating GitGuardian into your code approval process. By scanning for hard-coded secrets during the code approval stage, you can ensure that no secrets are leaked and avoid the need for a secret rotation procedure.
***Analyze Infrastructure code (IaC) for misconfigurations:* Use Snyk Infrastructure as Code (IaC) to analyze your IaC templates and configurations for misconfigurations and vulnerabilities. This helps ensure that your infrastructure is properly configured and secure before deployment.

Stage 3: Continuous Integration and Deployment

After being approved, your changes finally get merged into the main branch (or the appropriate trunk) of your project. The code is now ready to be built, tested, and eventually deployed. However, before that can happen, multiple checks are needed. Welcome to the automated test kingdom, where the inner workings of the code are scrutinized to ensure quality and security requirements are met.

During the Continuous Integration and Deployment (CI/CD) pipeline, consider implementing the following steps to ensure the security of your code:

Scan the source code for vulnerabilities: Integrate Snyk with your CI/CD pipelines to scan for vulnerabilities during the build process. This ensures that any vulnerabilities in your code are identified and can be addressed before deployment.
Ensure no secrets are found in the final artifact: Integrate GitGuardian into your CI/CD pipeline to detect any hard-coded secrets during the pipeline execution. This helps prevent secrets from being included in the final artifact and ensures that sensitive information is not exposed.
Detect intrusions in your CI/CD pipelines: Enhance the security of your build process by using Honeytokens. Place Honeytokens in the secure store of your CI/CD provider (as secrets, or environment variables) to ensure that you receive alerts if your builds are compromised. This helps detect any unauthorized access to pipelines and allows you to take appropriate actions. Learn more here.
Scan artifacts: Even if your source code passes all security checks, it's important to scan the final artifact for vulnerabilities. Use tools like Snyk Container and ggshield secret scan docker to scan container images and ensure that the end product is clean and free from vulnerabilities and leaked secrets.

By following these steps in the CI/CD pipeline, you can ensure that your code is secure and meets the necessary quality and security requirements before deployment.

Stage 4: Deploy

Once your code has been approved and tested, it's time to deploy it to a cloud environment or infrastructure. However, before doing so, it's important to ensure the security of your deployment process. Here are some steps you can take:

Secure your deployment pipeline: Implement secure deployment practices such as using secure connections (HTTPS), verifying the integrity of your deployment artifacts, and using secure authentication methods.
Scan your deployment configuration: Use Snyk IaC to scan your infrastructure-as-code (IaC) templates and configurations for misconfigurations and vulnerabilities. This will help you identify any security issues before deploying your code.
Monitor your deployment: Implement monitoring and logging solutions to track the performance and security of your deployment. This will help you identify any anomalies or potential security threats. Implement security best practices: Follow security best practices for cloud deployments, such as using strong access controls, regularly updating your software and dependencies, and encrypting sensitive data.

Stage 5: Cloud

Once your code is deployed to a cloud environment, it's important to continuously monitor the security of your cloud infrastructure and applications. Here are some steps you can take to ensure the security of your cloud environment:

Implement logging and monitoring: Set up logging and monitoring solutions to track and analyze the activity in your cloud environment. This will help you detect any security vulnerabilities or threats.
Implement threat detection: Use threat detection tools and services to identify and respond to potential security threats in real-time. This can include detecting unauthorized access attempts, unusual network traffic, or suspicious behavior.
Regularly update and patch: Keep your cloud infrastructure and applications up to date by regularly applying security patches and updates. This will help protect against known vulnerabilities and exploits.
Implement access controls: Use strong access controls and authentication mechanisms to ensure that only authorized users and services can access your cloud resources. This can include implementing multi-factor authentication, role-based access controls, and least privilege principles. Get the IAM cheat sheet here.
Regularly review and audit: Conduct regular security reviews and audits of your cloud environment to identify any potential security weaknesses or misconfigurations. This can help you proactively address any security issues before they are exploited.

By following these steps and utilizing tools like Snyk and GitGuardian, developers can ensure the security of their code throughout the entire development and deployment process.

Microsoft AI involuntarily exposed a secret giving access to 38TB of confidential data for 3 years

Thomas Segura — Tue, 10 Oct 2023 13:29:59 +0000

The WIZ Research team recently discovered that an overprovisioned SAS token had been lying exposed on GitHub for nearly three years. This token granted access to a massive 38-terabyte trove of private data. This Azure storage contained additional secrets, such as private SSH keys, hidden within the disk backups of two Microsoft employees. This revelation underscores the importance of robust data security measures.

What happened?

WIZ Research recently disclosed a data exposure incident found on Microsoft’s AI GitHub repository on June 23, 2023.

The researchers managing the GitHub used an Azure Storage sharing feature through a SAS token to give access to a bucket of open-source AI training data.

This token was misconfigured, giving access to the account's entire cloud storage rather than the intended bucket.

This storage comprised 38TB of data, including a disk backup of two employees’ workstations with secrets, private keys, passwords, and more than 30,000 internal Microsoft Teams messages.

SAS (Shared Access Signatures) are signed URLs for sharing Azure Storage resources. They are configured with fine-grained controls over how a client can access the data: what resources are exposed (full account, container, or selection of files), with what permissions, and for how long. See Azure Storage documentation.

After disclosing the incident to Microsoft, the SAS token was invalidated. From its first commit to GitHub (July 20, 2020) to its revoking, nearly three years elapsed. See the timeline presented by the Wiz Research team:

Why did the token have such an extended lifespan? If you take a look at the timeline, you'll see that the token's expiration date was extended by an additional 30 years post-expiration. This longevity isn't surprising when you consider that the token was intentionally engineered to be shared and grant access to training data.

Yet, as emphasized by the WIZ Research team, there was a misconfiguration with the Shared Access Signature (SAS).

Data Exposure

The token was allowing anyone to access an additional 38TB of data, including sensitive data such as secret keys, personal passwords, and over 30,000 internal Microsoft Teams messages from hundreds of Microsoft employees.

Here is an excerpt from some of the most sensitive data recovered by the Wiz team:

Not only was the access scope excessively permissive, but the token was also misconfigured to grant "full control" permissions instead of read-only. This means that an attacker not only had the ability to view all the files in the storage account but could also delete and overwrite existing files.

As highlighted by the researchers, this could have allowed an attacker to inject malicious code into the storage blob that could then automatically execute with every download by a user (presumably an AI researcher) trusting in Microsoft's reputation, which could have led to a supply chain attack.

Also read Examples of software supply chain attacks

Security Risks

According to the researchers, Account SAS tokens such as the one presented in their research present a high-security risk. This is because these tokens are highly permissive, long-lived tokens that escape the monitoring perimeter of administrators.

When a user generates a new token, it is signed by the browser and doesn't trigger any Azure event. To revoke a token, an administrator needs to rotate the signing account key, therefore revoking all the other tokens at once.

Ironically, the security risk of a Microsoft product feature (Azure SAS tokens) caused an incident for a Microsoft research team, a risk recently referenced by the second version of the Microsoft threat matrix for storage services:

Secrets Sprawl

This example perfectly underscores the pervasive issue of secrets sprawl within organizations, even those with advanced security measures. Intriguingly, it highlights how an AI research team, or any data team, can independently create tokens that could potentially jeopardize the organization. These tokens can cleverly sidestep the security safeguards designed to shield the environment.

Read the State of Secrets Sprawl 2023

Mitigation strategies

For Azure Storage users:

1 - avoid Account SAS tokens

The lack of monitoring makes this feature a security hole in your perimeter. A better way to share data externally is using a Service SAS with a Stored Access Policy. This feature binds a SAS token to a policy, providing the ability to centrally manage token policies.

Better though, if you don't need to use this Azure Storage sharing feature, is to simply disable SAS access for each account you own.

2 - enable Azure Storage analytics

Active SAS tokens usage can be monitored through the Storage Analytics logs for each of your storage accounts. Azure Metrics allows the monitoring of SAS-authenticated requests and identifies storage accounts that have been accessed through SAS tokens, for up to 93 days.

For all:

1 - Audit your GitHub perimeter for sensitive secrets

With around 90 million developer accounts, 300 million hosted repositories, and 4 million active organizations, including 90% of Fortune 100 companies, GitHub holds a much larger attack surface than meets the eye.

Last year, GitGuardian uncovered 10 million leaked secrets on public repositories, up 67% from the previous year.

GitHub must be actively monitored as part of any organization's security perimeter. Incidents involving leaked credentials on the platform continue to cause massive breaches for large companies, and this security hole in Microsoft's protective shell wasn't without reminding us of the Toyota data breach from a year ago.

On October 7, 2022 Toyota, the Japanese-based automotive manufacturer, revealed they had accidentally exposed a credential allowing access to customer data in a public GitHub repo for nearly 5 years. The code was made public from December 2017 through September 2022. While Toyota says they have invalidated the key, any exposure this long could mean multiple malicious actors had already acquired access.

Being able to detect exposed sensitive tokens on GitHub is a unique feature of GitGuardian's Public Monitoring system. It allows security analysts to quickly inspect an organization's footprint on the platform, identify valid secrets, and assess the severity of incidents. What is more, the engine is able to include developers’ personal public repositories — where 80% of corporate credentials are leaked — to an organization's perimeter.

If your company has development teams, it is very likely that some of your company's secrets (API keys, tokens, password) end up on public GitHub, so you should evaluate your GitHub attack surface by requesting a complimentary audit.

2 - Lay out traps in the form of honeytokens

Do you need time to restructure governance around cloud storage access, yet need to be alerted if highly sensitive parts get scanned by a malicious actor?

Your best allies are honeytokens. These tokens are decoy AWS secrets you can deploy strategically across your software assets to regain observability in the grey areas of your IT infrastructure. Getting the attackers' IP addresses, user agent, what actions they were attempting, and the timestamps of each attempt will help you thwart attempts before they can inflict damage on your software supply chain.

Final words

Every organization, regardless of size, needs to be prepared to tackle a wide range of emerging risks. These risks often stem from insufficient monitoring of extensive software operations within today's modern enterprises. In this case, an AI research team inadvertently created and exposed a misconfigured cloud storage sharing link, bypassing security guardrails. But how many other departments - support, sales, operations, or marketing - could find themselves in a similar situation? The increasing dependence on software, data, and digital services amplifies cyber risks on a global scale.

Combatting the spread of confidential information and its associated risks necessitates reevaluating security teams' oversight and governance capabilities. It also requires the provision of appropriate tools to identify and counteract emerging threat categories. While human errors are an inevitable part of the process, GitGuardian is here to guide you along your security journey.

Securing your CI/CD: an OIDC Tutorial

Thomas Segura — Wed, 04 Oct 2023 13:00:00 +0000

Let's start with a story: have you heard the news about CircleCI's breach? No, not the one where they accidentally leaked some customer credentials a few years back. This time, it's a bit more serious:

It seems that some unauthorized individuals were able to gain access to CircleCI's systems, compromising the secrets stored in CircleCI. CircleCI advised users to rotate "any and all secrets" stored in CircleCI, including those stored in project environment variables or contexts.

The CircleCI breach serves as a stark reminder of the risks associated with storing sensitive information in CI/CD systems. Next, let's talk about CI/CD security a bit more.

1. CI/CD Security

CI/CD systems, like CircleCI, are platforms used by developers to automate build/deploy processes, which, by definition, means that they need to access other systems to deploy software or use some services, like cloud services.

For example, after building some artifacts, you probably need to push those artifacts to some repositories; for another, when deploying your cloud infrastructure using code, you need to access public cloud providers to create stuff.

As we can imagine, this means that a lot of sensitive information gets passed through the CI/CD platforms daily, because for CI/CD to interact with other systems, some type of authentication and authorization is required, and in most cases, passwords are used for this.

So, needless to say, the security of the CI/CD systems themselves is critical. Unfortunately, although CI/CD systems are designed to automate software development processes, they might not necessarily be built with security in mind and they are not 100% secure (well, nothing is).

2. Best Practices to secure CI/CD systems

2.1 Best Practice #1: No Long-Lived Credentials

One of the best practices, of course, is not to use long-lived credentials at all.

For example, when you access AWS, always use temporary security credentials (IAM roles) instead of long-term access keys. Now, when you try to create an access key, AWS even reminds you to not do this, but recommends SSO/other methods.

In fact, in many scenarios, you don't need long-term access keys that never expire; instead, you can create IAM roles and generate temporary security credentials. Temporary security credentials consist of an access key ID and a secret access key, but they also include a security token that indicates when the credentials expire.

2.2 Best Practice #2: Don't Store Secrets in CI/CD Systems

By storing secrets in CI systems, we are essentially placing our trust in a third-party service to keep sensitive information safe. However, if that service is ever compromised, as was the case with CircleCI, then all of the secrets stored within it are suddenly at risk, which can result in serious consequences.

What we can do is to use some secrets manager to store secrets, and use a secure way in our CI/CD systems to retrieve those secrets. If you are not familiar with data security or secrets managers, maybe give this blog a quick read.

2.3 Best Practice #3: Rotate/Refresh Your Passwords

Not all systems you are trying to access from your CI/CD systems support some kind of short-lived credentials like AWS does. There are certain cases where you would have to use long-lived passwords, and in those cases, you need to make sure you rotate and refresh the token as it periodically expires.

Certain secret managers even can rotate secrets for you, reducing operational overhead. For example, HashiCorp's Vault supports multiple "engines" (components that store, generate, or encrypt data), and most of the engines for Databases support root password rotation, where Vault manages the rotation automatically for you:

If you are interested in more best practices, there is a blog on how to secure your CI/CD pipeline.

3. How OIDC (OpenID Connect) Works

Following these best practices, let's dive deep into two hands-on tutorials to harden your CI/CD security. Before that, let's do a very short introduction to the technology that enables us to do so: OpenID Connect (OIDC).

If you are not bothered to read the official definition of OIDC from the official website, here's the TL;DR version: OIDC allows us to use short-lived tokens instead of long-lived passwords, following our best practice #1 mentioned earlier.

If integrated with CI, we can configure our CI to request short-lived access tokens and use that to access other systems (of course, other systems need to support OIDC on their end).

4. Tutorial: GitHub Actions OIDC with AWS

To use OIDC in GitHub Actions workflows, first, we need to configure AWS.

4.1 Create an OIDC provider in AWS

For Configure provider, choose OpenID Connect.
For the provider URL: Use https://token.actions.githubusercontent.com
Choose "Get thumbprint" to verify the server certificate of your IdP.
For the "Audience": Use sts.amazonaws.com.

After creation, copy the provider ARN, which will be used next.

To learn more about this step, see the official document here.

4.2 Create a Role with Assume Role Policy

Next, let's configure the role and trust in IAM.

Here, I created a role named "gha-oidc-role" and attached the AWS-managed policy "AmazonS3ReadOnlyAccess" (ARN: arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess).

Then, the tricky part is the trust relationships, and here's an example of the value I used:

The Principal is the OIDC provider's ARN we copied from the previous step.
The token.actions.githubusercontent.com:sub in the condition defines which org/repo can assume this role; here I used IronCore864/vault-oidc-test.

After creation, copy the IAM role ARN, which will be used next.

To learn more about creating roles for OIDC, see the official document here.

4.3 Test AWS Access in GitHub Action Using OIDC

Let's create a simple test workflow:

This workflow named "AWS" is triggered manually, tries to assume the role we created in the previous step, and runs some simple AWS commands to test we get the access.

The job or workflow run requires a permission setting with id-token: write. You won't be able to request the OIDC JWT ID token if the permissions setting for id-token is set to read or none.

For your convenience, I put the workflow YAML file here.

After triggering the workflow, everything works with no access keys or secrets needed whatsoever:

GitHub Actions Security Best Practices [cheat sheet included]

5. Tutorial: GitHub Actions OIDC with HashiCorp Vault

Unfortunately, not all systems that you are trying to access from your CI/CD workflows support OIDC, and sometimes you would still need to use passwords.

However, using hardcoded passwords means we need to duplicate and store them in GitHub as secrets, and this violates our aforementioned best practice.

A better approach is to use a secrets manager to store secrets and set up OIDC between your CI and your secrets manager to retrieve secrets from your secrets manager, with no password used in the process.

5.1 Install HashiCorp Vault

In this tutorial, we will do a local dev server (DO NOT DO THIS IN PRODUCTION) and expose it to the public internet so that GitHub Actions can reach it.

The quickest way to install Vault on Mac probably is using brew. First, install the HashiCorp tap, a repository of all our Homebrew packages: brew tap hashicorp/tap. Then, install Vault: brew install hashicorp/tap/vault.

For other systems, refer to the official doc here.

After installation, we can quickly start a local dev server by running:

However, this is only running locally on our laptop, not accessible from the public internet. To expose it to the internet so that GitHub Actions can reach it, we use grok, the fastest way to put your app on the internet. For detailed installation and usage, see the official doc. After installation, we can simply run ngrok http 8200 to expose the Vault port. Take note of the public URL to your local Vault.

5.2 Enable JWT Auth

Execute the following to enable JWT auth in Vault:

Apply the configuration for GitHub Actions:

Create a policy that grants access to the specified paths:

Create a role to use the policy:

When creating the role, ensure that the bound_claims parameter is defined for your security requirements, and has at least one condition. To check arbitrary claims in the received JWT payload, the bound_claims parameter contains a set of claims and their required values. In the above example, the role will accept any incoming authentication requests from any repo owned by the user (or org) IronCore864.

To see all the available claims supported by GitHub's OIDC provider, see "About security hardening with OpenID Connect".

5.3 Create a Secret in Vault

Next, let's create a secret in Vault for testing purposes, and we will try to use GitHub Actions to retrieve this secret using OIDC.

Here we created a secret named "aws" under "secret", and there is a key named "accessKey" in the secret with some random testing value.

To verify, we can run:

Note that the "Secret Path" is actually secret/data/aws, rather than secret/aws. This is because of the kv engine v2, the API path has the added "data" part.

5.4 Retrieve Secret from Vault in GitHub Actions Using OIDC

Let's create another simple test workflow:

This workflow named "Vault" is triggered manually, tries to assume the role we created in the previous steps, and receives the secret we just created.

To use the secret, we can either use "env" or step outputs, as shown in the example above.

Similarly to the previous AWS job, it requires a permission setting with id-token: write.

For your convenience, I put the workflow YAML file here.

After triggering the workflow, everything works with no secrets used to access our Vault:

Summary

In this blog, we started with the infamous CircleCI breach, went on to talk about security in CI/CD systems with some best practices, did a quick introduction to OIDC, and did two hands-on tutorials on how to use it with your CI. After this tutorial, you should be able to configure secure access between GitHub Actions and your cloud providers and retrieve secrets securely using OIDC.

If you enjoyed this article, please like, comment, and subscribe. See you in the next one!