DEV Community

Cover image for Biggest security takeaway of 2020 - Don't leak secrets on GitHub

Biggest security takeaway of 2020 - Don't leak secrets on GitHub

mackenziejj profile image mackenziejj ・8 min read

2020 has been crazy, especially in Security.

We could list all the insane things that have happened this year, but, you were there, you lived through it. In the world of cybersecurity, we have seen some unprecedented malicious activity: Widespread phishing attacks utilizing the fear of Covid, attacks on hospitals, a huge increase in nation-state attacks..... Heck, even Jeff Bezos and Kayne got their Twitter profiles hacked. Picking a takeaway from the year is difficult, to say the least. But towards the end of the year in particular, we saw a massive increase in high profile attacks that were exploited with a common vulnerability. Leaked credentials in git, namely GitHub.

The recent headlines

One of the biggest data breaches we saw this year came from Brazil. Data from 16 million Brazilian Covid-19 patients were exposed online, a list that included the president of Brazil, Jair Bolsonaro. This leak contained a trove of sensitive information such as addresses, names, medication regimes and even medical history.

And more recently, the still-unfolding hack of SolarWinds. The company has acknowledged that hackers injected malware into a software update for its Orion platform, a suite of products broadly used across the U.S. federal government and Fortune 500 organizations to monitor the health of their IT networks. The complete extent of this hack is still unknown as it affects now thousands of SolarWind customers and the repercussions won't fully be known until well into 2021. What we do know right now:

  • Attackers gained access to SolarWinds update server and injected a small amount of malicious code into an update.
  • The hackers were able to use the injected malware to breach SolarWinds customers with the update.
  • Attackers gained access to email communications in the U.S. Treasury and Commerce departments
  • intrusion also had been used to infiltrate computer networks at the U.S. Department of Homeland Security (DHS).
  • Up to 18,000 customers (again government and fortune 500 organizations) have been affected by the intrusion.

So what do these two massive security incidents have in common?

Both involved employees leaking secrets into personal public GitHub accounts.

Analyzing the incidents

Analyzing all the breaches this year will be a very long series of articles and include some pretty big names like Starbucks and even Mercedes Benz. But if we take the two high profile examples above, which are possibly the biggest security events of the year. We can paint a picture of what set the incidents in motion.

Brazilian Covid-19 data leak

In this case, the leak came to light after a GitHub user spotted the spreadsheet containing the passwords to government medical systems within a personal GitHub. The GitHub account in question was owned by an employee of the Albert Einstein Hospital in the city of Sao Paolo. The user that discovered the spreadsheet later notified Brazilian newspaper Estadao, which analyzed the data and notified the hospital and the Brazilian Ministry of Health.

Among the systems that had credentials exposed were E-SUS-VE and Sivep-Gripe, two government databases used to store data on COVID-19 patients. E-SUS-VE was used for recording COVID-19 patients with mild symptoms, while Sivep-Gripe was used to keep track of hospitalized cases.

The two databases contained sensitive details such as patient names, addresses, ID information, but also healthcare records such as medical history and medication regimes.

Estadao reporters said that data for Brazilians across all 27 states was included in the two databases, including high profile figures like the country's president Jair Bolsonaro, the president's family, seven government ministers, and the governors of 17 Brazilian states.

SolarWind Breach

Disclaimer: SolarWind has yet to confirm the root cause of the hack that saw malicious code injected into a software update. And there is no way with the current information to know with certainty that a leaked credential was the entry point for the sophisticated and complex attack. But a SolarWind credential giving access to the update server was disclosed, not yet connected.

What we do know is that Security researcher Vinoth Kumar, gained access to a SolarWind FTP update server on 19th of November 2019 as a result discovering credentials to the server within the public GitHub repository of a SolarWind employee. Mr Vinoth notifiedSolarWind public disclosure team via an email (below), one line in particular stand out, "Via this [FTP credential] any hacker could upload a malicious exe and update it with release SolarWinds product", which is essentially what has been reported to have happened. The FTP credentials were exposed within a config file named PurgeApp.exe.config which was committed back in June 2018. This was confirmed by the SolarWinds team in a reply email to Mr Kumar.

The way the attack was carried out was no doubt carefully planned and executed, but attackers first needed to gain entry into the companies systems to be able to embed their malicious code into the software update. The leaked FTP credentials, though not confirmed, is certainly a vulnerability that could have been used by the attackers to achieve this objective.

"If a group of well-funded hackers can succeed in modifying just a bit of code somewhere and getting folks to install it as part of a legitimate software suite, they are gaining insider access to organisations which may be otherwise impenetrable, such as governments."  Jackie Singh

Why this is such a predominant issue

Credential theft has long been a known and reported technique for attackers as outlined in the Mitre Att&ck framework. But there are some unique challenges that: git, a shift in software development and even the year that was 2020, have exasperated.

Leaks occurring outside of organizations control

The first thing to point out is that in both the mentioned cases, the secret leak was on employees personal GitHub accounts. Organizations have no authority to force employees to have any security measures in place on their personal profiles. it is after all outside of the companies scope and control. So even if organizations have detection capabilities within their own version control systems (which they absolutely should have), it's no guarantee that an employee won't make a mistake on their own repository. Leaked credentials is a human mistake we are trying to detect, and arguably these are harder to detect than malicious activity. Human mistakes will always happen and there is a huge number of ways secrets could leak:

  • Pushed to wrong repositories
  • Temporary code merged with secrets
  • Private repositories made public with secrets buried in history
  • Application logs, debug logs and config files committed with secrets

We can't prevent all human errors, but we can detect them and take action on it.

Rapid distribution of teams due to Covid-19

Another aspect that has influenced this issue is the dramatic nature in which many employees were suddenly working from home as we tried to deal with lockdowns throughout the world. Many no longer had the safety bubble of the secure closed network yet office systems still needed to be accessed and developers still needed access to secrets to be able to continue their work. While this is not an issue that cannot be overcome, the rapid rate at which organizations and people have had to adapt meant that the processes, documentation and tools were not able to be properly set up. As a result, more sensitive information was being shared between newly distributed teams and more sensitive information was being stored in multiple locations.

Challenges associated with detecting credentials

Finally, detecting secrets is hard, really hard! API keys are often high entropy strings that are random.

SECRET_KEY = 'wzftctmm@%c#ffp$v04i=mh8su*o!am^6op9+22xt2f2f#yrc*'

But high entropy strings are also used everywhere in code such as UUIDs (Universally unique identifiers), with URLs and even used to store assets. This is all to make the chance of a conflict almost impossible. But this does mean detecting secrets and not other high entropy strings is challenging and needs to factor in many different weak signals and to do this, algorithms need to be trained on massive amounts of data which is expensive in resources and time-consuming.

All this has meant that within the year of 2020, leaked credentials have proven to be one of the biggest security trends we have faced yet.


So how do we tackle the issue of secrets sprawling into git repositories so this isn't the takeaway of 2021? This is a complex issue, particularly at a large scale, but there are some general practices that we can put in place.

Manage secrets using vaults or key managers at an organization level

This first one is the most obvious. Secrets are the crown jewels of every organization. So they need to be tightly wrapped with fine grained auditing. There are many platforms that do this well and all companies should be implementing secure secret management as part of their security infrastructure. What system and how to implement it greatly depends on the organization, but HashiCorp vault has long been the gold standard in doing this. These systems help prevent secret sprawl and are the first step in trying to make sure your secrets do not end up in a git repository for a bad guy to find.

Tool recommendations:
Hashicorp Vault
Aws Secrets Manager
Azure key vault

Implement automated secrets detection within organizations assets

To prevent secrets being used in an attack, we need visibility into all the companies services and systems. Secrets sprawl because they are often not visible. Buried in git history, hardcoded into temporary source code and hidden within internal communication. Identifying the secrets that are sprawled within internal systems will allow you to minimize the risk of a potential secret leaking into a space outside of an organizations control.

Tool recommendation
GitGuardian private monitoring

Implement automated secrets detection on organizations perimeter

As discussed, the incidents originated within employees personal public GitHub accounts. Organizations have no authority to enforce security policies here, but they can monitor them (because the bad guys are). If you have 2 developers working for a company, this is a smaller problem, but if you have 500, even finding these developers on platforms like GitHub is a huge challenge. GitGuardian has some magic where they are able to find and link employees to companies easily and then monitor their repositories for company assets and secrets.

Tool recommendation
GitGuardian enterprise public monitoring

Rotate secrets regularly

Some secrets are easily rotated and some not. The secrets that can be rotated easily should be rotated frequently, a secrets management service like Vault will also help manage this. This way if there is a secret buried in a git repository that is made public the chances of that secret still being an active vulnerability has been reduced.

Set minimum permissions to secrets

The last thing we want to do is set our permissions to always be the minimum feasible. Once attackers have access to a system they are often able to elevate privileges and move laterally between services. So when secrets are made they should also have the minimum scope to ensure that in the worse case scenario if an attacker has a valid credential we can limit the damage and prevent movement.

Wrap up

2020 has been a crazy year for security, and just about everything else. But one of the biggest trends we have seen this year is the number of data breaches and attacks that have happened because of leaked secrets in public GitHub repositories. This is exasperated by the lack of visibility companies have within git repositories outside of their control, the rapid displacement of employees in 2020 due to Covid-19 lockdowns and the probabilistic nature of secrets detection.

While it is not an easy task, we can do things to prevent breaches from leaked secrets. This includes: implementing secret management systems, scanning for secrets within company assets and employees repositories, rotating keys frequently and setting minimal permissions for secrets.

Discussion (0)

Editor guide