DEV Community

Bernd Verst for Microsoft Azure

Posted on • Originally published at bernd.dev on

From Cloud Security Alert to Open Source Bugfix

Perhaps you've seen vulnerability reports in your CI/CD pipeline or tools like NPM. Cloud infrastructure has these too and I was surprised to get an alert recently. Naturally, I had to investigate to see where I went wrong... (and of course mitigate the problem in the process). Little did I know about the journey I was embarking on...

The security alert

It all started a few weeks ago when I received an email from a colleague with the following content.

Secure transfer to storage accounts should be enabled

Subscription ID: devrel-berndverst-demo-test  
Resource: /subscriptions/XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX/
    resourcegroups/kubeflowrelease/providers/
    microsoft.storage/storageaccounts/ > fairingXXXXXXXXXXXXXXXXX  
[1-Click mitigation link]

***Note: several similar results have been omitted*
Enter fullscreen mode Exit fullscreen mode

This was my gut reaction:

😱 This cannot be. What did I do wrong? Did I mess something up? 😰

My colleague had used Azure Security Center to generate a report of security issues. Since I had never received such a message before (note I could have set up email alerts in Azure Security Center myself) I was a bit skeptical and first verified the authenticity of the email: Email domain, Sender Police Framework (SPF) check, Domain Keys Identified Mail (DKIM) all indicated the email was legitimate.

The 1-Click mitigation link (which I verified went to the Azure web portal domain) took me directly to the recommendations section in the Azure Security Center for my Azure account and jumped to the relevant entries. The Security Center recommendations documentation also had more information on the alert I had received.

What I did wrong (supposedly)

Azure Blob Storage accounts should be configured to only serve traffic over https.

The 1-step mitigation in Azure Security Center resolved the issue, but I could also have followed these instructions in the docs.

🤔 Why would I ever not want secure blob transfers? This does not sound like me.

Taking a closer look

I noticed that all affected blob storage accounts had programmatically generated names. Furthermore they all resided in my Azure resource group kubeflowrelease and contained the string fairing in the account names.

Days earlier I had ported an end to end tutorial for Kubeflow using the MNIST training set to Azure. Of course in the process I deployed Kubeflow to my Kubernetes cluster and went through the tutorial I wrote.

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

The culprit: Kubeflow

💡Kubeflow somehow created the storage accounts in question

The Kubeflow Fairing helper library created the storage account without forcing secure transport.

How to correctly create secure storage accounts

Looking at the relevant documentation I discovered something very interesting:

By default, the Secure transfer required property is enabled when you create a storage account in Azure portal. However, it is disabled when you create a storage account with SDK.

In other words, the SDKs default to doing that for which the Azure Security Center alerted me, creating insecure storage accounts.

❓Kubeflow Fairing is written in Python. I wonder how to create secure storage accounts with the Azure Python SDK.

The Python SDK sample code at github.com/Azure-Samples/storage-python-manage does not explicitly set the secure option. The sample code uses the insecure default [Edit: it turns out that this depends on the SDK version used]. I made this pull request to fix this.

Fixing the sample code led to several unrelated issues with the tests, all of which I fixed:

  • The Azure Python SDK version was not locked and the API output had changed, no longer matching the mocked responses
  • Travis was running tests for a version of Python that was no longer supported
  • Travis was not running tests for current versions of Python 3.7 and 3.8.

It took me hours, but the above pull request fixed it all.

What did Kubeflow Fairing do?

Taking a close look we can see that Kubeflow Fairing basically copied from the Azure Python sample code.

I would have done the same. Good thing that is fixed now!

Only use secure transport mode for new Azure Storage Accounts #477

What this PR does / why we need it:

Currently Fairing creates new Storage Accounts on Azure (if necessary) that do not use best practices of enabling strict transport security. Azure Security Center will flag these storage accounts.

This PR uses the recommended default (which unfortunately isn't the default in the Python SDK).

References: https://github.com/Azure-Samples/storage-python-manage#create-account https://docs.microsoft.com/en-us/azure/storage/common/storage-require-secure-transfer https://docs.microsoft.com/en-us/python/api/azure-mgmt-storage/azure.mgmt.storage.v2018_07_01.models.storageaccountcreateparameters?view=azure-python

Release note: -->

NONE

This change is Reviewable

Success?

- ✅ Kubeflow Fairing now creates secure storage accounts.
- ✅ Anyone finding the Azure Python SDK samples
  for storage account creation will create secure accounts.
Enter fullscreen mode Exit fullscreen mode

What about the API / server-side default?

Since API version 2019-04-01 the secure mode is the default. All current SDKs will be using this API version or a newer one which defaults to the secure setting.

So why the Kubeflow Fairing issue afterall?

Kubeflow Fairing was using the deprecated azure library. Instead azure-mgmt-storage. As such it was consuming an outdated version of the API with the insecure default setting. API and SDK versioning is hard!

The edits to the Azure Python sample documentation weren't strictly necessary. But at least I fixed the tests.

We must go deeper

Top comments (0)