Wassim Chegham for Microsoft Azure

Posted on Sep 30, 2020

CLAD Model for Serverless Security

#javascript #serverless #mscreate #security

This is a guest post by Guy Podjarny (@guypod), co-founder and president of Snyk. Prior to that, Guy was CTO at Akamai.

How does serverless help with security?

Let's start with just a little bit of taxonomy. What does serverless mean? Serverless could mean different things to different people. So, just for the purpose of this post, I'm going to use the definition of serverless functions, as in Azure Functions. Basically, think about functions, in the context of serverless, that run on top of the cloud platform that manages the VM and the operating system for you. You just need to deploy the functions.

In this post, we're going to dig a lot into the security gaps, because this post is primarily meant to be practical and give you something to take away. However, let me briefly talk about the security advantages that serverless brings in.

Serverless does implicitly tackle a bunch of security concerns by pushing the handling of them to the underlying platform. The three notable ones are:

1. Unpatched operating systems

Serverless takes away the server wrangling. It takes away the need to patch your servers and your operating system, which is one of the primary way attackers can get in.

Serverless means that the platform patches the servers and the operating systems for you, and generally this is a core competency of these platforms, so they do it quite wel.

2. Denial of service attacks

Serverless tackles denial of service attacks well. Serverless, naturally elastically scales to handle large volumes of goods traffic, and because of that, it also can handle a substantial amount of bad traffic that might be trying to use up your capacity, so that you can’t serve legitimate users.

You can still get DDoS'ed, and you can get a big bill if you are using serverless, but it's harder for the attacker to do successfully.

3. Long-standing compromised servers

This is probably something that doesn't get as much credit as it should. Serverless means that the servers are very short lived. These components that run your software, they come in, and they go away. And that implies that a very typical attack cannot really be done because attackers need to do an end-to-end attack in one go, which is harder and carries a higher risk of exposure.

What's left?

So, even though serverless helps with all these things, it doesn't completely secure your application. There's a lot of responsibility that still lives with you, the developer. Let's dig into that responsibility.

We're going to go through them in a model I call CLAD:

Code: this is your function’s code, which might contain vulnerabilities.
Libraries: the components or binaries that you pulled in through your app, from NPM, Maven or PyPi; they're still in your components, they're still a part of your application and over time they might have known vulnerabilities in them that attackers can exploit.
Access: which is where you may have given too much permission to a function and therefore made it either riskier if an attacker compromises it, or made it easier for an attacker to access it.
Data: which is a little bit different in serverless, because you take away the transient data that might live on a server.

So, let's go one by one.

Code (the function)

Code is kind of the heart of what we're trying to do. Here's an example of Node.js a function.

const { execSync } = require("child_process");
module.exports = async function (context, req) {
  // ...
  // code logic here
  // ...

  const path = `/tmp/${req.body.orderId}`;
  const cmd = `echo -e "Date: ${Date.now()}" >> ${path}`;
  try {
    execSync(cmd);
  } catch (err) {
    context.done(err);
  }

  // ...
  // more code logic here
  // ...
  context.done();
};

This is an Azure function written in Node.js. It simulates an e-commerce store that might create an Azure file storage for every order made. This function gets called when the order is fulfilled, so that the file is amended with the date to indicate that it was fulfilled.

If you've been observant, you might see that the scariest piece of this sample code is probably that execSync() call. In fact, that's the case, but really the security mistake, happens a little bit further up, which is over here:

  const path = `/tmp/${req.body.orderId}`;

The orderId can hold any UTF-8 character and that includes for instance, a semicolon ;.

So over here:

  const cmd = `echo -e "Date: ${Date.now()}" >> ${path}`;
  execSync(cmd);

When I do execSync() and build the cmd shell command, I'm potentially allowing a remote command execution. If the payload looks like this:

{ "orderId": "123abc;cat /etc/passwd" }

Which start with a valid Id 123abc, but instead of a complete order Id, there is a semi-colon ; and then a malicious shell command. So, it's a pretty bad attack. And it's because of that semi-colon.

There's nothing in serverless that would protect you against this kind of vulnerability. This type of remote command execution vulnerability can also happen in serverless just as much as it can happen in a non-serverless. For serverless, you will have to:

secure your code, and beware of even inputs and triggers.
treat every function as a perimeter.
and to be able to do those at scale, you really need to use shared security libraries. You're going to have many functions, and it's just not practical or realistic to think that your developers would always sanitise every source of inputs for every function. So, it's easier if you create or choose an external sanitisation library that they can use.

Libraries

We get used to thinking about libraries as the app or the function. But in practice, they behave very much like infrastructure, just like an operating system or a server might have an unpatched Nginx, a function might have an unpatched express.js or other libraries.

There are quite a few of them. Let me kind of share some numbers:

Languages	Median # of direct deps	Median # of total deps	# 0-days last 12 months
JavaScript	6	462	565
Java	16	145	812
Python	13	73	206
.Net	7	85	88

Source: Serverless projects in Snyk.io, Vulnerabilities in Snyk.io Vulnerability DB.

I looked at the projects we protect at Snyk.io. We protect about a million of them, and many of them are serverless. I did a quick analysis of what's the median number of dependencies that the serverless functions have. And it's substantial; it's 6 to 16 libraries that a function, by median, uses. But maybe more interesting is that these components used by the functions use other components, that use other components. And in total, the number of dependencies (libraries) is dramatically bigger! It's one, or sometimes more, orders of magnitude bigger than these direct dependencies. So, there are a lot of components that might have a vulnerability. A lot of them that can grow stale that might have not had a vulnerability, but now a new disclosure came along and shared that it has a security flaw.

The third column shares that, per each of these four ecosystems, how many 0-days are rather new disclosures of vulnerabilities in these components took place in the last 12 months alone. As you can see, that's a lot! And if you do the math, the likelihood of you having a significant number of ways in for an attacker is very high. So, you need ensure that you deal with this. It’s an infrastructure-ish type of risk that you need to control.

So, what do you do about it? Well, first, you must know what you've got. You want to make sure that you invest in tracking which components are being used by every function. You should keep note of which function, especially the ones in production, use which components, and then track whether new vulnerabilities get released on them.

Second, you want to invest in remediation. You're going to get these alerts often; the reality is that this happen all the time. And so you want to make sure that it's easy for you, once you found out about an issue, to fix it. Typically this means upgrading the component and rolling it out.

To recap:

find and prevent vulnerable libraries.
streamline and automate remediation.
know your inventory, be ready for 0-days.

Access and Permissions

This is really about the difference between what can your function do, and what should it be able to do.

In serverless, what you often times see is a pattern where you have a YAML or a config file, with all functions configuration and IAM or access permission in one single file, and this pattern happens in every ecosystem.

Once you give some function permission, and it runs, it's scary to take that permission away. You really don't know what might break. The reality is that they never contract, they just expand until, somebody adds an asterisk. So, you really want to invest in shrinking that and having the right policies in place from the get-go.

A single security policy might be easier. But the safe way to go is to invest in having a policy per function. If you do that, not only are you overcoming a problem, you're actually better off than you were before because in the monolith situation, if you have a single app and it has all those functions in one, the platforms don't allow you to do it. You can't say this piece of the code has this permission and that piece of the code has the other. But with functions and serverless, you can. So take advantage of it instead of, you know, making it be a flaw.

So, we've talked about:

giving functions the minimal permissions, even if it's harder.
isolating experiments from production.

And if you really want to level-up, build a system that tracks unused permissions and reduces them over time. Whether you do it through logs or through more of a "chaos engineering" style", remove permission and see what happens. If you manage to build this competency, it will be very powerful for you to keep your functions and the application secure and as safe as it can be.

Data: input and output into your functions

At the end of the day, applications typically are just processing data: add some piece of logic and it takes some data in and puts some data out. Serverless is no different, these functions still process data, and they need to do it well.

However, with serverless, there's also the concern that you lost the opportunity to store transient data. Things like a session data, or log data, that you might have temporarily put on the machine or even held in memory, you can't do that anymore. The result is that much more of that data gets stored outside the function.

The data might get stored into some Redis session cache. It might get stored into another spot. And you have to be mindful of how you secure that data because just like before, when we talked about the perimeter, you don't know who has access to that data, or where would that go.

One recommendation, when storing data outside, always turn on encryption. Data is not encrypted at rest. Who knows who has access to it?

Data is important. Serverless doesn't magically make your data security concerns go away. You just need to be mindful. More specifically with serverless, I would advise you keep secrets away from code, using something like Azure Key-Vault. Serverless makes everything so easy but secrets are a little bit harder. It's very tempting to just check in some code or some secret key into your code repository. Don't do that. It's hard to rotate them. So, try to use the Key-Vault, or at least environment variables and proceed.

Secure data in transit. When you think about these functions, data moves between network entities, between functions, far more than before. Are you securing it when it's going in transit? When you're going to 3rd party components? When you're reading data back and because it's not all on the same machine, you cannot trust the channel that these functions communicate through. You can, but if you don't treat it as if every function has a perimeter, if things move around, you're quite fragile. Consider also encrypting your data, and consider verifying the identity of the other entity that you're talking to.

Then, finally, think about that transit data, that session data. This isn't more severe, it's just a little bit newer for serverless development. So, if you've gone from developing for non-serverless and you might've been used to, say, holding session data in memory, you might not have thought to encrypt it. Now, when you store it off to a side Redis, maybe you should.

So, that's the CLAD model. Which basically says serverless is amazing; it implicitly takes care of a lot of security concerns for you. But it leaves you with Code, Libraries, Access and Data; all of which you need to secure.

Final thoughts

Let me leave you with two more thoughts.

Scale

With serverless today, you might have 20, 30, or 50 functions. It might seem manageable. It's a small amount that you might be auditing them or surveying their security manually, but over time that won't work. Serverless is all about scale. Tomorrow, you're going to have 500 then 5000 functions. And if you don't invest in automation and observability, to be able to know what's going on, you're going to get in trouble.

Now that you're building out your practices, make sure that you are aware of which functions there are, what their current security status is, which components they run, and what their permissions are. That way, you really get ahead of this. Otherwise, it's going to be hard later to untangle the mess that might get created.

DevSecOps

Serverless is all about speed. It's about being able to deploy these functions again and again and have them be small units that just work with good APIs.

There's no room, there's no time, there's no opportunity for an external security team to be brought in. It won't fit the business needs to have a security team come in, stop the deployment process, and audit. And so, the only way to scale is the DevSecOps approach, where you want to empower developers, and give them the tools, the ownership, and the mandate to secure what they're building.

Then, you want to have a security team whose job is really to help these developers secure what they’re building, better and more easily all the time, and make sure that they’ve done it properly. With that model, you can scale security beyond serverless - for cloud native development, and for that matter, development as a whole.