DEV Community

vorsprung
vorsprung

Posted on

Why Serverless needs Ops

Dr Werner Voglers famously said, "No server is easier to manage than no server".

As he is the CTO of Amazon and so one of the people pushing the move to Serverless, you'd expect him to hold this view.

But in many ways, he is right. But this isn't the end of the story

What is software for?

Serverless or a server is a platform to run software on. It's worth remembering the things that are the purpose of software

  • Whatever it is your customers are paying you for
  • Unique value, differentiation
  • Making your product automated, scalable, 24x7

And serverless can help with these

Whatever it is your customers are paying you for - that can appear quicker in a more responsive way than with using servers

The unique value, differentiation of the incredible software you are making should shine through - so let SaaS do the mundane stuff, you have more resource for specialization

Making your product automated, scalable, 24x7 - this is one of real "batteries included" features of SaaS

How to manage servers

To manage a server you need to be expert at these things

  • Scalability
  • Resilience
  • Availability
  • Maintainability
  • Simplicity in complex systems
  • Instrumentation and visibility
  • Graceful degradation

Being an expert at these things is server operations. Without these things your software isn't going to perform on a server

But remember serverless, does away with servers. And part of this is dealing with operational issues

The operations that Serverless includes

A great way of understanding the advantage of serverless is to see how it can usurp people doing server operations.

  • Scalability - serverless does this automatically
  • Resilience - the cloud infrastructure has resilience
  • Availability - cloud is 24x7
  • Maintainability - cloud infrastructure has good maintenance features
  • Simplicity in complex systems - cloud products are simple and reliable
  • Instrumentation and visibility - basic metrics are available
  • Graceful degradation - can be implemented

Of course, the serverless providers have teams of people doing this for you. And they can afford to hire the best people, due to economies of scale

Serverless isn't magic, it's specialisation.

Instead of you having to hire server operations people, this bit of work is supplied by the serverless provider.

Why serverless isn't the end of the story for operations

However, if we dig in a little deeper, often serverless still needs operations people

  • Scalability - serverless does this automatically but within limits that must be understood. For example, in AWS S3 there is a limit of 3500 put operations per bucket per second. It's important to know the limits and understand the alternatives

  • Resilience - the cloud infrastructure has resilience usually as some “provision”. For example with AWS DynamoDB, there is a cost limited option or a "on demand" option that must be understood and set up

  • Availability - cloud is 24x7, except for very unusual events. For example in 2017, there was an outage on AWS S3. Even AWS itself was caught out by this as the service had previously been so reliable.

  • Maintainability - cloud infrastructure has good maintenance features, use of which has to be planned. For example, if using AWS API gateway as part of your stack, changes to the API have to be understood as a threat to stability and planned in ahead

  • Simplicity in complex systems - cloud products are simple and reliable. But emergent behaviour is still there. It's easy to add a microservice, and another, and another until there are many of them. Microservices architectures have the services calling each other. So to do some numbers, if each service has one input and one output and there are 10 of them and if they can be arbitrarily connected that's 9 factorial or 362880 different ways of connecting them. In the real world, most of these combinations would be unsuitable but the possibility is still there for misconnecting things. And when something like this does occur, it's down to you to find it, in production

  • Instrumentation and visibility - basic metrics are available. But application specific metrics have to be added. Going on from the example above with 10 cross-connected microservices, imagine how much easier it is to fix them if there is application level logging implemented for transactions. Companies like Honeycomb have a business supplying this extra, necessary level of traceability in the applications that are implemented on serverless or PaaS systems - which show emergent behaviour but do not monitor everything out of the box

  • Graceful degradation - can be implemented. And this can be difficult. The ideal form of graceful degradation is that valued customers got to use the part that is still working while the less important traffic is discarded. Maybe someone better informed than me can suggest a pattern for doing this within serverless

In summary: operational problems don't go away because you are running serverless!

Top comments (2)

Collapse
 
phlash profile image
Phil Ashby

Well observed, thank you :)

Do you find the separation of responsibilities between customers and providers is cleaner or easier to manage in Serverless environments, what about PaaS, IaaS, etc.? These are 'interesting' questions when compliance and auditors are involved (eg: obtaining PCI-DSS accreditation)!

Collapse
 
vorsprung profile image
vorsprung

Do you find the separation of responsibilities between customers and providers is cleaner or easier to manage in Serverless environments?

It's hard to say but my gut instinct is no.

The cloud providers mode of operation is all the customers are the same and all the products are the same. By "the same" the positive is that no matter how big or small you are the service is pretty similar - which is usually a big positive.

However, it's a two edged thing. The S3 hosted web page for the $10 a month account and the $100,000 RDS get similar stuff but the corporate droids running the corporate RDS are expecting less (in a way) than the beginners with their S3

The S3 dudes don't get the seperation of responsibilities. Is this their fault? Kind of. But if they had a guiding hand on their side maybe it would go better

'interesting' questions when compliance and auditors are involved (eg: obtaining PCI-DSS accreditation)!

usually auditors just want to see paperwork which cloud providers are good at. I guess there might be some responsiblity (eg: password change restriction) that is with the customer. If the low end, normal users had some good advice they'd be fine. Otherwise they'd be screwed, the serverless ops won't help