DEV Community

After 5 years, I'm out of the serverless compute cult

Brent Mitchell on March 30, 2022

I have been using serverless computing and storage for nearly five years and I'm finally tired of it. I do feel like it has become a cult. In a cu...

Read full post

Elias Brange • Mar 30 '22 • Edited

Most of the points here sounds like a result of bad communication and organization, and all of them could surface without Serverless as well. Having teams that build stuff without communicating on certain standards are bound to build services that don't interact very nicely with each other, regardless of if they are running in a Lambda, container or VM.

Depending directly on other teams functions?

Stop doing that, and expose each individual service as an API. Then it does not matter whether there are lambdas, containers or even VMs behind the API. Care must be taken to not break the APIs, and that needs to be done regardless of what kind of compute you are using.

10 different ways to authorize

Also boils down to communication between teams. If all services are exposed as APIs, it would be preferable to use a common auth mechanism for them. Here you could even centralize that function to a team and let them be in charge of an authorizer that other teams can use in API Gateways. For other services that does not run behind API gateway, expose the authorizer functionality as an API that can be called from inside containers in an API middleware or similar.

Account chaos

More an organizational issue. I wouldn't want to log in to a console where 20 teams have 4 EC2 instances each either. Use AWS Organizations and automate creation of accounts, preferably one account per service (or some similar scope).

DNS migration

Same goes for ALB, NLB and other services.

leading engineers to spend most of their time with YAML configuration

Kubernetes says hello.

Brent Mitchell • Mar 30 '22

Hey there, Elias, appreciate the feedback! There is no doubt that team communication is critical, serverless or otherwise. My larger point was that focusing on business objectives (that move business forward) rather than technical ones should be the priority of every productive team.

Additionally, I probably could have been more clear that it doesn't matter whether a developer is hitting an API or lambda directly. The testing difficulty of serverless means many developers and teams are first testing in their dev and test environments. This means my functions/apis may start suddenly failing in dev because another team is making a change to their api, which is now broken for whatever reason. Again, this is because it is very hard to impossible to test anything locally.

As I mentioned, appreciate the feedback!

Elias Brange • Mar 30 '22

Hello there! Great answer! :)

I totally agree on the part that testing distributed systems, where different teams are responsible for different services, is very hard. And I also agree on that all the serverless offerings still need a bit of work to mimic it perfectly locally. However, I feel that this problem would still be there even if the compute layer was running on something else.

How we solved it at my previous place was that every team had a development environment, where they experimented and stuff was not expected to always work. Other services were mocked where needed. We then had a staging and production deploys of every service where all services were expected to be stable, with integration suites testing and verifiying that the actual business usecases (often spanning multiple services) was working as expected.

Since all contracts between teams where defined by APIs, it didn't really matter what was running underneath. Some teams used API Gateway + Lambdas extensively, others used ALB + Fargate, and some teams used NLB + EC2.

tdsanchez • Apr 20 '22

Sounds like your dev teams never heard of blue-green release patterns.

It sounds like your organization REALLY needs to consult with cloud native practitioners. Your organization will continue to fail at the cloud until it does and makes changes on how it communicates and enforces best practices.

Brian Burton • Mar 30 '22

I'm a big proponent of serverless, especially in bootstrapped startups, but it has its place. We have long-running processes that we have running on K8s, but for everything else we use Google Cloud Run and it's frankly amazing.

I agree with Elias, your post sounds like technical decisions are being made by people who don't have a full understanding of how it all works individually or together. The "API Responses" section seems like a rant about the quality of your engineers, to be honest. There are definitely solid negatives to using serverless, but your post comes across not as, "serverless is bad" but "we're bad at serverless" if that makes sense.

Questions I asked myself reading your post:

Why aren't you using an OpenAPI schema and tests to standardize and validate your API endpoints?
Why are you allowing developers to deploy code that hasn't been reviewed for consistency and quality?
Why aren't you containerizing your services?

At our company we have a very elegant environment on GCP that has been properly architected that frankly is working incredibly well. We have processes for each step, from pushing code to CI/CD to building new services from skeletons that has helped maintain consistency across the product. We utilize services such as Pub/Sub when appropriate if we need to run background processes that may take several seconds to minutes and may fail and need to be retried. All internal security including the compartmentalization of customer data is handled by IAM. It's elegant and easy to manage.

Also if it helps, many of your complaints about the quality of the services at AWS are why I prefer Google Cloud. For example, Anthos Service Mesh is the solution you're looking for for Problem #4. From project management to permissions to serverless, the entire experience is so much more logical and easier to manage. Hellooooo folders!

Brent Mitchell • Mar 30 '22

Hey there Brian,

Totally agree, in a perfect world, a senior engineer or architect would guide these decisions. However, there are instances (especially as things scale, like in large organizations
) where it is out of a developer's control and technical decisions have been made without fully understanding the tech. The path is wide for poor decision-making in a serverless application where many times those configurations or options are not available in a non-serverless application (i.e. domain name, choosing whether you want auth or not). This can lead to a lot of pain. Why blame or subject yourself to the lack of knowledge when the human problem can be eliminated. (Yes, I realize this approach still necessitates a senior engineer or architect laying the proper groundwork).

It's not always (and rarely in my experience) a breaking schema change when an API breaks. For example, OpenAPI test validations don't do much when a developer forgets to add an environment variable for some refactoring. Now, their function is breaking everyone as soon as they deploy, even though it passed locally fine because they had their environment set correctly.

Services are containerized. The problem is these container definitions change too easily as people move teams and teams grow in their understanding of serverless.

Can these be prevented through code review? Maybe. But now the code reviewer has to check all of these things before even beginning to check actual business impact of the function. Again, why not enforce this outside the code review process to eliminate the human aspect?

Adam Hisley • Apr 4 '22 • Edited

It's nice to hear critical perspectives, but I can't agree with most of these points. Serverless is "cult-like" because.... it doesn't support KISS and DRY "without help from frameworks?" This sounds more like a local problem than a fundamental critique of serverless architectures. You seem to acknowledge that good frameworks are the answer here, so what's wrong with CDK, serverless.com, etc.?

In addition, it seems like many of the listed problems boil down to "junior devs with weak support and practices ran into these issues." But have junior members on standalone DevOps/Ops teams never created an overly permissive IAM policy? Never copy/pasted YAML? Have devs never cut corners on local testing prior to serverless?

I think it's important to differentiate problems that are endemic to serverless/cloud native architectures from generic challenges that apply to all organizations adapting to new technologies and software principles. In my experience, the real cult-like mentality in software development is the pernicious belief that the benefits of innovation can be bought cheaply:

-- "We can just lift & shift our data center into the cloud, and reap massive benefits overnight!"
-- QA automation totally replaces our need for dedicated QA!"
-- "Software devs will handle cloud infrastructure, plus all their traditional responsibilities, with minimal support! And output will be higher than ever before!"

In many such cases, after the hype wave crashes, the most successful software businesses are the ones who followed through on innovative practices, but also paid their entry fee and have an impressive list of "Challenges we solved..." by the end of their journey. If your organization's leadership isn't prepared to support a serverless transformation, I would certainly stick to more traditional and practiced approaches, but I struggle to imagine many of these problems you listed going away with a simple tech stack change.

Steven Chu • Mar 31 '22

I've always felt the same way about serverless, but never knew it was okay to talk about :P

I'm definitely repeating a lot of what you say but I thought I'd throw my hat in the ring to both sympathize and keep myself honest. My biggest pain points are as follows:

Copy pasta -- yes layers are a way of modularizing your code but each layer comes with a version bump and, let's be honest, it's much harder to develop a stable library than it is to simply import from a namespace; yes those namespaced utils may change a lot over time but I'm more okay with that than constantly redeploying a new version of my layers just because I keep writing buggy code; OTOH, maybe I personally just am not that very good at writing modular code and that no amount of process fixes anything and that really what's the cost of a version bump?
An AWS resource for everything! With every new lambda I have to add a metric, a metric filter, a log group, and probably at least 2 other things I want to get basic log lines in CloudWatch and then on top of that I have to remember what each resource I create was named! OTOH, building a centralized logging system is no walk in the park either and bespoke solutions are never really as ready-out-the-box as advertised
Local testing. I'm running into this now where my laptop is just too darn slow and I constantly forget what's a dev dependency and what's a prod dependency and whether I need to update my .env files to point to the stage or prod instance when I'm testing. I have no clue how this would work if it weren't a side project :P I've loved, loved, loved dedicated dev environments at my 9-5 but even there (OTOH) sometimes dev isn't quite up to par with prod and it does take time and a process (and money and servers!) to make sure that you have a decent dev environment to begin with

Alex Lohr • Mar 31 '22

Serverless is such a misnomer, as it is the exact opposite: managed servers as a service. You get what you pay for. It's your own decision if the advantages outweigh the disadvantages. It's certainly a nice choice for more front-end oriented people.

Adam Crockett 🌀 • Mar 30 '22

I didn't ever fully invest in serverless because it worked just fine the way I have bookmarked your post because I will read it, I want to believe I'm the one with the problem and I hope you convince me that perhaps I'm not

Mike Talbot ⭐ • Mar 30 '22

Great article, thought provoking and mirrored at least partially in my experiences.

Thomas Hansen • Apr 3 '22

I worked for a company once who were using CosmosDB. All inserts had to go through stored procedures because we needed atomicity (sigh!), we paid €8,000 per month in Azure fees, we couldn't do partial document updates complicating our code resulting in forcing us to manually in code implement locking, and we were basically using the things as a "badly implemented RDBMS".

Us the right tool for the right job - For software that is (almost) never CosmosDB, Lambdas or Dynamo ... :/

JDanson • Sep 26 '22 • Edited

Brent's article has some valid ideas, but I would debate these in terms of what he is missing.

I am now running a 10,000 user load test on daily users, at 5 cents per user per year.

If I want to scale up or down, the revenue to profit ratio is unchanged. Try doing that with fixed costs and 2-yr upfront costs for EC2 and RDS.

The vast cost reduction, and revenue-cost mirroring for a 100% scalable system is the reason why serverless is difficult, but still necessary.

All of the CICD and reliability challenges and problems which Brent mentions need enforcement in both legacy client-server apps AND the newer serverless apps. Blaming mismanagement on serverless ignores the agnostic nature of failure. A mismanaged project is precisely that, not down to tech A or tech B approaches.

Alex Hutton • Apr 19 '22

Good article. You're definitely on to something, but I couldn't help but disagree with you on almost all of your specific points. The problems you raise all seem solvable.

When I look at serverless, I see something beautiful and unique (let's call it a monolith), that's been shattered into a thousand pieces. Now each piece can be purchased individually as a turnkey solution. Is this genius or is this a crime -- a difficult question to answer. If an experienced developer is not used to serverless, it's only naturally for them to feel like something is wrong when making the adjustment. On the other hand, you're five years in and you still have a problem with serverless. I'd be interested in hearing more of your perspective in terms of comparing serverless with concrete examples of "non-serverless" and explanations as to why the latter is better.

DanielSanchez97 • Apr 5 '22

I agree with a lot of the things that were brought up in this article about the short comings of using a Serverless compute platform. I think Serverless compute platforms created a fundamentally different deployment privatives, which is the underlying cause for a lot of the problems in this article. I am currently trying to build a framework to solves these underlying problems at staging.cdevframework.io/ if anyone is interested in trying it out. It definitely still has a lot of rough edges, but I think it shows some promise.

Tony Nguyễn • Mar 31 '22 • Edited

nice and interesting viewpoint!
Thanks for your sharing!

Rob Muhlestein • Apr 20 '22

This is extremely well written. I appreciate the specific examples you cite. It is very hard to convince people of these conclusions even though they are objective and obvious to others.

tdsanchez • Apr 20 '22

It's sounds like your organization decided not to adopt cloud native practices and it is the fault of your org, not Severless PaaS technologies.