Michael Levan

Posted on Feb 21

Cloud-Native Is In Shambles

#kubernetes #devops #programming #cloud

Have you ever seen this meme?

Yeah, sure you have. It’s one of the most popular memes when everything is blowing up in an environment, in your head, or around you and you have to pretend it’s all good.

That’s “cloud-native” in today’s world.

Engineers are confused, there are a lot of tools doing the same thing, and no one knows which direction is up or what advice it takes.

In this blog post, I’ll help point you in the direction of the “why” and the “how” to make your life a bit easier when it comes to the cloud-native realm.

Too Much Confusion

First and foremost, plenty of engineers are confused in today’s cloud-native world, and it’s not their fault. There are too many chefs in the kitchen. Too much going on between vendors and conferences and clouds to even get a grasp on what’s right or what’s wrong.

If you google a question you may have, you’ll see ten different blog posts with ten different answers.

This is something that’s always been around in the tech space. There’s never one right answer. However, due to the velocity at which the cloud-native space is moving and the amount of tools/vendors that are arising, it’s causing more confusion.

As the world moves forward, we want more stuff faster (modern-day consumerism) and it’s rubbing off on the cloud-native space. We’re living in a time where we have the largest technology advancements and the most options, but as we’re all learning, that may not be a good thing.

For engineers who have been in the space for 7-10 years, there’s a trend happening. It’s almost like a 180 is occurring. For example, going from “cloud all of the things” to “hybrid cloud and on-prem is actually good”. Spoiler alert - we’ve been doing on-prem since the 70’s. In engineering, we’re simply doing what we were already doing.

For engineers that are new in the space (2-5 years in), how the cloud-native space is moving is causing a massive amount of confusion because it’s too much too fast. The engineers that have 7-10 years of experience are confused as well, but less so because they’re seeing the “trend” (the 180).

How about tools?

Too Many Tools

From a tools and vendors perspective, there are a lot… and by a lot, I mean a TON. Engineering has always had several tools, but it’s never been this congested.

As an example, think about Active Directory. For years, Active Directory was the de facto standard when it came to authentication and authorization. Now there are 10+ authentication and authorization tools/platforms along with things like RBAC and authentication tied into platforms/systems (cloud providers, Kubernetes, etc.).

In today’s world, there are 10+ tools for each category, and they all have a slight tweak that makes the tool stand out. Unfortunately, it’s typically never enough of a tweak to truly help you understand what tool you should use.

The problem with this approach is when humans have too many options, we’re naturally inclined to deny all options. If we have too much information coming at us, we become overwhelmed and want to run in the other direction.

That’s how the majority of engineers feel when hearing from vendors.

So, how do we fix all of this?

What’s The Fix?

The fix for both the confusion factor and the tools factor isn’t small and there will be major pushback, but it’s very doable.

Confusion Fix

First, let’s talk about reducing confusion. To reduce confusion when going into a new or existing cloud-native environment, ask yourself one question:

What’s the expected outcome?

This question is the only thing you need to remove confusion. Let’s talk about a scenario.

Engineer A gets super excited about all of the cool Kubernetes and cloud-native goodness they read on the socials. They see that Kubernetes is super popular and everyone is talking about it, so naturally Engineer A thinks that Kubernetes will solve all of the problems that they’re facing. So, what happens? Kubernetes gets implemented without truly understanding what’s necessary or what the expected outcome is and tech debt occurs.

Engineer B on the other hand doesn’t get swept up by the hype. They ask the main question - what’s the expected outcome? Now, as an engineer, it’s Engineer B’s responsibility to decipher the answer. If Engineer B is talking to a manager or someone in leadership, they may not get a technical answer. They’ll have to come up with the technical answer. They do however know what’s expected. Engineer B can then come up with a solution for the expected outcome. It may be Kubernetes or it may not.

Tools Fix

From a “tools fix” perspective, this is a tricky one. After all, every engineer can’t get together and boycott vendors from creating more tools… so how does the tool problem get fixed?

It’s a three-step approach.

First, identify the confusion fix. Understand the expected outcome.

Second, research the exact tools based on category for the expected outcome of the implementation you’re trying to do or the problem you’re trying to fix. You’re going to want to get other engineer's opinions, so Google around on various forums and see what engineers are saying that they use in their environment. Remember, their opinion isn’t set in stone, take it with a grain of salt. However, you’ll have a good starting point.

Third, out of the 5-7 tools you’ll end up narrowing down to, narrow it to 2-3 and test them vigorously. You should take a minimum of 3-4 days to evaluate the tools and see which one works best for you. They may do the same thing, but something as simple as “the installation here was way simpler” will be a night and day difference from a scalability perspective.

Wrapping Up

There’s a massive amount of confusion in today’s cloud-native world. Every engineer from junior level to mid to principal are feeling it. The good news is there are a few different ways to navigate the problem. Remember, always ask yourself one key question - “what’s the expected outcome?”.

Top comments (22)

miraculixx • Feb 22 • Edited

So very true. Not only for cloud and related tools, but for nearly everything in Dev/IT world. Let's go back to essentials:

what is the objective
how do we reach that efficiently
how do we keep operating and running the system, at best automatically
last but not least, how easy is it to change or add something?

Only start once we can answer these questions without reverting to some recent hype as the deciding factor.

Ah yes, do try out stuff. Hands-on. Actually use it. Build a prototype. Kick its tires, look inside. Don't ask your procurement department for help unless you know exactly what you need - or else spend valuable time & effort in a useless vendor shootout, and be prepared to solve politics instead of engineering a system that delivers value.

Michael Levan • Feb 22

I agree with you 100%! Thanks for sharing your perspective.

Miguel Quintero • Feb 22

Very true. TONS of cognative stress (no matter which way you shift it) on devs. Many tools in the API ecosystem. Ask the questions noted in this post before you embark on putting together an API Platform.

Michael Levan • Feb 22

Agreed!

JamieT • Feb 22

Michael Levan • Feb 22

Andy Edwards • Feb 28

I’ve never used Active Directory, but I’m very glad there are alternatives now, esp since (I assume?) I would have to maintain Windows servers

Livio Ribeiro • Feb 22

I believe that part of what caused the "too many tools" problem was the CNCF embracing everything.

For a long time Kubernetes was the only project under CNCF, and being under the foundation meant receiving money and attention, so (maybe intentionally, maybe not) Kubernetes became the only option for the workload orchestrator.

But then CNCF started receiving other projects, all of them tools that extend Kubernetes functionality, and most only work with Kubernetes, everyone wanted their own k8s "extension", and we got lots of projects with lots of duplicate functionality, with some hidden gems here and there.

And let's not forget that "cloud native" meaning "made for Kubernetes" was very likely a consequence of all this.

Michael Levan • Feb 22

Too many chefs in the kitchen.

Comment hidden by post author - thread only accessible via permalink

kstudio • Feb 23

I totally see your viewpoint, however, I’d like you to think about building a car 🚘. You are going to need a tonne of various different components 🛞⚙️🔩🛢️ for this project. And we need a toolset 🧰🔧🪛as well.

For each of those components ⚙️and tools 🧰there are multiple vendors. It’s a maze 🤬😡out there. The supply chain ecosystem looks like a mess.

But here’s the thing, everything you need is free 🆓 and high performance 🏎️

If you know what type of car 🚙 🏎️🚛 you want and know how to build it 📖 💪👩🏻‍🔧.

Well then the CNCF ecosystem is the biggest parts store in the world.

one man's garbage is another man's treasure 😄

Comment hidden by post author - thread only accessible via permalink

Sergey • Feb 23

I honestly don't feel any confusion.

There are many products but all of them fit into just a few categories. For those of us who worked with self-managed (on-prem and remote VMs) and managed (App Service, Beanstalk) servers, containers (AKS/EKS, ECS/ACI), and serverless (Lambdas, Azure Functions) it's pretty easy to chose the right tool for a particular project.

We still have the same recommended DBs as we had 10 years ago: Postgres, SQL Server, MySql, Mongo, Redis. There are some new DBs like Cosmos and Dynamo, but they're easy to grasp if you've worked with DBs for a few years. There are also some specialized DBs for things like analytics, ML, online gaming, but not everyone will have to touch those directly.

We still have to collect Traces, Metrics, and Logs. Cloud made it super simple to enable all of those without even thinking about it. Modern SDKs adopted OpenTelemetry which allows to use a lot of data aggregators out of the box. It doesn't matter if it's New Relic, Splunk, Data Dog, Jaeger, App Insights, etc. Add a few commands to instrument your app and it just works.

CI/CD is simple - everything is integrated with everything. Plug a few commands and deploy your artifacts to any platform you need within seconds. Everything has built-in secret managers, integrations, packaged actions, telemetry, blue/green deployment capabilities, rollbacks, retries, audit, gated deployments, RBAC.

Modern IDEs have a lot of tools to deal with all of that out of the box. If anything, it has never been as easy as it is today to quickly develop a new system, roll it out to the end users, and scale from small to large with minimal efforts.

I understand that it's a lot for the new comers to comprehend but that's what the current senior+ engineers are for.

Dimitar • Feb 23

It really annoys me to see all the unecessary technical bloat people add just because its new and it will look good on their cv. They just end up adding technical debt and leaving someone else to deal with the problems they created when they leave.

Why do we need to know 10 js libraries that all do the same thing?

Its the same with cloud.

Christopher John Pepper • Feb 22

Agreed but I would also add that "what is your company already using" should factor heavily too. In the end of the day using a tool that other engineers in the business already know how to use is a big plus. As is "what tools can I easily hire more engineers to work with"

The comparison I would draw would be something like React. React is a pretty good tool for the Job but there are definitely cases where something else could be better (e.g. SolidJS, Svelte, Vue) but if those are less heavily adopted in the company/country then your baking in a need to train/upskill. Again not a deal breaker at all, but should factor in to your decision. As should "how hard is this to change"

Comment hidden by post author - thread only accessible via permalink

Ecks Fiftyobe • Feb 22

This article says one thing. "Native-Cloud" is bad. Then it goes on to explain that it's a people issue, not a cloud issue. With engineer A making bad choices and not understanding the thing they chose to work with. It's not a bad thing to have 10 options. But it's up to the people to make sure the one, or combination of services they use will meet the requirements and are good choices. Having just one option doesn't make it better. If the engineers are confused, maybe they need simpler jobs.

I've been doing this over 20 years and was anti cloud. I was forced to move to 100% cloud and while I miss some of the on-prem control I had to resolve issues quicker in many cases, it's outweighed by the flexibility the cloud offers.