DevGraph

Posted on Apr 22, 2021 • Originally published at blog.engineyard.com

A Discussion with Clare Liguori from AWS Container Services

#aws #gitops #kubernetes #devops

Steam Powered - a Podcast from Engine Yard

Steam Powered is a podcast from the folks at Engine Yard where we talk about all things Cloud, Kubernetes, IaaS, Paas, Ruby, and what's new in the world of development.

In the seventh episode of the podcast, our host Darren sits down with Clare Liguori, Principal Software Engineer for AWS Container Services, to discuss the influence of DevOps and GitOps on the container space, why migrating your app to containers also improves observability, and how Chaos Engineering has become even more important.

Checkout the entire conversation that took place between Darren and Clare below. To listen to this podcast, click on this link.

Darren:

This is Steam Powered, the Engine Yard podcast where the time to deploy is now. We'll be talking about the tools and technologies that you care about: Ruby, cloud infrastructure, containers, and going from DevOps to NoOps. So let's see where the train is pulling into the station today.

We have a great show today. We have a very special guest from AWS. Claire Liguori is a principal software engineer at AWS. Her current focus is on the developer experience for container services, building tools at the intersection of containers and software development lifecycle. So that includes local development Infrastructure-as-code, CI/CD, a bunch of great stuff. Clare, welcome to the podcast.

Clare:

Thanks, I'm happy to be here.

Darren:

It's a really interesting time, right? Seven or eight years ago, the development and deployment landscape was really different. People were deploying to EC2 or bare metal still, probably they were using Jenkins. And I've even heard people call that “CI/CD 1.0”. How do you see the state of the world now? Do you feel like we're in “CI/CD 2.0” at this point, or what stands out to you right now in the landscape?

Clare:

I think of CI/CD as one of the tools in our toolbox for really implementing our development practice. And I have seen huge changes in development practices over the last decade.

Seven or eight years ago, before I came to Amazon, I was working on largely shrink-wrapped software that was released once a year. At that time, we had a Cruise Control box that was doing continuous integration. But in order to create this release, the 1.0 or 1.1 release, we had a release engineer who had to coordinate across all these engineers and teams and bring together all the changes that needed to be part of that release.

And I think the two things that I've seen change quite a bit since then, and especially in coming to Amazon and learning more about Amazon's development practice is one, just this transition to microservices and Software-as-a-service, where we no longer need to think about the 1.0, 1.1, or 2.0 release. We can think of changes to that service as more just a continuous flow, which is where we get that continuous delivery from our CI/CD pipelines today.

And then I think the other one is the rise of DevOps in the industry. So developers being able to be empowered really to deploy changes out to those microservices that they own and take responsibility for those changes going out to production. It's a very interesting time.

Darren:

While we're talking about DevOps and process, I've worked at places that have almost zero process before in the past. I've done government contracts that have more process than you'd ever want to ever want to know about. So having been at Amazon and being very interested in the developer experience and lifecycle, how do you put Amazon's own development process in that spectrum?

Clare:

What I like about how we practice software development at Amazon is that it's very empowering to individual teams. We put a lot of trust into the Amazon two-pizza box team, meaning a team that can largely work independently of other teams and own their microservices in the system they're working in. And so a lot of development practice is up to the team, whether they're practicing Agile, Kanban, Scrum, or anything that works for them.

But coupled with some guardrails and best practices that we've learned over time, really apply to a large swath of teams, especially within AWS, we have our pipelines, look at whether we have integration tests, whether we have monitoring, canaries running against production, all of those best practices that we know can help teams to more successfully roll out their software to production.

Darren:

Yes. full disclaimer, I worked at Amazon at AWS for four and a half years. Unfortunately, Clare, and I didn't get to work together. But I love what you said, especially regarding empowering the team. I had the feeling there that it was almost like you were a small startup. But at the same time you had all of these resources, like you were saying the guardrails and tooling, so it was kind of a best-of-both-worlds approach.

Let me ask you, everyone's been working from home now for quite a while. How has the pandemic and most folks being remote, how has that affected things?

Clare:

It's interesting, I think, largely software development has been unaffected because we were already sort of working asynchronously with tools. Code reviews have been asynchronous forever. So you send out a code review to your team, and hopefully a couple hours later, or maybe the next day, they'll take a look at it. And so that has translated really well into working from home.

The part that I think everybody at Amazon and in the industry is struggling with is the lack of whiteboards, that sort of collaboration on generating new ideas. And so that's been I think the thing that the teams that I work with have had to learn this year is how do we continue to work together on things like boxes and line diagrams, as we're thinking through an architecture or a solution? And putting that down on paper.

Darren:

Absolutely, I see teams that already are in a development cadence and are established being a lot less affected for the reasons that you said. But if you have a large scale, greenfield effort going on with a lot of design, it's not quite the same as just standing there at a whiteboard with someone.

One other thing to build on what you said. GitLab has their survey, and they talked about one of the really positive impacts is that because people are really grasping onto the flexibility, the meantime to turn around a pull request or code review has actually gone down. Because people are available at more times of the day.

In that same survey, they were asking developers what changes have you made to your process in this last year. The top answer 21% was continuous integration, followed up by 15% continuous development. Also source code management and automated testing were at 15%. These topics are clearly on people's minds.

You mentioned DevOps being one of the enabling factors. If you look into your crystal ball, what do you see for 2021 from the DevOps perspective?

Clare:

One of the newer trends that I am kind of excited about in this space is GitOps. And this is today, I think, largely being driven by the container space and Kubernetes space. GitOps is this idea that developers love to work in their source code repositories. That's what we want to be doing. And I think,especially in the container space, but across the board in service development, Infrastructure-as-code has really exploded as a way to manage both your infrastructure and to describe how to deploy your application.

There’s the Kubernetes manifest and Cloud Formation templates that contain the container image that you want to deploy for your service. And so GitOps is a way to continue to use that very familiar source code management and source code review and approval process that we're used to for application code. But then now, kind of move it into infrastructure as code and release management even. So I think that that is something that's really interesting that people are starting to implement a little bit in this space. And I'm interested to see how much it starts to extend out past the early adopters.

Darren:

So let's shift gears for a second and talk a little bit about container use cases. From my perspective, a lot of the folks that I'm talking with and our customers, I'm seeing the growing perspective that containers are almost becoming the default choice. Some shops are taking the view that I would only use instances if I had some requirements that really dictated to go that direction. Of course, in that landscape, you've also of course, got serverless as well, right, which is a huge abstraction for developers. Do you see that becoming almost more of a default choice for application deployment?

Clare:

I think that one of the things that I like about containers, and I think is still true today for why it's so helpful to adopt containers, is really this idea of it's so easy to package an application as a container. Before containers, I think we didn't really have a standardized way to package applications for the cloud or for deployment. And that's really what containers give us, that really standardized way no matter what language you're writing in or how it's going to run in that container. There's a standardized packaging and build process.

Another one is then being able to run it on your laptop, just like you run it in the cloud. I think everybody has run into those cases as well. It worked on my laptop, but for some reason it didn't work when I deployed it into the cloud. And containers really give us a better way to describe the environment that we want that application to be running in. And so I think that that is helpful, sort of across the board for any type of application.

I think serverless is really exciting because especially we see this with teams at Amazon, where we own both the development process and operating the application. Because so much of the serverless application is fully managed and scales on its own, it kind of takes care of itself. The operational burden for developers is so much less in many cases, because you don't have to think about auto scaling, you don't have to think about scaling up and down when you have a burst of traffic. It's gonna take care of that for you.

And you get to write quite a bit less code because you don't have to think about how this application is going to start up, do health checks, and things like that. I see it as a trade off of running applications that you're familiar with, how they run in containers, and using your favorite frameworks and things like that, or getting some of those operational benefits from serverless.

Darren:

Yes, they are reliable deployments and getting rid of the “it works on my machine” is such a nice feature. Better resource utilization is another thing that containers bring you, maybe not at the forefront of every developer’s mind, but certainly on the mind of architects and managers. Is that something that comes up quite a bit?

You could also contrast that with serverless, where there's not a fixed resource, it’s completely pay as you go. But how much does that come up?

Clare:

I do think that resource utilization concerns have led to the rise of containers in general. But I actually think we are moving into a really interesting place in the container space, in terms of serverless containers with services like AWS Fargate, where you don't really need to worry about resource utilization. Then taking that even farther with serverless functions like Lambda, completely not worrying about any utilization or any auto scaling. And so I think that for me, that's really the best of both worlds, being able to get this great developer experience with containers and serverless. And no longer needing to even think about tuning that resource utilization in your containers cluster.

Darren:

Let me ask you about different use cases for containers. We have a lot of customers that are migrating from on-premise to the cloud, or even from instance based applications. We tend to see this distinction between application components, your web app, whatever asynchronous processes, you might have running, and then other infrastructure components. For example, your database, a Redis setup, or you might have shared storage. It seems like the application components have a much easier migration path to put that into a container. And so we try to do that as a first step, and then leave some of those other infrastructure components as the second step.

How do you see the different use cases? And what are your thoughts on dealing with existing applications which may have seven or eight different components moving that into a container world.

Clare:

I think containers are a really great tool for modernizing applications. So one of the things that I tend to see is if you're moving from maybe an on-premise application, or an application that you're running on EC2 instances, it's so easy in that world to really build more of a monolithic application where you have the database running on the same instance or virtual machine. Your logging tools are right there built into that instance. And one of the things that we start to see with containers is customers taking it as an opportunity to break apart some of those applications into those individual components that they can be deployed individually, scale individually, monitored and patched individually.

That gives them a lot of freedom to modernize that application and start to incorporate some of those development practices that we were talking about before of CI/CD and individual ownership of some of those components instead of deploying them all as kind of a monolithic application.

Darren:

We have some customer challenges around apprehension. How much time and effort is this migration going to take me? Or do I lose a little bit of control? You know, there used to be this, quote unquote, physical server I can SSH into it and go look at logs. Kubernetes is not trivial. I don't want to say complicated, but there's a number of pieces and parts to it. So do you feel like with these customer apprehensions, there is a better story at the end of this with observability? Or what would your message be to technologists as they look at these transitions?

Clare:

We've seen some of this even inside of Amazon. For a long time, we also ran applications on instances, and we're very used to SSH into them. And I think there's two things there. I think one is, even on instances, we really try to discourage logging in. Because it's kind of a shortcut, right? It's letting you not think about ahead of time, where should these logs be going? What are the tools that I need to troubleshoot this application? And what do I need to put in place to be able to troubleshoot this application without that sort of break-glass SSH into the instance?

I think moving into containers forces that thought a little bit more, because you start to have to think about this in a totally different context where you're packaging and running your application in a different way. But one of the things that I think you have at the end of the tunnel is a much better story around observability with things like Cloudwatch container insights. A lot of the teams inside of Amazon that have been adopting containers have seen really excitement around some of the some of the Cloudwatch features that are coming out where all of a sudden, we can do these really complex queries against all of the logs that are coming out across all of our containers, and start to see trends and start to troubleshoot in a really complex way to troubleshoot complex situations that are going on across a fleet of containers. It's hard to do, when there's things going on across a large, large capacity fleet. So I think there are a lot of benefits that you gain from thinking about observability and how you're going to manage that.

Darren:

I like how you said it, it almost feels like to SSH in is this temptation that you can do it. It may help but you know it's not going to scale. I can't log into 100 boxes and see what's going on.

I think there's a huge promise there for and let me know if you see this, that you get better observability at the individual component level. So a typical use case would be I have different containers for different microservices. And I can scale those individually, which is really nice. But is it easier than to pinpoint where the problems are occurring? Or what's not behaving badly? What's not performing the way that I want it to? Do you see advantages there as well?

Clare:

Yes, absolutely. I think one example is if you think about that monolithic application running on an instance. The only metric that you have for something that has kind of gone into a death spiral spinning on the CPU, is that you something is at 100%. And now you need to figure out, what process is it? How long has this been going on? You don't necessarily have those really in-depth metrics that you could get from thinking about that early and instrumenting that through something like containers or deploying them as individual components in your system. You don't always have that insight into what's what's going on.

And so, I do see that as we start to break these things apart into individual components, we can start to not only monitor them individually, but take action on them individually as well.

Darren:

Let me ask you this, we've been talking about migrating existing applications. Is there anything that if you're building a new application, you need to think differently about?

Clare:

One of the things that I think you have to get used to with containers is this idea that they are somewhat more ephemeral than instances. We tend to think of instances as having really, really long uptime. And that tends to bleed a little bit into how we design our applications. We don't think as much about making sure that we are gracefully shutting down connections when the application is shutting down, because it doesn't shut down that often.

And so with containers, especially through container orchestrators being able to rebalance and auto scale your applications across a fleet of compute capacity, then you have to start thinking about\ what if what if the lifetime of this container is only 12 hours? What if it's only four hours?

And so I think the rise of things like chaos engineering are actually really great in this space, because it helps to inject some of that into your application, and start to exercise some of those code paths a little bit more often. And you start to see those patterns coming up in your metrics. For example, if you do have an issue with how your application shuts down, you might only see that regularly, maybe once a week when you're deploying, for example. But with both containers and with chaos engineering, you start to see that all the time in your metrics, and you start to pay attention to it.

So thinking about and finding those edge cases in your applications becomes a little bit more interesting, it's actually forcing us into better practices.

Darren:

Tell us a little bit about AWS proton. So proton was announced at reinvent

Clare:

Proton is an application deployment and delivery service. And where it comes from is that we've started to see a challenge with the amount of infrastructure as code that developers are needing to manage for their applications, and how users are sort of struggling to achieve some standardization and best practices across all of this infrastructure code across all of their applications. And so Proton really helps with that problem, by letting you extract some of this infrastructure or other AWS resources, like manage databases and Cloudwatch log groups, away from the developers. So they can start to focus a little bit more on their application code and less on the infrastructure as code and to achieve some standardization across your organization.

The way that it works is that you create what we callProton templates, which really represent all of the resources that are needed for a particular type of application. So for example, you might have a container service, that's a request-reply API service, running behind a load balancer. And we find that that's a really common sort of pattern across a lot of applications, especially as you start to get into these modern application best practices like using micro services. So you can register this proton template that contains all of the resources that would be needed by any API service, as well as the pipeline definition for deploying that application. And then when an application developer comes into Proton, they just need to select that proton template, fill in a few little parameters that I want to tweak, and then they'll have all of the infrastructure needed. They'll have their container image deployed, and their application code will be automatically deployed by a CI/CD pipeline that's provisioned by Proton as well. So I'm really excited about it. This is something that I've been focusing on for a while now. So I'm really excited to see it in preview at reinvent in 2020 and love hearing all the feedback from customers. We have a public roadmap on GitHub, so it's really great to have customers trying it out and giving us feedback.

Darren:

One last question for you, and maybe this ties into what you just talked about with Proton. I know, when I was at AWS, the team worked really hard and when we delivered, especially at reInvent, it's so amazing to see this value get into your customers hands. But is there anything from your time at AWS that stands out to you? Maybe a product launch or just something personally that you had a lot of fun working on?

Clare:

I would actually say Proton is it. This project is something that that I've been involved with since the beginning, since we started looking at customers over and over again telling us that they were building what they would call it an internal development platform, and turn it as a way to try to manage all of this complexity that that was being put on their developers deploying their applications to the cloud. And so taking that from just you know, a tiny kernel of an idea, of a trend that we were seeing and what customers were telling us all the way to designing and launching the service. It's been really exciting for me personally.

Darren:

That's awesome. Well, Clare, thank you so much for joining us. We really appreciate you coming on the show, it's so great to get your insights and to talk with you. And we just wish you and your team all the best in 2021.

Clare:

Thank you.

Darren:

Once again we want to thank Claire for joining us. You've been listening to Steam Powered, the podcast from Engine Yard. Use Engine Yard Kontainers Platform-as-a-Service to deploy your apps now with one click onto a fully managed Kubernetes stack. For more information visit us at engineyard.com.

AWS, Containers, Engine Yard

DEV Community

A Discussion with Clare Liguori from AWS Container Services

Steam Powered - a Podcast from Engine Yard

Top comments (0)