DEV Community

Jennifer Davis
Jennifer Davis

Posted on

Testing Infrastructure at ✨ Corp, a DevOps Story

George, a junior SRE and Sonia, a senior SRE are collaborating on the first stage of an infrastructure migration. As they both work remotely, they pair using Visual Studio Code and Live Share so that they can collaboratively edit and debug while also maintaining all their own features and preferences.

They start up a Zoom meeting and post the connection details in their team channel so anyone can join in during their pairing sessions.

"I've cloned the packer repo here on to my local machine," George said while opening up the folder in VS Code, knowing that Sonia could follow along in her view. "Let's start with a feature branch. We should name it after the issue number from the project board, right?"

"Great," said Sonia. "That's perfect."

"So I took the existing JSON template here. I removed all the extra customization and made a base template. We can reuse this base template to customize rather than copying a customized template," explained George.

"That's a great idea," said Sonia. "Let's go ahead and commit that change so we have separate commits if needed."

"You know, I'm wondering if we write some InSpec code to test against our infrastructure now, then we can use those same tests to validate our applications on the cloud provider instances. That way we can increase our confidence that the application is working regardless of whether it's on-prem or in the cloud, " George commented. Taking his hands off his keyboard, he looked directly at the screen pausing in the coding.

"I haven't heard of InSpec. What is it?" asked Sonia switching to camera view, as she wanted to focus on what George was saying.

"It's a testing tool for infrastructure, mainly advertised as a way to do compliance as code, and it totally can do that, but I've found it super useful to use it for tests against my infrastructure in general just to identify whether I've set something up the way I expected to. For example, if I was setting up an http server, I could use an InSpec http resource to validate a specific endpoint. So just like developers write unit and integration tests for their code, we can write tests for our infrastructure, " George explained.

"This sounds pretty interesting, but I'd be concerned that we would spend a lot of time evaluating the options and whether it would be useful to us in the short term without adding a lot of overhead to the project plan," Sonia said. While the idea sounded intriguing to her, there wasn't a lot of excess capacity on the team to support learning a new tool.

"Well, one of the things I did when I was starting to work on this packer project, was to add an InSpec verification action after the packer build action, and..." started George.

"Uh-oh, and?" asked Sonia.

"Well, it was super helpful for me to state explicitly my expectations of what the image should include when it was built..."

"But, if you are adding InSpec to the image, couldn't that modify the potential interactions with existing dependencies" interrupted Sonia.

Shaking his head, George explained. "No, it runs externally against the build, and that way we can validate the image before we publish the image."

"Oh, that is pretty interesting. So we could eliminate some of the known issues with the builds and have a way to collect what information we are already verifying. Oh! So then Erin from Security Engineering could look directly at the repo as well and see what we are verifying as well. That might streamline how we do security audits as well," Sonia said. She was sensitive to the time constraints that the schedule had, although security audits generally took days. Magnify those days across each part of this project, and the time saved in manual checks, it might make it worth the time spent learning a new tool.

"Can we walk through what you've got so far?" she asked.

George switched back to VS Code, and brought up the profiles directory, and opened up a default_spec.rb file. "This is just the default set of tests to run against the image. Right now it's pretty basic. I've got it testing whether the default user still exists on the image, and whether we have access to the system. It also checks to make sure that NTP is running because some of our services our time sensitive. The way to declare a test is in the form of a describe statement:

describe service('service_name') do
  it { should be_installed }
  it { should be_enabled }
  it { should be_running }
Enter fullscreen mode Exit fullscreen mode

So for our NTP service, I have:

describe service('ntp') do
  it { should be_enabled }
  it { should be_running }
Enter fullscreen mode Exit fullscreen mode

If anything fails, the image isn't loaded up to the repository. There are a lot of different resources. It's ruby so it's not hard to create complex resources, " said George with enthusiasm.

"This is really awesome. Let's share this with the rest of the SRE team, as I think it will be really useful for us to adopt the practice of testing our infrastructure in different ways than we do now. This seems like the shell script that we currently use to validate instances could be replaced by this even!

Let's hold off on adding that to the repository though until you have a chance to share this at the weekly team meeting for sharing learning from the week, " said Sonia.

"Oh no! I couldn't share this yet, it's not complete." said George.

"It's totally ready for being shared. I think it's at the stage where we want to get additional feedback from the team to see what they think of it." said Sonia.

"I thought that meeting was more for showing off what people had accomplished for the week, " said George.

"No. I admit that it can look like that at times, but in reality, it's a way of sharing new ideas and spreading learning across the team. We can't all be specialists, and we can't learn about everything. But it's a way of spreading knowledge and understanding as much as we can as well as encourages people to experiment like you've done here. By sharing this kind of possible tool and process change early, we can prevent folks from getting surprised by introducing a ready baked change." explained Sonia.

"That's a good point, I hadn't thought of it that way," said George.

"I'll update the team meeting agenda for the week, and you can share your work on Wednesday. Let's go ahead and review this packer change for the base image so we can get a pull request in," said Sonia.

After reviewing, they realize that there are a few changes that are needed before the commit can be integrated back into the central shared repository.

git commit -m "Copied customized template to a new base template.

This introduces a new base template that can be used to build 
a base Sparkle Corp image with all the security mandated packages, as well as monitoring and minimum version of shared common packages.

This new base template can also be used as a starting point for new customized images.

Co-authored-by: george <george@sparkle.corp>
Co-authored-by: sonia <sonia@sparkle.corp>"
Enter fullscreen mode Exit fullscreen mode

"Great, the commit explains the context for the change that you've made, along with why and how. This way we can look back on it and know what we were thinking here," said Sonia.

After the encouragement from Sonia, George presented a short tutorial about InSpec to his team along with his sample tests against the packer built base images. Everyone on the SRE team was really excited and paired in turns with George to start creating profiles for each of the different services. As the team already used Chef to manage their infrastructure as code, the mapping of test resources felt really seamless. In addition, the team could leverage the compliance profiles available in the community to do more rigorous security testing.

Plugging these tests into their continuous integration and continuous deployment system allowed for the SRE team to speed up the testing of the infrastructure that they were building out and increase confidence that the systems are configured correctly.

Top comments (2)

charles1303 profile image

Nice illustration.

For me three important take aways here for team progressive learning and solution delivery:

  1. How George took his time to explain clearly what he was trying to implement to Sonia.

  2. The fact that Sonia listened to George and tried to follow in an unbiased manner.

  3. The approach Sonia suggested for the onboarding process of other team members.

sigje profile image
Jennifer Davis

THANK YOU :D It's so great to get this kind of feedback from folks and the takeaways as it gives me a way to see whether what I'm trying to convey I actually accomplished!