Gennovacap explains how devops automation and devops best practices can help cut costs, deploy apps faster, and achieve 99.9% uptime.
SUMMARY
In this guide we analyze a common problem with devops that startup companies face as they scale up their applications. At Gennovacap, we formulate devops automations using our devops best practices that help our customers scale to the moon. The business results from our devops automation resolves scalability issues, reduces cloud costs, and improves software delivery 6X – 10X times faster.
To encapsulate these learnings, Gennovacap decided to share our devops best practices with interested technical and business leaders. The following strategy proposes a well architected solution with a technical guide and benefits for businesses who choose this solution.
TABLE OF CONTENTS
- Summary
- Tech Debt in Devops
- Fast and cheap Devops – the Growth Killer
- Our Devops Automation and Managed Cloud Services Practices
- Devops Automation Benefits
- How Can Gennovacap’s Devops Best Practices Help My Company?
-
Devops Automation Case Study
- Devops Challenges: Slow releases, Unstable Deployments, and Service Disruptions
- Application Challenges: Memory Leaks and Auto Scaling Issues
- Solution: Resilient Cloud Architecture
- Solution: Devops Automation Tools and Cloud Resources
- Technical Guide: Devops Automation and Devops Best Practices
- Results: Saved $5400 /month, Released 15X Faster, and Reached 99.9% Uptime
Estimated reading time: 11 minutes
TECH DEBT IN DEVOPS
Over the past 10 years, Gennovacap’s team has been a part of shipping software for technology companies. We’ve seen first hand how underrated devops is during the early stages of product development. A good majority of startup companies spend 80% of their development cycles shipping features and spend the other 20% patching bugs.
At the early stage, startups choose deployment tools like Jenkins or Heroku. These deployment tools are good to start but don’t scale efficiently with operations and certainly aren’t built for automation. The tech debt and costs incurred from these tools grows as the applications grow. If a company’s customer base doubles in one year, then devops automation becomes a critical strategy to achieve the high growth and scale. This is where we can help you with our devops best practices and devops automation strategies.
FAST AND CHEAP DEVOPS – THE GROWTH KILLER
There’s an old saying in software development that goes something like, “Fast, good, or cheap – pick two.” Anyone who has ever built software has felt the pressure of weighing the opposing forces of features, speed and cost against each other.
Startup companies usually choose fast and cheap for their early stage devops strategy and this is why they choose Heroku or Jenkins. It’s smarter at that early stage to get the product launched and into customers hands to iterate quickly. However, when success catches up to you, you need a feature rich devops process to improve development and product quality. At Gennovacap, we focus on quality with each devops iteration so that our customers can reach scaling goals, deploy features faster, and achieve high growth.
OUR DEVOPS AUTOMATION AND MANAGED CLOUD SERVICES PRACTICES
Last year, Gennovacap put together a case study covering Cloud Cost Optimization using AWS Devops: How an AI company saved 90% on cloud costs. CTOs at software companies always want to know how we did it. How did we save a company $18,000 / month using our devops best practices?
The answer is not simple. In fact, it’s a lengthy process involving operations, code repositories, deployments, network, storage, pipelines, and databases in the cloud. At Gennovacap we break these steps into these devops best practice areas:
- Upgrade the application build process, to make it ready for running on Kubernetes (AWS EKS) as a Docker container, by following the 12-factor app methodology.
- Adopt Infrastructure as Code practice for managing and evolving cloud resources, leveraging industry standard, cloud-agnostic tools and processes.
Infrastructure as code
Platform as code
Configuration as code
Policy as code
- Create CI/CD Processes for consistently and reliably building and deploying managed artifacts. See our latest article on 11 Open Source Kubernetes CI CD Tools to Improve Your Devops
- 24x7x365 support, continuous monitoring, and incident response for AWS Resources, GCP Resources, or Azure Resources.
Logging
Monitoring
Alerting
Tracing
- Security and Disaster Recovery Plans
- Security patches, software upgrades, infrastructure maintenance
DEVOPS AUTOMATION BENEFITS
By implementing devops automation using these devops best practices, every company we consult for can achieve benefits like:
- Reduced Cloud Costs
- Faster Releases
- Compliance
- Harden Security
- 99.9% Uptime
In the remainder of this article, we examine a case which lists out some of the devops automation tools and reveals a basic cloud architecture for devops automation.
HOW CAN GENNOVACAP’S DEVOPS BEST PRACTICES HELP MY COMPANY?
In the course of working with clients, we documented every issue and have created a series of devops best practices. From the devops best practices, we compiled a case from an existing client who faced scaling problems with their existing devops setup on AWS.
In this case we break down the devops automation tools and managed support strategies we put in place for this startup company. By implementing these devops best practices, the client lowered costs $5400/month, deployed 15X faster, and reached a 99.9% uptime.
DEVOPS AUTOMATION CASE STUDY
COMPANY PROFILE
Founded: 2012
Company Market: B2B SaaS Startup
Customers: Globally focused small and medium sized businesses with regional branches of multinational companies.
Cloud costs: $9,000 / month
APPLICATION STACK
- Ruby on Rails
- PostgreSQL
- Background Jobs
- Redis
- Multi-tenant SaaS
CLOUD SERVICES
- AWS OpsWorks
- AWS Certificate Manager
- RDS
- EC2 (Ubuntu Linux)
- Classic Load Balancer
- CloudWatch Logs and Alarms
APPLICATION AND CLOUD INFRASTRUCTURE: VANILLA RUBY ON RAILS AND AWS EC2
The startup company built their software on a very vanilla ruby software stack utilizing PostgreSQL on AWS RDS and EC2 instances.
DEVOPS CHALLENGES: SLOW RELEASES, UNSTABLE DEPLOYMENTS, AND SERVICE DISRUPTIONS
They deployed the software on an early version of AWS Ops Work which did not include any CI/CD tools. As a result, the lack of modern CI/CD technology with Ops Work caused release cycles to be very slow and unstable. Above all, their engineering team needed to focus on the core product instead of triaging scaling issues due. To sum up, here are the issues they faced:
- Older cloud technology / devops tools caused devops process bottleneck
- Rising cloud costs – $9,000 / month
- 2 days / week used to deploy and stabilize code
- Need additional support to handle IT related issues (servers, maintenance, security, upgrades)
APPLICATION CHALLENGES: MEMORY LEAKS AND AUTO SCALING ISSUES
In addition to the IT and Devops problems, the multi-tenant SaaS application contained fundamental problems, like memory leaks. The memory leaks required the engineering team to add additional EC2 instances every week after each deployment. As a result, once they fixed the leaks, they would have to spin the servers back down to keep costs low.
Further, they did not have autoscaling to handle large API requests, which often caused service disruptions for customers. In short, their devops needed a serious upgrade and they also needed a flexible auto scaling option for their application.
SOLUTION: RESILIENT CLOUD ARCHITECTURE
The devops automation tools, cloud infrastructure, and application architecture proposal we chose for this startup company included: Terraform, Gitlab, AWS Codebuild, AWS Code Deploy, AWS EKS, AWS ECS, AWS EC2 Auto Scaling Spot Fleet, and numerous Kubernetes tools with monitoring and alerting.
SOLUTION: DEVOPS AUTOMATION TOOLS AND CLOUD RESOURCES
The following is a complete list all the AWS cloud resources and devops automation tools needed to solve the scaling problems, address the application issues, and provide monitoring and support:
Containers and Microservices
- EKS
- ECS
- Docker
- Kubernetes
Identity and Access
- IAM with RBAC
Continuous Delivery and Continuous Integration
- Codebuild
- Gitlab
Infrastructure as code
- Cloud Formation
- Terraform
- Systems Manager
- Config
Monitoring and logging
- Cloud Watch
- Grafana
- Prometheus
Version Control
- Gitlab
Databases
- RDS
Network Services
- Auto Scaling
- Load Balancers
- VPC
DNS
- Route 53
Storage
- S3
Certificates
- ACM certificate
Servers
- EC2
- EC2 Spot Instances
TECHNICAL GUIDE: DEVOPS AUTOMATION AND DEVOPS BEST PRACTICES
To implement the full devops automation strategy, we took the following steps listed below. This strategy follows all of our devops best practices. For brevity’s sake, we will not dive into how to configure all these components and systems in this article. If you’re interested in receiving the technical devops automation guides, please sign up for our newsletter in the form below.
Upgrade the application build process, to make it ready for running on Kubernetes (AWS EKS) as a Docker container, by following the 12-factor app methodology:
- Minimum impact on existing development workflows
- Immutable images that can be stored in an artifact repository
- Ready to run on Kubernetes
- Optimal resource usage
- Native IAM integration for fine-grained EKS service roles
- Native integration with AWS CloudWatch Logs log shipper
Adopt Infrastructure as Code practice for managing and evolving cloud resources, leveraging industry standard, cloud-agnostic tools and processes:
- Use Terraform as an Infrastructure as Code tool to manage changes in a controlled way, leveraging Git as the source of truth and collaboration tool
- Changes applied via CI/CD system (see next topic), avoiding hard-to-track manual changes applied via AWS console or API
- Self-document changes via Git history
CI/CD Processes for consistently and reliably building and deploying managed artifacts:
- Enable and set up build and test pipelines in - Jenkins or another build platform (e.g., AWS CodeBuild or GitLab) using modern standards and patterns
- Custom workflows for enabling live test environments for each development branch, that can be automatically disposed of as soon as testing is done
- Allow developers to also make and propose changes via PRs (pull requests), that can then be automatically applied to the system once approved
- Autoscaling workers for the build system for reduced costs and faster builds
- Implement monitors for the entire CI/CD process, that can be used to inform the status of events of interest for developers and OPS operators, and integrate them with team communication tools, like Slack
- GitOps Continuous Delivery pipeline: Releases for multiple environments (dev, prod, etc.) Controlled release rollouts (for prod) Database migration controller Deployment Rollback mechanism Kubernetes cluster add-ons (DNS manager, load balancer ingress controller, among others)
Managed AWS Services and Resources:
- EKS Cluster setup for each environment
- Auto Scaling setup for different node pools (application and delayed job workers)
- Docker image Container Registry with automatic security assessments for vulnerabilities Shared Application Load Balancers for decreasing costs on both prod and development environments
- AWS SSO integration with EKS via IAM and RBAC
- Redis ElastiCache
- RDS
- Fine grained IAM policies for DNS records, S3 buckets and Service Roles
- EKS integration with CloudWatch Logs, for both app and infrastructure layers
- CloudWatch alarms for monitoring key platform components
- ACM certificate integration for public-facing and internal-facing components
- Automated DNS management with Route53 and EKS
- VPC endpoints for private connections between the VPC and supported AWS resources
Security and Disaster Recovery:
- Segmented VPC with isolated subnets and NAT gateways spawned across multiple availability zones for increased reliability
- CloudTrail integration with CloudWatch Logs
- Integrated secrets management between AWS resources, CI/CD systems, applications and Terraform
- Backup AWS account setup
- Enforcement of Multi-Factor authentication and password rollouts
- SSO authentication for multi-account access
- AWS Organizations setup (Backup and Main AWS accounts), including consolidated billing
- Fine-grained policies for S3 access, DNS management, load balancer registration and certificate renewals
- Leverage in-house tools or managed services like - Skeddly for managing database backups and S3 data replication on another AWS account
- Extensive use of in-transit and at-rest data encryption mechanisms for inter-resource communications
- Continuous and automatic rollout of security updates for all EC2 instances
- Optional use of AWS Trusted Advisor, for extra security recommendations and reports
- ClientVPN setup to provide access to internal resources
- WAF integration via Terraform
Ongoing operational support:
- Add an on-call site reliability engineer for monitoring and support
- Prometheus and Grafana stacks for analyzing cluster and applications past data, providing details that can be further used to tweak and optimize different operational aspects
- Documentation and playbooks for common and potential issues
- Alert routing setup for on-call engineers
- Optional: AWS DevOps guru for insights and early fault detection and notification
RESULTS: SAVED $5400 /MONTH, RELEASED 15X FASTER, AND REACHED 99.9% UPTIME
By employing a cloud cost optimization strategy and migrating to AWS EC2 Spot Instances, we minimized costs and obtained a 60% cost savings for the client. Additionally, they were able to continuously deploy the application as many times a day as possible.
Furthermore, the alerting and monitoring solution notified the developers when the application failed from memory leaks. The engineering team solved application problems quicker and bugs were immediately triaged. With more stable releases, enhanced monitoring, and IT support for their infrastructure this enabled them to reach 99.9% uptime.
Cloud Cost Optimization:
- Before Devops: $9000 / month
- After Devops: $3600 / month
Released Software 15X Faster:
- Before Devops: 1 time / week
- After Devops: 3 times / day
99.9% Uptime:
- Before Devops: 97.5% Uptime
- After Devops: 99.9% Uptime
This concludes our case study for Devops Automation and Devops Best Practices. We hope you enjoyed our article and feel free to contact us if you’re interested in having Gennovacap help you cut cloud costs and achieve scale for growth.
Top comments (1)
Great read, for absolute beginners I have something too : Comprehensive guide for DevOps newbies