Josue Luzardo Gebrim

Posted on Dec 22, 2021

*DevOps: {UN}complicated?!

#devops #dataops #sysadmin #tooling

DevOps Guide cycle, Tools, certification free, and more… :)

According to AWS, we can understand DevOps as “a combination of cultural philosophies, practices, and tools that increase a company’s ability to distribute applications and services at high speed: optimizing and improving products at a faster rate than companies using traditional processes. software development and infrastructure management. This speed allows companies to better serve their customers and be able to compete more effectively in the market. “

# Infrastructure as code

This is a practice in which we can programmatically provision and configure the infrastructure, using techniques from the software engineer, such as version control and continuous integration.

From automation solutions, we were able to facilitate and increase many solutions in different scenarios to have much more effective and efficient deliveries, some of these scenarios where we can use infrastructure practices as code are:

1 . Virtual machine management: Vagrant

One of the big problems during the development of a software is its configuration about the environment and its dependencies, that most of the time, the developer loses some time to install everything necessary to work locally, in addition to having scenarios where the development environment would be very different from the server, where that same project would be installed and would be running (“where the deployment would be done”).

Seeking to minimize this problem, we have to create a virtual machine that can be replicated in the programmer's development environment and on the server where the code will be hosted.
To provision virtual machines using programming, we can use Vagrant, a solution developed by HashiCorp, to facilitate the creation and configuration of virtual machines. With Vagrant and provisioning the virtual machine, we could change network settings, SSH, ports, etc.

https://www.vagrantup.com/

2. Configuration management: Puppet

When we start to have an environment with several virtual machines, it is very complicated to configure each of them often; it is common to have conflicting configurations or a machine to be without any configuration.

// Imagine, you, in the middle of Friday, having to set up an environment of 400 machines, one at a time… :)

A solution to this problem is to use Puppet, a configuration manager via code, with, in practice, some configuration files that will be applied during the provisioning of machines or containers used in conjunction with Vagrant or Terraform.

https://puppet.com/

// You probably think you can use a shell script for this; good luck…

3. Automated provisioning: Ansible

Another solution that can also be used for automated provisioning, unified configuration management, and a happy alternative to Shell script is Ansible.
With Ansible, from a few files, we could make the most diverse installations and configurations on the machines or containers, requiring only Python installed and SSH configured at the destination where the configurations will be applied, in addition to having a very active community maintaining this solution. Using Ansible, we can create reusable automation in the most different scenarios for the most diverse environments.

// It will set up a Gitlab in your hand … :)

https://docs.ansible.com/

Extra: Packer

Have you ever imagined creating an image, provisioning that image in some cloud like AWS, Azure, or Google cloud, configuring it using Ansible or Puppet, just with a JSON file, this is Packer’s idea.

https://www.packer.io/

4. Containers

Over time, it was seen that maintaining several virtual machines could be a problem because, in addition to installing on each of these virtual machines, an operating system and its drivers, and being a solution that consumes many computational resources.

To solve the same problems as virtual machines, containers emerged; even though both are widely used solutions in the DevOps culture and with the mission to isolate the application, the containers consume less computational resources, in addition to having easier management of the environment. and configuration using Kubernetes and Rancher.

Docker

This is the most famous and adopted solution in the market for creating customized images of containers, with the possibility of sharing this image in the Docker Hub publicly or privately.

https://www.docker.com/

Kubernetes

This is a solution for creating and managing “clusters” for deploying containers, with the possibility of being handled via API, in addition to having extensive documentation and a large community with the intention of maintaining it.

https://kubernetes.io/

Helm

This is a powerful package manager for Kubernetes, allowing the definition, installation, and updating of the most complex applications in Kubernetes through a few commands via CLI and a YAML definition file, the chart

https://helm.sh/

We already have charts to deploy Apache Spark and other Big Data solutions in an automated way in Kubernetes.

Rancher

It is a Kubernetes as a Service delivery platform; that is, it will facilitate the management, configuration, and use of a Kubernetes environment by at least 300%. :)

Quality and code security: sonarqube

Through sonarqude we checked the code quality and its security in an automated way, from the development stage to its integration and continuous deployment in the production environment.

Registry: Jfrog Artifactory

A registry is a place to save images of containers created to be reused at an opportune moment. There are currently several solutions, and one of the most used is Jfrog Artifactory for its versatility and easy management of repositories.

// An alternative is the Sonatype Nexus …

Continuous Integration (CI):

Working in a development team, thanks to the time, it is noticeable that many of the developers put their code in a common repository, only very close or almost at the deadline for delivery. We have many situations as functionality to stop working or not give time to do all the tests and development.

To solve this, within the cultural DevOps movement, treadmills will begin to appear that integrate and automate part of the development. The development, testing, and acceptance of the code occur within the stipulated deadlines correctly.

During code development, the organization of the application is generally adopted in Mono-Repo, wherein only one centralized repository is the entire solution, or Multi-Repo, where the solution is divided into independent modules that are different repositories.

One of the most adopted proposals in this sense is GitOps, where through code versioning with git, SVN, or another versioning solution in a code repository like GitHub or Gitlab, an automated treadmill with Jenkins is activated, which will:

Reduce risks and burgers, improve quality and safety during development through using sonarqube to apply automated tests and create customized reports.
Save the compiled codes, libraries, binaries, and images created to be used in the future through a registry such as Jfrog Artifactory or Nexus.
Make the deployments in the development, homologation, and production environments in a continuous and automated way or use the Argo Project.

Jenkins is an open-source solution to create automated pipelines widely adopted by the market with an enormous amount of integration plugins with the most diverse solutions for the most different strategies.

https://www.jenkins.io/

// In addition to Jenkins, we can use GoCD, Bamboo, Travis CI, Team City, Circle CI, Gitlab, AWS Code Pipeline, Azure…

Continuous Delivery / Deploy Continuo (CD):

Continuous deployment is usually the last or penultimate phase of continuous integration. The entire system has already been extensively tested for quality and code security and is ¨ready¨ for installation to occur in the environments (deploy).

The big difference between continuous deployment and continuous delivery is the human interaction in the approval of the codes when the team sees that the application is already mature enough to go into production in a new version and for that to happen, it has to have approval both the teams involved and even the business area.

Argo is one of the many existing solutions. The big difference is its easy use, rich graphic interface, and advanced controls that make it possible to use different strategies even for recovery in some scenarios.

https://argoproj.github.io/argo-cd/

// An alternative to Argo is urban{code} …

The most used CD strategies are:

Blue-Green
The deployment takes place, generating a new version (Green) and keeping the old one (Blue) in the environment; between the versions is a router that at some pre-defined moment will gradually direct the flow of access to the new version, maintaining the version old case dirty something unforeseen.
Canary
We can consider Canary as an evolution of Blue-Green, where only a small portion of users use the new version for testing, and soon after, the change of version happens.
Feature Toggles
This strategy is to place some configuration before the deployment via code that characterizes a given user as a possible tester for the new version.

Microservices / Function as a Service (FaaS) / Serverless
With the culture of DevOps and the emergence of a large number of solutions and tools to improve and automate the development process, large applications will begin to be segregated into a set of APIs, with the passage of time and adoption of a Kubernetes environment, if have become functions. It is now prevalent to have a front-end requesting a world of functions hosted on some cloud that uses serverless architecture.

https://jlgjosue.medium.com/introduction-to-serverless-and-functions-as-a-service-faas-for-dummies-d764863f842b

Service Mash: Istio

Through the great adoption of Kubernetes and the ease of creating API in automated ways, we started to have the scenario of a certain function that is extremely consumed, having thousands of requests per second, so service Mash like Istio appeared.

Istio aims to make the intelligent control of traffic, making or excluding the deployments of a given service, automatic security with the management of authentication, authorization, and encryption of the services, control of access policies for different APIs that use the service, also, to monitor and create service logs.

https://istio.io/

Messenger / Streams: Apache Kafka

Another strategy for handling a large volume of requests is to use a solution for event streams such as Apache Kafka. The requests would be recorded in one queue or a group of queues and answered in another queue.

http://kafka.apache.org/

NOTE: These two strategies work very well if applied in the right scenario!

https://javier-ramos.medium.com/service-mesh-vs-kafka-f60c00044f20

#Feedback / Monitoring

With decentralized architectures in increasingly diverse environments, monitoring an application becomes an almost impossible activity depending on the solution, and this is usually only seen when a major problem occurs.

// It is prevalent, only to think about monitors the project when it is already in a production environment… :(

Several solutions can, in addition to monitoring the environment, check the health of the application or microservices; here are some.

Stack ELK (Elasticsearch + Logstash + Kibana)

The company Elastic created a set of solutions that will start with the purpose of monitoring logs, and over time, new features have been added and today we have the possibility to monitor the environment in different ways with Beats and generate monitoring logs, use logstash to work on the quality of the data in these logs almost as an ETL solution, Elasticsearch, a customized search engine, and kibana, a visualization solution.

Beats:

The Beats are agents installed in the environment that monitor and generate logs that can be requested via API, monitor from the use of computational resources such as memory and CPU, the application network, integrity of data files to deploy Function-as-a-Service (FaaS) in some cloud.

// In addition to those maintained by Elastic, the community mainly on Github sees each day creating beats with very different purposes …

https://www.elastic.co/beats/

Logstash

This is an open-source solution to centralize logs, transform data, insert or index them in a destination, and configure and integrate with Beats and other solutions such as Apache Kafka.

https://www.elastic.co/logstash

Elasticsearch

The Elasticsearch is a distributed RESTful search and analysis engine capable of handling large volumes of data. As the heart of the Elastic Stack, it centrally stores your data for speedy searches, adjusted relevance, and powerful analyzes that can be easily scaled.

https://www.elastic.co/elasticsearch/

Kibana

This is simply an open-source “interface” for viewing the most diverse forms of data in Elasticsearch.

https://www.elastic.co/kibana

// There are some differences in free and open source products for their paid versions.

2. Prometheus + Grafana

Another monitoring strategy is to make the application generate its own metrics, make Prometheus index them, and use Grafana to view and flip these monitored metrics.

Prometheus

This is a set of open-source tools with the objective of monitoring and generating alerts, composed of a server, where the data is stored in time series format, libraries, which can be used in the application to generate the metrics, a push appropriate gateway to capture these metrics, exporters, to share the metrics data and the alertmanager to create the alerts and send them compatible with the most diverse channels such as e-mail or Telegram.

https://prometheus.io/

Grafana

Grafana is an open-source project that allows you to consult, visualize, alert and understand your metrics, regardless of where they are stored. Create, explore and share dashboards with your team and promote a data-based culture by creating the most diverse forms of visualization, dynamic Dashboards with the possibility to explore metrics and logs, create fins and integrate different data sources, be it a database like MySQL, MongoDB and indexed search solutions like Elasticsearch.

https://grafana.com/

// It is possible to integrate Grafana with chatbots to make alerts on Telegram.
// The recent overuse (hype) of Artificial Intelligence (AI), we can monitor the environment, the container, the function, but the model, its quality, and efficiency is another story. A strategy that has been taking shape in the community is the Model Drift…

https://towardsdatascience.com/model-drift-in-machine-learning-models-8f7e7413b563

https://towardsdatascience.com/why-machine-learning-models-hate-change-f891d0d086d8

** Cloud infrastructure automation as a code: Terraform**

We currently have several cloud options that we can use to maintain our infrastructures, such as Google Cloud, AWS, and Azure. We have different configurations and different ways to manage them; to abstract this difficulty, we can use Terraform.

Terraform is an infrastructure solution as code, created and maintained by HashiCorp to provide and manage infrastructure independent of the chosen cloud in a simple and automated way.

In practice, we use Terrafrom in conjunction with other solutions such as Ansible to apply configurations more easily and manage the environment in a customizable way.

https://www.terraform.io/

Imagine having to provision multiple Kubernetes nodes in different clouds and manage applications on AWS and Azure at the same time…

DataOps — DevOps for Data Science

With the growing use of the DevOps culture, data engineers and architects started using the most diverse solutions in this medium to automate configuration management and provisioning of environments, use, for example, Vagrant for creating virtual machines, Ansible for apply mass configurations, and Terraform to provision the most different environments in the main available clouds.

https://medium.com/data-hackers/o-que-%C3%A9-dataops-organizando-o-futuro-do-data-science-para-neg%C3%B3cios-dc46af338e12

https://medium.com/data-hackers/dataops-a6d008549aa6

Many Big Data solutions have started to gain versions for a container environment, see below:

Some examples of Hive implementations with Docker:

https://blog.newnius.com/setup-apache-hive-in-docker.html

https://github.com/big-data-europe/docker-hive

Apache Spark for Kubernetes:

https://spark.apache.org/docs/latest/running-on-kubernetes.html

https://datenworks.github.io/quickstart-apache-spark-no-kubernetes/

https://towardsdatascience.com/performance-of-apache-spark-on-kubernetes-has-caught-up-with-yarn-73730878a792

https://databricks.com/session_na20/running-apache-spark-on-kubernetes-best-practices-and-pitfalls

HDFS:

https://hasura.io/blog/getting-started-with-hdfs-on-kubernetes-a75325d4178c/
https://github.com/apache-spark-on-k8s/kubernetes-HDFS/blob/master/charts/README.md

HBASE:

https://github.com/harryge00/hbase-kubernetes

Apache Kylin:

https://medium.com/analytics-vidhya/building-and-managing-apache-kylin-cubes-on-docker-like-a-pro-22fc236c335e

Apache Airflow:

https://medium.com/ninjavan-tech/setting-up-a-complete-local-development-environment-for-airflow-docker-pycharm-and-tests-3577ddb4ca94

Kafka:

https://www.confluent.io/blog/getting-started-apache-kafka-kubernetes/

https://www.confluent.io/blog/kafka-devops-with-confluent-kubernetes-and-gitops/

https://medium.com/swlh/apache-kafka-with-kubernetes-provision-and-performance-81c61d26211c

//many other big data solutions are gaining implementations suited to this new reality, are you ready?

#MLOps — ML + DevOps and more…

With the popularization of artificial intelligence and its use in the most different media at the end of 2018, large communities around the world saw the need to automate the process of studying, training, testing, deploying, and monitoring new models, in some situations, it took months to deploy to the production environment.

From the use of DevOps culture by many data scientists and machine learning engineers, it was possible to create customized treadmills and even open source solutions that will optimize the creation and maintenance of models compatible with the most different scenarios and environments.

Example of the automated treadmill for AI models:

https://medium.com/analytics-vidhya/polyaxon-argo-and-seldon-for-model-training-package-and-deployment-in-kubernetes-fa089ba7d60b

Conclusion

This publication was just an introduction to the DevOps culture, some of the different existing solutions, their uses, and problems they seek to solve, whether in development, infrastructure, Big Data, and even Artificial Intelligence.
More References:

https://aws.amazon.com/pt/devops/what-is-devops/

https://www.redhat.com/pt-br/topics/containers/containers-vs-vms

https://www.valuehost.com.br/blog/o-que-sao-containers/

https://buoyant.io/2017/04/25/whats-a-service-mesh-and-why-do-i-need-one/

Recommended courses:

https://www.alura.com.br/cursos-online-infraestrutura/

https://www.schoolofnet.com/cursos/infraestrutura/