I wrote a bit about this in other articles but today I come with a more detailed, descriptive and step by step post about this DevOps tasks.
What DevOps, CI and CD stands for?
DevOps is an acronym of Development Operations.
It's a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality.
IT performance can be measured in terms of throughput and stability. Throughput can be measured by deployment frequency and lead time for changes; stability can be measured by mean time to recover. The State of DevOps Reports found that investing in practices that increase these throughput and stability measures increase IT performance.
The goals of DevOps span the entire delivery pipeline. They include:
- Improved deployment frequency
- Faster time to market
- Lower failure rate of new releases
- Shortened lead time between fixes
- Faster mean time to recovery (in the event of a new release crashing or otherwise disabling the current system).
Simple processes become increasingly programmable and dynamic, using a DevOps approach. DevOps aims to maximize the predictability, efficiency, security, and maintainability of operational processes. Very often, automation supports this objective.
DevOps integration targets product delivery, continuous testing, quality testing, feature development, and maintenance releases in order to improve reliability and security and provide faster development and deployment cycles.
Practices that correlate with deployment frequency are:
- Continuous delivery
- Using version control for all production artifacts
Practices that correlate with a lead time for change are:
- Using version control for all production artifacts
- Automated testing
Practices that correlate with a mean time to recovery for change are:
- Using version control for all production artifacts
- Monitoring system and application health
While DevOps describes an approach to work rather than a distinct role (like system administrator), job advertisements are increasingly using terms like "DevOps Engineer".
It does not exist any certificate or title about "DevOps Engineering" but it's true that any developer could and should know how to implement, test and provide a piece on the DevOps process, at least the pieces we are involved in.
If you know nothing about and you apply for a company that is implementing DevOps on the entire development process it could turn into a cultural shock and a big misunderstanding on how things work and why they work like that.
I'll not dig deeper on terms like IaC (Infrastructure as code), Containerization, Orchestration or Software measurement on this post because each of that points will need an entire detailed post, and it's not needed for the purpose of this post. You can search these concepts to know what they mean but you don't need to know more at this point.
In software engineering, continuous integration (CI) is the practice of merging all developers' working copies to a shared mainline *several times a day.
*It's not necessary to be several times a day, but when it's rationally needed.
When embarking on a change, a developer takes a copy of the current code base on which to work. As other developers submit changed code to the source code repository, this copy gradually ceases to reflect the repository code. Not only can the existing code base change, but new code can be added as well as new libraries, and other resources that create dependencies, and potential conflicts.
The longer development continues on a branch without merging back to the mainline, the greater the risk of multiple integration conflicts and failures when the developer branch is eventually merged back. When developers submit code to the repository they must first update their code to reflect the changes in the repository since they took their copy. The more changes the repository contains, the more work developers must do before submitting their own changes.
Eventually, the repository may become so different from the developers' baselines that they enter what is sometimes referred to as "merge hell", or "integration hell", where the time it takes to integrate exceeds the time it took to make their original changes.
As you should notate it's about merging all devs work to the repository and usually deploy it into a test machine (usually called pre-production or test environment) for testing different changes merged together seeking possible bugs.
There's two ways for achieving this CI heaven. The best one from my point of view is merging the branch Develop into my feature branch so I can test my changes locally, then when the feature is finished I'm able to merge easily my feature branch into Develop and creating a merge request from develop to Master branch.
The reason for doing that comes from the suddenly need to make minor changes that are not semantically hotfixes, so If I'm pushing an unfinished feature into develop several times it could block a deploy into production. This would make other teammates to use hotfixes for things that are not hotfixes or spend several time commenting my feature which is not desirable (I'll need to fix the merge later and uncomment it when pulling changes).
Continuous delivery (CD) is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time and, when releasing the software, doing so manually. It aims at building, testing, and releasing software with greater speed and frequency. The approach helps reduce the cost, time, and risk of delivering changes by allowing for more incremental updates to applications in production. A straightforward and repeatable deployment process is important for continuous delivery.
CD contrasts with continuous deployment, a similar approach in which software is also produced in short cycles but through automated deployments rather than manual ones.
You will need to choose one or another depending on the project. If you automated tests on deploy process you may want to run with Continuous Deployment while in contrast if you need to manually test the changes you'll perform a Continuous Delivery.
The reason to be for this practices are simple to understand. If you deploy tones of changes at once your application changes will be delivered to customers with bigger time spans and the testing of all changes will be a hell.
If you deliver changes with short cycles there will be few changes to test and the customers will receive updates quickly.
Achieving CI CD easily, step by step
The first concept you should be aware of is that CI/CD (or DevOps) is a way to work, not a thing you do specifically over your APP.
I'm assuming you already use an VCS (version control system) like Git, that you already test your changes properly (Unit tests, End to End, or manually from the test or QA department if the application doesn't support automated tests for some reason).
First change to DevOps comes from the git usage: You'll need to use Git and Git Flow properly. You can check this tutorial if you know nothing or few about.
Using gitflow properly will lead you to integrate changes properly into the repository.
At this point all you need is a CI/CD script to automate all tasks that can be automated on your application.
Let's assume you are starting a web app from scratch and you want to add this workflow to it.
How it works?
Continuous Integration works by pushing small code chunks to your application’s code base hosted in a Git repository, and, to every push, run a pipeline of scripts to build, test, and validate the code changes before merging them into the main branch.
Continuous Delivery and Deployment consist of a step further CI, deploying your application to production at every push to the default branch of the repository.
These methodologies allow you to catch bugs and errors early in the development cycle, ensuring that all the code deployed to production complies with the code standards you established for your app.
The Script, step by step
I'm using GitLab as main repository but this script should work on all git repositories; on GitLab this script must be named as .gitlab-ci.yml (You'll need to rename it depending on your chosen repository).
stages: - deploy deploy: stage: deploy image: debian:stretch-slim only: - master script: - apt-get update && apt install -y --no-install-recommends lftp - lftp -e "set ftp:ssl-allow no; set ssl:verify-certificate no; mirror -R ./ "$REMOTE_ROUTE -p 21 -u $USER,$PSWD $HOST
*The uppercase preceded by dollar symbol are variables that you can set inside your repository configuration/preferences.
This will raise a docker image with your code inside, perform some actions inside this docker and then this docker image will be destroyed, all "on the fly".
Line by line:
We add a stage called deploy on the stage list, then define this deploy stage with a docker image debian:stretch-slim.
It will raise only when a merge/push into Master branch happen.
Then we define the script that will run at this point of the deploy stage.
In this case it will simply perform an update and upgrade (-y = assume yes to all questions), then we need to install lfpt and finally it will use lftp to move the repository content (./) into the server desired directory ($REMOTE_ROUTE) using FTP (this server directory will be the root dir for a given domain).
This could be Continuous Deployment as it will trigger each time a code push is done into Master branch.
It also could be Continuous Delivery if we block pushes to master, adding another stage that takes code from Develop instead of Master and we deliver the code into a test machine. Then we need to manually perform a merge into master in order to run the deploy to production pipeline.
Ok but this is so simple, does it can perform more actions?
Well the first thing we need to do is to add security to this. It's a plain FTP transaction which can be converted easily into a FTP over SSL (FTPs) transaction.
stages: - deploy deploy: stage: deploy image: debian:stretch-slim only: - master script: - apt-get update && apt install -y --no-install-recommends lftp - lftp -e "set ftp:ssl-protect-data true; set ftp:ssl-force true; set ssl:verify-certificate no; set ftp:ssl-allow no; mirror -R ./ "$REMOTE_ROUTE -p 21 -u $API_USER,$API_PSWD $HOST
Now we added ftp:ssl-protect-data true; set ftp:ssl-force true; so it will run over SSL (which must be installed on the target machine).
The best way to perform this actions is using SSH instead, but I've a shared hosting to play with that I use on my free time so I've no terminal access to it. The best I can do here is using FTP over SSL, no SFTP, no SSH (both need terminal/console usage, locally or remotely and you don't have access to it on a shared hosting).
Ok, now we have secured this process encrypting the file transactions with the SSL certificate, now... what else?
Imagine I've a folder called app/ on the repository root dir on where I have a react/angular/svelte/preact (whatever) project or simply a plain html, css and js bunch of files for the front-end.
Nowadays it's common to use a bundler (webpack, rollup, parcel and so). - In this case I'm using parcel which is my favorite -
You can add the dist/ folder (the output of the bundler) to your repository so you can move your dist/ folder directly from your repository into production folder:
stages: - deploy deploy: stage: deploy image: debian:stretch-slim only: - master script: - apt-get update && apt install -y --no-install-recommends lftp - lftp -e "set ftp:ssl-protect-data true; set ftp:ssl-force true; set ssl:verify-certificate no; set ftp:ssl-allow no; mirror -R ./app/dist/ "$REMOTE_ROUTE -p 21 -u $API_USER,$API_PSWD $HOST
This is a bad practice. After all this is a result of a transpillating/compiling process over your original code.
We can solve this by simply performing the build inside this raised machine on the deploy process like this:
stages: - deploy deploy: stage: deploy image: debian:stretch-slim only: - master script: - apt-get update && apt install -y --no-install-recommends lftp - cd ./app/ - apt-get install -y gnupg2 - apt-get install curl -y - apt-get install apt-transport-https ca-certificates - curl -sL https://deb.nodesource.com/setup_12.x | bash - - apt-get install nodejs -y - curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - - echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list - apt update && apt install yarn -y - yarn install - yarn build-prod - lftp -e "set ftp:ssl-protect-data true; set ftp:ssl-force true; set ssl:verify-certificate no; set ftp:ssl-allow no; mirror -R ./dist/ "$REMOTE_ROUTE -p 21 -u $USER,$PSWD $HOST
- I'm using yarn but you can change it to npm, npx or your favorite package manager.
Wow it grew fast, huh?
At the beginning I move into app/ directory, where my front-end app is. Then I install dependencies and dependencies of the dependencies so I can perform the yarn install and the yarn build-prod commands.
yarn (or npm or..) install will read the package.json that is inside the app folder, download and install the dependencies specified on it and then it generates the node_modules
yarn build-prod is a custom script that uses parcel to bundle my app, the details does not matter on this context, all you need to know is that this generates a dist/ folder inside app/ that contains my front-end code for production (ofuscated, minified and so).
At the end (remember we are inside ./app/ directory) I push the content of the dist/ folder into the $REMOTE_ROUTE which is the production root directory.
So now we have:
- Avoided pushing dist/ folder into the repository
- Avoided pushing node_modules into the repository (NEVER push your node_modules folder into a repository for goodness sake! No excuse, don't do that, never.)
- Automated the deploy process each time some code is pushed or merged into Git Master branch
- If some process fails (because we added a new dependency but didn't stated in package.json for example) it will throw an error and will cancel the process, so we added reliability
As you can see you can add linux commands in the process.
You need to remember that all commands will be thrown as root so avoid using sudo on your commands. You also need to skip user interaction inside the process (for example adding -y modifier on an apt-get install command will automatically assume Yes for all questions) otherwise the process will fail (you cannot interact with the process, it's an automation).
Automated Tests on the Pipeline:
Then you can add triggers for your tests on this process, it could vary a lot depending on the tools you use for testing (Cypress, Selenium, Protractor...), each tool will have some documentation about adding tests into CI pipelines, for example Cypress CI documentation.
I can't add all examples because basically I didn't worked with all of them and it will be poorly useful for this generic purpose to all of you. If you already have a pipeline working at this point please, read your test tools documentation about CI pipelines as there are tones of tools for different technologies, languages and methodologies that I can't cover here.
Adding the tests into your pipeline and adding another stage for deploying develop into a pre-production / test environment will complete your script.
You'll get automation tests before deploying with a visible log if something fails and your CI CD pipeline will be properly completed.
At this point all that lasts is Monitoring.
Monitoring provides a heartbeat of how your apps are performing as you're deploying new versions of your code in various environments. Catching issues earlier in the process empowers teams to quickly remedy the issue, and continue to test and monitor the subsequent changes.
There are a bunch of tools for this purpose, some of them are Datadog, Nagios, Zabbix, Sensu, Prometheus, SysDig...
Each one have a different approach for the job and offers different tools (not only for monitoring your Apps but for monitoring server state, data graphs and so on) so you'll need to dig a bit to know which one fits best to you.
For the same reason that I told you on Tests I can't cover Monitoring on a detailed way here. Moreover Monitoring tools are not integrated on this pipelines process but installed, configured and maintained on your server, and the tools you will be using would be different depending on your platform (Cloud Server, Dedicated Server, VPS (Virtual Private Server), shared hosting...) apart from your application and data monitoring needs.
If you apply for a job on any modern company (or an old company with modern software) they usually will be using a pipeline like this (it may vary a lot depending on the technologies they use, but the basics and the process will be basically the same: plan, code, build, test, release, deploy, operate, monitor and repeat all over and over again).
You can find commands for backend such a framework command on the deploy stage for example some artisan command if you use Laravel with PHP. Also you can find some different command for front-end that uses another package manager with a different bundler or custom scripts.
You can also find integrations with the concepts I let unexplained like Orchestration or measurement.
Also you can find CI CD pipelines on a serverless environment with some lambda functions on it, or using kubernetes with jenkins or with azure. As there are tones of possibilities I tried to cover a use case that you can use for almost all your projects and that you can reproduce at home with few cost (the Shared hosting I use is about 4.95 U$D per month with discounts for billing yearly and so (go to Web hosting to see shared hosting plans). I can recommend it because its easy to handle and it includes a CPanel that makes tones of things easier so you can play and learn with.
But all those possible integrations are steps inside the script and before pushing the deploy, or SSH commands thrown at a specific line of the pipeline process for a given purpose. You probably won't be able to get a proper monitoring tool working on your system and you probably won't need orchestration tools as those concepts apply for big apps or at least, for a cloud or dedicated server (you also can configure them on an VPS) so If you are learning you may not want to spend around 300 U$D per year for getting a low performance VPS. If you are about to create a startup then you will need to concern about that and get a proper server with monitoring, also orchestration will fit only if your architecture needs it, not always).
Hope it helped you to understand all this process and don't be shy to ask me in the comment section if you have any doubt :)