In this post, I describe a few of the amazing concepts of GitLab CICD pipelines. They made my daily work on the CICD pipelines significantly easier.
All examples can be found here.
1. YAML Anchors
This is probably for many of you a no-brainer. When I got into CICD I was also pretty new to work with YAML files. I also did not start from scratch. I had to deal with a huge CICD file which was over 1000 lines long. I could shrink it down to a couple of hundreds line while also adding more functionality into the pipelines. I could achive this utilizing the concept of parallelization and YAML Anchors.
What are YAML Anchors?
YAML Anchors are reusable code blocks you can easily insert at a later stage. You can define entire jobs like this and based on some variables you set you can change the job. I will make an example.
Let's say we have two builds in our pipeline to perform, one for development and one for production. But for production, we need a different .env
file than for development. We could just create two jobs like this below, which will result in this kind of pipeline:
stages:
- deploy
dev_deploy:
variables:
- ENV_FILE: .env.test
stage: deploy
script:
- source $ENV_FILE
- echo "DEPLOY APPLICATION"
master_deploy:
variables:
- ENV_FILE: .env.prod
stage: deploy
script:
- source $ENV_FILE
- echo "DEPLOY APPLICATION"
These two jobs are fairly easy to read, but imagine a more complex build/bundle script and also added different rules on when to run what jobs. We can do better if we use YAML Anchors because most parts of the jobs are the same. So we can transform the above code block to the following, which will result in this kind of pipeline:
stages:
- deploy
.deploy: &deploy
stage: deploy
before_script:
- if [[ "$ENV_FILE" == '' ]] ; then echo "ENV_FILE is not set" ; exit 1 ; fi
script:
- source $ENV_FILE
- echo "BUILDING APPLICATION"
dev_deploy:
<<: *deploy
variables:
ENV_FILE: .env.test
staging_deploy:
<<: *deploy
master_deploy:
<<: *deploy
variables:
ENV_FILE: .env.prod
As you can see, we now share the code of across the different jobs. I also added a staging job into the mix to show that we can also prevent jobs from running if the required variables are not set for the job. When it comes to override, the stuff which comes at a later line will override the declarations of before. That is why we "spread" the deploy
anchor at the top of the job.
2. Parallelization
Parallelization has similar use cases to YAML Anchors. But is somewhat different. Sometimes a job is exactly the same, but just one variable is different, and therefore it needs to be run again and again. So going back to the first example, we could also improve on it, in the following manner, which results in this kind of pipeline:
stages:
- deploy
dev_deploy:
stage: deploy
parallel:
matrix:
- ENV:
- test
- prod
script:
- echo $ENV
- echo .env.$ENV
- source .env.$ENV
- echo "DEPLOY TO $ENV"
So instead of define multiple jobs, we define one job with a parallel matrix. This will spin up the job multiple times and inject the ENV
variable. This is very useful if you for instance need to build or test your app based on different environments files, because then only one CI Variable is different. The downside is that you can only spin up 50 parallel processes.
On the other hand, parallel jobs are often used to split up a big job into smaller parts and then bring everything together in the next job, or you can split your test files into parallel jobs.
3. CI_JOB_TOKEN
The CI_JOB_TOKEN
is a pre-set variable which allows you to access or trigger other resources within a group. So if you need to trigger a multi project pipeline where for instance after the backend is deployed you want to trigger the frontend deployment the CI_JOB_TOKEN
comes in very handy. But there is more! If you use the CI_JOB_TOKEN
then GitLab will actually know and make a connection between these pipelines. You can jump from one project's pipeline to another project's pipeline. The call would look like this:
stages:
- trigger_pipeline_in_other_project
trigger:
stage: trigger_pipeline_in_other_project
script:
- curl --request POST --form token=${CI_JOB_TOKEN} https://gitlab.com/api/v4/projects/<PROJECT_ID>/trigger/pipeline
A resulting pipeline could look like this:
4. Clean Up Jobs
Clean up jobs are jobs which run after another job and based on the pipeline status the execution changes. So you can basically run a different job depending on the pipeline status. For instance, you can then clear the cache on failure or invalidate some CloudFront dist etc. So to utilize this concept you can do something like the following, which result in a pipeline like this:
stages:
- build
- deploy
build:
stage: build
script:
- echo "BUILD APPLICATION"
deploy_on_failure:
stage: deploy
when: on_failure
script:
- echo "CLEAR ARTIFACTS"
deploy_on_success:
stage: deploy
when: on_success
script:
- echo "DEPLOY APPLICATION"
deploy_on_failure
runs only if the build has failed, while deploy_on_success
will run when the build has succeeded. This can come very handy but has limitations, that is why I really like the next concept.
5. Child Pipelines & Dynamic Child Pipelines
Child Pipelines are pipelines which are started using the combination of the trigger
and include
keywords. They are detached from the parent pipeline and start by default directly running when triggered. So if one stage in your CICD file is "trigger job" for a child pipeline, it will trigger the pipeline and then the next job will start immediately after. Child pipelines are defined in a second CICD file, which is included into the main file. Let me make an example, which would result in this kind of pipeline:
The main file would look like this:
stages:
- test
- build
- deploy
test:
stage: test
trigger:
include: test.yml
build:
stage: build
script:
- echo "BUILD APPLICATION"
deploy:
stage: deploy
script:
- echo "DEPLOY APPLICATION"
As you can see, the test stage includes a second YAML file, which will be triggered into the detached (child) pipeline. The file could look like this:
stages:
- test
test:
stage: test
script:
- echo "TEST APPLICATION"
So, child pipelines allow splitting your YAML files into multiple files. But they also have a constraint and can only be triggered up to two levels down. That means the first child pipeline can trigger another child pipeline, but this pipeline cannot trigger a third child pipeline. But why is this exciting, we can use other tools for splitting YAML files.
This is exciting because the triggered YAML File does not have to exist before the pipeline starts!
The above statement leads us right into Dynamic Child Pipelines. This concept is really powerful and deserves an article on its own (Let me know if I should write more about it).
Most programming languages have some sort of packages to convert a JSON like structure into a YAML file. So what you can do, you have a pre-job which will compute the YAML file for you and then passes the YAML file as an artifact to the trigger job. This way, you can decide on the fly what the child pipeline should look like.
What I am going to show is not the most elegant or dynamic way, but it is the easiest way to grasp the concept. I call this set up a pipeline switch. Let's say we have a job which computes something for us.
Example conditions:
- For instance gas prices for blockchains and if the gas prices are low we want to deploy the new contracts (basically when deployment costs are low)
- On every Sunday we want to deploy our frontend in a random color π
You get the gist, so we have some condition on which we want to alter the pipeline.
In the below example the pipeline depends on the outcome of the condition (the deployment fees):
stages:
- build
- check_deployment_costs
- trigger_dynamic_child
build:
stage: build
script:
- echo "BUILD APPLICATION"
check_deployment_costs:
stage: check_deployment_costs
script:
- echo "RUNS SCRIPT TO CHECK DEPLOYMENT COSTS"
- echo "query computed costs per contract are 50 Finney"
- bash pipelineSwitch.sh 50
artifacts:
paths:
- './dynamicChildPipeline.yml'
trigger_dynamic_child:
stage: trigger_dynamic_child
trigger:
include:
- artifact: dynamicChildPipeline.yml
job: check_deployment_costs
So in the step check_deployment_costs
we check for the deployment costs and plug that into our bash script. The bash script is a simple check and then copies from the template folder to the location from where we will upload the artifact.
echo "input value: $1"
if [[ $1 < 51 ]]; then
echo "should deployment"
cp ./CICDTemplates/deployment.yml ./dynamicChildPipeline.yml
else
echo "should wait deployment"
cp ./CICDTemplates/sendNotification.yml ./dynamicChildPipeline.yml
fi
This solution might be as stated earlier not as elegant as other solutions but still pretty viable for a quick way. The resulting pipelines look like this, if the price is too high or if the price is okay.
LET ME KNOW π
- Do you need help, with anything written above?
- What would be top on your list? π
- Do you think I can improve - then let me know
- Did you like the article? π₯
Top comments (0)