When is the last time you updated your CI/CD workflow? A year ago? Never? You are not alone, my friends. Reconfiguring workflows can be one of the most daunting tasks for DevOps practitioners. But with new opportunities to benefit from CircleCI plans, there’s one simple and effective place to start: understanding concurrency and parallelism.
Using concurrency and parallelism can cut your build times significantly. But you need to know what they are and how to find them in your config file.
What is concurrency?
Concurrency means that multiple computing tasks are happening at the same time. It’s everywhere in computing. We have so many things happening concurrently: applications running, computers communicating in a network, or even users visiting a website. Concurrency is so commonplace that it’s easy to assume we know how it works. While the definition always remains the same, the practical application of concurrency can be nuanced. So what does concurrency mean within CircleCI?
Simply put, concurrency in CircleCI is the number of tasks that are being executed at any point in time. For example, CircleCI’s free plan provides a concurrency limit of 30, meaning you can run up to 30 tasks at the same time.
Computing how many concurrent tasks are happening is a little more complicated than finding out how many jobs are running or how many containers are being used. Pipelines can have all sorts of twists and turns that change the number of concurrent tasks, like conditional logic or test splitting. For this post, I’ll focus on how to manage concurrency and when it matters.
What is parallelism?
It’s hard to write about concurrency in CircleCI without talking about parallelism. The two concepts are often conflated, but have distinct applications. We already know concurrency is the number of executing tasks at any given time in a workflow. We can affect concurrency by changing parallelism, which is where some confusion can start.
Parallelism splits work between identical copies of a particular job.
Parallelism is most often used to split up test suites. All of the copies of the job have the same instructions, but run with different variables. Parallelism is set in the CircleCI configuration file, and the number of parallel jobs counts toward your concurrency total.
Applying concurrency and parallelism
Here’s an example using the CircleCI free plan, which has a concurrency limit of 30. Say you have 10 jobs that each take 1 minute to execute. Without concurrency, this workflow would take 10 minutes to complete. With concurrency, each of those jobs could run at the same time, and the workflow would be done in 1 minute instead of 10. If you set parallelism to 3, you can still run 10 jobs at the same time and be within the concurrency limit. You can run 3 copies of 10 jobs for 30 total tasks done in a minute.
You could just as easily run 4 copies (parallelism of 4) on 7 jobs for a total of 28 concurrent tasks. The free plan allows a maximum parallelism of 4, but other plans have more options if you really need the speed.
If you have a situation where the total number of simultaneous tasks is more than 30, some of the work will have to wait. For example, if you set parallelism to 4 on 8 jobs, one of those jobs (and all of its copies) will have to wait until there are resources free. The copies of jobs created by parallelism always run simultaneously, so even if you’re only over the concurrency limit by 2 (as in this case), all four copies of one job will wait until another job finishes.
How to add parallelism and concurrency to your pipelines
Most of the time, you don’t need to worry about concurrency. Concurrency limits are set by your CircleCI plan and are enforced in the background. You may not need to manage concurrency at all. Most often, you will take advantage of your concurrency by working with parallelism. Setting parallelism is easy: set the value of the parallelism key in the config.yml. Any value greater than 1 means you’re running parallel tasks.
How to set parallelism
~/.circleci/config.yml
version: 2
jobs:
test:
docker:
- image: cimg/<language>:<version TAG>
auth:
username: mydockerhub-user
password: $DOCKERHUB_PASSWORD # context / project UI env-var reference
parallelism: 3
This example has one job (named test) running on Docker. Setting parallelism to 3 means three copies of this job (called tasks) will run simultaneously on three separate Docker containers. The only differences between these tasks are environment variables so that work can be split among these tasks. This is most often used for test splitting, which you can read more about in our documentation.
Conclusion
As DevOps practitioners, we care about concurrency because it helps get work done faster by running multiple processes at the same time. Knowing the concurrency limit of your plan and all the tasks that count toward it helps you optimize your builds.
Concurrency can be a straightforward concept. Jobs finish sooner when at least some of them can run at the same time. Parallelism helps you customize which tasks are done concurrently and is easily set in the config file. Concurrency and parallelism both help finish tasks quicker so you can get on to the important business of failing, passing, and shipping software.
The next step is to continue learning about test splitting and continue to automate and expedite your jobs. If you’re interested in expert reviews of your CircleCI configuration to help optimize your work, you can get a custom evaluation from a dedicated support engineer by signing up for a premium support plan.
Top comments (0)