<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alexej Penner</title>
    <description>The latest articles on DEV Community by Alexej Penner (@alexejpenner).</description>
    <link>https://dev.to/alexejpenner</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F828553%2Ffe1b44da-e106-4bff-958a-2e5e40edc8e3.jpeg</url>
      <title>DEV Community: Alexej Penner</title>
      <link>https://dev.to/alexejpenner</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexejpenner"/>
    <language>en</language>
    <item>
      <title>How we made our integration tests delightful by optimizing our GitHub Actions workflow</title>
      <dc:creator>Alexej Penner</dc:creator>
      <pubDate>Fri, 11 Mar 2022 08:55:46 +0000</pubDate>
      <link>https://dev.to/alexejpenner/how-we-made-our-integration-tests-delightful-by-optimizing-our-github-actions-workflow-fh1</link>
      <guid>https://dev.to/alexejpenner/how-we-made-our-integration-tests-delightful-by-optimizing-our-github-actions-workflow-fh1</guid>
      <description>&lt;h1&gt;
  
  
  📍 What's the point of Github Actions?
&lt;/h1&gt;

&lt;p&gt;Software projects are complex beasts with a multitude of moving pieces that can affect each other in surprising ways. &lt;br&gt;
If you're reading &lt;br&gt;
this then chances are, you've been part of such projects where you have to build and improve software &lt;br&gt;
while ensuring you don't break anything in the process. Testing and code standards are the main antidote at your &lt;br&gt;
disposal. As such these two tools should be considered integral parts of the software development lifecycle to allow&lt;br&gt;
for Continuous Integration.&lt;/p&gt;

&lt;p&gt;The first thing that pops up when you search for the term 'Continuous Integration' is the following definition:&lt;br&gt;
'Continuous integration (CI) is the practice of automating the integration of code changes from multiple contributors &lt;br&gt;
into a single software project.' A big part of CI is automated quality control before allowing the new code&lt;br&gt;
to be merged into the productive codebase. This quality control can get arbitrarily complex, layered and even&lt;br&gt;
convoluted. As such it is important to grow your quality control framework at the same pace as your product increases&lt;br&gt;
in scale and complexity.&lt;/p&gt;

&lt;p&gt;This quality control can address multiple different aspects of the code. Static code analysis, otherwise known as &lt;br&gt;
&lt;a href="https://www.perforce.com/blog/qac/what-lint-code-and-why-linting-important"&gt;linting&lt;/a&gt;, is one such aspect of code quality &lt;br&gt;
control. Linting deals with programmatic and stylistic issues, such as making sure typed inputs and outputs match up. &lt;br&gt;
Unit and integration tests are also important ways of adding quality control your code.&lt;/p&gt;

&lt;p&gt;The practice of DevOps aims to seamlessly integrate such quality control workflows into the development workflow. &lt;br&gt;
When you use Github for code versioning, you can use Github Actions as a tool that can orchestrate these code quality &lt;br&gt;
checks. Github Actions enables you to define workflows in &lt;em&gt;yaml&lt;/em&gt; files. A 'workflow' is a recipe of instructions that you&lt;br&gt;
would like to run at a given time. &lt;/p&gt;

&lt;p&gt;'Workflows' can be equipped with triggers that define when it should run. Your workflow could run, for example, when someone makes a &lt;br&gt;
commit on a branch, creates a pull request for a specific branch or when manually triggered from the UI. Workflows contain one or &lt;br&gt;
more 'jobs'. &lt;/p&gt;

&lt;p&gt;A 'job' is a collection of instructions that is run on a virtual machine of its own. As such jobs are &lt;br&gt;
perfectly encapsulated from one another. The user can choose between a few operating systems for each 'job' as part of &lt;br&gt;
the 'job.strategy' attribute. However, when developing a Python library, you might want it to be tested on all operating &lt;br&gt;
systems and maybe even on multiple Python versions; enter 'strategy matrices'. A matrix allows you to define a set of&lt;br&gt;
environment configurations. In our case we chose three operating systems (Linux, MacOS, Windows) and two different Python &lt;br&gt;
versions (3.7 and 3.8). The matrix then makes sure that the job is run for all six possible permutations of these &lt;br&gt;
configurations (eg MacOS + Python 3.7). &lt;/p&gt;

&lt;p&gt;Finally, a job consists of one or many 'steps'. Such steps can be defined through arbitrary command line commands. &lt;br&gt;
Bash scripts can be invoked, Python scripts can be called or you can use a plethora of steps produced by the broader &lt;br&gt;
community (see &lt;a href="https://github.com/marketplace?type=actions"&gt;Github Action Marketplace&lt;/a&gt;). These community steps are &lt;br&gt;
called 'actions'. Steps can be executed conditionally. Importantly, all steps of a job run on the same machine. As such&lt;br&gt;
they can read and write to and from the same filesystem. Information from previous steps can also be directly accessed.&lt;/p&gt;
&lt;h1&gt;
  
  
  🚂 Where we start our journey
&lt;/h1&gt;

&lt;p&gt;Here at ZenML, we've made it our mission to build a tool that spans the complete development process of machine learning &lt;br&gt;
solutions in Python. Such a lofty vision comes with its own set of challenges. Not least of which is the shear scale &lt;br&gt;
of other tools that need to be integrated. You might have guessed where this is going. Many integrated tools implies a &lt;br&gt;
large number of integration tests. This is especially true if you also want to verify interoperability.&lt;/p&gt;

&lt;p&gt;We start this journey in a very standard, cookie-cutter, monolithic workflow. I'm sure many projects start out this way. &lt;br&gt;
One yaml file defines a workflow that checks out the code, perform linting, unit testing, integration testing and &lt;br&gt;
uploading coverage to &lt;a href="https://codecov.io/"&gt;codecov&lt;/a&gt; on a matrix of operating systems and Python versions. Here is one such sample &lt;br&gt;
of what the workflow used to look like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1mq2UFFT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/m3keedmzdc28yncojhw2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1mq2UFFT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/m3keedmzdc28yncojhw2.png" alt="Sample of the monolithic workflow that encompassed the complete code quality control" width="583" height="793"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One of the largest problems we ran into was the different dependencies each step needed and the &lt;br&gt;
consequential nightmare of unexpected upgrades or downgrades of some low-level packages. This would then lead to some &lt;br&gt;
confusing error messages and some very long debugging sessions, at the end of which our reaction was something like &lt;br&gt;
this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LNq-CZs9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jq7xmeb9x7gl2btr35cw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LNq-CZs9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jq7xmeb9x7gl2btr35cw.png" alt="Turbo Facepalm when the hour long debugging session leads to the conclusion that some dependency was sneakily downgraded" width="674" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you might imagine, the team was growing frustrated with the long testing times and the sporadic errors and a solution&lt;br&gt;
needed to be found. Here are five changes we implemented to upgrade our CI pipeline.&lt;/p&gt;
&lt;h2&gt;
  
  
  ⏩ 1. Speed up your workflows with caching
&lt;/h2&gt;

&lt;p&gt;Caching is a powerful way to speed up repeating processes. We run 'poetry install' in one such process that is necessary for &lt;br&gt;
each aspect of our CI pipeline. We didn't want to commit the 'poetry.lock' file to ensure we would keep ZenML compatible &lt;br&gt;
with the newest versions of packages that we are integrating with and test regardless of the state on the developer's&lt;br&gt;
machine. On average the &lt;a href="https://python-poetry.org/"&gt;Poetry&lt;/a&gt; installation would take between 10-15 minutes for each cell on the OS-Python Version&lt;br&gt;
matrix. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WvJhvOfy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z4sdin6w0zqb64dl39u7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WvJhvOfy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z4sdin6w0zqb64dl39u7.png" alt="Resolving and installing of dependencies used to take 10–15 minutes" width="623" height="43"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using caching we are able to make this step nearly instantaneous, assuming a cached venv is available. See the &lt;em&gt;yaml&lt;/em&gt; excerpt&lt;br&gt;
below to see how caching is done within a Github Actions workflow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;...&lt;/span&gt;

    &lt;span class="s"&gt;- name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Load cached venv&lt;/span&gt;
      &lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cached-poetry-dependencies&lt;/span&gt;
      &lt;span class="s"&gt;uses&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v2.1.6&lt;/span&gt;
      &lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;.venv&lt;/span&gt;
          &lt;span class="s"&gt;poetry.lock&lt;/span&gt;
        &lt;span class="c1"&gt;# Cache the complete venv dir for a given os, python version, pyproject.toml&lt;/span&gt;
        &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;venv-${{ runner.os }}-python-${{ matrix.python-version }}-${{ hashFiles('pyproject.toml') }}&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Project&lt;/span&gt;
      &lt;span class="na"&gt;shell&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bash&lt;/span&gt;
      &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;steps.cached-poetry-dependencies.outputs.cache-hit != 'true'&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;poetry install&lt;/span&gt;

    &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see the cache is saved with a unique key as a function of the runner operating system, the Python version and a hash of the&lt;br&gt;
'pyproject.toml'. As a consequence the cache can be invalidated by changing the 'pyproject.toml'.&lt;/p&gt;

&lt;p&gt;Unfortunately, this caching action currently does not give the user control over the exact&lt;br&gt;
moment in the pipeline when the cache is written to. As things currently stand, if there is no cache-hit then the cache entry is created&lt;br&gt;
at the end of the job. This means you need to structure your jobs purposefully in such a way that they reflect the &lt;br&gt;
state you want to cache, at their end."&lt;/p&gt;

&lt;p&gt;The keen-minded among you might have caught on to an inconsistency in my argument from above. We don't commit the &lt;br&gt;
'poetry.lock' file, as we want to always guarantee compatibility with the bleeding-edge changes of our integrations and&lt;br&gt;
dependencies. But by caching the virtual environment directory as a function of the 'pyproject.toml', aren't we just &lt;br&gt;
locking on to the versions when we cache for the first time? That is correct, however we now are not dependent on the &lt;br&gt;
state on a developer's machine; instead we have a state for each combination of OS and Python version. On top of this, we can &lt;br&gt;
now decide on a cadence by which we periodically invalidate the cache.&lt;/p&gt;

&lt;p&gt;Currently there is no way to explicitly invalidate the cache, so you'll have to use a &lt;br&gt;
workaround, like changing something innocuous in a hashed file, or to add date-stubs in the cache-key."&lt;/p&gt;
&lt;h2&gt;
  
  
  🏭 2. Modularize your monolith with reusable workflows
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://nalexn.github.io/separation-of-concerns/"&gt;Separation of Concerns (SoC)&lt;/a&gt; is an important principle&lt;br&gt;
in software development: "The principle is simple: don’t write your program as one solid block,&lt;br&gt;
instead, break up the code into chunks that are finalized tiny pieces of the system each able to complete a &lt;br&gt;
simple distinct job." This makes your code more understandable, reusable and maintainable.&lt;/p&gt;

&lt;p&gt;In order to grow our Github Actions that meant splitting the monolithic workflow that was responsible for everything&lt;br&gt;
into multiple, sub-workflows with one purpose each. Luckily, Github Actions could do all this.&lt;/p&gt;

&lt;p&gt;Reusable workflows are a way to use full-fledged workflows as jobs within an overarching workflow.&lt;br&gt;
In our case this means we have one CI workflow that calls the linting, unit test and integration test workflows&lt;br&gt;
respectively. This enables us to use any combination of these sub-workflows but also trigger them separately. What this &lt;br&gt;
also gives us is a perfect encapsulation of each separate job. Now our linting dependencies do not interfere with the&lt;br&gt;
integrations that we must install for our integration tests. This also allows us more fine-grained control over the &lt;br&gt;
runners, Python versions and other peripheral configurations that can now be done at the level of each 'reusable &lt;br&gt;
workflow.&lt;/p&gt;

&lt;p&gt;Here's an excerpt from our 'ci.yml' file. Within the jobs section, we simply give each step in the job a name and call the &lt;br&gt;
corresponding reusable workflow.&lt;/p&gt;

&lt;p&gt;'.github/workflows/ci.yml'&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;poetry-install&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/workflows/poetry-install.yml&lt;/span&gt;

  &lt;span class="na"&gt;lint-code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;poetry-install&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/workflows/lint.yml&lt;/span&gt;

  &lt;span class="na"&gt;unit-test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;poetry-install&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/workflows/unit-test.yml&lt;/span&gt;

  &lt;span class="na"&gt;integration-test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;poetry-install&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/workflows/integration-test.yml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Within these reusable workflows themselves we just need to make sure to add &lt;br&gt;
&lt;code&gt;workflow_call&lt;/code&gt; to the list of triggers under &lt;code&gt;on:&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;'.github/workflows/lint.yml'&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Integration Test the Examples&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;workflow_call&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each reusable workflow takes the place of a job and is run on a separate machine. &lt;br&gt;
As such outputs from one job need to be defined as outputs/inputs explicitly to pass information between jobs."&lt;/p&gt;

&lt;p&gt;As you can see the jobs that reference the different workflows have dependencies on one another. Here we make sure the &lt;br&gt;
'poetry install' only has to be done once per OS/Python version combination before branching into the three separate &lt;br&gt;
workflows. Currently, each of the sub-workflows are running on the same matrix. &lt;/p&gt;

&lt;p&gt;One downside of this approach is that the 'poetry-install' job is only considered done, &lt;br&gt;
when all six matrix cells are complete. This means even if the ubuntu/py3.8 runner is done with the 'poetry-install' &lt;br&gt;
after 1 minute, the Ubuntu/Py3.8 runner for 'lint-code' can only start once every other runner on the 'poetry-install' &lt;br&gt;
job are done."&lt;/p&gt;
&lt;h2&gt;
  
  
  🔁 3. Avoid code duplication with composite actions
&lt;/h2&gt;

&lt;p&gt;As we were designing the different reusable workflows it became obvious that we were generating a lot of duplicated &lt;br&gt;
code. Each workflow would set up Python, do some OS-specific fine-tuning, install Poetry and load the cached venv&lt;br&gt;
or create it. &lt;/p&gt;

&lt;p&gt;This is where composite actions come into play. A composite action condenses multiple steps into one step and makes it&lt;br&gt;
usable as a step across all of your workflows. Here is a small example of how we use it. &lt;/p&gt;

&lt;p&gt;In the '.github' directory we create an 'actions' folder which in turn is populated by a folder for each composite &lt;br&gt;
action that you want to create -- in our case '.github/actions/setup_environment'. Within this folder you then create &lt;br&gt;
a file with the name 'action.yml'. Now you just need to add all your steps to the &lt;code&gt;runs&lt;/code&gt; section and add the &lt;br&gt;
&lt;code&gt;using: "composite"&lt;/code&gt; entry to it. &lt;/p&gt;

&lt;p&gt;'.github/actions/setup_environment/action.yaml'&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;runs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;using&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;composite"&lt;/span&gt;
  &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up Python&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v2&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ matrix.python-version }}&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Poetry&lt;/span&gt;
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;snok/install-poetry@v1&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;virtualenvs-create&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;virtualenvs-in-project&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;

    &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All that is left to do now is reference this action from within your workflows to start using it.&lt;/p&gt;

&lt;p&gt;'.github/workflows/lint.yml'&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;...&lt;/span&gt;

    &lt;span class="s"&gt;- name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Setup environment with Poetry&lt;/span&gt;
    &lt;span class="s"&gt;uses&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/actions/setup_environment&lt;/span&gt;

    &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You might be asking yourself: "What is the difference between these 'composite actions' and 'reusable workflows'?" Short &lt;br&gt;
answer is, 'composite actions' are a collection of commands while 'reusable workflows' also contain information on where&lt;br&gt;
and how to be executed. For more information check out &lt;a href="https://chris48s.github.io/blogmarks/github/2021/11/06/composite-actions-reusable-workflows.html#:~:text=A%20composite%20action%20is%20presented,separately%20in%20the%20summary%20output."&gt;this&lt;/a&gt;&lt;br&gt;
blogpost that goes a bit more in detail on the differences.&lt;/p&gt;
&lt;h2&gt;
  
  
  💬 4. Expose control of Github Actions to developers with comment interaction
&lt;/h2&gt;

&lt;p&gt;It is hard finding the correct atomized triggers for your workflows. "Should we run this on every pull request?&lt;br&gt;
Should we only run this on PRs from 'dev' to 'main'? Should we run this only for changes within a given directory?" &lt;br&gt;
These are some questions you'll inevitably run into while growing with your CI pipeline. All these questions are useful &lt;br&gt;
ways to critically examine the motivations and reasons behind each part of your CI pipeline. &lt;/p&gt;

&lt;p&gt;Automating most of these triggers helps ensure your code deployment runs smoothly with guaranteed checks&lt;br&gt;
in place. However, there are times when you want to have some more fine-grained control.&lt;/p&gt;

&lt;p&gt;We ran into one such case at ZenML. One of our integrations is &lt;a href="https://www.kubeflow.org/docs/components/pipelines/"&gt;Kubeflow Pipelines&lt;/a&gt;.&lt;br&gt;
This integration needs to spin up a cluster of pods using &lt;a href="https://k3d.io/"&gt;k3d&lt;/a&gt; in order to deploy Kubeflow Pipelines. Then all of our other integration tests&lt;br&gt;
are run on this cluster. This whole process takes about 1 hour to run and so it is not something we want running for each &lt;br&gt;
push on each PR. Instead, we want to have some control over when it is appropriate. &lt;/p&gt;

&lt;p&gt;The first part of the fix to this problem is to include &lt;code&gt;workflow_dispatch&lt;/code&gt; as one of the triggers for our reusable workflow that is&lt;br&gt;
dedicated to Kubeflow Pipelines integration tests. In order to make this even easier and more integrated into our &lt;br&gt;
normal workflow surrounding pull requests, we also added the 'pull-request-comment-trigger' action to our CI pipeline.&lt;/p&gt;

&lt;p&gt;Given a specific comment on a pull request, the test gets activated for this PR, meaning that each commit on that PR &lt;br&gt;
will now trigger the specified Kubeflow Pipelines integration test. &lt;/p&gt;

&lt;p&gt;As we are using the step as a basis to decide if a certain workflow should be executed, it needs to be part of a job of &lt;br&gt;
its own. As such we are explicitly defining the output of the 'check_comments' job, so it can be used to conditionally &lt;br&gt;
run the Kubeflow tests job. &lt;/p&gt;

&lt;p&gt;'.github/workflows/ci.yml'&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;...&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;opened&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;synchronize&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;issue_comment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;created&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

    &lt;span class="s"&gt;...&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

  &lt;span class="s"&gt;...&lt;/span&gt; 

  &lt;span class="s"&gt;check_comments&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;kf_trigger&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.check.outputs.triggered }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;khan/pull-request-comment-trigger@master&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;check&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;trigger&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LTKF!'&lt;/span&gt;
          &lt;span class="na"&gt;reaction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rocket&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;GITHUB_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;secrets.GITHUB_TOKEN&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}'&lt;/span&gt;

&lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="na"&gt;kubeflow-tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;poetry-install&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;check_comments&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;# Run this one only when pull-request-comment-trigger was triggered&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ needs.check_comments.outputs.kf_trigger == 'true' }}&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/workflows/kubeflow.yml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure you add 'pull_request' and 'issue_comment' to the workflow triggers if you want &lt;br&gt;
to use the 'pull-request-comment-trigger'.&lt;/p&gt;

&lt;p&gt;In our case, if you want to run integration tests on Kubeflow Pipelines specifically, you simply comment 'LTKF!', short&lt;br&gt;
for 'Let The Kubes Flow'.&lt;/p&gt;

&lt;p&gt;You may have noticed that there is also a reaction that can be specified &lt;code&gt;reaction: rocket&lt;/code&gt;. This might be more gimmick &lt;br&gt;
than anything. But isn't it the tiny things like this that can take your code from being functional to next level of&lt;br&gt;
delightful?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--QrUeaets--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6wy3xu1akiz00t3l3rvd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--QrUeaets--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6wy3xu1akiz00t3l3rvd.png" alt="Interaction with Github Actions on the PR comments" width="357" height="158"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using 'issue-comment' as a trigger seems to currently be only supported if the workflow with&lt;br&gt;
this type of trigger has ended up on the default branch of the repo already. As such this feature is not fully tested on&lt;br&gt;
our side yet as it will only reach our main branch in the coming week. I'll make sure to update this if something&lt;br&gt;
changes along the way."&lt;/p&gt;
&lt;h2&gt;
  
  
  ♻️ 5. Reduce wasted compute resources by avoiding unwanted concurrency
&lt;/h2&gt;

&lt;p&gt;I'm sure this has happened to you before. After some intense hours coding away you are ready to commit and oush your&lt;br&gt;
work. Mere seconds after you have pushed and opened your PR you realize that you left something in the code that does &lt;br&gt;
not belong. No problem, you think, and within seconds you make the change, commit and push. &lt;/p&gt;

&lt;p&gt;Trust me, I've been there more times than I care to admit. In these cases I would go into the Github Actions view and&lt;br&gt;
manually cancel the Github Action of the first push to free up the runners. But there is an easier way. &lt;br&gt;
By defining what the Github Action does in case of concurrency, this can be handled automatically. In our case, we want &lt;br&gt;
to cancel the action of the older commit, as we want to know if the most recent code version passes our CI pipeline.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;...&lt;/span&gt;

    &lt;span class="s"&gt;concurrency&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# New commit on branch cancels running workflows of the same branch&lt;/span&gt;
      &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ github.workflow }}-${{ github.ref }}&lt;/span&gt;
      &lt;span class="na"&gt;cancel-in-progress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;

    &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  🚀 Our final process (for now)
&lt;/h1&gt;

&lt;p&gt;When I set out on the journey to improve our CI pipelines, the Github Actions weren't even part of the plan. &lt;br&gt;
All I wanted to do was create a &lt;a href="https://docs.pytest.org/"&gt;pytest&lt;/a&gt; fixture that creates a separate virtual environment for each integration test&lt;br&gt;
(If this is something that interests you, let us know, and we'll do a separate blogpost on that whole story).&lt;br&gt;
The changes to our Github Actions just happened naturally on the side. The whole process did feel a bit like this though.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/AbSehcT19u0"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;As of early March 2022 this is the new CI pipeline that we use here at &lt;a href="https://github.com/zenml-io/zenml"&gt;ZenML&lt;/a&gt; and the &lt;br&gt;
feedback from my colleagues -- fellow engineers -- has been very positive overall. I am sure there will be tweaks, changes and refactorings in the future, but for&lt;br&gt;
now, this feels Zen. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2wKE9DOH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/54v38u2lle2rf17ib66q.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2wKE9DOH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/54v38u2lle2rf17ib66q.gif" alt="New Github Actions on the complete CI pipeline" width="600" height="531"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Check it out yourself &lt;a href="https://github.com/zenml-io/zenml/blob/develop/.github/workflows/ci.yml"&gt;here&lt;/a&gt; and feel free to &lt;br&gt;
drop in on &lt;a href="https://zenml.io/slack-invite/"&gt;Slack&lt;/a&gt; and let us know if this helped you or if you have tips on how we can &lt;br&gt;
do even better.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Alexej Penner is a Machine Learning Engineer at ZenML.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>startup</category>
      <category>mlops</category>
      <category>devops</category>
      <category>python</category>
    </item>
  </channel>
</rss>
