Omri Gabay

Posted on Nov 29, 2019 • Edited on Dec 13, 2023

Getting Started with SimpleCov in GitLab CI

#ruby #rails #simplecov #gitlab

EDIT (10/04/2020): Edited to reflect slightly cleaner practices, and use
the collate method in SimpleCov that does a lot of work for you.

simplecov is a useful code coverage tool built for Ruby projects. It encapsulates Ruby's own Coverage API and is, by far, the most popular gem and library for code-coverage.

simplecov-ruby / simplecov

Code coverage for Ruby with a powerful configuration library and automatic merging of coverage across test suites

SimpleCov

Code coverage for Ruby

SimpleCov is a code coverage analysis tool for Ruby. It uses Ruby's built-in Coverage library to gather code coverage data, but makes processing its results much easier by providing a clean API to filter, group, merge, format and display those results, giving you a complete code coverage suite that can be set up with just a couple lines of code SimpleCov/Coverage track covered ruby code, gathering coverage for common templating solutions like erb, slim and haml is not supported.

In most cases, you'll want overall coverage results for your projects, including all types of tests, Cucumber features, etc. SimpleCov automatically takes care of this by caching and merging results when generating reports, so your report actually includes coverage across your test suites and thereby gives you a better picture of blank spots.

The official formatter of SimpleCov…

View on GitHub

I spent a good chunk of this past weekend trying to integrate SimpleCov into a framework called Marty that we work on at my company, PennyMac Loan Services.

We use GitLab CI at PennyMac for continuous development workflows, and while it's a fantastic piece of software, getting it to work with a code coverage tool like SimpleCov is not quite a "batteries included" situation.

GitLab coverage in merge request — Example of GitLab CI Coverage percentage in a Merge Request

While SimpleCov does support merging coverage reports from different test libraries and suites, I ran into some difficulties getting it to merge results from multiple different runs using subsets of the same test suite. This is particularly important for our use case since we parallelize our test suite during CI pipeline runs by category, and we'd like to get an overarching coverage report for the whole suite instead of just the coverage of individual categories.

This post assumes that you already know what SimpleCov is and have it integrated into a Ruby project, but want to learn how to use it effectively with GitLab CI to get the most out of it. Unfortunately, it won't cover deploying coverage reports to GitLab pages, as there is already a fantastic article (which has inspired large parts of this post) on the matter. Enjoy!

Quick setup details

The most basic way to get started with SimpleCov is to add the Gem to your bundle, and then put something like this at the beginning of your spec_helper.rb, or whatever conventional pre-test file you load before running your suite:

# spec/spec_helper.rb

require 'simplecov'
SimpleCov.start

ENV['RAILS_ENV'] ||= 'test'
ENV['TZ'] ||= 'America/Los_Angeles'

This is fine, although if you need to do any more configuration than this
(which we will need for this scenario), I recommend having a helper file for
SimpleCov. I put mine in lib/ so I could reuse it in different applications:

# lib/marty/simplecov_helper.rb
# Credit goes to https://gitlab.com/gitlab-org/gitlab-foss/blob/master/spec/simplecov_env.rb

require 'simplecov'
require 'active_support/core_ext/numeric/time'

module SimpleCovHelper
  def self.configure_profile
    SimpleCov.configure do
      enable_coverage ENV.fetch('COVERAGE_METHOD', 'line').to_sym

      track_files '{app,config,lib,spec}/**/*.rb'

      add_filter 'db/migrate'
      add_filter 'vendor/'
      add_filter 'extjs/'

      add_group 'Channels', 'app/channels'
      add_group 'Netzke Components', 'app/components'
      add_group 'Controllers', 'app/controllers'
      add_group 'Helpers', 'app/helpers'
      add_group 'Jobs', ['app/jobs', 'app/workers']
      add_group 'Mailers', 'app/mailers'
      add_group 'Models', 'app/models'
      add_group 'Services', 'app/services'
      add_group 'Configs', 'config/'
      add_group 'Libraries', 'lib/'
      add_group 'Specs', 'spec/'

      use_merging true
      merge_timeout 1.day
    end
  end

  def self.start!
    return unless ENV['COVERAGE'] == 'true'

    configure_profile

    SimpleCov.start
  end
end

A quick explanation of what's going on here:

We expose the SimpleCov.start function via start in the module. That allows us to add more options before actually starting coverage tracking.
We only run coverage if the COVERAGE environment variable is set. Coverage generation is potentially an expensive operation and can make RSpec suite runs take much longer, so it should be an opt-in feature.

We also call the configure_profile method, which defines some basic but
helpful configuration variables.

SimpleCov usually recommends using the rails profile when calling start.
This works for most people; however, it suppresses coverage for the config/
and spec/ directories, which I wanted. So I chose instead to build a similiar
configuration to the rails profile, but leaving in coverage for config/
and spec/:

# lib/marty/simplecov_profile.rb

require 'simplecov'
require 'active_support/core_ext/numeric/time'

SimpleCov.profiles.define :marty do
  enable_coverage ENV.fetch('COVERAGE_METHOD', 'line').to_sym

  track_files '{app,config,lib,spec}/**/*.rb'

  add_filter 'db/migrate'
  add_filter 'vendor/'
  add_filter 'extjs/'

  add_group 'Channels', 'app/channels'
  add_group 'Netzke Components', 'app/components'
  add_group 'Controllers', 'app/controllers'
  add_group 'Helpers', 'app/helpers'
  add_group 'Jobs', ['app/jobs', 'app/workers']
  add_group 'Mailers', 'app/mailers'
  add_group 'Models', 'app/models'
  add_group 'Services', 'app/services'
  add_group 'Configs', 'config/'
  add_group 'Libraries', 'lib/'
  add_group 'Specs', 'spec/'

  use_merging true
  merge_timeout 1.day
end

This allows me to reuse this profile in multiple applications with little hassle.
It also makes my SimpleCovHelper module look something like this:

# spec/support/simplecov_helper.rb

require 'marty/simplecov_helper'

module SimpleCovHelper
  def self.start!
    return unless ENV['COVERAGE'] == 'true'

    SimpleCov.start :marty
  end
end

I also like to pre-require my SimpleCov Helper using RSpec's dotfile: .rspec:

--require marty/simplecov_helper
--require spec_helper

This requires marty/simplecov_helper and then spec_helper in that order,
allowing me to just throw this at the very beginning of spec_helper:

# spec/spec_helper.rb

Marty::SimpleCovHelper.start!

ENV['RAILS_ENV'] ||= 'test'
ENV['TZ'] ||= 'America/Los_Angeles'

Getting Coverage in GitLab CI

If you enable coverage and run any RSpec test (a single file or a whole suite), you'll get something like this:

$ COVERAGE=true bundle exec rspec
.................................

Finished in 1 minute 15.14 seconds (files took 1.73 seconds to load)
249 examples, 0 failures, 8 pending

Coverage report generated for RSpec to #{Rails.root}/coverage. 1170 / 1230 LOC (95.12%) covered.

See that last part down there? The one that says 1170 / 1230 LOC? That's the results of the test coverage report, and it's exactly what we're looking to parse. Specifically, the percentage.

GitLab CI's configuration will live in a .gitlab-ci.yml file in the root of your project, and there is a great detail of rigorous configuration that can go into it. What we specifically care about here is your job that runs your RSpec test. Here is a rudimentary example of such job:

# .gitlab-ci.yml

image: ruby:2.6.3-buster

before_script:
  - gem install bundler
  - bundle install --jobs $(nproc) --path vendor "${FLAGS[@]}"

rspec:
  stage: test
  script:
    - bundle exec rspec

After enabling coverage via variables, we will need to extract it from the output of the job, and this is exactly what the coverage key is for!

The coverage key takes a regular expression that matches the output percentage of your job, and embeds it into the results that we saw in the above screenshot:

# .gitlab-ci.yml

rspec:
  stage: test
  coverage: '/LOC\s\(\d+\.\d+%\)\scovered/'
  variables:
    COVERAGE: "true"
  script:
    - bundle exec rspec

This is most likely not the simplest regular expression to match "LOC (xx.xx%) covered", but it's working fine for me so far, so feel free to use it.

GitLab CI will now pick up parse the output of the job looking for a string matching that regex, and when it finds it, it'll display it in your Merge Request page and job results!

Viewing Artifacts

Since you may want to view your coverage reports generated by CI in the future, I suggest releasing the reports as job artifacts so they can be cached and downloaded later for review:

# .gitlab-ci.yml

rspec:
  stage: test
  coverage: '/LOC\s\(\d+\.\d+%\)\scovered/'
  variables:
    COVERAGE: "true"
  script:
    - bundle exec rspec
  artifacts:
    paths:
      - "coverage/"
    name: "Pipeline $CI_PIPELINE_ID Coverage Report"

This'll generate an artifact with a unique, descriptive name for each pipeline that runs, including the pipeline's ID (viewable through your MR or the Pipelines section in CI/CD).

Multiple/Parallelized Test Suite

Things get much more interesting if you decide to parallelize your test suite; this means you don't run all of your specs in one go, but instead in different CI jobs. This is fine, but it makes getting complete suite coverage much more difficult.

To solve for this use case, we'll have to find a way to combine all of our results in one place and merge them (and fortunately, SimpleCov already has us covered on that front).

Getting all your coverage results together

Suppose that your .gitlab-ci.yml looks something like the above examples, except that you've decided to use a base job to split off your rather massive test suite into multiple categories:

# .gitlab-ci.yml

.base-rspec:
  stage: test
  coverage: '/LOC\s\(\d+\.\d+%\)\scovered/'
  image: ruby:2.6.3-buster
  before_script:
    - gem install bundler
    - bundle install --jobs $(nproc) --path vendor "${FLAGS[@]}"
  variables:
    COVERAGE: "true"

rspec-models:
  extends: .base-rspec
  script:
    - bundle exec rspec spec/models

rspec-controllers:
  extends: .base-rspec
  script:
    - bundle exec rspec spec/controllers

rspec-lib:
  extends: .base-rspec
  script:
    - bundle exec rspec spec/lib

rspec-features:
  extends: .base-rspec
  script:
    - bundle exec rspec spec/features

Each one of these guys will generate its own unique coverage report, but they won't be complete.
Most likely, one category of tests does not span all your project's files and as such will result in odd, incomplete reports of 51.43% or even 37.22% coverage.

What we actually want is the product of all of our divided test suites together, and this has to be done by combining them in a single job, and then generating the report.

To do that, create a new stage in your pipeline stages called codecov:

# .gitlab-ci.yml

stages:
  - build
  - test
  - codecov
  - deploy

You could also use the default deploy stage, but it has to be any stage that comes after test.

To combine all the reports, we're going to want them to have unique names when they all end up together in the code coverage job, so one doesn't overwrite the other. SimpleCov has a mechanism to give jobs unique names, and we can do this based on environment variables.

Add this to the SimpleCov Profile that we created:

# lib/marty/simplecov_helper.rb

SimpleCov.profiles.define :marty do
  # ...
  if ENV['GITLAB_CI']
    job_name = ENV['CI_JOB_NAME']
    coverage_dir "coverage/#{job_name}"
    command_name job_name
    SimpleCov.at_exit { SimpleCov.result }
  end

So now, if the job is titled rspec-features, SimpleCov will spit out the results in a folder called coverage/rspec-features" in the root.

Next, we want to make sure we pass on our individual, uniquely named coverage reports to the job that will combine them, and we will do this using artifacts. Take our .base-rspec job and add the following artifacts field to it:

# .gitlab-ci.yml

.base-rspec:
  stage: test
  coverage: '/LOC\s\(\d+\.\d+%\)\scovered/'
  image: ruby:2.6.3-buster
  before_script:
    - gem install bundler
    - bundle install --jobs $(nproc) --path vendor "${FLAGS[@]}"
  variables:
    COVERAGE: "true"
  artifacts:
    paths:
      - "coverage/$CI_JOB_NAME" # look for the unique CI_JOB_NAME folder for this run
    name: "$CI_JOB_NAME Coverage" # Give it a unique artifact name
    when: always # Always extract the artifact, even if not all tests passed.

Now we have all of our coverage reports being passed on as artifacts into one job, and we just have to take care of merging them.

Merging

If you followed my instructions, your hypothetical coverage/ folder in your final, aggregating job should look something like this:

#{Rails.root}/coverage
 ├─"rspec-models"
 ├─"rspec-controllers"
 ├─"rspec-lib"
 └─"rspec-features"

We now want to merge all of our results together.
To do that, we're going to use the wonderful collate functionality
built into SimpleCov.

Define this method, merge_all_results! in your

# lib/marty/simplecov_helper.rb

module SimpleCovHelper
  def self.merge_all_results!
    # Collate and combine all the previous coverage results that we produced
    # in other RSpec runs. Combine them using the custom profile we created.
    # This method also handles storing them for you
    SimpleCov.collate(Dir['coverage/**/.resultset.json'], :marty)

    # Singleton responsible for keeping track of the last merged result in
    # memory
    merged_result = SimpleCov::ResultMerger.merged_result

    # Print out to console all the groups and their percents + hits/line.
    groups = merged_result.groups.map do |group, files|
      [group, files.covered_percent, files.covered_strength]
    end

    # Sort by percentage and print everything
    sorted_groups = groups.sort_by { |_gr, per, _str| -per }
    sorted_groups.each do |group|
      gr_name, percent, strength = group
      LOGGER.info(
        "Group '#{gr_name}': #{percent} covered at #{strength} hits/line"
      )
    end

    merged_result.format!
end
  ...
end

This recursively globs all .resultset.json files from the coverage/ folder
(which include details of each coverage run), reads and merges them for you.

We want a way to be able to call this code from our CI job, so let's use a Rake task for that. Insert this into your Rakefile:

# Rakefile

desc 'Merge the results of various SimpleCov coverage reports'
  task merge_coverage_reports: :environment do
    require 'lib/marty/simplecov_helper'
    puts 'Merging code coverage reports...'
    SimpleCovHelper.merge_all_results!
  end

Finishing CI Touches

The last thing we want to do is to create the aggregating code coverage job. It will look something like this:

# .gitlab-ci.yml

code coverage:
  stage: codecov
  when: always # Always try to generate a code coverage report
  coverage: '/LOC\s\(\d+\.\d+%\)\scovered/'
  script:
    - bundle exec rake merge_coverage_reports
  artifacts: # Release the final, merged coverage for review
    paths:
      - "coverage/"
    name: "Pipeline $CI_PIPELINE_ID Merged Coverage"

If you did everything right, your pipeline should now look something like this:

Conclusion

Code Coverage is not a perfect metric by any means, but I think it's a fantastic heuristic to figure out where to focus testing efforts, an undertaking that my team will soon begin.

This is the first article that I've ever written, and I appreciate you for reading it. I truly hope saves you a weekend of work!