Everyone seems to have a love/hate relationship with Atlassian products. I've only really worked at "Atlassian shops" my entire career. Jira, Confluence, Bitbucket, StatusPage. It's nice to have everything in "one place" but on occasion, it seems like so many people are always "fighting" with a limitation of their products. Can't get Jira to do the thing? I guess it's Excel again. Can't get Bitbucket to work with Playwright Test sharding? You've come to the right place.
So what is sharding? The concept is pretty simple and the execution even simpler. The command line pretty much looks like this:
npx playwright test --shard 1/3
Then you do the same for shard 2 of 3, and 3 of 3. Ideally, each command runs in its own machine/Docker image, and it's assigned its own little subset of tests.
And how does reporting work? If you are able to gather up all of the artifacts written (by default) to ./blob-report
, then it's just this:
npx playwright merge-reports --reporter html ./blob-report
Sounds pretty sweet right? Bunch of tests running in parallel, across different pipeline jobs, and you merge a report and serve it up somewhere.
All of this is made super easy in Github Actions but unfortunately is absolutely non-existent in Bitbucket pipelines. The idea of a "job triggering other jobs" is just not a thing.
So how can this be done? Everything is done through shell scripts and some imagination. Firstly, let's take a look at the top level pipelines that we'll need:
pipelines:
custom:
execute-tests:
- variables:
- name: Environment
default: dev
allowed-values:
- dev
- stage
- name: MaxNumberOfShards
default: 1
- step: *run-tests
run-shard:
- variables:
- name: Environment
- name: ShardNumber
- name: MaxNumberOfShards
- step: *run-shard
So what's happening here? The job run-shard
is basically how our individual shards will be run. This is what it looks like from the Bitbucket Pipeline UI:
If you really wanted to, you could go into the Bitbucket pipeline UI, and resubmit this form for all of the shards you want to run. The idea here is to use our execute-tests
pipeline job to automate all of that!
So what does our run-shard definition actually look like?
definitions:
services:
run-shard: &run-shard
name: Run shard for playwright tests
image: mcr.microsoft.com/playwright:v1.37.0-jammy
size: 2x
caches:
- node
script:
- echo "TEST_ENV=$Environment" > .env
- export DEBIAN_FRONTEND=noninteractive # Interactive installation of aws-cli causes issues
- apt-get update && apt-get install -y awscli
- npm install
- npx playwright test --shard="$ShardNumber"/"$MaxNumberOfShards" || true # Run test shard
- aws s3 cp blob-report/ s3://my-bucket/blob-report --recursive # Copy blob report to s3
artifacts:
- playwright-report/**
- test-results/**
- blob-report/**
- logs/**
- .env
Looking a little nasty isn't it? We have our Playwright Docker image executing what we want, which is the playwright test --shard
cli command that we needed. From there, we are uploading the blob-report to S3, which means installing aws-cli
during our pipeline. To me, this seemed a lot easier than trying to fetch artifacts from various pipeline jobs that can be fairly difficult to track down.
We have our individual run-shard
job that can run shardNumber
out of maxNumberOfShards
(i.e. 1/6, 2/6, etc). I refer to these as "child pipelines". Take note that we've added || true
to the playwright test
step, as honestly we're not interested in seeing the individual test statuses for the child pipelines. Also we want to really focus on examining test results from our "parent pipeline", and not have a bunch of failed child pipelines divert our attention.
And so what does our parent pipeline look like? Admittedly it's a mess of shell scripts designed to do a few different things.
run-tests: &run-tests
name: Run all UI tests
image: mcr.microsoft.com/playwright:v1.37.0-jammy
size: 2x
caches:
- node
script:
- echo "TEST_ENV=$Environment" > .env
- export DEBIAN_FRONTEND=noninteractive # Interactive installation of aws-cli causes issues
- apt-get update && apt-get install -y awscli
- aws s3 rm s3://my-bucket/blob-report --recursive # Clear out old blob reports from previous test runs
- npm install
- /bin/bash ./scripts/start_playwright_shards.sh # Start child pipelines
- /bin/bash ./scripts/monitor_shards.sh # Monitor child pipelines from parent pipeline
- /bin/bash ./scripts/merge_reports_from_shards.sh # Download sharded blob reports from S3 and merge
# Fail the parent pipeline if test failures are found across shards
- |
if grep -qE "[0-9]+ failed" ./logs/test-results.log; then
echo "Failed tests found in log file"
exit 1
fi
artifacts:
- playwright-report/**
- test-results/**
- logs/**
- .env
This parent pipeline, through some shell scripts, will accomplish the following:
- Iterate from
1
through$MaxNumberOfShards
and send aPOST
to Bitbucket's API to start therun-shard
pipeline job. The pipeline variables are sent as part of its payload. - Poll for any
IN_PROGRESS
child pipeline jobs using the Bitbucket API. If the number ofrun-shard
jobs is0
, that means we're all done and the parent pipeline can finish. - Download the
blob-report
folder from S3 and executemerge-report
. Here, I opt to create an html report as well as alist
report, which is the Playwright default. The former is found as an artifact inplaywright-report
, while the latter is found inlogs/test-results.log
, which is a file that is normalized and parsed for results. - If the log file generated contains "X failed", it means at least 1 test failed across all children. And if any of the individual children fail, then the parent is deemed a failure too (hey, just like in real life!)
I'll spare you the details on the bash scripts, but for the most part the work involved inspecting Bitbucket's network requests and mimicking those via curl
. From there, it's also a good idea to make your test reporting shareable and easily accessible for your team.
Well that's all there is to it. I wish it were simpler in Bitbucket but... it's not. Github Actions allows for a couple dozen lines of YAML to do the same thing. But here we have another thing to deal with when it comes to Atlassian. Thanks for the the blog idea though.
Top comments (0)