DEV Community

Pete Miloravac
Pete Miloravac

Posted on • Originally published at semaphore.io on

Semaphore CI/CD Benchmark: Performance and Cost Analysis

This benchmark compares Semaphore to GitHub Actions, GitLab CI, Buildkite, and CircleCI using the same repository, pipeline logic, versions, and equivalent machine classes. 

The goal is to measure real execution time and compute cost under identical conditions.

Repository and Workload

Repository: Redmine (Ruby on Rails application). The workload consists of dependency installation and full test execution. Runs were measured after cache warm-up to focus on steady-state execution time. After initializing the cache, 10 consecutive runs were executed per provider. No outliers were removed.

Infrastructure Configuration Setup

All providers used 2 vCPU machines, matching memory as closely as possible:

  • Semaphore: f1-standard-2 (2 vCPU, 8 GB RAM)
  • GitHub Actions: ubuntu-latest (Linux runner) (2 vCPU, 7 GB RAM)
  • GitLab: saas-linux-small-amd64 (2 vCPU, 8 GB RAM)
  • CircleCI: Docker medium (2 vCPU, 4 GB RAM)
  • Buildkite: LINUX_AMD64_2X4 (2 vCPU, 4 GB RAM)

OS family, Ruby version, dependency installation strategy, and database backend are the same across the board. This benchmark measures single-job execution speed (no parallelism).

Semaphore Pipeline Overview

The Semaphore pipeline defines a single job running on f1-standard-2. It checks out the repository, restores cache, installs dependencies, sets up the database, and executes the full test suite.

version: v1.0
name: Redmine
agent:
  machine:
    type: f1-standard-2
    os_image: ubuntu2404
blocks:
  - name: Tests
    task:
      jobs:
        - name: tests
          commands:
            - checkout
            - cache restore gems-$(checksum Gemfile)-ruby-4.0
            - sudo apt-get update
            - sudo DEBIAN_FRONTEND=noninteractive apt-get install --yes --quiet build-essential pkg-config libpq-dev postgresql-client ghostscript gsfonts locales bzr cvs imagemagick
            - sudo locale-gen en_US.UTF-8
            - |
              cat > policy.xml <<'EOF'
              <policymap>
                <policy domain="coder" rights="read | write" pattern="PDF" />
              </policymap>
              EOF
            - sudo rm -f /etc/ImageMagick-6/policy.xml
            - sudo mv policy.xml /etc/ImageMagick-6/policy.xml
            - sem-version ruby "4.0"
            - sem-service start postgres 14 --db redmine_test
            - |
              cat > config/database.yml <<'EOF'
              test:
                adapter: postgresql
                database: redmine_test
                username: postgres
                password:
                host: 127.0.0.1
              EOF
            - bundle config set path 'vendor/bundle'
            - bundle install --jobs 4 --retry 3
            - cache store gems-$(checksum Gemfile)-ruby-4.0 vendor/bundle
            - export RAILS_ENV=test
            - 'export SCMS=subversion,git,git_utf8,filesystem,bazaar,cvs'
            - 'bundle exec rake ci:about'
            - 'bundle exec rake ci:setup'
            - 'bundle exec rake db:environment:set'
            - bin/rails test
            - LANG=en_US.ISO8859-1 LC_ALL=en_US.ISO8859-1 bin/rails test test/unit/repository_bazaar_test.rb
            - 'bin/rails test:autoload'

Benchmark Results












<tbody>
  <tr>
    <td>Semaphore</td>
    <td>

        <span><span>1</span>05:10</span>
        <span><span>2</span>04:25</span>
        <span><span>3</span>04:30</span>
        <span><span>4</span>05:12</span>
        <span><span>5</span>04:34</span>
        <span><span>6</span>04:47</span>
        <span><span>7</span>06:27</span>
        <span><span>8</span>04:54</span>
        <span><span>9</span>05:00</span>
        <span><span>10</span>05:06</span>

    </td>
    <td>05:01</td>
    <td></td>
    <td>$0.0075</td>
    <td>$0.04</td>
    <td></td>
  </tr>

  <tr>
    <td>GitHub Actions</td>
    <td>

        <span><span>1</span>09:56</span>
        <span><span>2</span>10:29</span>
        <span><span>3</span>09:50</span>
        <span><span>4</span>10:03</span>
        <span><span>5</span>09:40</span>
        <span><span>6</span>09:37</span>
        <span><span>7</span>09:15</span>
        <span><span>8</span>09:42</span>
        <span><span>9</span>09:08</span>
        <span><span>10</span>09:44</span>

    </td>
    <td>09:44</td>
    <td>94.48%</td>
    <td>$0.0060</td>
    <td>$0.06</td>
    <td>55.58%</td>
  </tr>

  <tr>
    <td>GitLab</td>
    <td>

        <span><span>1</span>10:13</span>
        <span><span>2</span>11:45</span>
        <span><span>3</span>11:11</span>
        <span><span>4</span>11:27</span>
        <span><span>5</span>09:45</span>
        <span><span>6</span>11:56</span>
        <span><span>7</span>11:10</span>
        <span><span>8</span>11:23</span>
        <span><span>9</span>11:30</span>
        <span><span>10</span>12:06</span>

    </td>
    <td>11:15</td>
    <td>124.49%</td>
    <td>$0.0100</td>
    <td>$0.11</td>
    <td>199.32%</td>
  </tr>

  <tr>
    <td>Buildkite</td>
    <td>

        <span><span>1</span>05:08</span>
        <span><span>2</span>09:13</span>
        <span><span>3</span>05:51</span>
        <span><span>4</span>06:38</span>
        <span><span>5</span>07:36</span>
        <span><span>6</span>07:31</span>
        <span><span>7</span>06:53</span>
        <span><span>8</span>08:47</span>
        <span><span>9</span>06:44</span>
        <span><span>10</span>08:12</span>

    </td>
    <td>07:15</td>
    <td>44.86%</td>
    <td>$0.0130</td>
    <td>$0.09</td>
    <td>151.09%</td>
  </tr>

  <tr>
    <td>CircleCI</td>
    <td>

        <span><span>1</span>10:44</span>
        <span><span>2</span>11:00</span>
        <span><span>3</span>10:56</span>
        <span><span>4</span>17:08</span>
        <span><span>5</span>14:02</span>
        <span><span>6</span>09:25</span>
        <span><span>7</span>16:05</span>
        <span><span>8</span>13:25</span>
        <span><span>9</span>14:38</span>
        <span><span>10</span>15:33</span>

    </td>
    <td>13:18</td>
    <td>165.42%</td>
    <td>$0.0060</td>
    <td>$0.08</td>
    <td>112.34%</td>
  </tr>
</tbody>
Enter fullscreen mode Exit fullscreen mode
Provider Runs (1-10) Average Sem faster by Price / minute Cost / run Cost increase

/* ---------- Desktop: reduce columns by grouping runs into a compact grid ---------- */
.table-responsive-cards {
max-width: 770px;
width: 100%;
border: 1px solid #D2D4D6;
border-radius: 1rem;
overflow: hidden;
margin: 0;
}

.table-responsive-cards,
.table-responsive-cards * {
box-sizing: border-box;
}

.table-responsive-cards table {
width: 100%;
border-collapse: collapse;
table-layout: fixed;
border: 0;
}

.table-responsive-cards th,
.table-responsive-cards td {
padding: .65rem .75rem;
white-space: normal;
vertical-align: top;
border: 1px solid #D2D4D6;
min-width: 0;
}

/* avoid double border where table meets the rounded wrapper */
.table-responsive-cards tr > *:first-child {
border-left: 0;
}

.table-responsive-cards tr > *:last-child {
border-right: 0;
}

.table-responsive-cards thead tr:first-child > * {
border-top: 0;
}

.table-responsive-cards tbody tr:last-child > * {
border-bottom: 0;
}

.table-responsive-cards thead th {
background: #FFEFA4;
font-weight: 700;
}

.table-responsive-cards tbody tr:first-child {
background: #F4EFE6;
}

.table-responsive-cards th:first-child,
.table-responsive-cards td:first-child {
min-width: 9.5rem;
font-weight: 600;
}

.table-responsive-cards th:nth-child(1),
.table-responsive-cards td:nth-child(1) {
width: 16%;
}

.table-responsive-cards th:nth-child(2),
.table-responsive-cards td:nth-child(2) {
width: 20%;
}

.table-responsive-cards th:nth-child(n + 3),
.table-responsive-cards td:nth-child(n + 3) {
width: 13%;
}

.table-responsive-cards td[data-label="Average"],
.table-responsive-cards td[data-label="Price/minute"],
.table-responsive-cards td[data-label="Cost/run"],
.table-responsive-cards td[data-label="Cost increase"],
.table-responsive-cards td[data-label="Sem faster by"] {
white-space: nowrap;
font-variant-numeric: tabular-nums;
}

.run-grid {
display: grid;
grid-template-columns: repeat(2, minmax(0, 1fr));
gap: .3rem .4rem;
font-variant-numeric: tabular-nums;
}

.run-pill {
display: inline-flex;
align-items: center;
gap: .35rem;
padding: .12rem .3rem;
border: 1px solid #D2D4D6;
border-radius: 999px;
background: #fff;
font-size: .8rem;
white-space: nowrap;
}

.run-label {
display: inline-flex;
align-items: center;
justify-content: center;
min-width: 1.3rem;
padding: .04rem .28rem;
border-radius: 999px;
background: #FFEFA4;
font-size: .68rem;
font-weight: 700;
color: #5A4C00;
}

/* Condensed runs: tighter single-column list */
.is-condensed-runs .run-grid {
display: grid;
grid-template-columns: 1fr;
gap: .25rem;
}

.is-condensed-runs .run-pill {
padding: .08rem .25rem;
border: 1px solid #D2D4D6;
background: #fff;
font-size: .78rem;
width: fit-content;
}

.is-condensed-runs .run-label {
min-width: 1.1rem;
padding: .02rem .22rem;
background: #F4EFE6;
font-size: .62rem;
}

.table-responsive-cards tbody tr:first-child .run-label {
background: #FFEFA4;
}

/* Optional: keep Provider visible on wide screens */
@media (min-width: 900px) {
.table-responsive-cards th:first-child,
.table-responsive-cards td:first-child {
position: sticky;
left: 0;
background: inherit;
z-index: 2;
}
}

@media (max-width: 900px) {
.table-responsive-cards th,
.table-responsive-cards td {
padding: .55rem .6rem;
}

.table-responsive-cards th:first-child,
.table-responsive-cards td:first-child {
min-width: 0;
}
}

/* ---------- Mobile: turn each row into a readable "card" ---------- */
@media (max-width: 700px) {
.table-responsive-cards {
border: 0;
border-radius: 0;
}

.table-responsive-cards table,
.table-responsive-cards thead,
.table-responsive-cards tbody,
.table-responsive-cards th,
.table-responsive-cards td,
.table-responsive-cards tr {
display: block;
width: 100%!important;
}

.table-responsive-cards thead {
display: none;
}

.table-responsive-cards tr {
margin: 0 0 14px 0;
border: 1px solid #D2D4D6;
border-radius: 10px;
overflow: hidden;
background: #fff;
}

.table-responsive-cards tbody tr:last-child {
border-bottom: 0;
}

.table-responsive-cards td {
display: block;
width: 100%;
padding: .55rem .8rem;
border: 0;
border-top: 1px solid #D2D4D6;
white-space: normal;
min-width: 0;
}

.table-responsive-cards td::before {
content: attr(data-label);
font-weight: 600;
opacity: .75;
display: block;
margin-bottom: .2rem;
}

.table-responsive-cards td:first-child {
border-top: 0;
font-size: 1rem;
font-weight: 700;
}

.run-pill {
width: 100%;
justify-content: flex-start;
white-space: normal;
font-size: .82rem;
}

.run-grid {
width: 100%;
}
}

Cost Calculation Method

Formulas used:

  • Semaphore faster by = (ProviderAvg – SemaphoreAvg) / SemaphoreAvg
  • Cost per run = AverageDurationMinutes × PricePerMinute
  • Cost vs Semaphore = (ProviderCost – SemaphoreCost) / SemaphoreCost

Productivity Impact

A reduction from 9-13 minutes down to 5 minutes changes feedback cycles. For a team running 100 builds per day, saving 4 minutes per build results in 400 minutes saved daily. That equals over 6.5 engineer hours regained per day.

Budget and Engineering Capacity Impact at Scale

Assume your organization consumes 1,000,000 build minutes on Semaphore for this workload.

Based on the benchmark runtime ratios, the same workload would require:

Semaphore 1.00M build minutes
GitHub Actions 1.94M build minutes +15,670 hours
GitLab 2.24M build minutes +20,709 hours
Buildkite 1.45M build minutes +7,420 hours
CircleCI 2.65M build minutes +27,519 hours

These are not abstract numbers. They represent real waiting time in feedback loops, and a slower CI means:

  • Longer pull request cycles
  • Slower bug detection
  • Slower incident resolution
  • More context switching
  • Reduced deployment frequency

Now consider your internal engineering cost model:

  • H = fully loaded engineering hourly cost

Then the organizational impact of slower CI is:

Additional personnel cost exposure = Extra pipeline hours × 𝐻

Without assuming a specific compensation level, the relationship is linear and direct:

  • If H increases, the cost penalty increases proportionally
  • If your build volume increases, the cost penalty scales proportionally

CI performance therefore has a second-order budget impact:

  • Direct compute cost
  • Indirect engineering time cost

The benchmark shows that under identical workload conditions, Semaphore minimizes both simultaneously.

When performance and cost efficiency move in the same direction, CI infrastructure reduces both compute waste and feedback loop waiting times.

TL;DR

Under identical workload and equivalent machine classes, and single-job execution constraints, Semaphore delivered the fastest execution time and the lowest cost per run in this benchmark configuration.

For engineering teams optimizing feedback loops and infrastructure spend, execution time, and cost per run are measurable levers. This benchmark demonstrates both.

Next Step

CI performance directly influences two variables that scale with your organization: infrastructure spend and engineering throughput.

This benchmark demonstrates measurable differences under controlled conditions. The most relevant comparison, however, is against your own repository, build frequency, and test volume. 

Create a project, run your existing pipeline, and measure the execution time and cost against your current setup.

You will have the data you need to quantify the difference where it matters.

The post Semaphore CI/CD Benchmark: Performance and Cost Analysis appeared first on Semaphore.

Top comments (0)