Iuri Covaliov

Posted on Mar 19

GitLab CI Caching Didn’t Speed Up My Pipeline — Here’s Why

#devops #gitlab #cicd #python

Most DevOps guides say:

“Enable caching — it will speed up your CI pipelines.”

I’ve done that many times in my career. Here I'd like to share with you some of my thoughts on the topic illustrating it with a little experiment.

I built a small GitLab CI lab, added dependency caching. Are you expecting faster runs?

The result might surprise you:

My pipeline didn’t get faster at all.

In fact, in some cases, it was slightly slower.

Before jumping to conclusions — this is not a post against caching.

Caching worked exactly as expected.
It just didn’t translate into faster pipeline duration in this particular setup.

And that’s the part worth understanding.

This article is not about how to enable caching.
It’s about what actually happens after you enable it — and why the outcome might not match expectations.

What I Wanted to Test

I wanted to validate a simple assumption:

Does dependency caching really reduce pipeline duration?
Where does the improvement come from?
When is caching actually worth it?

So I built a small Python project with a multi-stage GitLab CI pipeline and measured the results.

The Setup

The pipeline has three stages:

prepare → install dependencies
quality → compile/lint
test → run tests

Each job installs dependencies independently — just like in many real-world pipelines.

To make the effect visible, I used slightly heavier dependencies:

pandas
scipy
scikit-learn
matplotlib

Baseline: No Cache

Each job runs:

time pip install -r requirements.txt

As expected:

dependencies are downloaded in every job
work is repeated across stages
every pipeline run starts from scratch

Results

Run	Duration
#1	~38s
#2	~34s

Adding Cache

I introduced GitLab cache:

.cache:
  cache:
    key:
      files:
        - requirements.txt
    paths:
      - .cache/pip
    policy: pull-push

And configured pip:

PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"

Now dependencies should be reused between jobs and runs.

The Result

Mode	Run	Duration
No cache	1	~38s
No cache	2	~34s
With cache	1	~40s
With cache	2	~38s

Almost no difference.

Why Didn’t It Get Faster?

1. Fast package source

If your runner uses a nearby mirror (for example, Hetzner), downloads are already fast.

2. pip is efficient

Modern Python packaging uses prebuilt wheels, making installs quick.

3. Cache has overhead

archive creation
upload/download
extraction

This overhead can cancel the benefit.

4. CI jobs spend time elsewhere

container startup
image pulling
repo checkout

The Real Takeaway

Dependency caching is not automatically a performance optimization.

Its impact depends on:

dependency size
network conditions
runner configuration
pipeline structure

When Caching Helps

large dependency trees
slow networks
distributed runners
frequent pipeline runs

When It Might Not Help

small projects
fast mirrors
short pipelines
high cache overhead

Not Just About Speed

Caching can still:

reduce outbound traffic
improve resilience
reduce dependency on external registries

What’s Next

Next step:

testing shared cache with S3-compatible storage

Repo

You can find the full lab here:
👉 https://github.com/ic-devops-lab/devops-labs/tree/main/GitLabCIPipelinesWithDependencyCaching

Final Thought

Not every best practice gives a measurable improvement — but understanding why is where real DevOps begins.

DEV Community