Most DevOps guides say:
“Enable caching — it will speed up your CI pipelines.”
I’ve done that many times in my career. Here I'd like to share with you some of my thoughts on the topic illustrating it with a little experiment.
I built a small GitLab CI lab, added dependency caching. Are you expecting faster runs?
The result might surprise you:
My pipeline didn’t get faster at all.
In fact, in some cases, it was slightly slower.
Before jumping to conclusions — this is not a post against caching.
Caching worked exactly as expected.
It just didn’t translate into faster pipeline duration in this particular setup.
And that’s the part worth understanding.
This article is not about how to enable caching.
It’s about what actually happens after you enable it — and why the outcome might not match expectations.
What I Wanted to Test
I wanted to validate a simple assumption:
- Does dependency caching really reduce pipeline duration?
- Where does the improvement come from?
- When is caching actually worth it?
So I built a small Python project with a multi-stage GitLab CI pipeline and measured the results.
The Setup
The pipeline has three stages:
- prepare → install dependencies
- quality → compile/lint
- test → run tests
Each job installs dependencies independently — just like in many real-world pipelines.
To make the effect visible, I used slightly heavier dependencies:
- pandas
- scipy
- scikit-learn
- matplotlib
Baseline: No Cache
Each job runs:
time pip install -r requirements.txt
As expected:
- dependencies are downloaded in every job
- work is repeated across stages
- every pipeline run starts from scratch
Results
| Run | Duration |
|---|---|
| #1 | ~38s |
| #2 | ~34s |
Adding Cache
I introduced GitLab cache:
.cache:
cache:
key:
files:
- requirements.txt
paths:
- .cache/pip
policy: pull-push
And configured pip:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
Now dependencies should be reused between jobs and runs.
The Result
| Mode | Run | Duration |
|---|---|---|
| No cache | 1 | ~38s |
| No cache | 2 | ~34s |
| With cache | 1 | ~40s |
| With cache | 2 | ~38s |
Almost no difference.
Why Didn’t It Get Faster?
1. Fast package source
If your runner uses a nearby mirror (for example, Hetzner), downloads are already fast.
2. pip is efficient
Modern Python packaging uses prebuilt wheels, making installs quick.
3. Cache has overhead
- archive creation
- upload/download
- extraction
This overhead can cancel the benefit.
4. CI jobs spend time elsewhere
- container startup
- image pulling
- repo checkout
The Real Takeaway
Dependency caching is not automatically a performance optimization.
Its impact depends on:
- dependency size
- network conditions
- runner configuration
- pipeline structure
When Caching Helps
- large dependency trees
- slow networks
- distributed runners
- frequent pipeline runs
When It Might Not Help
- small projects
- fast mirrors
- short pipelines
- high cache overhead
Not Just About Speed
Caching can still:
- reduce outbound traffic
- improve resilience
- reduce dependency on external registries
What’s Next
Next step:
- testing shared cache with S3-compatible storage
Repo
You can find the full lab here:
👉 https://github.com/ic-devops-lab/devops-labs/tree/main/GitLabCIPipelinesWithDependencyCaching
Final Thought
Not every best practice gives a measurable improvement — but understanding why is where real DevOps begins.
Top comments (0)