DEV Community

Claudio Cesar
Claudio Cesar

Posted on

From Chaos to Perfect Flow: My Experience Automating a Massive Real GitLab Migration (4,000 Repos)

Migrating thousands of repositories can feel like one of those nightmares that wake engineers up in the middle of the night. But for me, it became one of the most transformative experiences of my journey with DevOps and automation.
This article is an honest and practical report of how I automated the migration of approximately 4,000 GitLab Community Edition projects to GitLab Enterprise Edition — without disrupting teams, without losing history, and without breaking production pipelines.
It is a story about chaos, discipline, automation, and, above all, how errors are an essential part of the process.

Why This Migration Needed to Happen

GitLab Community Edition works well in the beginning. But as teams grow, repositories multiply, and CI/CD volume explodes, clear limitations appear:

  • lack of governance
  • inconsistent configurations
  • heterogeneous runners
  • unstable pipelines
  • absence of essential enterprise features

Migrating to GitLab Enterprise Edition was more than a technical improvement.

It was a step toward organizational maturity.

The challenge?
There is no official migration process that preserves everything correctly.
With no alternative, I had to create a custom solution.

What Makes This Migration So Complex?

Moving 4,000 projects is not just moving folders.
It means preserving an entire ecosystem, including:

  • full commit history
  • protected branches
  • release tags
  • variables and secrets
  • scattered CI/CD includes
  • issues and comments
  • archived-state preservation
  • runners and permissions

GitLab’s native import/export tool breaks large parts of this along the way.
I needed automation capable of migrating everything — reliably.

The Toolkit That Built the Bridge

I developed a modular set of Bash scripts to maintain absolute control over each part of the migration.
It is publicly available here:

👉 https://github.com/clcesarval/migrar-gitlab

The architecture had two clear layers:

1. The Git Layer

Responsible for:

  • cloning all repositories
  • reconstructing branches
  • ensuring DAG integrity
  • preserving tags exactly as in CE
  • reconfiguring remotes
  • restoring archived state

2. The API Layer

  • Responsible for:
  • variables and secrets
  • issues and comments
  • group hierarchies
  • permissions
  • recreating projects in the destination

Each script did only one thing — but did it precisely.

Some key components:

  • clone-projects.sh
  • replace_gitlab-ci.sh
  • push_projects.sh
  • migrar-variaveis.sh
  • migrate-issues.sh
  • recursive scripts for subgroups
  • complete runner inventory

It was purposeful automation — not magic.

The Hidden Chaos That Appeared

The official documentation helped, but only partially. Many sections were inconsistent, incomplete, or simply failed when applied at large scale.

I also identified issues such as:

  • CI includes using absolute paths that would break after migration
  • duplicated or unsafely scoped variables
  • runners with unpredictable configurations
  • archived projects reappearing as active when using import/export
  • tags being incorrectly recreated

Using the official process could have caused severe operational impact across the organization.

Pilot Tests… and the Errors That Saved the Migration

To ensure everything would work, I ran several pilot tests in isolated environments.
And all of them — without exception — failed at first.

I faced:

include: /old-group/subgroup/template.yml
Enter fullscreen mode Exit fullscreen mode
  • pipelines breaking due to outdated includes
  • missing variables that caused job failures
  • runners rejecting executions
  • YAML-invalid errors caused by invisible details
  • tags disappearing or being recreated incorrectly
  • group hierarchies being created out of order

And that was exactly why the migration succeeded.

I built a script that recursively analyzed every include inside the cloned directories and automatically replaced paths according to the new GitLab structure.

Every failure revealed the real behavior of GitLab.
Errors taught more than any documentation ever could.
With each test I:

  • added new validations
  • created filters
  • strengthened logs
  • fixed edge cases
  • implemented idempotent checks

When the tests finally ran clean, the automation proved truly reliable.

The Real Migration Day

When the full migration pipeline ran, the outcome matched exactly what is expected from a well-engineered solution:

  • no history lost
  • no broken tags
  • pipelines working properly
  • variables recreated with correct scopes
  • hierarchy preserved
  • code arriving exactly as expected
  • archived repositories staying archived
  • runners behaving consistently

It was like transforming a chaotic environment into a predictable, standardized, governed system.

The Impact on the Teams

Even without early formal DORA metrics, the post-migration behavior was clear.

After the migration:

  • Deploys became more frequent
  • Lead time dropped drastically
  • Pipelines became more stable
  • Change Failure Rate decreased
  • Standardized runners reduced unexpected failures
  • Recovery became faster thanks to intact history

The entire organization began delivering faster.

What This Experience Taught Me

This project taught me that:

  • official documentation never covers 100% of cases
  • standardization is the foundation of reliability
  • pipelines depend deeply on clean, well-scoped variables
  • runners define the invisible health of CI/CD
  • errors play an essential role in engineering
  • good automation builds confidence

And the biggest lesson:

When you reorganize the foundation, the entire development flow improves.

Conclusion

Migrating thousands of projects is not just a technical challenge.
It’s a test of discipline, engineering, and resilience.

With deterministic automation, real pilot tests, and learning through errors, it was possible to migrate around 4,000 repositories safely, preserving code integrity, pipelines, history, and governance.

A future evolution is also under consideration: rewriting the toolkit in Python, enabling more flexibility and fluidity without losing the robustness of the current Bash solution.

The solution is publicly available and may help others facing similar challenges:

⭐ If you liked the project, please leave a ⭐ star ⭐ on GitHub — it helps support and recognize the work.
👉 https://github.com/clcesarval/migrar-gitlab

If you are planning a large-scale migration, remember:

Errors are not enemies.
They are guides.
And good automation turns fear into confidence.

** References**

GitLab Documentation — https://docs.gitlab.com/ee/

GitLab Import/Export Guide — https://docs.gitlab.com/ee/user/project/settings/import_export.html

GitLab API Reference — https://docs.gitlab.com/ee/api/

Google Cloud — DORA Research & Four Key Metrics — https://cloud.google.com/blog/products/devops-sre/using-the-four-keys-to-measure-your-devops-performance

Accelerate: State of DevOps Report (DORA) — https://www.devops-research.com/research.html

Martin Fowler — Continuous Delivery & IaC — https://martinfowler.com/

Google SRE Book — https://sre.google/sre-book/

Pro Git — Git Internals — https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain

Advanced Bash Scripting Guide — https://tldp.org/LDP/abs/html/

GitHub Engineering — https://github.blog/engineering/

Top comments (0)