Dependency Management: Monorepo or Polyrepo? My Choices

#dependencymanagement #monorepo #polyrepo #yazilimmimarisi

When developing software, regardless of the project's size or team dynamics, a fundamental decision about dependency management eventually needs to be made: Should the codebase be kept in a single large repository (monorepo) or should each component reside in its own separate repository (polyrepo)? While this might seem like a simple choice of folder structure, it's actually a strategic decision that impacts many areas, from development speed to deployment reliability, and even inter-team communication.

In my twenty years of working with systems and software, there have been times when I got lost in fragmented polyrepo structures and times when I struggled with the complexities introduced by monorepos. Both approaches have their own advantages and disadvantages. In this post, I will compare these two approaches based on my own experiences, explaining what I prefer in which situations and why.

The Polyrepo Approach: The Challenges of Fragmented Management

The polyrepo approach, as its name suggests, means that each service, application, or library is kept in its own separate code repository. This structure, which became popular with microservice architectures, supports the philosophy of "let everything live on its own."

This approach brings some obvious advantages at first glance. Each repository can have its own independent lifecycle. It can have its own CI/CD pipeline, its own versioning strategy, and its own unique dependencies. This provides more autonomy to teams, especially in large organizations where different teams work on different products. Each team can progress at its own pace, choose its own technology stack, and deploy independently of other projects. However, this autonomy comes at a cost.

⚠️ The Nightmare of Version Conflicts

One of the biggest problems I experienced with the polyrepo structure was version conflicts, also known as "dependency hell." In a client project, we encountered unexpected errors in the production environment because 3 different microservices were using different minor versions of the same core library. It took me 2 full days to find this error and bring all services into alignment.

While working on a production ERP, when I updated a core integration module, I had to update and deploy 7 different microservices that used this module, one by one. Creating a separate pull request for each service, running separate tests, and managing separate deployment processes, along with the tests, took me 5-6 hours. This created an unnecessary operational burden even for a small change. Furthermore, synchronizing and managing CI/CD pipelines running in different repositories can become a complex structure over time. Maintaining a separate Jenkinsfile or GitHub Actions configuration for each repository can also lead to inconsistencies.

# Dependency updates in an example polyrepo
# Microservice A
cd microservice-a
npm update common-library@1.2.0
npm test
git commit -m "Update common-library to 1.2.0"
git push

# Microservice B (uses the same library but managed by a different team)
cd microservice-b
# They might be stuck on 1.1.0 instead of 1.2.0 here, or a different error might occur when they update.
npm update common-library@1.2.0
npm test
# If tests fail, the debugging process begins...

In such scenarios, especially when shared libraries are frequently updated or when all applications need to be compatible simultaneously, the polyrepo approach proved very challenging for me. When a change affects multiple repositories, coordinating these changes and deploying them atomically becomes almost impossible.

The Monorepo Approach: A Holistic View

A monorepo is an approach where all projects, libraries, and services are kept in a single version control repository. While this might sound like "putting all your eggs in one basket" at first, it can provide significant advantages when managed with the right tools and strategies.

For me, the biggest advantage of a monorepo is the simplification of dependency management. Since all code is in one place, it becomes natural to use a single version of shared libraries or internal modules. This prevents "dependency hell" scenarios. In a production ERP, when critical units like production planning, operator screens, and reporting modules were all in the same codebase, updating a common security module and having all dependent applications updated simultaneously with a single PR incredibly simplified auditing and deployment processes.

ℹ️ The Power of Atomic Commits

A single commit I make in a monorepo can include both a change in a common library and adaptations in all services using that library. This is an invaluable advantage, especially when performing a major refactoring or API change. A single PR guarantees the consistency of the entire system.

When I consolidated all my services under a single monorepo for my side product's backend, I can say that dependency conflicts decreased by 80%. Furthermore, code sharing and reuse between different projects become much easier. Instead of copying code from one project to another, I can directly import it or create a common library and use it across all projects. This improves code quality and reduces repetitive work.

However, monorepos also have their own challenges. The repository size can grow very large over time. In an ERP project, when the monorepo size exceeded 50GB, git clone times reached 15 minutes. This prolonged the onboarding time for new team members and could even slow down the daily development flow. Additionally, managing a monorepo without the right tooling can turn into a nightmare. Running all monorepo tests with every commit could extend CI pipeline times up to 45 minutes. This made it impossible to get quick feedback.

# Example pnpm-workspace.yaml file (monorepo tooling example)
packages:
  - 'apps/*' # My applications: web, admin, api
  - 'packages/*' # My common libraries: ui-kit, common-utils, types

Thanks to this structure, the pnpm install command can manage all my project dependencies in a single, very fast operation. I managed to overcome the initial slowdowns in the CI process by using the affected commands of tools like Nx or Turborepo. By running tests and builds only for changed projects, I was able to speed up CI cycles by 70%. This showed me how to turn the potential disadvantages of a monorepo into advantages with the right tools.

Practical Solutions and Tools for Dependency Management

Since I've seen that both approaches have their unique challenges, the practical solutions and tools I use for dependency management are very important. My goal is to increase reliability while reducing complexity.

In polyrepo scenarios, I strictly adhere to Semantic Versioning (SemVer) rules to minimize version conflicts. I also use tools to pin dependencies. For example, in Python projects, I use pip-compile to automatically generate requirements.txt files, specifying all transitive dependencies with exact versions. I used this method for different microservices in a client's e-commerce infrastructure. This largely prevented issues caused by version discrepancies in the production environment. In the JavaScript ecosystem, committing package-lock.json or yarn.lock files serves this purpose.

# Example of pinning dependencies in a Python project
# My requirements.in file
flask
requests==2.28.1

# Creating requirements.txt with pip-compile
pip-compile requirements.in -o requirements.txt

# Contents of the generated requirements.txt (example)
#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile requirements.in
#
flask==2.3.2
    itsdangerous<2.2,>=2.1.0
    Jinja2<3.2,>=3.0.3
    Werkzeug<2.4,>=2.3.3
requests==2.28.1
    charset-normalizer<3,>=2
    idna<4,>=2.5
    urllib3<1.27,>=1.21.1

On the monorepo side, workspaces features in the modern JavaScript ecosystem make my life easier. Thanks to npm workspaces, yarn workspaces, or my favorite, pnpm workspaces, I can manage multiple projects and libraries within a single repository. In my monorepo containing my side product's React frontend and FastAPI backend, I manage dependencies from a single place using pnpm workspaces. The pnpm install command installs all dependencies in 30 seconds and also saves disk space.

To make monorepos more efficient, I use specialized build systems like Nx, Turborepo, or Bazel. These tools allow me to detect only changed code parts and run only relevant tests and builds in CI/CD pipelines. In one of my projects, I used Nx's affected commands to run tests and builds only for changed projects. This accelerated CI cycles by 70% and reduced 45-minute pipeline times to 5-7 minutes. Without these optimizations, it's not possible to fully realize the benefits of a monorepo.

💡 Smart Build Systems

If you're using a monorepo, definitely check out tools like Nx or Turborepo. These tools can dramatically speed up your CI/CD processes by running builds and tests only on changed code parts. This is key to maintaining development speed in large monorepos.

I also use services like Dependabot that automatically update dependencies. This is critical for patching security vulnerabilities and keeping libraries up-to-date. However, automatic updates need to pass through testing processes, and care must be taken against potential breaking changes. These tools significantly reduce the burden of manual dependency management.

Factors I Consider When Making a Decision

When choosing between a monorepo and a polyrepo, it's necessary to look not only at technical details but also at organizational and operational factors. In my own projects and the companies I've worked for, I consider the following when making these decisions:

Team Size and Structure: In projects with a small, single team, the simplicity and speed offered by a monorepo are generally more advantageous. Coordination is easier because everyone is on the same codebase. However, for large, independent teams working on different products or business areas, polyrepos can provide more autonomy. Still, even in large teams, managing a monorepo with the right tools (like Nx) can be beneficial for ensuring the consistency of common libraries.
Project Size and Complexity: How many common components does the project have? How often do these components change? If there are many shared libraries and modules that are frequently updated, the atomic commit capability of a monorepo provides a significant advantage. If projects are completely independent and have very few common points, a polyrepo might make more sense. It doesn't make much sense for the native code of my Android spam application and the web-based financial calculator code of one of my side products to be in the same repository, because their technology stacks and dependencies are completely different.
CI/CD Processes and Required Performance: How critical are build and deploy times? While optimization can be done in a monorepo with smart build systems, CI/CD processes can slow down initially and without the right tools. In one of my projects, a build taking more than 20 minutes was unacceptable. This made build optimization capabilities, such as incremental builds provided by a monorepo, critical. In a polyrepo, setting up and managing separate pipelines for each project can require more operational effort.

🔥 Cost Factor

CI/CD costs in a monorepo can be 2-3 times higher than in a polyrepo if not properly optimized. Because a monorepo works with a larger codebase, running all tests or builds with every commit consumes both time and resources (CI/CD minutes, server cost). Therefore, smart build systems and caching mechanisms are vital.

Security Concerns: Keeping all code in a single repository theoretically carries the risk of a single security breach gaining access to all projects. In a polyrepo, the risk is more distributed. However, with good security practices and access control, this risk can also be managed in a monorepo. The important thing is to correctly determine the risk tolerance and the security layers to be applied.
Organizational Culture: Do teams prefer autonomy or tight coordination? This is often overlooked but a very important factor. If teams are very fond of their independence, the transition to a monorepo can be culturally challenging. However, in an environment where inter-team collaboration is encouraged to achieve common goals, a monorepo can be adopted more easily.

By combining these factors, I try to determine the most suitable approach for each project's unique needs. I've learned from my own experiences that there isn't always a single right answer.

My Preferences and Reasons

In my nearly two decades of experience, I've tried monorepo and polyrepo approaches countless times and have ultimately developed some clear preferences for myself. I now lean more towards monorepos in most scenarios, especially for projects that are related and use common libraries.

Why? Because the benefits of atomic changes and unified dependency management that a monorepo brings far outweigh the tooling complexity that can be overcome with modern monorepo tools (Nx, Turborepo). Especially in an ERP project with critical units like production planning, being able to test and deploy a dependency change that could affect the production line, along with all operator screens and reporting modules, within a single PR, is a huge advantage for me. This narrows the scope of potential errors and simplifies rollback processes.

💡 Refactoring Power

A monorepo provides incredible ease for large-scale refactoring operations. When you change an API in a shared library, you can open all projects using that library in a single IDE, update them with a single commit, and test them. In a polyrepo, this can turn into a nightmare requiring days, numerous PRs, and deployments.

In most of my side products, I keep the web frontend, API backend, and common libraries in a single repository. This has increased my development speed and deployment reliability. For example, in the financial calculators I built for my own site, having both the frontend and backend use the same type definition files (TypeScript types) becomes much easier thanks to the monorepo. When I change a type definition, ensuring that both the frontend and backend adapt to this change is possible with a single PR.

However, I don't always prefer a monorepo. For independent products with very different technology stacks, almost no interaction with each other, and managed by different teams, a polyrepo can still make sense. For instance, as I mentioned, the native code of my Android spam application and the web-based financial calculator code of one of my side products should not be in the same repository. The dependency between these two projects is almost zero, and they are written in different languages for different platforms. In this case, a monorepo would only introduce unnecessary complexity.

Sometimes I also adopt "hybrid" approaches. That is, a structure that behaves logically like a monorepo but has physically separate repositories. This usually comes up during the modernization of very large and old systems. Common libraries can be kept in a separate repository, while applications have their own repositories. In this case, it's necessary to implement a strict versioning and release management strategy. [related: Software Version Management and Automation]

In conclusion, my preference is mostly towards a monorepo. However, this preference is always shaped by the specific needs, scale, and expectations of the project and the team. What's important is to know the strengths and weaknesses of both approaches and to be able to choose the right tool at the right time.

Conclusion

For dependency management in software projects, monorepo and polyrepo approaches are fundamental architectural decisions, each with its unique advantages and disadvantages. In my nearly two decades of experience, I have personally experienced how these two approaches work in different scenarios, what challenges they bring, and what benefits they provide.

While a polyrepo might initially seem simpler for small, independent projects with different technology stacks, it can lead to problems like "dependency hell" and operational complexity, especially when common libraries are frequently updated and inter-project coordination is critical. A monorepo, when managed with the right tools (Nx, Turborepo) and strategies, can offer significant advantages such as atomic commits, simplified dependency management, and easier refactoring, thereby increasing development speed and reliability. [related: CI/CD Pipeline Reliability]

It's important to remember that there is no single "best" solution. When making a decision, factors such as team size, project complexity, CI/CD expectations, security concerns, and even organizational culture need to be considered. While my personal preference is generally for a monorepo, I believe that each project should be evaluated within its own context. The key is to understand all the nuances of these approaches and consciously choose the one that best suits your project's needs.